A complete, task-oriented guide for operating bulkers end-to-end
Bulkers manages multi-container computing environments. You define a "crate" — a YAML manifest listing Docker images and the commands they provide. Activate a crate and bulkers creates symlinks that dispatch each command to the right container at runtime. Commands appear on your PATH as if natively installed. No install step needed — just bulkers activate and go.
# One-liner install (Linux/macOS)
curl -sL https://raw.githubusercontent.com/databio/bulkers/master/install.sh | bash
# This installs the binary to ~/.local/bin/bulkers
# and adds a shell function to ~/.bashrc or ~/.zshrc
Verify the installation:
bulkers --version
If not already initialized, create a config file:
# Auto-detects docker or singularity
bulkers config init
# Or specify config path and engine explicitly
bulkers config init -c ~/.config/bulker/bulker_config.yaml -e docker
Activate a crate (interactive shell):
bulkers activate databio/pepatac
# Auto-fetches from registry if not cached locally
# You're now in a shell where samtools, bowtie2, etc. are on PATH
samtools --version
bulkers deactivate
activate auto-fetches the manifest from the registry on first use, creates symlinks for each command, and prepends them to your PATH. deactivate restores the original PATH. No separate install step needed.
Pre-cache for offline use (optional):
bulkers crate install databio/pepatac:1.0.13 # cache manifest
bulkers crate install databio/pepatac:1.0.13 -b # cache manifest + pull images
Run a single command (non-interactive, for scripts/AI):
bulkers exec databio/pepatac -- samtools view -h input.bam
exec is the preferred method for AI agents and scripts — it runs a single command in the crate's environment without modifying your current shell.
See what's cached:
bulkers crate list # list all cached crates
bulkers crate list --simple # space-separated, for scripting
bulkers crate inspect databio/pepatac # show commands in this crate
bulkers crate clean databio/pepatac # remove a cached crate
bulkers crate clean --all # clear entire cache
manifest:
name: my-tools
commands:
- command: samtools
docker_image: quay.io/biocontainers/samtools:1.9--h91753b0_8
Each field:
command — the name you'll type on the command linedocker_image — the full Docker image reference (registry/repo:tag)manifest:
name: biotools
version: "1.0"
commands:
- command: samtools
docker_image: quay.io/biocontainers/samtools:1.9--h91753b0_8
- command: bedtools
docker_image: quay.io/biocontainers/bedtools:2.29.2--hc088bd4_0
- command: python
docker_image: python:3.11
docker_command: python
docker_args: "-it"
docker_command — what to run inside the container. Defaults to the command value. Use when the container command name differs from what you want the user to type.docker_args — extra Docker flags. -i keeps stdin open (needed for piping), -it for interactive tools like python/R.manifest:
name: data-tools
commands:
- command: mytool
docker_image: myimage:latest
volumes:
- /data/shared
- /scratch
How mounting works:
$HOME is always mounted (configured globally in bulker config).--volume "/path:/path" — same path inside and outside the container.$(pwd) so relative paths work.--user=$(id -u):$(id -g)) so files are owned by you, not root./etc/passwd, /etc/group, /etc/shadow, /etc/sudoers.d (read-only), /tmp/.X11-unix (for GUI apps).When you need extra volumes:
$HOME: add them to volumes:volumes:$HOME — it's already mounted globallymanifest:
name: display-tools
commands:
- command: firefox
docker_image: jess/firefox
docker_args: "-it"
envvars:
- DISPLAY
- XAUTHORITY
no_network: false
envvars — list of environment variable names to pass into the container.DISPLAY is passed globally by default (in bulker config).no_network: false (the default) adds --network=host.manifest:
name: advanced
commands:
- command: redis-server
docker_image: redis:7
docker_command: redis-server
docker_args: "--name redis -p 6379:6379"
workdir: /data
no_user: true
no_network: false
- command: jq
docker_image: ghcr.io/jqlang/jq
docker_args: "-i --entrypoint jq"
docker_command: " "
Field reference:
| Field | Type | Default | Description |
|---|---|---|---|
command |
string | required | Executable name created on PATH |
docker_image |
string | required | Docker image reference |
docker_command |
string | command |
Command to run in container |
docker_args |
string | none | Extra docker run flags |
volumes |
list | [] | Additional mount paths |
envvars |
list | [] | Env var names to pass through |
no_user |
bool | false | If true, run as root (not current user) |
no_network |
bool | false | If true, don't add --network=host |
workdir |
string | $(pwd) |
Working directory in container |
manifest:
name: pepatac
version: "1.0.14"
imports:
- bulker/coreutils
- databio/bedstuff
host_commands:
- python3
- perl
- git
commands:
- command: samtools
docker_image: quay.io/biocontainers/samtools:1.9--h91753b0_8
imports — other crates whose commands are included when this crate is activated. Resolved at runtime — updating an imported crate propagates automatically.host_commands — commands from the host system (not containerized) that should be available in the crate. Symlinks to the host binary are created in the crate directory.host_commands for tools that don't need containerization (python3, perl, git) or that need direct host access.# Activate directly from a local file
bulkers activate ./my-manifest.yaml
# Pre-cache a local manifest (optional)
bulkers crate install ./my-manifest.yaml
# Pre-cache and pull all Docker images
bulkers crate install ./my-manifest.yaml -b
# Exec with a local manifest
bulkers exec ./my-manifest.yaml -- samtools --version
When the argument starts with ./, /, or ends in .yaml/.yml, it is treated as a local manifest file. The manifest is cached automatically for shimlink dispatch at runtime. Otherwise the argument is treated as a registry shorthand (namespace/crate:tag).
The registry is a static collection of YAML files served from GitHub via hub.bulker.io. Publishing means adding your manifest file to this repository.
# 1. Write your manifest
cat > my-crate.yaml << 'EOF'
manifest:
name: my-crate
version: "1.0"
commands:
- command: mytool
docker_image: myorg/mytool:1.0
EOF
# 2. Fork or clone the hub.bulker.io repo
git clone git@github.com:databio/hub.bulker.io.git
cd hub.bulker.io
# 3. Add your manifest under your namespace
mkdir -p mynamespace
cp ../my-crate.yaml mynamespace/my-crate.yaml
# For versioned: mynamespace/my-crate_1.0.yaml
# 4. Commit and push
git add mynamespace/my-crate.yaml
git commit -m "Add mynamespace/my-crate"
git push origin main
# 5. Open a pull request to databio/hub.bulker.io
File naming convention:
namespace/crate_name.yamlnamespace/crate_name_1.0.yamlAfter merge, the crate is available via: bulkers activate mynamespace/my-crate:1.0
Full config file with annotations:
bulker:
container_engine: docker # "docker" or "apptainer"
default_namespace: bulker # used when namespace is omitted
registry_url: http://hub.bulker.io/ # default manifest registry
# Shell settings
shell_path: ${SHELL}
shell_rc: $HOME/.bashrc
# Global defaults (applied to all containers)
volumes:
- $HOME
envvars:
- DISPLAY
# Build template (for crate install --build)
build_template: docker_build.tera
# Apptainer-specific
apptainer_image_folder: ~/.local/share/apptainer/images
Manifests are cached in ~/.config/bulker/manifests/ (managed automatically by activate). There is no crates map in the config — the filesystem cache is the source of truth.
Config file location lookup order:
-c flag on any command$BULKERCFG environment variable~/.config/bulker/bulker_config.yamlSetting config values from the CLI:
# Add global volumes that all containers will mount
bulkers config set volumes=$HOME,/data/shared,/scratch
# Add global environment variables
bulkers config set envvars=DISPLAY,LANG,MY_VAR
# Change container engine
bulkers config set container_engine=singularity
# View current config
bulkers config show
# Get a specific value
bulkers config get container_engine
# Initialize with apptainer
bulkers config init -e apptainer
# Or change engine in existing config
bulkers config set container_engine=apptainer
Differences from Docker:
.sif files in apptainer_image_folder.$HOME is always mounted by Apptainer (not listed separately).-B for bind mounts instead of --volume.Pattern: Set up a bioinformatics environment
# Just exec — auto-fetches manifest on first use
bulkers exec databio/pepatac -- samtools view -h input.bam > output.sam
Pattern: Create a custom crate for a project
cat > manifest.yaml << 'EOF'
manifest:
name: my-analysis
commands:
- command: samtools
docker_image: quay.io/biocontainers/samtools:1.17--hd87286a_2
docker_args: "-i"
- command: bedtools
docker_image: quay.io/biocontainers/bedtools:2.31.0--hf5e1c6e_2
- command: R
docker_image: r-base:4.3
docker_command: R
docker_args: "-it"
host_commands:
- python3
- git
EOF
# Activate directly from local file
bulkers activate ./manifest.yaml
samtools --version
bulkers deactivate
# Or exec without activating
bulkers exec ./manifest.yaml -- samtools --version
Pattern: Run a pipeline step
# Non-interactive, for scripts -- no shell function needed
bulkers exec databio/pepatac -- trim_galore --paired R1.fastq.gz R2.fastq.gz -o trimmed/
bulkers exec databio/pepatac -- bowtie2 -x /data/genomes/hg38 -1 trimmed/R1.fq.gz -2 trimmed/R2.fq.gz -S aligned.sam
Pattern: Check what's available before running
bulkers crate list
bulkers crate inspect databio/pepatac
# Shows: samtools, bowtie2, trim_galore, bedtools, ...
Pattern: Multiple crates at once
# Activate multiple crates (commands from all are on PATH)
bulkers activate databio/pepatac,bulker/coreutils
# Exec with multiple crates
bulkers exec databio/pepatac,bulker/coreutils -- samtools --version
Pattern: Strict mode (only crate commands on PATH)
# No host commands leak into the environment
bulkers activate -s databio/pepatac
bulkers exec -s databio/pepatac -- samtools --version
| Term | Definition |
|---|---|
| Crate | A loaded collection of containerized commands, identified by namespace/name:tag |
| Manifest | YAML file defining the commands in a crate, their Docker images, and configuration |
| Namespace | Organizational prefix (e.g., databio, bulker). Like a GitHub org. |
| Tag | Version identifier for a crate (e.g., 1.0.13, default) |
| Registry | HTTP endpoint serving manifest YAML files (default: hub.bulker.io) |
| Shimlink | A symlink to the bulkers binary. When invoked (e.g., as samtools), bulkers checks argv[0] and dispatches to the right container at runtime. |
| Activate | Auto-fetch a manifest, create shimlinks, and put them on your PATH. Works with registry crates and local files. |
| Exec | Run a single command in a crate's environment without modifying your shell |