lc run¶
Materialize outputs declared in astra.yaml. Generates a Snakefile
and dispatches through Snakemake on a Dask cluster.
Synopsis¶
OUTPUTS is zero or more output ids. With no arguments, materializes
everything (Snakemake's rule all).
Options¶
| Option | Default | Effect |
|---|---|---|
--universe, -u NAME |
all universes in universes/*.yaml (or ["default"] if none exist) |
Restrict to one universe. |
--jobs, -j N |
os.cpu_count() |
Parallel jobs / Dask submission concurrency. Passed as both --cores and --jobs to Snakemake. |
--rerun-triggers TRIGGERS |
code,input,mtime,params |
Comma-separated rerun triggers (forwarded to Snakemake). |
--force, -f |
off | --force when targets are named, --forceall otherwise. |
--verbose, -v |
off | Show the underlying Snakemake / executor chatter and the spawned snakemake invocation. |
What happens, step by step¶
- Find the project (walk up looking for
astra.yaml). - Discover universes from
universes/*.yaml(default to["default"]). - Resolve the container runtime via
lightcone.engine.container.load_runtime. Ifautofalls back tononewhile the spec declares containers, print a loud provenance warning. - Generate
.lightcone/Snakefileand.lightcone/snakefile-config.jsonfor the selected universes. - Translate any explicit
OUTPUTSinto Snakemake target paths (<output_dir>/.lightcone-manifest.json) — this is what tells Snakemake "build that specific output." - Open a Dask cluster context (
local,srun-backed insideSLURM_JOB_ID, or external ifDASK_SCHEDULER_ADDRESSis set). - Spawn
snakemake -s … -d … --cores N --jobs N --executor dask --rerun-triggers …withDASK_SCHEDULER_ADDRESSin the environment. - In the default (non-verbose) path, filter the executor's banner chatter so the output reads as lightcone's, not Snakemake's. Real error content always passes through.
Output qualification¶
When the same output_id appears in multiple sub-analyses, you must
qualify it as <analysis_id>.<output_id>:
Each rule's body wraps the recipe in a <runtime> run --rm --pull=never
-v "$PWD":"$PWD" -w "$PWD" <image> bash -c '<recipe>' shell when a
container is configured. After the recipe shell exits, the Snakefile
calls write_manifest() host-side and the validation snippet emits
warnings for empty / all-NaN / wrong-extension outputs.
Examples¶
lc run # all outputs, all universes
lc run --universe baseline # one universe
lc run accuracy # one output
lc run accuracy precision --universe baseline # several
lc run --jobs 4 --verbose # parallel, with stack noise
lc run --force --universe baseline # rebuild everything
lc run --rerun-triggers params,input # tighter staleness
Inside SLURM¶
lc run detects SLURM_JOB_ID, binds the Dask scheduler to the
driver's hostname, and launches one dask worker per node via srun.
Workers advertise cpus, memory, and gpus resources. Per-rule
resource hints (cpus_per_task, mem_mb, gpus_per_task) constrain
which workers can pick up which jobs.
Provenance gotcha¶
If ~/.lightcone/config.yaml says runtime: auto and no runtime is
on PATH, lc run falls back to running recipes on the host. Because
each manifest still records the declared container_image, this is a
provenance lie. lc run prints a yellow warning telling you to either
install a runtime or set container.runtime: none explicitly.
See api/dask_cluster for the cluster-shape decision and Architecture for the full execution flow.