Math (Offline)#

Run the math RL recipe on a node with no internet access by pre-downloading every training and evaluation dataset to a local directory.

Recipe: examples/math/offline/qwen3-8b-m2po-full-offline/

Downloader: examples/math/offline/download_math_datasets.py

This is the same Qwen3-8B / M2PO / TCP recipe as Math, with one difference: at startup the AstraFlow service loads every dataset from disk instead of fetching from the HuggingFace Hub.

1. One-time prep — download datasets#

From the repo root:

python examples/math/offline/download_math_datasets.py --root data-data/math

This writes 8 dataset directories under data-data/math/ (~400 MB total) plus a MANIFEST.json:

Directory	HF source	Split	Use
`deepscaler`	`agentica-org/DeepScaleR-Preview-Dataset`	train	rollout
`dapo_filter`	`aaabiao/dapo_filter`	train	rollout
`aime24`	`HuggingFaceH4/aime_2024`	train	eval
`aime25`	`math-ai/aime25`	test	eval
`amc`	`rawsh/2024_AMC12`	train	eval
`math500`	`HuggingFaceH4/MATH-500`	test	eval
`minerva`	`math-ai/minervamath`	test	eval
`olympiadbench`	`math-ai/olympiadbench`	test	eval

Re-running is idempotent (skips populated dirs). Useful flags:

--force — re-download even if a directory exists
--only deepscaler,aime24 — partial subset
--verify — skip download; just load each from disk and assert non-empty

2. Run training#

bash examples/math/offline/qwen3-8b-m2po-full-offline/scripts/run_qwen3-8b-m2po-full-offline.sh

You can confirm the offline path is active by looking for these lines in the AstraFlow service log:

Auto-derived offline_dir for dataset 'deepscaler': data-data/math/deepscaler
Loading DeepScaleR dataset from offline path: data-data/math/deepscaler
Auto-derived offline_dir for dataset 'aime24': data-data/math/aime24
... (same for aime25, amc, minerva, math500)

How it works#

The recipe’s experiment.yaml sets a single field under dataflow:

dataflow:
  data_root: data-data/math

At startup astraflow.dataflow.service walks every entry in rollout_dataset and eval_datasets; for each one that does not already specify offline_dir, it auto-derives offline_dir = f"{data_root}/{name}". The name is:

the dict key for eval datasets (aime24, aime25, amc, minerva, math500)
the dataset_fn module basename for the rollout dataset (deepscaler from astraflow.dataflow.dataset.deepscaler:get_deepscaler_rl_dataset)

The downloader uses the same naming convention, so the two sides stay in sync. To opt a single dataset out — e.g. point one eval at a different snapshot — just set offline_dir: explicitly on that entry; explicit values always win.

To convert any other recipe to offline mode, add the same dataflow.data_root field; no other changes are required.

Caveats#

Model and tokenizer weights are not covered by the dataset downloader. model_path / tokenizer_path still point at Qwen/Qwen3-8B and resolve via the HuggingFace cache. For a fully air-gapped run, pre-fetch them with huggingface-cli download Qwen/Qwen3-8B --local-dir /local/models/Qwen3-8B and edit the two paths in experiment.yaml.
The downloader needs internet at prep time. Once data-data/math/ is populated, training itself works with HF_HUB_OFFLINE=1 / HF_DATASETS_OFFLINE=1.