Math (Offline)¶
Run the math RL recipe on a node with no internet access by pre-downloading every training and evaluation dataset to a local directory.
Recipe: examples/math/offline/qwen3-8b-m2po-full-offline/
Downloader: examples/math/offline/download_math_datasets.py
This is the same Qwen3-8B / M2PO / TCP recipe as Math, with one difference: at startup the AstraFlow service loads every dataset from disk instead of fetching from the HuggingFace Hub.
1. One-time prep — download datasets¶
From the repo root:
python examples/math/offline/download_math_datasets.py --root data-data/math
This writes 8 dataset directories under data-data/math/ (~400 MB total) plus a MANIFEST.json:
Directory |
HF source |
Split |
Use |
|---|---|---|---|
|
|
train |
rollout |
|
|
train |
rollout |
|
|
train |
eval |
|
|
test |
eval |
|
|
train |
eval |
|
|
test |
eval |
|
|
test |
eval |
|
|
test |
eval |
Re-running is idempotent (skips populated dirs). Useful flags:
--force— re-download even if a directory exists--only deepscaler,aime24— partial subset--verify— skip download; just load each from disk and assert non-empty
2. Run training¶
bash examples/math/offline/qwen3-8b-m2po-full-offline/scripts/run_qwen3-8b-m2po-full-offline.sh
You can confirm the offline path is active by looking for these lines in the AstraFlow service log:
Auto-derived offline_dir for dataset 'deepscaler': data-data/math/deepscaler
Loading DeepScaleR dataset from offline path: data-data/math/deepscaler
Auto-derived offline_dir for dataset 'aime24': data-data/math/aime24
... (same for aime25, amc, minerva, math500)
How it works¶
The recipe’s experiment.yaml sets a single field under dataflow:
dataflow:
data_root: data-data/math
At startup astraflow.dataflow.service walks every entry in rollout_dataset and eval_datasets; for each one that does not already specify offline_dir, it auto-derives offline_dir = f"{data_root}/{name}". The name is:
the dict key for eval datasets (
aime24,aime25,amc,minerva,math500)the
dataset_fnmodule basename for the rollout dataset (deepscalerfromastraflow.dataflow.dataset.deepscaler:get_deepscaler_rl_dataset)
The downloader uses the same naming convention, so the two sides stay in sync. To opt a single dataset out — e.g. point one eval at a different snapshot — just set offline_dir: explicitly on that entry; explicit values always win.
To convert any other recipe to offline mode, add the same dataflow.data_root field; no other changes are required.
Caveats¶
Model and tokenizer weights are not covered by the dataset downloader.
model_path/tokenizer_pathstill point atQwen/Qwen3-8Band resolve via the HuggingFace cache. For a fully air-gapped run, pre-fetch them withhuggingface-cli download Qwen/Qwen3-8B --local-dir /local/models/Qwen3-8Band edit the two paths inexperiment.yaml.The downloader needs internet at prep time. Once
data-data/math/is populated, training itself works withHF_HUB_OFFLINE=1/HF_DATASETS_OFFLINE=1.