Trainer¶
The Trainer (astraflow/train_worker/) is the distributed training engine
supporting FSDP and Megatron backends.
Design Principle: Swappable Trainer¶
The trainer is a swappable component. AstraFlow provides a built-in PPO trainer, but the system is designed so that users can replace it with their own training framework. The trainer communicates with Dataflow purely over HTTP — there is no shared library or class hierarchy coupling the two.
Customizing a trainer is extremely simple — the entire integration surface is just 6 HTTP APIs (3 outbound calls to Dataflow + 3 inbound endpoints for weight transfer). Any training framework (PyTorch, JAX, Megatron, custom) can integrate with Dataflow as long as it speaks this protocol:
┌──────────────┐ HTTP (3 calls) ┌──────────────┐
│ │ ──────────────────────► │ │
│ Trainer │ │ Dataflow │
│ (swappable) │ │ (stable) │
│ │ ◄──────────────────── │ │
└──────┬───────┘ └──────────────┘
│
│ TCP weight pull (3 endpoints)
│
┌──────▼───────┐
│ RaaS │
│ (swappable) │
└──────────────┘
The trainer interacts with two components:
Dataflow service (outbound) — 3 HTTP calls for registration, data pulling, and version notification.
RaaS (inbound) — 3 HTTP endpoints served by a weight sender agent that RaaS connects to for pulling updated weights.
Trainer → Dataflow API¶
Category |
Method |
Endpoint |
Purpose |
|---|---|---|---|
Registration |
|
|
Signal readiness, pass batch size and sender endpoint |
Data |
|
|
Pull a training batch (blocks until available) |
Weights |
|
|
Notify new weight version, trigger RaaS broadcast |
RaaS → Trainer API (weight sender)¶
Category |
Method |
Endpoint |
Frequency |
|---|---|---|---|
Weights |
|
|
Once per RaaS (query tensor layout) |
Weights |
|
|
Once per RaaS (establish TCP link) |
Weights |
|
|
Every step (pull weights over TCP) |
The weight sender is provided as a reusable library
(astraflow.weight_manager.transfer.sender_agent) so custom trainers
don’t need to reimplement TCP/ZMQ machinery. See
WeightManager for details.
For a full guide on implementing a custom trainer, see Custom Trainer Integration.