vortex_torch.cache.context¶
Classes
|
Mutable, single-instance context; populate later via .create(...). |
- class vortex_torch.cache.context.Context[source]¶
Bases:
ContextBaseMutable, single-instance context; populate later via .create(…).
- create(parent, model_runner, *, overwrite=False)[source]¶
Populate this instance once (no locking). Set overwrite=True to allow re-init. NOTE: Without locking, concurrent callers may race; call from a single thread.
- mode: Literal['profile', 'execute']¶
Current operating mode.
- max_new_tokens_per_batch¶
- page_size¶
- total_num_pages¶
- head_dim¶
- head_num¶