vortex_torch.cache.context¶
Classes
|
Mutable, single-instance cache context; populate later via .create(...). |
- class Context[source]¶
Bases:
ContextBaseMutable, single-instance cache context; populate later via .create(…).
Beyond the minimal runtime knobs (page/block layout, head shape) this context also carries the graph and codegen state used by the cache compiler — mirroring
vortex_torch.indexer.context.Context.The graph state (
tensor_list,op_list,op_to_input_tensor_list,op_to_output_tensor_list,output_tensor_to_op_list) is populated during theprofilephase as cache ops register themselves; the codegen state (compilation_header_lines,auxilary_func_def_lines,tensor_id_to_tensor_name_map,compilation_cache_dir,sparse_attention_name,impl_backend) is consumed byvortex_torch.cache.compiler.- create(parent, model_runner, *, overwrite=False)[source]¶
Populate this instance once (no locking). Set overwrite=True to allow re-init. NOTE: Without locking, concurrent callers may race; call from a single thread.
- vortex_dtype: torch.dtype¶
Intermediate-tensor dtype (default
torch.bfloat16).