vortex_torch.indexer.utils_sglang¶
Functions
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- vortex_torch.indexer.utils_sglang.plan_decode(cached_seq_lens, req_to_token, req_indices, ctx)[source]¶
- Parameters:
cached_seq_lens (torch.Tensor)
req_to_token (torch.Tensor)
req_indices (torch.Tensor)
ctx (Context)
- vortex_torch.indexer.utils_sglang.plan_prefill(cached_seq_lens, dense_kv_indptr, dense_kv_indices, input_seq_lens, qo_indptr_ragged, qo_indptr_paged, kv_last_page_len, req_to_token, req_indices, batch_table, page_size, num_kv_heads)[source]¶
- Parameters:
cached_seq_lens (torch.Tensor)
dense_kv_indptr (torch.Tensor)
dense_kv_indices (torch.Tensor)
input_seq_lens (torch.Tensor)
qo_indptr_ragged (torch.Tensor)
qo_indptr_paged (torch.Tensor)
kv_last_page_len (torch.Tensor)
req_to_token (torch.Tensor)
req_indices (torch.Tensor)
batch_table (torch.Tensor)
page_size (int)
num_kv_heads (int)
- vortex_torch.indexer.utils_sglang.chunkwise_nh2hn_transpose(x, indptr, batch_table, num_qo_heads, num_kv_heads, head_dim)[source]¶
- vortex_torch.indexer.utils_sglang.chunkwise_hn2nh_transpose(x, y, indptr, batch_table, num_qo_heads, num_kv_heads, head_dim)[source]¶
- vortex_torch.indexer.utils_sglang.plan_prefill_fa3(cached_seq_lens, cu_seqlens_q, req_to_token, req_indices, page_table, batch_table, page_size, num_kv_heads)[source]¶
- vortex_torch.indexer.utils_sglang.plan_decode_fa3(cached_seq_lens, req_to_token, req_indices, dense_page_table, dense_cache_seqlens, sparse_page_table, sparse_cache_seqlens, ctx)[source]¶
- Parameters:
cached_seq_lens (torch.Tensor)
req_to_token (torch.Tensor)
req_indices (torch.Tensor)
dense_page_table (torch.Tensor)
dense_cache_seqlens (torch.Tensor)
sparse_page_table (torch.Tensor)
sparse_cache_seqlens (torch.Tensor)
ctx (Context)