vortex_torch.indexer.utils_sglang

Functions

chunkwise_hn2nh_transpose(x, y, indptr, ...)

chunkwise_hn2nh_transpose_fa3(x, indptr, ...)

chunkwise_nh2hn_transpose(x, indptr, ...)

indices_to_page_table(page_table, ctx)

plan_decode(cached_seq_lens, req_to_token, ...)

plan_decode_fa3(cached_seq_lens, ...)

plan_prefill(cached_seq_lens, ...)

plan_prefill_fa3(cached_seq_lens, ...)

vortex_torch.indexer.utils_sglang.plan_decode(cached_seq_lens, req_to_token, req_indices, ctx)[source]
Parameters:
  • cached_seq_lens (torch.Tensor)

  • req_to_token (torch.Tensor)

  • req_indices (torch.Tensor)

  • ctx (Context)

vortex_torch.indexer.utils_sglang.plan_prefill(cached_seq_lens, dense_kv_indptr, dense_kv_indices, input_seq_lens, qo_indptr_ragged, qo_indptr_paged, kv_last_page_len, req_to_token, req_indices, batch_table, page_size, num_kv_heads)[source]
Parameters:
  • cached_seq_lens (torch.Tensor)

  • dense_kv_indptr (torch.Tensor)

  • dense_kv_indices (torch.Tensor)

  • input_seq_lens (torch.Tensor)

  • qo_indptr_ragged (torch.Tensor)

  • qo_indptr_paged (torch.Tensor)

  • kv_last_page_len (torch.Tensor)

  • req_to_token (torch.Tensor)

  • req_indices (torch.Tensor)

  • batch_table (torch.Tensor)

  • page_size (int)

  • num_kv_heads (int)

vortex_torch.indexer.utils_sglang.chunkwise_nh2hn_transpose(x, indptr, batch_table, num_qo_heads, num_kv_heads, head_dim)[source]
Parameters:
  • x (torch.Tensor)

  • indptr (torch.Tensor)

  • batch_table (torch.Tensor)

  • num_qo_heads (int)

  • num_kv_heads (int)

  • head_dim (int)

Return type:

torch.Tensor

vortex_torch.indexer.utils_sglang.chunkwise_hn2nh_transpose(x, y, indptr, batch_table, num_qo_heads, num_kv_heads, head_dim)[source]
Parameters:
  • x (torch.Tensor)

  • y (torch.Tensor)

  • indptr (torch.Tensor)

  • batch_table (torch.Tensor)

  • num_qo_heads (int)

  • num_kv_heads (int)

  • head_dim (int)

Return type:

Tuple[torch.Tensor, torch.Tensor]

vortex_torch.indexer.utils_sglang.plan_prefill_fa3(cached_seq_lens, cu_seqlens_q, req_to_token, req_indices, page_table, batch_table, page_size, num_kv_heads)[source]
Parameters:
  • cached_seq_lens (torch.Tensor)

  • cu_seqlens_q (torch.Tensor)

  • req_to_token (torch.Tensor)

  • req_indices (torch.Tensor)

  • page_table (torch.Tensor)

  • batch_table (torch.Tensor)

  • page_size (int)

  • num_kv_heads (int)

vortex_torch.indexer.utils_sglang.plan_decode_fa3(cached_seq_lens, req_to_token, req_indices, dense_page_table, dense_cache_seqlens, sparse_page_table, sparse_cache_seqlens, ctx)[source]
Parameters:
  • cached_seq_lens (torch.Tensor)

  • req_to_token (torch.Tensor)

  • req_indices (torch.Tensor)

  • dense_page_table (torch.Tensor)

  • dense_cache_seqlens (torch.Tensor)

  • sparse_page_table (torch.Tensor)

  • sparse_cache_seqlens (torch.Tensor)

  • ctx (Context)

vortex_torch.indexer.utils_sglang.chunkwise_hn2nh_transpose_fa3(x, indptr, batch_table, num_qo_heads, num_kv_heads, head_dim)[source]
Parameters:
  • x (torch.Tensor)

  • indptr (torch.Tensor)

  • batch_table (torch.Tensor)

  • num_qo_heads (int)

  • num_kv_heads (int)

  • head_dim (int)

Return type:

torch.Tensor

vortex_torch.indexer.utils_sglang.indices_to_page_table(page_table, ctx)[source]
Parameters:
  • page_table (torch.Tensor)

  • ctx (Context)