vortex_torch.indexer.scan¶

Classes

`Normalize`([dim])	Segmented \(L_2\) normalization over the packed sequence axis.
`Softmax`([dim, scale])	Segmented scaled softmax over the packed sequence axis.

class Softmax(dim=0, scale=1.0)[source]¶

Bases: vOp

Segmented scaled softmax over the packed sequence axis.

Math:

The leading axis is a packed concatenation of \(B\) per-request segments \(\mathcal{I}_b\) (total length \(S = \sum_b S_b\)). For each segment and each channel \((d_0,d_1)\), softmax runs within the segment:

\[Y_{s,d_0,d_1} = \frac{\exp(\text{scale}\cdot X_{s,d_0,d_1})} {\sum_{s'\in\mathcal{I}_b}\exp(\text{scale}\cdot X_{s',d_0,d_1})}, \qquad s\in\mathcal{I}_b.\]

__init__:

Softmax(dim=0, scale=1.0) — dim must be 0 (the packed S axis); scale multiplies the logits before the exponential.

__call__:

y = op(x, ctx=ctx) — x [S, D_0, D_1] → same shape.

Note:

RAGGED only.

Parameters:

dim (int)
scale (float)

class Normalize(dim=0)[source]¶

Bases: vOp

Segmented \(L_2\) normalization over the packed sequence axis.

Math:: The leading axis is a packed concatenation of \(B\) per-request segments \(\mathcal{I}_b\). For each segment and each channel \((d_0,d_1)\), the values are divided by their segment-local \(L_2\) norm:

\[Y_{s,d_0,d_1} = \frac{X_{s,d_0,d_1}} {\sqrt{\sum_{s'\in\mathcal{I}_b} X_{s',d_0,d_1}^2}}, \qquad s\in\mathcal{I}_b.\]
__init__:: Normalize(dim=0) — dim must be 0 (the packed S axis).
__call__:: y = op(x, ctx=ctx) — x [S, D_0, D_1] → same shape.
Note:: RAGGED only.
Parameters:: dim (int)

class Conv1d(weight, dim=0, dtype=torch.bfloat16, device=None)[source]¶

Bases: vOp

Segmented depth-wise causal 1-D convolution over the packed sequence axis.

Math:

The leading axis is a packed concatenation of \(B\) per-request segments. Within each segment, every channel \((d_0,d_1)\) is convolved by its own \(K\)-tap causal filter \(W\in\mathbb{R}^{K\times D_0\times D_1}\):

\[Y_{s,d_0,d_1} = \sum_{k=0}^{K-1} W_{k,d_0,d_1}\, X_{s-k,\,d_0,d_1},\]

with \(X_{s-k}=0\) for \(s-k\) before the segment. The op runs only on the mid-range \([b_{\text{bos}},\,S_b-b_{\text{eos}})\) (ctx.block_reserved_bos / block_reserved_eos); reserved BOS/EOS rows are neither read nor written.

__init__:

Conv1d(weight, dim=0, dtype=torch.bfloat16, device=None) — weight is a nested Python list of shape [K, D_0, D_1] (kernel size \(K\) = len(weight)); dim must be 0.

__call__:

y = op(x, ctx=ctx) — x [S, D_0, D_1] → same shape; weight’s inner dims must match (D_0, D_1).

Note:

RAGGED only.

Parameters:

weight (list)
dim (int)
dtype (torch.dtype)
device (torch.device | None)