vortex_torch.indexer.matmul¶
Classes
|
Per-page matrix–matrix product, \(O[s] = Y[s]\,X[s]^{\top}\). |
|
Per-request batched matrix–vector product, \(O = Y X^{\top}\). |
- class GeMV[source]¶
Bases:
vOpPer-request batched matrix–vector product, \(O = Y X^{\top}\).
- Math:
Batched query \(X\in\mathbb{R}^{B\times 1\times D}\), packed pages \(Y\in\mathbb{R}^{S\times 1\times D}\); for page \(s\) in request \(i(s)\),
\[O_{s,0,0} = \sum_{d=0}^{D-1} Y_{s,0,d}\,X_{i(s),0,d} = \langle Y_s,\, X_{i(s)} \rangle, \qquad O\in\mathbb{R}^{S\times 1\times 1}.\]- __init__:
GeMV()— no arguments.- __call__:
o = op(x, y, ctx=ctx)—xis[B, 1, D],yis[S, 1, D](matchingD); returnso[S, 1, 1]. Output isBATCHEDiff both inputs are, elseRAGGED.
- class GeMM[source]¶
Bases:
vOpPer-page matrix–matrix product, \(O[s] = Y[s]\,X[s]^{\top}\).
- Math:
\(Y\in\mathbb{R}^{S\times N_y\times K}\), \(X\in\mathbb{R}^{(B\text{ or }S)\times N_x\times K}\); per page \(s\) this is \(O_s = Y_s X_s^{\top}\) (i.e.
GeMM(x, y) = y xᵀ):\[O_{s,a,b} = \sum_{k=0}^{K-1} Y_{s,a,k}\,X_{s,b,k}, \qquad O\in\mathbb{R}^{S\times N_y\times N_x}.\]- __init__:
GeMM()— no arguments.- __call__:
o = op(x, y, ctx=ctx)—xis[B|S, N_x, K],yis[S, N_y, K](matchingK); returnso[S, N_y, N_x]. Output isBATCHEDiff both inputs are, elseRAGGED.