vortex_torch.cache.elementwise¶
Classes
|
Absolute-value transform of an affine argument. |
|
Affine transformation \(y = \beta x + \alpha\). |
|
Unary elementwise operator dispatcher (e.g. ReLU/Sigmoid/SiLU/Abs/Affine). |
|
Piecewise ReLU-like activation. |
|
Sigmoid activation with configurable shift and slope. |
|
SiLU-like activation with configurable shift and slope. |
- class vortex_torch.cache.elementwise.Elementwise(alpha=0.0, beta=1.0)[source]¶
Bases:
vOpUnary elementwise operator dispatcher (e.g. ReLU/Sigmoid/SiLU/Abs/Affine).
This class dispatches a family of unary elementwise operations on rank-3 tensors. The input is treated as
\[X \in \mathbb{R}^{B \times N \times D},\]where:
\(B\) is a leading batch-like axis (for example,
max_new_tokens_per_batch * head_numcoming from the runtime context),\(N\) is a sequence or position dimension, and
\(D\) is a feature/channel dimension.
The operation is applied pointwise:
\[Y[b, n, d] = f(X[b, n, d]; \alpha, \beta, \text{op_type}),\]where the actual function \(f\) is selected by
op_type, and may make use of scalar parametersalphaandbeta(for example, in affine or activation variants).Dispatch is based on the pair of tensor formats
(x_format, o_format)and a registry mapping:(x_format, o_format) -> (impl, resolved_output_format)
Policy¶
If
outputisNone,profile()selects an implementation witho_format == FORMAT.RAGGED, allocates an internal buffer of logical shape[B, N, D], and returns avTensorview.If
outputis provided,profile()requires an exact(x_fmt, o_fmt)mapping in_impl_mapand validates shape/device consistency.The logical (
N, D) axes are preserved by design; only the leadingBcomes from the runtime context.
- _impl_map¶
Dispatch table keyed by
(x_format, o_format). Each entry maps to(callable_impl, resolved_output_format).
- op_type¶
Runtime-set enum/int describing the specific elementwise operation.
- Type:
Optional[ElementwiseOpType]
- output_buffer¶
Internal output buffer allocated when
outputisNone.- Type:
Optional[torch.Tensor]
- profile(x, output, loc, ctx)[source]¶
Validate inputs, select implementation and output format, and optionally allocate an internal output buffer.
The input tensor
xis expected to have logical shape[B, N, D]. The auxiliary tensorloccarries per-position metadata used by the implementation (for example, mapping positions to segments or other runtime indices); its exact shape and semantics are defined by the kernel.There are two modes:
No output provided (
output is None):Select an implementation for
(x._format, FORMAT.RAGGED).Allocate an internal buffer with shape
[B, N, D], where\[B = \text{ctx.max_new_tokens_per_batch} \times \text{ctx.head_num},\]Wrap it as a
vTensorwith the resolved output format.
Output provided (
output is not None):Require an exact mapping for
(x._format, output._format).Validate that
outputhas rank 3 and preserves the(N, D)dimensions ofx.Validate device consistency between
xandoutput.
- Parameters:
x (vTensor) – Input tensor with logical shape
[B, N, D].output (Optional[vTensor]) – Optional preallocated output tensor. If
None, an internal buffer is allocated; otherwise, this tensor must have shape[B_out, N, D]for someB_outand a format compatible with_impl_map.loc (torch.Tensor) – Auxiliary tensor carrying per-position metadata required by the implementation (e.g., location/segment indices).
ctx (Context) – Execution context that provides the runtime value of
B(viactx.max_new_tokens_per_batchandctx.head_num) and is used for auxiliary memory accounting.
- Returns:
A
vTensorview representing the resolved output: either the providedoutputor an internally allocated buffer.- Return type:
- Raises:
AssertionError – If types, ranks, formats, shapes, or devices are incompatible, or if no implementation is found in
_impl_map.
- execute(x, output, loc, ctx)[source]¶
Execute the selected unary elementwise implementation.
This method assumes that
profile()has already selected an implementation and, if needed, allocated an internal output buffer.- Parameters:
x (torch.Tensor) – Plain input tensor with shape consistent with the
vTensorvalidated inprofile().output (Optional[torch.Tensor]) – Optional preallocated output tensor. If
None, the internal buffer created duringprofile()will be used.loc (torch.Tensor) – Auxiliary tensor carrying per-position metadata required by the implementation (e.g., location/segment indices).
ctx (Context) – Execution context forwarded to the implementation.
- Returns:
The output tensor written by the implementation: either the provided
outputor the internal buffer.- Return type:
torch.Tensor
- Raises:
AssertionError – If
profile()has not been called and no implementation or internal buffer is available.
- class vortex_torch.cache.elementwise.Relu(alpha=0.0, beta=0.0)[source]¶
Bases:
ElementwisePiecewise ReLU-like activation.
This operator applies, elementwise, the scalar function
\[\begin{split}f(x; \alpha, \beta) = \begin{cases} x, & x \ge \alpha, \\ \beta, & x < \alpha. \end{cases}\end{split}\]Given an input tensor \(X \in \mathbb{R}^{B \times N \times D}\), the output is defined by
\[Y[b, n, d] = f\bigl(X[b, n, d]; \alpha, \beta\bigr).\]
- class vortex_torch.cache.elementwise.Silu(alpha=0.0, beta=0.0)[source]¶
Bases:
ElementwiseSiLU-like activation with configurable shift and slope.
This operator applies, elementwise, the scalar function
\[\operatorname{SiLU}(x; \alpha, \beta) = \frac{x}{1 + \exp(\beta x + \alpha)}.\]Given an input tensor \(X \in \mathbb{R}^{B \times N \times D}\), the output is
\[Y[b, n, d] = \operatorname{SiLU}\bigl(X[b, n, d]; \alpha, \beta\bigr).\]
- class vortex_torch.cache.elementwise.Sigmoid(alpha=0.0, beta=0.0)[source]¶
Bases:
ElementwiseSigmoid activation with configurable shift and slope.
This operator applies, elementwise, the scalar function
\[\sigma(x; \alpha, \beta) = \frac{1}{1 + \exp(\beta x + \alpha)}.\]Given an input tensor \(X \in \mathbb{R}^{B \times N \times D}\), the output is
\[Y[b, n, d] = \sigma\bigl(X[b, n, d]; \alpha, \beta\bigr).\]
- class vortex_torch.cache.elementwise.Add_Mul(alpha=0.0, beta=1.0)[source]¶
Bases:
ElementwiseAffine transformation \(y = \beta x + \alpha\).
This operator applies, elementwise, the scalar function
\[f(x; \alpha, \beta) = \beta x + \alpha.\]For an input tensor \(X \in \mathbb{R}^{B \times N \times D}\), the output is
\[Y[b, n, d] = \beta \, X[b, n, d] + \alpha.\]
- class vortex_torch.cache.elementwise.Abs(alpha=0.0, beta=1.0)[source]¶
Bases:
ElementwiseAbsolute-value transform of an affine argument.
This operator applies, elementwise, the scalar function
\[f(x; \alpha, \beta) = \bigl|\beta x + \alpha\bigr|.\]For an input tensor \(X \in \mathbb{R}^{B \times N \times D}\), the output is
\[Y[b, n, d] = \bigl|\beta \, X[b, n, d] + \alpha\bigr|.\]