vortex_torch.cache.elementwise¶

Classes

`Abs`([alpha, beta])	Absolute-value transform of an affine argument.
`Add_Mul`([alpha, beta])	Affine transformation \(y = \beta x + \alpha\).
`Elementwise`([alpha, beta])	Unary elementwise operator dispatcher (e.g. ReLU/Sigmoid/SiLU/Abs/Affine).
`Relu`([alpha, beta])	Piecewise ReLU-like activation.
`Sigmoid`([alpha, beta])	Sigmoid activation with configurable shift and slope.
`Silu`([alpha, beta])	SiLU-like activation with configurable shift and slope.

class vortex_torch.cache.elementwise.Elementwise(alpha=0.0, beta=1.0)[source]¶

Bases: vOp

Unary elementwise operator dispatcher (e.g. ReLU/Sigmoid/SiLU/Abs/Affine).

This class dispatches a family of unary elementwise operations on rank-3 tensors. The input is treated as

\[X \in \mathbb{R}^{B \times N \times D},\]

where:

\(B\) is a leading batch-like axis (for example, max_new_tokens_per_batch * head_num coming from the runtime context),
\(N\) is a sequence or position dimension, and
\(D\) is a feature/channel dimension.

The operation is applied pointwise:

\[Y[b, n, d] = f(X[b, n, d]; \alpha, \beta, \text{op_type}),\]

where the actual function \(f\) is selected by op_type, and may make use of scalar parameters alpha and beta (for example, in affine or activation variants).

Dispatch is based on the pair of tensor formats (x_format, o_format) and a registry mapping:

(x_format, o_format) -> (impl, resolved_output_format)

Policy¶

If output is None, profile() selects an implementation with o_format == FORMAT.RAGGED, allocates an internal buffer of logical shape [B, N, D], and returns a vTensor view.
If output is provided, profile() requires an exact (x_fmt, o_fmt) mapping in _impl_map and validates shape/device consistency.
The logical (N, D) axes are preserved by design; only the leading B comes from the runtime context.

_impl_map¶

Dispatch table keyed by (x_format, o_format). Each entry maps to (callable_impl, resolved_output_format).

Type:: Dict[Tuple[FORMAT, FORMAT], Tuple[Callable, FORMAT]]

alpha¶

Scalar parameter used by certain unary ops.

Type:: float

beta¶

Scalar parameter used by certain unary ops.

Type:: float

op_type¶

Runtime-set enum/int describing the specific elementwise operation.

Type:: Optional[ElementwiseOpType]

impl¶

The resolved implementation selected during profile().

Type:: Optional[Callable]

output_format¶

The output tensor format as determined in profile().

Type:: Optional[FORMAT]

output_buffer¶

Internal output buffer allocated when output is None.

Type:: Optional[torch.Tensor]

profile(x, output, loc, ctx)[source]¶

Validate inputs, select implementation and output format, and optionally allocate an internal output buffer.

The input tensor x is expected to have logical shape [B, N, D]. The auxiliary tensor loc carries per-position metadata used by the implementation (for example, mapping positions to segments or other runtime indices); its exact shape and semantics are defined by the kernel.

There are two modes:

No output provided (output is None):
- Select an implementation for (x._format, FORMAT.RAGGED).
- Allocate an internal buffer with shape [B, N, D], where
  
  \[B = \text{ctx.max_new_tokens_per_batch} \times \text{ctx.head_num},\]
- Wrap it as a vTensor with the resolved output format.
Output provided (output is not None):
- Require an exact mapping for (x._format, output._format).
- Validate that output has rank 3 and preserves the (N, D) dimensions of x.
- Validate device consistency between x and output.

Parameters:

x (vTensor) – Input tensor with logical shape [B, N, D].
output (Optional[vTensor]) – Optional preallocated output tensor. If None, an internal buffer is allocated; otherwise, this tensor must have shape [B_out, N, D] for some B_out and a format compatible with _impl_map.
loc (torch.Tensor) – Auxiliary tensor carrying per-position metadata required by the implementation (e.g., location/segment indices).
ctx (Context) – Execution context that provides the runtime value of B (via ctx.max_new_tokens_per_batch and ctx.head_num) and is used for auxiliary memory accounting.

Returns:

A vTensor view representing the resolved output: either the provided output or an internally allocated buffer.

Return type:

vTensor

Raises:

AssertionError – If types, ranks, formats, shapes, or devices are incompatible, or if no implementation is found in _impl_map.

execute(x, output, loc, ctx)[source]¶

Execute the selected unary elementwise implementation.

This method assumes that profile() has already selected an implementation and, if needed, allocated an internal output buffer.

Parameters:

x (torch.Tensor) – Plain input tensor with shape consistent with the vTensor validated in profile().
output (Optional[torch.Tensor]) – Optional preallocated output tensor. If None, the internal buffer created during profile() will be used.
loc (torch.Tensor) – Auxiliary tensor carrying per-position metadata required by the implementation (e.g., location/segment indices).
ctx (Context) – Execution context forwarded to the implementation.

Returns:

The output tensor written by the implementation: either the provided output or the internal buffer.

Return type:

torch.Tensor

Raises:

AssertionError – If profile() has not been called and no implementation or internal buffer is available.

Parameters:

alpha (float)
beta (float)

class vortex_torch.cache.elementwise.Relu(alpha=0.0, beta=0.0)[source]¶

Bases: Elementwise

Piecewise ReLU-like activation.

This operator applies, elementwise, the scalar function

\[\begin{split}f(x; \alpha, \beta) = \begin{cases} x, & x \ge \alpha, \\ \beta, & x < \alpha. \end{cases}\end{split}\]

Given an input tensor \(X \in \mathbb{R}^{B \times N \times D}\), the output is defined by

\[Y[b, n, d] = f\bigl(X[b, n, d]; \alpha, \beta\bigr).\]

Parameters:

alpha (float, optional) – Threshold value \(\alpha\). Inputs greater than or equal to this threshold are passed through unchanged. Default is 0.0.
beta (float, optional) – Fallback value \(\beta\) used when \(x < \alpha\). Default is 0.0.

class vortex_torch.cache.elementwise.Silu(alpha=0.0, beta=0.0)[source]¶

Bases: Elementwise

SiLU-like activation with configurable shift and slope.

This operator applies, elementwise, the scalar function

\[\operatorname{SiLU}(x; \alpha, \beta) = \frac{x}{1 + \exp(\beta x + \alpha)}.\]

Given an input tensor \(X \in \mathbb{R}^{B \times N \times D}\), the output is

\[Y[b, n, d] = \operatorname{SiLU}\bigl(X[b, n, d]; \alpha, \beta\bigr).\]

Parameters:

alpha (float, optional) – Bias term \(\alpha\) added inside the exponential. Default is 0.0.
beta (float, optional) – Slope \(\beta\) multiplying \(x\) inside the exponential. Default is 0.0.

class vortex_torch.cache.elementwise.Sigmoid(alpha=0.0, beta=0.0)[source]¶

Bases: Elementwise

Sigmoid activation with configurable shift and slope.

This operator applies, elementwise, the scalar function

\[\sigma(x; \alpha, \beta) = \frac{1}{1 + \exp(\beta x + \alpha)}.\]

Given an input tensor \(X \in \mathbb{R}^{B \times N \times D}\), the output is

\[Y[b, n, d] = \sigma\bigl(X[b, n, d]; \alpha, \beta\bigr).\]

Parameters:

alpha (float, optional) – Bias term \(\alpha\) added inside the exponential. Default is 0.0.
beta (float, optional) – Slope \(\beta\) multiplying \(x\) inside the exponential. Default is 0.0.

class vortex_torch.cache.elementwise.Add_Mul(alpha=0.0, beta=1.0)[source]¶

Bases: Elementwise

Affine transformation \(y = \beta x + \alpha\).

This operator applies, elementwise, the scalar function

\[f(x; \alpha, \beta) = \beta x + \alpha.\]

For an input tensor \(X \in \mathbb{R}^{B \times N \times D}\), the output is

\[Y[b, n, d] = \beta \, X[b, n, d] + \alpha.\]

Parameters:

alpha (float, optional) – Additive term \(\alpha\) in the affine transform. Default is 0.0.
beta (float, optional) – Multiplicative term \(\beta\) in the affine transform. Default is 1.0.

class vortex_torch.cache.elementwise.Abs(alpha=0.0, beta=1.0)[source]¶

Bases: Elementwise

Absolute-value transform of an affine argument.

This operator applies, elementwise, the scalar function

\[f(x; \alpha, \beta) = \bigl|\beta x + \alpha\bigr|.\]

For an input tensor \(X \in \mathbb{R}^{B \times N \times D}\), the output is

\[Y[b, n, d] = \bigl|\beta \, X[b, n, d] + \alpha\bigr|.\]

Parameters:

alpha (float, optional) – Additive term \(\alpha\) inside the absolute value. Default is 0.0.
beta (float, optional) – Multiplicative term \(\beta\) inside the absolute value. Default is 1.0.