vortex_torch.indexer.transpose

Classes

Transpose()

Transpose dispatcher for rank-3 logical tensors.

class vortex_torch.indexer.transpose.Transpose[source]

Bases: vOp

Transpose dispatcher for rank-3 logical tensors.

This operator transposes the last two dimensions of a rank-3 tensor while keeping the leading axis unchanged. The input is treated as

\[X \in \mathbb{R}^{S \times D_0 \times D_1},\]

and the output has logical shape

\[Y \in \mathbb{R}^{S \times D_1 \times D_0},\]

with

\[Y[s, d_1, d_0] = X[s, d_0, d_1].\]

The leading dimension \(S\) may represent a true sequence axis or a packed axis (e.g. \(S_{\text{pack}} = \sum_b S_b\)); the transpose is applied independently for each slice along that axis.

Dispatch is keyed only by the input tensor format x._format.

_impl_map

Dispatch table keyed by x_format. Each entry maps to (callable_impl, resolved_output_format).

Type:

Dict[FORMAT, Tuple[Callable, FORMAT]]

impl

The resolved implementation selected during profile().

Type:

Optional[Callable]

output_format

The output tensor format as determined in profile().

Type:

Optional[FORMAT]

output_buffer

Preallocated output tensor buffer with logical shape [S, D_1, D_0].

Type:

Optional[torch.Tensor]

profile(x, ctx)[source]

Validate the input, select an implementation, allocate the output buffer, and return a vTensor view with the resolved format.

The input tensor is expected to have logical shape [S_in, D_0, D_1]. The output buffer is allocated with shape

\[[S, D_1, D_0],\]

where \(S\) is taken from ctx.max_num_pages to match the runtime configuration for the leading dimension.

Parameters:
  • x (vTensor) – Input tensor to be transposed, with logical shape [S_in, D_0, D_1].

  • ctx (Context) – Execution context providing ctx.max_num_pages for the leading dimension and tracking auxiliary memory usage.

Returns:

A vTensor view wrapping the internally allocated output buffer with the resolved output format.

Return type:

vTensor

Raises:

AssertionError – If x is not a vTensor, if its rank is not 3, or if no implementation is registered for x._format.

execute(x, ctx)[source]

Run the selected transpose implementation and return the output buffer.

The implementation transposes the last two dimensions of x into the internal buffer stored in output_buffer, leaving the leading dimension unchanged.

Parameters:
  • x (torch.Tensor) – Input tensor to be transposed, on the same device as the internal output buffer.

  • ctx (Context) – Execution context passed through to the implementation.

Returns:

The internally allocated output tensor with shape [S, D_1, D_0].

Return type:

torch.Tensor

Raises:

AssertionError – If profile() has not been called and the internal output buffer or implementation is not available.