Skip to content

Framework integrations

BlazeRPC provides optional helpers for PyTorch, TensorFlow, and ONNX Runtime. These handle the conversion between NumPy arrays (BlazeRPC's wire format) and framework-specific tensor types so you can write model code in the framework's native API.

PyTorch

Install the extra:

uv add "blazerpc[pytorch]"

@torch_model decorator

The @torch_model decorator converts NumPy inputs to PyTorch tensors before your function runs, and converts the PyTorch tensor output back to NumPy when it returns:

import numpy as np
from blazerpc import BlazeApp, TensorInput, TensorOutput
from blazerpc.contrib.pytorch import torch_model

app = BlazeApp()

@app.model("classifier")
@torch_model(device="cuda")
def classify(
    image: TensorInput[np.float32, "batch", 3, 224, 224],
) -> TensorOutput[np.float32, "batch", 1000]:
    # `image` is a torch.Tensor on CUDA
    return model(image)
    # Return value is converted back to np.ndarray automatically
Parameter Type Default Description
device str "cpu" Target device ("cpu", "cuda", "cuda:0").

The decorator can be used with or without arguments:

@torch_model           # Uses default device="cpu"
@torch_model()         # Same as above
@torch_model(device="cuda:1")  # Specify a GPU

Standalone conversion functions

If you need more control, use the conversion functions directly:

from blazerpc.contrib.pytorch import torch_to_numpy, numpy_to_torch

# NumPy -> PyTorch
tensor = numpy_to_torch(arr, device="cuda", dtype=torch.float16)

# PyTorch -> NumPy (detaches from graph, moves to CPU)
array = torch_to_numpy(tensor)

TensorFlow

Install the extra:

uv add "blazerpc[tensorflow]"

@tf_model decorator

Works the same way as @torch_model, converting NumPy inputs to TensorFlow tensors and back:

import numpy as np
from blazerpc import BlazeApp, TensorInput, TensorOutput
from blazerpc.contrib.tensorflow import tf_model

app = BlazeApp()

@app.model("classifier")
@tf_model
def classify(
    image: TensorInput[np.float32, "batch", 224, 224, 3],
) -> TensorOutput[np.float32, "batch", 1000]:
    # `image` is a tf.Tensor
    return model(image)
Parameter Type Default Description
dtype Any None Optional TensorFlow dtype to cast inputs to.

Standalone conversion functions

from blazerpc.contrib.tensorflow import tf_to_numpy, numpy_to_tf

tensor = numpy_to_tf(arr, dtype=tf.float16)
array = tf_to_numpy(tensor)

ONNX Runtime

Install the extra:

uv add "blazerpc[onnx]"

ONNXModel wrapper

ONNXModel manages an ONNX Runtime inference session and exposes a simple predict() method:

import numpy as np
from blazerpc import BlazeApp, TensorInput, TensorOutput
from blazerpc.contrib.onnx import ONNXModel

onnx_model = ONNXModel(
    "model.onnx",
    providers=["CUDAExecutionProvider", "CPUExecutionProvider"],
)

app = BlazeApp()

@app.model("classifier")
def classify(
    image: TensorInput[np.float32, "batch", 3, 224, 224],
) -> TensorOutput[np.float32, "batch", 1000]:
    return onnx_model.predict(image)[0]
Constructor parameter Type Default Description
model_path str \| Path required Path to the .onnx file.
providers list[str] \| None ["CPUExecutionProvider"] Execution providers.
session_options Any None Optional ort.SessionOptions.

Positional inputs

predict() matches positional arguments to input names in order:

results = onnx_model.predict(input_1, input_2)
# Returns a list of output arrays

Named inputs

predict_dict() accepts a dictionary of named inputs and returns a dictionary of named outputs:

results = onnx_model.predict_dict({
    "input_ids": input_ids_array,
    "attention_mask": attention_mask_array,
})
# Returns {"output_name": array, ...}

Introspection

print(onnx_model.input_names)   # ["input_ids", "attention_mask"]
print(onnx_model.output_names)  # ["logits"]

Installing all extras

To install all framework integrations at once:

uv add "blazerpc[all]"