Atriva Atriva
Documentation

Atriva Edge AI Platform

Guides, tutorials, and API references to help you build and deploy Edge AI applications.

Pure Python

🐍 Pure Python OpenVINO Runtime Example

Note: This section demonstrates how to work with OpenVINO via pure Python for development or debugging.
You do not need this if using the Atriva API β€” this is for reference and low-level control only.

1. Initialize Runtime

from atriva_inference import InferenceRuntime

runtime = InferenceRuntime(backend="openvino")

Available backends:

NameDevices
openvinoCPU, GPU, NPU (Meteor Lake)
onnxruntimeCPU / CUDA
tensorrtNVIDIA GPU

2. Load a Model

model = runtime.load_model(
    model_path="ir/model.xml",
    device="AUTO"  # CPU / GPU / NPU / AUTO
)

Device Options

  • β€œAUTO” β†’ Prefer GPU β†’ NPU β†’ CPU
  • β€œCPU” β†’ FP32/FP16/INT8
  • β€œGPU” β†’ iGPU or Arc
  • β€œNPU” β†’ Meteor Lake AI Boost

3. Prepare Input Tensors

d into a runtime tensor:

tensor = runtime.tensor_from_numpy(
    np_array, layout="NCHW"
)

The runtime will:

  • Normalize dtype
  • Convert layout when needed
  • Allocate device-optimized memory

4. Execute Synchronously

output = model.run({"input": tensor})

Output is a dict mapping output names β†’ tensors.

Convert to numpy:

output_np = output["output"].to_numpy()

5. Execute Asynchronously

For pipelines (video analytics), async mode is preferred:

request = model.create_async_request()

request.start({"input": tensor})
result = request.wait()

wait() blocks until inference completes.

stream = model.create_stream()

for frame in frames:
    inp = runtime.tensor_from_numpy(frame)
    out = stream.run({"input": inp})

Stream keeps the model compiled and reuses device buffers.

7. Working With Dynamic Shapes

If your IR model has dynamic inputs:

tensor = runtime.tensor_from_numpy(
    np_array,
    shape_override=(1, 3, h, w)
)

OpenVINO compiles a new shape on first use, then caches it.

8. Performance Profiling

profile = model.get_profiling()
print(profile)

Typical metrics:

  • Layer execution time
  • Memory transfers
  • Device overhead

9. Error Handling

try:
    output = model.run(inputs)
except InferenceRuntimeError as e:
    print("Inference error:", e)