Atriva Atriva
Documentation

Atriva Edge AI Platform

Guides, tutorials, and API references to help you build and deploy Edge AI applications.

README

This document describes the architecture of the Atriva AI Inference Service, a RESTful API backend that processes shared decoded frames from the video-pipeline container and returns inference results to the Atriva Core API.


1. System Overview

The Atriva AI Inference Service operates as a microservice within the Atriva ecosystem, bridging the video-pipeline and the Core API through AI-powered analysis.

┌─────────────────────────────────────────────────────────────────────────────┐
│                           Atriva Platform                                   │
│                                                                             │
│  ┌──────────────────┐    Shared Volume    ┌──────────────────────────────┐  │
│  │  Video-Pipeline  │ ──────────────────► │   AI Inference Service       │  │
│  │    Container     │   (Decoded Frames)  │      (OpenVINO)              │  │
│  │                  │                     │                              │  │
│  │  • RTSP/RTMP     │                     │  • FastAPI REST API          │  │
│  │  • Decode        │                     │  • OpenVINO Runtime          │  │
│  │  • Frame Export  │                     │  • Model Management          │  │
│  └──────────────────┘                     └──────────────┬───────────────┘  │
│                                                          │                  │
│                                                          │ Inference        │
│                                                          │ Results          │
│                                                          ▼                  │
│                                           ┌──────────────────────────────┐  │
│                                           │   Atriva Core API Backend    │  │
│                                           │                              │  │
│                                           │  • Event Processing          │  │
│                                           │  • Detection Storage         │  │
│                                           │  • Alert Management          │  │
│                                           └──────────────────────────────┘  │
└─────────────────────────────────────────────────────────────────────────────┘

2. Core Components

2.1 FastAPI REST Server

The service exposes a RESTful API built with FastAPI, providing:

ComponentPurpose
main.pyApplication entry point, server configuration
routes.pyAPI endpoint definitions and request handling
services.pyBusiness logic and model inference orchestration
models.pyPydantic schemas for request/response validation

2.2 Shared Frame Access Layer

Frames decoded by the video-pipeline container are accessed through a shared volume:

/shared/frames/
├── camera1/
│   ├── frame_0001.jpg
│   ├── frame_0002.jpg
│   └── latest.jpg
├── camera2/
│   └── ...
└── metadata/
    └── cameras.json

The shared_data.py module handles:

  • Camera discovery and enumeration
  • Frame file access and validation
  • Latest frame retrieval per camera
  • Frame metadata management

2.3 OpenVINO Inference Engine

The inference backend leverages Intel OpenVINO for optimized execution:

┌─────────────────────────────────────────────────────┐
│              OpenVINO Inference Engine              │
│                                                     │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐  │
│  │   Model     │  │  Compiled   │  │   Infer     │  │
│  │   Loading   │──│   Model     │──│   Request   │  │
│  │   (.xml)    │  │   Cache     │  │   Queue     │  │
│  └─────────────┘  └─────────────┘  └─────────────┘  │
│                                                     │
│  Supported Devices: CPU (INT8/FP16/FP32)            │
└─────────────────────────────────────────────────────┘

2.4 Model Registry

Pre-trained models optimized for OpenVINO:

ModelTypeUse Case
YOLOv8nDetectionGeneral object detection
YOLOv8s/mDetectionHigher accuracy detection
LPRNetRecognitionLicense plate reading
Vehicle TrackingDetection + TrackingVehicle analytics

3. Data Flow

3.1 Frame Ingestion Flow

Video-Pipeline Container              AI Inference Service
        │                                     │
        │  1. Decode RTSP/RTMP Stream         │
        ▼                                     │
  ┌───────────┐                               │
  │  Decoded  │                               │
  │   Frame   │                               │
  └─────┬─────┘                               │
        │                                     │
        │  2. Write to Shared Volume          │
        ▼                                     │
  ┌───────────┐    3. Read Frame        ┌─────┴─────┐
  │  /shared/ │ ──────────────────────► │  Frame    │
  │  frames/  │                         │  Loader   │
  └───────────┘                         └─────┬─────┘


                                        ┌───────────┐
                                        │ Inference │
                                        │  Engine   │
                                        └───────────┘

3.2 Inference Request Flow

┌──────────────────┐
│  Atriva Core API │
│    or Client     │
└────────┬─────────┘

         │  POST /inference/latest-frame
         │  POST /shared/cameras/{id}/inference

┌──────────────────────────────────────────────────────┐
│                 AI Inference Service                 │
│                                                      │
│  1. Validate Request                                 │
│       │                                              │
│       ▼                                              │
│  2. Load Frame from Shared Volume                    │
│       │                                              │
│       ▼                                              │
│  3. Preprocess (Resize, Normalize, Tensor Convert)   │
│       │                                              │
│       ▼                                              │
│  4. Run OpenVINO Inference                           │
│       │                                              │
│       ▼                                              │
│  5. Post-process (NMS, Decode Boxes, Filter)         │
│       │                                              │
│       ▼                                              │
│  6. Return JSON Response                             │
│                                                      │
└──────────────────────────────────────────────────────┘

         │  { "detections": [...], "camera_id": "..." }

┌──────────────────┐
│  Atriva Core API │
└──────────────────┘

4. API Integration Points

4.1 Inbound APIs (Consumed by Core API)

EndpointMethodDescription
/healthGETService health and shared volume status
/modelsGETAvailable models list
/shared/camerasGETCameras with available frames
/shared/cameras/{id}/inferencePOSTRun inference on camera’s latest frame
/inference/latest-framePOSTInference on specified camera frame
/inference/backgroundPOSTStart background inference worker

4.2 Outbound Communication (To Core API)

Inference results are returned synchronously via HTTP response:

{
  "camera_id": "camera1",
  "frame_path": "/shared/frames/camera1/latest.jpg",
  "timestamp": "2025-12-14T10:30:00Z",
  "model_name": "yolov8n",
  "detections": [
    {
      "class_id": 2,
      "class_name": "car",
      "confidence": 0.92,
      "bbox": [150, 200, 350, 450]
    }
  ]
}

5. Deployment Architecture

5.1 Container Topology

┌─────────────────────────────────────────────────────────────┐
│                      Docker Host                            │
│                                                             │
│  ┌─────────────────┐         ┌─────────────────────────┐    │
│  │ video-pipeline  │         │   ai-inference          │    │
│  │                 │         │                         │    │
│  │  Port: N/A      │         │  Port: 8001:8001        │    │
│  │                 │         │                         │    │
│  │  Volumes:       │         │  Volumes:               │    │
│  │  - /shared      │◄───────►│  - /shared (readonly)   │    │
│  │                 │         │  - /models              │    │
│  └─────────────────┘         └─────────────────────────┘    │
│                                       │                     │
│                                       │ REST API            │
│                                       ▼                     │
│                              ┌─────────────────────────┐    │
│                              │   atriva-core-api       │    │
│                              │                         │    │
│                              │  Port: 8080:8080        │    │
│                              └─────────────────────────┘    │
└─────────────────────────────────────────────────────────────┘

5.2 Volume Mounts

VolumePurposeAccess
/shared/framesDecoded video framesRead-only
/modelsOpenVINO model filesRead-only
/tmp/inferenceTemporary processingRead-write

6. Performance Considerations

6.1 Accelerator Selection

AcceleratorPrecisionUse Case
cpui8INT8Maximum throughput, slight accuracy trade-off
cpu16FP16Balanced performance/accuracy
cpu32FP32Maximum accuracy, baseline performance

6.2 Optimization Strategies

  • Model Caching: Compiled models are cached to avoid recompilation
  • Frame Polling: Efficient file-based frame access from shared volume
  • Async Inference: Non-blocking inference for high-throughput scenarios
  • Batch Processing: Multiple frames processed in single inference call (when supported)

7. Security Considerations

  • Shared volume mounted as read-only for frame access
  • API endpoints validated with Pydantic schemas
  • No direct external network access required
  • Internal service communication only

Next Steps

  • ➡️ API Endpoints — Detailed endpoint documentation
  • ➡️ Models — Supported models and preparation
  • ➡️ Development — Local development setup