README

This guide covers everything you need to develop, test, and extend the Atriva AI Inference Service built on OpenVINO.

1. Project Structure

atriva-ai-openvino/
├── app/
│   ├── __init__.py
│   ├── routes.py            # FastAPI endpoint definitions
│   ├── services.py          # Inference logic & model orchestration
│   ├── models.py            # Pydantic request/response schemas
│   ├── model_capabilities.py # Model metadata & supported objects
│   └── shared_data.py       # Shared volume frame access utilities
├── models/                  # OpenVINO IR model files (.xml, .bin)
│   ├── yolov8n/
│   ├── yolov8s/
│   ├── lprnet/
│   └── vehicle_tracking/
├── scripts/
│   ├── convert_to_openvino.py   # PyTorch → OpenVINO conversion
│   └── requirements.txt
├── tests/
│   ├── test_runner.py           # Main test orchestrator
│   ├── test_yolov8_openvino.py  # YOLOv8 detection tests
│   ├── test_vehicle_tracking.py # Tracking tests
│   ├── test_images/             # Sample images
│   ├── test_videos/             # Sample videos
│   └── requirements.txt
├── main.py                  # FastAPI application entry point
├── config.py                # Configuration settings
├── requirements.txt         # Production dependencies
└── Dockerfile               # Container build configuration

2. Environment Setup

2.1 Python Version Requirements

Environment	Python Version	Reason
Docker AI Service	3.12	Latest stable, handled by Dockerfile
Testing	3.11	PyTorch wheel availability
Model Conversion	3.11	PyTorch + OpenVINO compatibility

⚠️ Important: PyTorch does not provide pre-built wheels for Python 3.13. Use Python 3.11 for any tasks involving PyTorch.

2.2 Local Development Setup

# Clone the repository
git clone https://github.com/atriva-ai/atriva-ai-openvino.git
cd atriva-ai-openvino

# Create virtual environment
python3.11 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install --upgrade pip
pip install -r requirements.txt

# Verify OpenVINO installation
python -c "from openvino import Core; print('OpenVINO OK:', Core().available_devices)"

2.3 IDE Configuration

VS Code (recommended extensions):

Python (ms-python.python)
Pylance (ms-python.vscode-pylance)
Black Formatter (ms-python.black-formatter)

PyCharm:

Set Python interpreter to venv/bin/python
Enable FastAPI support in project settings

3. Running the Development Server

3.1 Start FastAPI with Hot Reload

# From project root
uvicorn main:app --host 0.0.0.0 --port 8001 --reload

# With debug logging
LOG_LEVEL=debug uvicorn main:app --host 0.0.0.0 --port 8001 --reload

Access points:

API: http://localhost:8001
Swagger UI: http://localhost:8001/docs
ReDoc: http://localhost:8001/redoc

3.2 Environment Variables

Variable	Default	Description
`PORT`	`8001`	API server port
`LOG_LEVEL`	`info`	Logging verbosity (debug, info, warning, error)
`MODELS_PATH`	`./models`	Path to OpenVINO model files
`SHARED_FRAMES_PATH`	`/shared/frames`	Shared volume for video frames
`DEFAULT_ACCELERATOR`	`cpu32`	Default inference precision

3.3 Configuration File

Edit config.py for persistent configuration:

# config.py
import os

class Settings:
    MODELS_PATH = os.getenv("MODELS_PATH", "./models")
    SHARED_FRAMES_PATH = os.getenv("SHARED_FRAMES_PATH", "/shared/frames")
    DEFAULT_ACCELERATOR = os.getenv("DEFAULT_ACCELERATOR", "cpu32")
    CONFIDENCE_THRESHOLD = 0.5
    NMS_THRESHOLD = 0.4

4. Development Workflow

4.1 Adding a New API Endpoint

Define the route in app/routes.py:

from fastapi import APIRouter, Query
from app.models import InferenceResponse
from app.services import run_custom_inference

router = APIRouter()

@router.post("/inference/custom", response_model=InferenceResponse)
async def custom_inference(
    model_name: str = Query(..., description="Model to use"),
    threshold: float = Query(0.5, description="Confidence threshold")
):
    """Run custom inference with specified parameters."""
    result = await run_custom_inference(model_name, threshold)
    return result

Add business logic in app/services.py:

async def run_custom_inference(model_name: str, threshold: float):
    # Load model
    model = get_or_load_model(model_name)
    
    # Run inference
    outputs = model.infer(inputs)
    
    # Post-process with threshold
    detections = post_process(outputs, threshold)
    
    return {"detections": detections}

Define schemas in app/models.py:

from pydantic import BaseModel
from typing import List

class Detection(BaseModel):
    class_id: int
    class_name: str
    confidence: float
    bbox: List[int]

class InferenceResponse(BaseModel):
    model_name: str
    detections: List[Detection]

4.2 Adding a New Model

Convert to OpenVINO IR format:

cd scripts
source scripts-venv-py311/bin/activate

# For YOLO models
python convert_to_openvino.py --model yolov8n --output ../models/yolov8n

# For custom ONNX models
mo --input_model custom_model.onnx --output_dir ../models/custom

Register model capabilities in app/model_capabilities.py:

MODEL_REGISTRY = {
    "custom_model": {
        "type": "detection",
        "input_size": (640, 640),
        "classes": ["class1", "class2", "class3"],
        "path": "models/custom/model.xml"
    }
}

Test the model:

curl -X POST "http://localhost:8001/inference/direct?model_name=custom_model" \
  -F "image=@test_image.jpg"

5. Testing

5.1 Test Environment Setup

cd tests

# Use Python 3.11 for PyTorch compatibility
pyenv local 3.11.13  # or: python3.11 -m venv test-venv-py311
python3.11 -m venv test-venv-py311
source test-venv-py311/bin/activate

pip install -r requirements.txt

5.2 Running Tests

# Run all tests
python test_runner.py

# Run specific model test
python test_runner.py --model yolov8n --input test_images/sample.jpg

# Run with video input
python test_runner.py --model vehicle_tracking --input test_videos/traffic.mp4

# Download models first (if needed)
python test_runner.py --download-models

5.3 Test Output

Test results are saved to tests/output/:

Annotated images with bounding boxes
Detection JSON files
Performance metrics

6. Debugging

6.1 Enable Debug Logging

# Terminal
LOG_LEVEL=debug uvicorn main:app --reload

# Or in code
import logging
logging.basicConfig(level=logging.DEBUG)

6.2 OpenVINO Profiling

from openvino import Core

core = Core()
model = core.read_model("models/yolov8n/model.xml")
compiled = core.compile_model(model, "CPU")

# Enable performance counters
compiled.set_property({"PERF_COUNT": "YES"})

# After inference
request = compiled.create_infer_request()
request.infer(inputs)

# Get per-layer timing
perf_counts = request.profiling_info
for layer in perf_counts:
    print(f"{layer.node_name}: {layer.real_time.total_seconds() * 1000:.2f}ms")

6.3 Common Issues

Issue	Solution
`ModuleNotFoundError: openvino`	Reinstall: `pip install openvino --upgrade`
Model load fails	Check `.xml` and `.bin` files exist in model directory
Slow first inference	Normal — OpenVINO compiles model on first run. Subsequent runs are faster.
Shape mismatch error	Verify input image is resized to model’s expected dimensions
Out of memory	Use INT8 quantized model or reduce batch size

6.4 Interactive Debugging

# Start with debugger support
python -m debugpy --listen 5678 --wait-for-client -m uvicorn main:app --reload

VS Code launch.json:

{
    "version": "0.2.0",
    "configurations": [
        {
            "name": "Attach to FastAPI",
            "type": "python",
            "request": "attach",
            "connect": {"host": "localhost", "port": 5678}
        }
    ]
}

7. Code Style & Conventions

7.1 Formatting

# Format code with Black
black app/ tests/ --line-length 100

# Sort imports
isort app/ tests/

# Type checking
mypy app/ --ignore-missing-imports

7.2 Naming Conventions

Type	Convention	Example
Files	snake_case	`model_capabilities.py`
Classes	PascalCase	`InferenceService`
Functions	snake_case	`run_detection()`
Constants	UPPER_SNAKE	`DEFAULT_THRESHOLD`
API endpoints	kebab-case	`/inference/latest-frame`

7.3 Documentation

All public functions must have docstrings
Use Google-style docstrings:

def detect_objects(image: np.ndarray, threshold: float = 0.5) -> List[Detection]:
    """Detect objects in an image using the loaded model.
    
    Args:
        image: Input image as numpy array (H, W, C) in BGR format.
        threshold: Minimum confidence threshold for detections.
    
    Returns:
        List of Detection objects with class, confidence, and bbox.
    
    Raises:
        ModelNotLoadedError: If no model is currently loaded.
    """

8. Docker Development

8.1 Build Development Image

# Build with cache
docker build -t atriva-ai-openvino:dev .

# Build without cache (for clean builds)
docker build --no-cache -t atriva-ai-openvino:dev .

8.2 Run with Volume Mounts (for development)

docker run -d \
  -p 8001:8001 \
  -v $(pwd)/app:/app/app \
  -v $(pwd)/models:/app/models \
  -v /path/to/shared/frames:/shared/frames:ro \
  --name ai-dev \
  atriva-ai-openvino:dev

8.3 View Logs

# Follow logs
docker logs -f ai-dev

# Last 100 lines
docker logs --tail 100 ai-dev

8.4 Shell Access

docker exec -it ai-dev /bin/bash

➡️ API Endpoints — Complete REST API reference
➡️ Pure Python Runtime — Low-level OpenVINO usage
➡️ Model Preparation — Converting models to OpenVINO
➡️ Architecture — System design overview
➡️ Testing Guide — Running and writing tests

Atriva Edge AI Platform