What are the differences between Neuron Runtime API V1 and V2?

I have learned that Neuron Runtime API V1 and Neuron Runtime API V2 are not compatible with each other. What are the main differences between these two API versions in terms of interface design, supported features, or usage patterns?
How should we decide which API version to use for our project?

Thank you for your question.

Neuron Runtime API V2 is developed as an evolution of V1, offering enhanced functionality, improved parallelism, and broader usability for AI workloads on MediaTek platforms. Both APIs are not compatible—each version has distinct interface contracts and usage patterns.

Key Differences: Neuron Runtime API V1 vs. V2

Aspect Neuron Runtime API V1 Neuron Runtime API V2
Interface Design Simpler, serial execution Enhanced, supports parallel execution
Supported Features Basic model inference Multiple models, concurrent inference
Usage Patterns Use when inferencing jobs do not overlap in time Use when multiple models may execute simultaneously or inference requests may overlap
Compatibility Not compatible with V2 Not compatible with V1

How to Choose

  • Use Neuron Runtime API V1 if your workflow is strictly serial—each inference starts only after the previous one has finished. This is suitable for simple, single-model deployments.
  • Use Neuron Runtime API V2 if you require concurrent execution—such as launching new inferences before previous ones complete, or managing multiple models on the same system.

MediaTek recommends adopting the latest Neuron Runtime API V2 for new projects, as it provides expanded features and flexibility for advanced AI use cases.

Additional Reference

For detailed developer guides and API documentation, visit:
Neuron Runtime API (MediaTek documentation)


Note:
When selecting API versions for deployment, ensure that the chosen version matches your project’s concurrency and feature needs.