Technology

From RTL to Inference

The engineering inside HonChen NPU IP — hand-crafted hardware, automated software toolchain, and a cloud-assisted optimization service designed to make integration painless.

Compiler & SDK

Model Deployment Pipeline

From standard ONNX input to NPU-executable bytecode, our toolchain handles graph optimization, quantization, operator fusion, memory planning and code generation — so your team doesn't have to manually tune the model.

01 · FRONTEND

ONNX Model

Standard format input. Bring any model that exports to ONNX.

02 · FRONTEND

Graph Parser

Parse computational graph and validate operator support.

03 · OPTIMIZER

Graph Optimization

Constant folding, dead-code elimination, operator pruning.

04 · OPTIMIZER

Quantization

INT8 / FP16 mixed-precision, per-channel calibration.

05 · OPTIMIZER

Operator Fusion

Conv + BN + ReLU fused. MatMul + Add + Activation fused.

06 · BACKEND

Memory Planning

SRAM allocation, DMA scheduling, double-buffer management.

07 · BACKEND

Code Generation

NPU instruction set, multi-model runtime descriptor.

08 · DEPLOY

NPU Bytecode

Ready-to-execute binary, runs on HonChen NPU IP.

Multi-Model Runtime

The same NPU runs ASR → LLM → TTS in pipeline with runtime model switching — no recompilation or chip reset.

Mixed-Precision Strategy

Toolchain auto-selects INT8 vs FP16 per-layer to balance accuracy and footprint. Quantization-aware fine-tuning supported.

Zero Manual Tuning

Customers feed ONNX, get NPU-ready binary. Memory layout, kernel fusion, instruction scheduling fully automated.

SoC Integration

How HonChen NPU IP
Drops into Your SoC

Industry-standard bus protocols. Clean clock and reset boundaries. Designed to integrate, not disrupt — your existing SoC pipeline keeps working.

Process-Independent · 55 nm → 2 nm
Multi-Precision INT8 + FP16
Multi-Model Runtime Switching
Custom AI Instruction Set
Low-Power Architecture
HonChen NPU IP
AXI Master Interface▲ to SoC fabric / DRAM
CNN Compute Engine

Conv2D · Depthwise · Pointwise · Pooling · Upsample · BatchNorm · Element-wise · Concat

Transformer Engine

MatMul · Multi-Head Attention · Softmax · LayerNorm · GELU · KV Cache · Embedding

Local SRAM Scratchpad

Weights / activations cache · double-buffer · streaming buffer · DMA-managed

Quantize / Activation

INT8 / FP16 · Per-channel · ReLU · Leaky ReLU · GELU · Softmax · Sigmoid · Tanh

5-Stage RISC-V Control Core

Custom AI instructions · Runtime scheduler · DMA orchestration · Multi-model switching · Clock gating

APB Slave·Synchronous FIFO▼ control + streaming data
AXI Master → SoC fabric / DRAM
APB Slave ← CPU control registers
Sync FIFO ↔ streaming data
Local SRAM · DDR optional
Clock + external power gating

HonChen NPU IP uses standardized interfaces that integrate directly into your existing SoC fabric. Industry-standard AXI Master for memory access, APB Slave for CPU control, and Synchronous FIFO for streaming data — no custom interconnect required.

Verification & Delivery

Engineering You Can
Actually Verify

25 years of IP development discipline applied to NPU — deliverables built for real integration and production.

FPGA PROTOTYPING

Pre-silicon validation on Xilinx-class FPGA boards for full workload regression

TESTED MODELS

Whisper small / tiny · Qwen 500M / 1.7B LLM · VITS · YOLOv3-tiny · YOLOv4-tiny · Face Detection · Face Angle — continuously expanding

ONNX COMPATIBLE

Standard model format input — bring your trained model from PyTorch, TensorFlow, or any ONNX exporter

BUILT-IN SELF-TEST

BIST mode for integration debugging and post-silicon production testing — verify the IP before tape-out and after

INTEGRATION SUPPORT

On-site / remote engineering hours during customer SoC integration phase

SDK + REFERENCE

Toolchain + model porting examples + integration reference platform

DOCUMENTATION

Integration guide · register reference · performance characterization · ONNX operator support matrix

MAINTENANCE

Annual subscription for bug fixes, minor enhancements, technical consultation

HonChen Cloud Service

Model Optimization,
Remote & Effortless

A planned cloud-assisted service for customers: upload an ONNX model, receive a NPU-optimized binary ready to deploy on HonChen IP — no local toolchain installation needed.

HonChen Cloud Optimizer

★ Coming soon

How it works. Customers connect to HonChen's secured server, upload an ONNX model, and receive an optimized binary configured for their target NPU configuration. No SDK installation, no local compute, no GPU farm — just the result.

Designed for ASIC service providers and OEM partners who need to iterate on multiple model variants quickly without maintaining the toolchain themselves.

Want to evaluate the toolchain?

Get a technical briefing on architecture, compiler, and integration — under NDA.