● Technology
From RTL to Inference
The engineering inside HonChen NPU IP — hand-crafted hardware, automated software toolchain, and a cloud-assisted optimization service designed to make integration painless.
● Compiler & SDK
Model Deployment Pipeline
From standard ONNX input to NPU-executable bytecode, our toolchain handles graph optimization, quantization, operator fusion, memory planning and code generation — so your team doesn't have to manually tune the model.
02 · FRONTEND
Graph Parser
Parse computational graph and validate operator support.
03 · OPTIMIZER
Graph Optimization
Constant folding, dead-code elimination, operator pruning.
04 · OPTIMIZER
Quantization
INT8 / FP16 mixed-precision, per-channel calibration.
05 · OPTIMIZER
Operator Fusion
Conv + BN + ReLU fused. MatMul + Add + Activation fused.
06 · BACKEND
Memory Planning
SRAM allocation, DMA scheduling, double-buffer management.
07 · BACKEND
Code Generation
NPU instruction set, multi-model runtime descriptor.
08 · DEPLOY
NPU Bytecode
Ready-to-execute binary, runs on HonChen NPU IP.
◆
Multi-Model Runtime
The same NPU runs ASR → LLM → TTS in pipeline with runtime model switching — no recompilation or chip reset.
◆
Mixed-Precision Strategy
Toolchain auto-selects INT8 vs FP16 per-layer to balance accuracy and footprint. Quantization-aware fine-tuning supported.
◆
Zero Manual Tuning
Customers feed ONNX, get NPU-ready binary. Memory layout, kernel fusion, instruction scheduling fully automated.
● SoC Integration
How HonChen NPU IP
Drops into Your SoC
Industry-standard bus protocols. Clean clock and reset boundaries. Designed to integrate, not disrupt — your existing SoC pipeline keeps working.
Process-Independent · 55 nm → 2 nm
Multi-Precision INT8 + FP16
Multi-Model Runtime Switching
Custom AI Instruction Set
Low-Power Architecture
HonChen NPU IP
▲AXI Master Interface▲ to SoC fabric / DRAM
CNN Compute Engine
Conv2D · Depthwise · Pointwise · Pooling · Upsample · BatchNorm · Element-wise · Concat
Transformer Engine
MatMul · Multi-Head Attention · Softmax · LayerNorm · GELU · KV Cache · Embedding
Local SRAM Scratchpad
Weights / activations cache · double-buffer · streaming buffer · DMA-managed
Quantize / Activation
INT8 / FP16 · Per-channel · ReLU · Leaky ReLU · GELU · Softmax · Sigmoid · Tanh
5-Stage RISC-V Control Core
Custom AI instructions · Runtime scheduler · DMA orchestration · Multi-model switching · Clock gating
▼APB Slave·Synchronous FIFO▼ control + streaming data
AXI Master → SoC fabric / DRAM
APB Slave ← CPU control registers
Sync FIFO ↔ streaming data
Local SRAM · DDR optional
Clock + external power gating
HonChen NPU IP uses standardized interfaces that integrate directly into your existing SoC fabric. Industry-standard AXI Master for memory access, APB Slave for CPU control, and Synchronous FIFO for streaming data — no custom interconnect required.
● Verification & Delivery
Engineering You Can
Actually Verify
25 years of IP development discipline applied to NPU — deliverables built for real integration and production.
FPGA PROTOTYPING
Pre-silicon validation on Xilinx-class FPGA boards for full workload regression
TESTED MODELS
Whisper small / tiny · Qwen 500M / 1.7B LLM · VITS · YOLOv3-tiny · YOLOv4-tiny · Face Detection · Face Angle — continuously expanding
ONNX COMPATIBLE
Standard model format input — bring your trained model from PyTorch, TensorFlow, or any ONNX exporter
BUILT-IN SELF-TEST
BIST mode for integration debugging and post-silicon production testing — verify the IP before tape-out and after
INTEGRATION SUPPORT
On-site / remote engineering hours during customer SoC integration phase
SDK + REFERENCE
Toolchain + model porting examples + integration reference platform
DOCUMENTATION
Integration guide · register reference · performance characterization · ONNX operator support matrix
MAINTENANCE
Annual subscription for bug fixes, minor enhancements, technical consultation
● HonChen Cloud Service
Model Optimization,
Remote & Effortless
A planned cloud-assisted service for customers: upload an ONNX model, receive a NPU-optimized binary ready to deploy on HonChen IP — no local toolchain installation needed.
☁
HonChen Cloud Optimizer
★ Coming soon
How it works. Customers connect to HonChen's secured server, upload an ONNX model, and receive an optimized binary configured for their target NPU configuration. No SDK installation, no local compute, no GPU farm — just the result.
Designed for ASIC service providers and OEM partners who need to iterate on multiple model variants quickly without maintaining the toolchain themselves.