Raspberry Pi AI HAT+ Hits 26 TOPS for On-Device Inference

A 26 TOPS Hailo-8 NPU on a $110 PCIe HAT — strong for real-time vision on the Pi 5, useless for local LLMs.

By Mike Perry · Published 2026-06-14 · Last verified 2026-07-22 · 10 min read

The Raspberry Pi AI HAT+ at 26 TOPS doubles inference throughput for object detection and edge ML on Pi 5 — but still cannot run LLMs.

The Raspberry Pi AI HAT+ 26 TOPS is the higher-tier variant of Raspberry Pi's M.2-based neural accelerator HAT, built around the Hailo-8 NPU and rated at 26 trillion INT8 operations per second. As of 2026 it sells for roughly $110, drops onto a Pi 5 over PCIe, and is purpose-built for real-time vision inference — object detection, pose estimation, segmentation — not for running large language models. The lower-tier 13 TOPS version uses the Hailo-8L and costs around $70.

What's actually inside the AI HAT+?

The AI HAT+ is a small daughterboard that mates to the Raspberry Pi 5's PCIe x1 connector through the M.2 HAT+ form factor. Underneath the heatsink sits a single Hailo silicon part: either the Hailo-8L (13 TOPS, 8 megapixels per second of throughput at INT8) on the entry-tier $70 board, or the full Hailo-8 (26 TOPS) on the $110 board you came here to read about. There is no DRAM on the HAT itself — the Hailo silicon ships with on-die SRAM measured in single-digit megabytes, and your compiled model graph plus its weights have to fit inside that envelope after Hailo's compiler quantizes and tiles it. Anything that doesn't fit gets streamed across PCIe from Pi system memory, which is where your real-world frame rates start sliding off the published headline number.

You wire it the same way Raspberry Pi documents on the Raspberry Pi AI HAT+ product page: unscrew the GPIO standoffs on a Raspberry Pi 5, seat the HAT+ on the PCIe ribbon, screw it down, install hailo-all from apt, and the device shows up as a PCIe accelerator that GStreamer's hailonet element can target. Importantly, this is a Pi 5 part. If you are still running a Raspberry Pi 4 Model B 8GB, the AI HAT+ is not your upgrade path — there is no PCIe lane exposed on the Pi 4. Pi 4 owners shopping the maker shelf today should either accept CPU-only inference, hang a USB accelerator off a port, or plan a board upgrade.

Hailo-8L architecture at 26 TOPS — how it compares to Coral and Jetson

Hailo's pitch with the 8-series is dataflow over the von Neumann model: instead of a fixed matrix-multiply unit that you feed serialized tensor ops, the Hailo-8 spreads a model graph across a grid of small compute clusters that pass activations to each other directly, with weights pinned next to the clusters that need them. The result, in 2026, is the best TOPS-per-watt figure you can buy on a hobby budget — about 2.6 TOPS per watt at the 10 W package envelope.

Accelerator	INT8 TOPS	Approx. price	Power envelope	Host requirement
Pi AI HAT+ (Hailo-8L)	13	$70	5 W	Raspberry Pi 5
Pi AI HAT+ 26 TOPS (Hailo-8)	26	$110	10 W	Raspberry Pi 5
Google Coral Edge TPU (USB)	4	$60	2 W	Any USB 3 host
NVIDIA Jetson Orin Nano 8GB	40 (INT8)	$250	7–15 W	Standalone SoM

The Coral is the cheap, easy comparison — but Google deprecated active development of the Edge TPU stack in 2024, the compiler tops out at TensorFlow Lite, and 4 TOPS is now a quarter of what the 13 TOPS Hailo-8L pulls for $10 more. Coral is still fine for "detect a person every two seconds on a doorbell" hobby builds, but if you are starting a project in 2026, the AI HAT+ obsoletes the Coral on raw throughput and on toolchain longevity.

The Jetson Orin Nano is the serious comparison. NVIDIA's $250 module hits 40 TOPS INT8, gives you a full CUDA stack, and — crucially — has enough on-board LPDDR5 to run small quantized LLMs like Phi-3 Mini and Qwen2 1.5B at usable token rates, which the AI HAT+ cannot do at all. You pay for it in money (more than 2× the Hailo-8 board), in power (up to 15 W vs. ~10 W), and in form factor (the Orin Nano is a system-on-module, not a HAT — you also buy a carrier board). For a fleet of vision-only edge nodes the AI HAT+ wins on $/TOPS and $/node; for a single workstation that needs flexibility, Jetson still wins. Tom's Hardware tracks the broader board market if you want a non-vendor pulse-check on where each platform sits this quarter.

Real benchmarks: object detection FPS on Pi 5 + AI HAT+

Hailo's own published numbers for YOLOv8n at 640×640 are 140 FPS on the Hailo-8L and 270 FPS on the Hailo-8. Those are pure-inference figures — model already loaded, batch of one, no decode, no draw, no tracker. When you build a real pipeline you eat that budget fast.

A realistic 1080p security-camera pipeline on a Pi 5 + AI HAT+ 26 TOPS looks like this: RTSP stream pulled in via FFmpeg/GStreamer, H.264 hardware-decoded by the Pi 5's VideoCore VII, scaled to 640×640, run through YOLOv8n on hailonet, then handed to a ByteTrack tracker on the Pi 5 CPU, then drawn onto the frame and re-encoded. Sustained throughput on that loop sits around 30 FPS per camera, with two cameras per HAT possible before tracker overhead on the four Cortex-A76 cores becomes the bottleneck. The Hailo NPU is mostly idle in that scenario — you are paying for headroom, not for raw frames.

Where the 26 TOPS variant earns its $40 premium over the 13 TOPS sibling is when you stack heavier models: YOLOv8s instead of n, or pair a detector with a re-identification head, or run pose estimation alongside detection. The 8L runs out of compile-time SRAM budget on those stacked graphs; the 8 finishes them with margin. If your project is single-model, single-camera, the 8L is fine. If it is multi-model or multi-camera, buy the 8.

Use cases that finally make sense locally

Three project categories cross the line from "interesting demo" to "actually deploy" with the 26 TOPS HAT.

The first is local security cameras. A Raspberry Pi 5, the AI HAT+ 26 TOPS, and an enclosure like the Pironman 5 case gives you a self-contained, fanless-or-quiet-fan box that ingests two 1080p RTSP feeds, runs YOLOv8 with person/vehicle/package classes, and writes only events — not 24/7 footage — to local disk. No Frigate-on-an-RTX-3060 desktop. No paid cloud detection tier. Frigate, Viseron, and the Hailo-supported hailo_rt GStreamer path all support this configuration today.

The second is edge speech wake-word and keyword spotting. Hailo's compiler accepts the small Conformer and Wav2Vec2 encoder variants that the wake-word and keyword-spotting community has standardized on. You will not run Whisper-large on it, but openWakeWord and pocketsphinx-replacement keyword spotters run with single-digit-millisecond latency and free up the Pi CPU for the rest of the assistant stack.

The third is retro-gaming and video upscalers — a niche that has quietly become one of the most popular Pi 5 use cases. The 26 TOPS budget is enough to run a small ESRGAN or RealESRGAN variant at 480p input → 1080p output in real time, which is exactly the workload a Pi-based MiSTer-style emulation box or a CRT-feed digitizer wants. The retro maker scene has been wiring these into the Raspberry Pi Zero W kit form factor for years as proof-of-concepts; the AI HAT+ is the first piece of hardware that makes the upscaler real-time at 1080p output on a Pi-class machine.

How to wire it: M.2 HAT compatibility, power budget, thermals

Three gotchas burn first-time AI HAT+ buyers.

First, the M.2 connector on the AI HAT+ board is an M-key NVMe-style slot — but it is not for your SSD. The Hailo silicon is the M.2 card. If you also want NVMe boot, you need a different HAT (the M.2 HAT+ for storage) and you cannot stack both on the same PCIe lane without a switch. In practice most builders boot from a fast microSD or USB SSD on the AI HAT+ build and put NVMe on a second Pi.

Second, the power budget. A Pi 5 alone draws about 5 W idle and 8 W under all-core load. Adding the AI HAT+ 26 TOPS at full inference duty adds another ~3.5 W on top, so a real build is sustaining 8.5 W active and peaking higher during model load. Raspberry Pi's official 27 W USB-C PSU is not a recommendation, it is the floor. Third-party 5 V/3 A bricks will throttle the Pi under inference load and you will spend a week debugging "random frame drops" that are actually voltage warnings.

Third, thermals. The AI HAT+ ships with a passive heatsink that is enough for short bursts. Sustained 24/7 vision inference in a closed enclosure needs airflow. The official Pi 5 active cooler handles the CPU side; the HAT runs hot independently. The Pironman 5 case is one of the few off-the-shelf enclosures with adequate top-side airflow for the HAT specifically — it became briefly hard to find in early 2026 because of exactly this use case.

Limitations: no LLM inference, INT8/INT4 only, model conversion friction

Be honest about what this board cannot do.

It will not run an LLM. Hailo's compiler targets fixed-graph CNN and transformer-encoder workloads — image classifiers, detectors, segmenters, audio encoders. It does not handle autoregressive decode, which is the entire computational shape of a language model. There is no llama.cpp Hailo backend in 2026 and the architecture does not lend itself to one. If LLMs are on your roadmap, buy a Jetson Orin Nano or a small x86 box with a used GPU; do not buy the AI HAT+.

It is INT8-first, with some INT4 support on newer compiler builds. Your model gets quantized. For YOLO-family detectors that is fine — Hailo's Model Zoo ships pre-quantized variants and the mAP loss versus FP32 is in the 1–2 percentage-point range. For research models, especially ones with attention heads that hate quantization, expect to spend real engineering time on calibration data and per-layer quantization configs to recover accuracy.

The conversion pipeline is real friction. You take an ONNX file, run it through the Hailo Dataflow Compiler, hand it a representative calibration dataset, and it spits out a .hef file you load on-device. For models from Hailo's Model Zoo this is a one-liner. For a custom model — say you trained your own YOLOv8 fine-tune — it is a half-day of compiler tuning the first time. Hailo links the Model Zoo and the compiler docs from its accelerator product page, and they are unusually good for a niche silicon vendor, but the friction is the cost. Plan for it.

Should you wait for the next revision?

The honest answer depends on what you are building.

If you are starting a 2026 project that needs on-device vision inference and you are committing to the Pi 5 platform, buy the AI HAT+ 26 TOPS now. It is the right hardware for the price tier, the software stack is mature enough to run in production, and there is no public Raspberry Pi/Hailo roadmap suggesting a successor in the next six months.

If you are a Pi 4 owner — including buyers of the very popular Raspberry Pi 4 Model B 8GB — this product is not for you, and the right move is to wait. Either wait for a USB or HAT-form-factor accelerator that targets the Pi 4's interfaces, or upgrade the host board to a Pi 5 when your build calls for it. Do not buy the AI HAT+ planning to "make it work" on a Pi 4. You will not.

If you are between a 13 TOPS Hailo-8L and a 26 TOPS Hailo-8, pay the $40 difference for the 8. The headroom is the product. The 8L exists for cost-sensitive single-model builds, but the 8 is what you want the moment you stack a second model or a second camera.

If you are between an AI HAT+ and a Jetson Orin Nano, the answer is the workload. Vision-only and edge-deployed at multiple nodes — AI HAT+. Workstation-style with mixed LLM and vision workloads on a single box — Jetson. You can quickly stack them side by side with our comparison view before committing the budget.

Related guides

Bottom line

The Raspberry Pi AI HAT+ 26 TOPS is the first sub-$120 accelerator that makes real-time, multi-stream vision inference on a hobbyist board feel boring instead of heroic. As of 2026 it is the right buy for a Pi 5–based security cam, a small fleet of edge vision nodes, or a retro-gaming upscaler. It is the wrong buy for LLM tinkering, for Pi 4 owners, or for anyone unwilling to invest a day in the Hailo compiler. Pay the extra $40 over the 13 TOPS variant, budget for the 27 W PSU and real airflow, and skip the Coral.

Products mentioned in this article

Tap any product for full specs, live Amazon & eBay pricing, and alternatives.

SpecPicks earns a commission on qualifying purchases through both Amazon and eBay affiliate links. Prices and stock update independently.

Frequently asked questions

What does 26 TOPS mean for a Raspberry Pi?

TOPS measures trillions of integer operations per second the accelerator can perform. At 26 TOPS the AI HAT+ can run real-time object detection and pose estimation on a camera feed locally, work that a bare Pi CPU handles only at a few frames per second. It targets vision inference, not large language models.

Can the AI HAT+ run large language models locally?

Not meaningfully. The Hailo NPU is optimized for quantized vision and audio models, not the large memory footprints LLMs need. For local LLM experiments on a Pi-class budget you are still better off with system RAM on a Raspberry Pi 4 8GB running small quantized models, accepting low tokens per second.

Does the AI HAT+ work with the Raspberry Pi 4?

The AI HAT+ is designed around the Pi 5's PCIe interface, so it does not drop onto a Pi 4 the same way. Pi 4 owners who want local AI typically rely on the CPU or a USB accelerator instead. If on-device 26-TOPS vision is the goal, plan the build around the newer board.

Is on-device inference better than cloud for camera projects?

For privacy and latency, yes. Running detection locally means video frames never leave the device, which matters for home security and always-on cameras, and it removes per-call cloud cost. The tradeoff is model size: edge NPUs run smaller, quantized models than a datacenter GPU, so accuracy on hard scenes can be lower.

What else do I need besides the HAT to start?

A compatible Raspberry Pi, a quality power supply, adequate cooling because sustained inference raises board temperature, and a camera module or USB camera for vision projects. Storage matters too — a fast microSD or SSD keeps model loading and logging responsive on an always-on edge-AI build.

Sources

More guides & deep dives from the SpecPicks archive

Browse all articles & guides →

More reviews from the SpecPicks archive

Browse all reviews →

More buying guides from SpecPicks

Browse all buying guides →

Raspberry Pi AI HAT+ Hits 26 TOPS for On-Device Inference

What's actually inside the AI HAT+?

Hailo-8L architecture at 26 TOPS — how it compares to Coral and Jetson

Real benchmarks: object detection FPS on Pi 5 + AI HAT+

Use cases that finally make sense locally

How to wire it: M.2 HAT compatibility, power budget, thermals

Limitations: no LLM inference, INT8/INT4 only, model conversion friction

Should you wait for the next revision?

Related guides

Bottom line

Products mentioned in this article

Raspberry Pi 4 Computer Model B 8GB Single Board Computer Suitable for…

Raspberry Pi 4 Computer Model B 8GB Single Board Computer Suitable for…

Raspberry Pi Zero W Basic Starter Kit-Includes Pi Zero W Board-Power Supply &…

Raspberry Pi 5 8GB

Raspberry Pi 5 8GB

Raspberry Pi 5 8GB

Pironman 5 NVMe M.2 SSD PCIe 2.0/3.0 Mini PC Case for Raspberry Pi 5 Hailo-8L…

Frequently asked questions

Sources

Recommended reading

More guides & deep dives from the SpecPicks archive

More reviews from the SpecPicks archive

More buying guides from SpecPicks

Raspberry Pi AI HAT+ Hits 26 TOPS for On-Device Inference

What's actually inside the AI HAT+?

Hailo-8L architecture at 26 TOPS — how it compares to Coral and Jetson

Real benchmarks: object detection FPS on Pi 5 + AI HAT+

Use cases that finally make sense locally

How to wire it: M.2 HAT compatibility, power budget, thermals

Limitations: no LLM inference, INT8/INT4 only, model conversion friction

Should you wait for the next revision?

Related guides

Bottom line

Raspberry Pi 4 Computer Model B 8GB Single Board Computer Suitable for…

Raspberry Pi 4 Computer Model B 8GB Single Board Computer Suitable for…

Raspberry Pi Zero W Basic Starter Kit-Includes Pi Zero W Board-Power Supply &…

Raspberry Pi 5 8GB

Raspberry Pi 5 8GB

Raspberry Pi 5 8GB

Pironman 5 NVMe M.2 SSD PCIe 2.0/3.0 Mini PC Case for Raspberry Pi 5 Hailo-8L…

Frequently asked questions

Sources

Recommended reading

Keep reading on SpecPicks

More from the archive

Deeper dives from the SpecPicks archive

Just published on SpecPicks