What is the Parallel Processing Unit (PPU)?

News – 28/07/25

Modern workloads are pushing server CPUs beyond their limits. Cloud infrastructure, AI inference, and data-intensive applications demand scalable parallelism, but today’s CPUs weren’t built for it.

That’s why we created the Parallel Processing Unit (PPU).

What is Flow? What is the PPU? What are the benefits of the PPU?

Hear it from the people building it:

A powerful partner to the CPU

The PPU is a licensable IP block that integrates directly into a CPU chip. It’s a general-purpose parallel co-processor that works alongside your CPU. Not to replace it, but to unlock scalable, high-throughput performance.

Together, they form a heterogeneous architecture designed for the next era of compute: efficient, massively parallel, and software-adaptable.

The PPU is instruction set independent, compatible with Arm, x86, RISC-V and Power architectures, and supports a step-by-step migration path.

Why server CPUs hit a wall

Conventional CPUs rely on architecture replicating processor cores originally designed for sequential computing. But as core counts increase, this approach hits structural limitations:

Thread management overhead
Memory bottlenecks and cache contention
Poor scalability beyond a few dozen cores

And while GPUs, NPUs, and custom accelerators offer brute-force throughput, they’re not ideal for helping CPUs with general-purpose or real-time workloads.

How the PPU changes the game

Flow’s PPU is based on a novel architectural model called Thick Control Flow (TCF). It introduces a new way to scale parallelism, without the overhead of traditional multicore designs.

Key benefits:

Near-linear performance scalability (up to 256 cores)
Dramatically simplified parallel execution and thread management
Less synchronization overhead and faster memory throughput without coherency issues
ISA independence across all leading architectures

Built for cloud, hyperscale, and beyond

The PPU is ideal for environments where parallel throughput, determinism, and energy efficiency are critical:

Cloud providers & hyperscalers: Boost compute density and reduce energy cost per operation
AI/ML workloads : Accelerate pre- and post-processing in LLM training
Data centers & HPC systems: Deliver scalable performance for general-purpose tasks
Custom CPU designs: Integrate as ready-to-license IP

A smarter path to parallel

The PPU isn’t a patch. It’s a platform for rethinking how we do parallel computing.

It empowers developers to choose which parts of their software to offload to the PPU, keeping the rest on the CPU, while unlocking significant performance boosts.

For those ready to scale smarter, the PPU opens the door to next-generation performance.