Building a Bird Flock: From Canvas to WASM Components
In 1986, Craig Reynolds presented a paper called “Flocks, Herds, and Schools: A Distributed Behavioral Model.” It described how three deceptively simple rules - applied independently to each agent - produce the mesmerizing, coordinated motion we see in bird flocks and fish schools.
Those three rules are:
- Separation - steer away from neighbors that are too close
- Alignment - steer toward the average heading of nearby neighbors
- Cohesion - steer toward the average position of nearby neighbors
No leader. No global plan. Just local rules producing emergent order.
This makes boids a perfect benchmark for exploring web performance. The simulation is embarrassingly parallel, computationally uniform, and scales predictably: double the boids, quadruple the neighbor comparisons. Let’s build it four different ways, from simplest to most scalable.
The live demo
Before we dive into the implementation, here’s the final result running live on this page - a Rust-compiled WebAssembly component rendering to Canvas 2D. Expand the parameters panel to adjust the flock in real-time.
Parameters
Behavior Presets
Approach 1: Pure TypeScript + Canvas 2D
The simplest approach is to implement everything in TypeScript. No build tools, no WASM, just a script tag and a canvas.
The data model
Each boid is an object with position and velocity:
interface Boid {
x: number;
y: number;
vx: number;
vy: number;
} The update loop
For each boid, scan all other boids and accumulate the three steering forces:
function tick(boids: Boid[], params: Params, dt: number) {
for (const boid of boids) {
let sepX = 0, sepY = 0;
let aliVx = 0, aliVy = 0, aliCount = 0;
let cohX = 0, cohY = 0, cohCount = 0;
for (const other of boids) {
if (other === boid) continue;
const dx = other.x - boid.x;
const dy = other.y - boid.y;
const distSq = dx * dx + dy * dy;
if (distSq < params.separationDistance ** 2) {
sepX -= dx / distSq;
sepY -= dy / distSq;
}
if (distSq < params.alignmentDistance ** 2) {
aliVx += other.vx;
aliVy += other.vy;
aliCount++;
}
if (distSq < params.cohesionDistance ** 2) {
cohX += other.x;
cohY += other.y;
cohCount++;
}
}
// Apply forces, clamp speed, update position...
boid.vx += sepX * params.separationStrength;
boid.vy += sepY * params.separationStrength;
// ... alignment and cohesion similarly
boid.x += boid.vx * dt;
boid.y += boid.vy * dt;
}
} Tradeoffs
Pros:
- Zero toolchain complexity. Works in any browser, any bundler.
- Easy to debug - just
console.loga boid. - Hot-reloading is instant.
Cons:
- Each boid is a heap-allocated object. At 200 boids with 60fps, that’s a lot of GC pressure.
- The O(n²) neighbor scan runs in interpreted JavaScript. V8’s JIT helps, but you’ll feel the strain around 300-400 boids.
- No way to leverage SIMD or shared memory.
For a blog demo with ~200 boids, this works fine. But we can do better.
Approach 2: Rust + wasm-bindgen
Moving the simulation into Rust and compiling to WebAssembly eliminates GC pressure and gives us predictable, near-native performance.
Flat memory layout
Instead of objects, we use a flat Vec<f32> where each boid occupies 4 consecutive floats:
// Boid i lives at index i * 4:
// state[i*4] = x
// state[i*4 + 1] = y
// state[i*4 + 2] = vx
// state[i*4 + 3] = vy
pub fn tick(state: &[f32], params: &Params, dt: f32) -> Vec<f32> {
let n = state.len() / 4;
let mut out = vec![0.0f32; n * 4];
for i in 0..n {
let xi = state[i * 4];
let yi = state[i * 4 + 1];
// ... accumulate forces, update position
}
out
} This layout is cache-friendly: iterating over boids walks memory linearly instead of chasing pointers.
The wasm-bindgen approach
With wasm-bindgen and wasm-pack, you’d expose functions directly:
#[wasm_bindgen]
pub fn init(num_boids: u32, width: f32, height: f32) -> Vec<f32> {
// ...
}
#[wasm_bindgen]
pub fn tick(state: &[f32], /* 13 individual params */) -> Vec<f32> {
// ...
} Tradeoffs
Pros:
- 2-5x faster than TypeScript for the core simulation loop.
- No GC pauses - Rust manages its own memory.
- Predictable, consistent frame times.
wasm-packmakes the build straightforward.
Cons:
- The interface is a flat list of function parameters. Add a new param? Change every call site.
- No schema for the interface - consumers need to read docs or source code.
- Tightly coupled to the JavaScript host: the
#[wasm_bindgen]boundary is specific to JS.
This is where most Rust+WASM projects stop. But the WebAssembly Component Model offers something more interesting.
Approach 3: WASM Component Model
The Component Model introduces WIT (WebAssembly Interface Types) - a way to define typed, language-agnostic interfaces for WASM modules. Instead of exposing raw functions with primitive parameters, you define a contract.
The WIT interface
Here’s our boids simulation as a WIT world:
package blog:boids@0.1.0;
interface types {
record params {
num-boids: u32,
max-speed: f32,
min-speed: f32,
separation-distance: f32,
alignment-distance: f32,
cohesion-distance: f32,
separation-strength: f32,
alignment-strength: f32,
cohesion-strength: f32,
turn-margin: f32,
turn-factor: f32,
width: f32,
height: f32,
}
type state-buffer = list<f32>;
}
interface simulation {
use types.{params, state-buffer};
init: func(p: params) -> state-buffer;
tick: func(state: state-buffer, p: params, dt: f32) -> state-buffer;
resize: func(state: state-buffer, old-count: u32,
new-count: u32, p: params) -> state-buffer;
}
world boids {
export simulation;
} This is a typed contract. The params record bundles all configuration into a single structured type. The state-buffer is an explicit alias making the flat-array convention part of the spec.
The toolchain
Building a WASM component from Rust uses cargo-component:
# Install the tools
rustup target add wasm32-wasip2
cargo install cargo-component
npm install -D @bytecodealliance/jco
# Build + transpile to ESM
cargo component build --release
npx jco transpile target/wasm32-wasip1/release/boids.wasm
-o src/lib/wasm/boids/ --name boids cargo-component reads the WIT definition and generates Rust bindings via wit-bindgen. You implement a trait:
wit_bindgen::generate!({ world: "boids" });
struct Component;
impl Guest for Component {
fn init(p: Params) -> Vec<f32> {
boid::init(&p)
}
fn tick(state: Vec<f32>, p: Params, dt: f32) -> Vec<f32> {
boid::tick(&state, &p, dt)
}
fn resize(state: Vec<f32>, old_count: u32,
new_count: u32, p: Params) -> Vec<f32> {
boid::resize(&state, old_count, new_count, &p)
}
}
bindings::export!(Component with_types_in bindings); jco transpile then converts the WASM component into browser-compatible ESM with TypeScript definitions. The generated types mirror the WIT exactly:
export interface Params {
numBoids: number;
maxSpeed: number;
separationDistance: number;
// ... all 13 fields, typed and documented
}
export function init(p: Params): Float32Array;
export function tick(state: Float32Array, p: Params, dt: number): Float32Array; Using it from Svelte
The Svelte component dynamically imports the WASM (to avoid SSR issues) and runs the animation loop:
onMount(async () => {
const { simulation } = await import('$lib/wasm/boids/boids');
let state = simulation.init(params);
const loop = (now: number) => {
const dt = Math.min((now - lastTime) / 1000, 0.05);
state = simulation.tick(state, params, dt);
draw(state, ctx);
requestAnimationFrame(loop);
};
requestAnimationFrame(loop);
}); Parameter changes take effect on the next tick - no reinitialization needed (except when changing flock size, which uses resize()).
Why the Component Model matters
The value isn’t just performance - it’s composability:
- Typed contract: The WIT file is the API documentation. Add a
wind-strengthfield toparams? The Rust compiler and TypeScript types both break until you handle it. - Language-agnostic: Someone could reimplement the simulation in Go, C, or Python, targeting the same WIT interface. The Svelte frontend wouldn’t change.
- Composable: Components can be combined. You could compose a “predator” component with the boids component, both conforming to the same
typesinterface. - Versionable: The
@0.1.0in the package name means the interface can evolve with semver semantics.
Tradeoffs
Pros:
- All the performance benefits of Rust + WASM.
- Structured, typed, language-neutral interface.
- Future-proof: as WASI matures, the same component runs in browsers, servers, and edge runtimes.
Cons:
- Extra toolchain (
cargo-component,jco). - The WASI shims add some bundle weight (~60KB for the core WASM + JS glue).
- Component boundary has a small overhead per call (data copying). Negligible for 200 boids (3.2KB/tick), but worth measuring at 1000+.
Shrinking the bundle
The initial WASM build was 76KB for the component, expanding to 58KB core WASM + 191KB JavaScript glue after jco transpile. That’s a lot for a simulation that’s fundamentally just arithmetic. Here’s how to cut it nearly in half.
Cargo release profile
Add size-focused settings to Cargo.toml:
[profile.release]
opt-level = "z" # optimize for size over speed
lto = true # link-time optimization across crates
codegen-units = 1 # single codegen unit for max optimization
panic = "abort" # no unwinding machinery
strip = true # remove debug symbols opt-level = "z" trades micro-optimizations for smaller code. For our boids loop — simple arithmetic, no complex control flow — the performance difference is negligible. lto = true combined with codegen-units = 1 is the big win: it lets LLVM see across crate boundaries (wit-bindgen, wit-bindgen-rt) and eliminate dead code aggressively.
Post-build optimization
After cargo component build --release, two more tools help:
# Strip custom sections (names, producers metadata)
wasm-tools strip boids.wasm -o boids.wasm
# Minify the JS glue code
npx jco transpile boids.wasm -o output/ --minify wasm-tools strip removes metadata sections the browser doesn’t need. The --minify flag on jco transpile compresses the JavaScript glue from 191KB to 97KB.
Note: jco transpile -O runs wasm-opt on extracted core modules, but as of early 2026, wasm-opt doesn’t fully support trunc_sat instructions emitted by recent Rust toolchains. Once this lands, expect another 15-20% reduction.
Results
| Artifact | Before | After | Reduction |
|---|---|---|---|
| Component WASM | 76 KB | 59 KB | -22% |
| core.wasm | 58 KB | 40 KB | -33% |
| boids.js (glue) | 191 KB | 97 KB | -49% |
| Total bundle | 265 KB | 148 KB | -44% |
The remaining 97KB of JS glue is mostly WASI shims (@bytecodealliance/preview2-shim) that our boids module doesn’t actually use. As the Component Model matures and browsers gain native support, this overhead will disappear.
Composability: Custom behavior via WIT imports
So far, the WASM component only exports functionality. But the Component Model’s real power is bidirectional: a component can also import interfaces from the host. This lets the JavaScript side inject custom behavior into the simulation.
The customizer interface
We extend our WIT world with an imported customizer interface:
interface customizer {
/// Called per-boid after forces, before speed clamping.
modify-velocity: func(x: f32, y: f32, vx: f32, vy: f32,
speed: f32) -> tuple<f32, f32>;
/// Returns RGBA color for a boid based on its state.
boid-color: func(x: f32, y: f32, vx: f32, vy: f32,
speed: f32) -> tuple<f32, f32, f32, f32>;
}
world boids {
import customizer;
export simulation;
} The import keyword is the key difference. The WASM component declares that it needs these functions; the host provides them. This is the opposite of export — it’s a contract that flows inward.
One-line Rust change
In the boids tick loop, after computing forces and before clamping speed, we call the imported function:
// Host customizer: modify velocity (e.g. add wind, vortex)
let speed = (nvx * nvx + nvy * nvy).sqrt();
(nvx, nvy) = customizer::modify_velocity(xi, yi, nvx, nvy, speed); That’s it. wit-bindgen generates the import stubs automatically. The Rust code calls a function that doesn’t exist in Rust — it’s provided by JavaScript at runtime.
The JavaScript host implementation
The jco transpile command maps the import to a JS module:
npx jco transpile boids.wasm -o output/
--map 'blog:boids/customizer@0.1.0=../../boids/customizer.js' The customizer module exports swappable functions:
let modifier = (_x, _y, vx, vy, _speed) => [vx, vy]; // default: pass-through
export function setModifier(fn) { modifier = fn; }
export function modifyVelocity(x, y, vx, vy, speed) {
return modifier(x, y, vx, vy, speed);
} Now anyone can change the flock’s behavior by calling setModifier with a new function. The demo above includes presets like “Vortex” (circular force toward center), “Wind” (constant rightward push), and “Predator” (repulsion from a point).
Why this matters
This pattern is the Component Model’s answer to plugin systems:
- The WASM component is sandboxed. It can only call functions explicitly declared in the WIT interface. No filesystem access, no network, no globals — just the functions you provide.
- The interface is a contract. If you change
modify-velocityto return three values instead of two, both the Rust compiler and the TypeScript types break. You can’t silently break the integration. - It’s language-agnostic. The customizer could be implemented in JavaScript today, or compiled from Rust/Go/Python as a separate WASM component and composed using
wac plug.
The per-boid cross-boundary call adds ~microsecond overhead. At 200 boids this is invisible; at 5000 boids it’s measurable. For hot paths at scale, a batch API (modify-velocities(state) -> state) would amortize the overhead.
Looking ahead: WebGPU compute shaders
All three approaches above share a limitation: the O(n²) neighbor scan runs on a single thread. Even WASM can only push this to about 1000-2000 boids at 60fps. For 10,000+ boids, we need parallelism.
WebGPU compute shaders can run the neighbor scan across thousands of GPU threads simultaneously. A WGSL compute shader for boid updates would look something like:
@group(0) @binding(0) var<storage, read> input: array<vec4f>;
@group(0) @binding(1) var<storage, read_write> output: array<vec4f>;
@compute @workgroup_size(64)
fn update(@builtin(global_invocation_id) id: vec3u) {
let i = id.x;
if (i >= arrayLength(&input)) { return; }
let pos = input[i].xy;
let vel = input[i].zw;
var sep = vec2f(0.0);
var ali = vec2f(0.0);
var coh = vec2f(0.0);
var aliCount = 0u;
var cohCount = 0u;
for (var j = 0u; j < arrayLength(&input); j++) {
if (j == i) { continue; }
let other = input[j];
let d = other.xy - pos;
let distSq = dot(d, d);
if (distSq < separationDist * separationDist) {
sep -= d / distSq;
}
// ... alignment and cohesion similarly
}
// Apply forces, clamp, write output
output[i] = vec4f(newPos, newVel);
} Each boid’s update runs on a separate GPU thread. The O(n²) is still there per-boid, but with 64 boids per workgroup and thousands of concurrent invocations, 10,000 boids becomes feasible at 60fps.
WebGPU is available in Chrome, Edge, and Firefox (behind a flag). The API is more complex than Canvas 2D, but for large-scale particle simulations, it’s the endgame.
A hybrid approach is compelling: use the WASM Component Model for the simulation interface (WIT types, parameter management), but dispatch the heavy computation to a GPU compute shader when WebGPU is available, falling back to the WASM simulation otherwise.
Performance comparison
Rather than quoting static numbers, you can run the benchmark yourself on your own hardware. It measures average tick time (headless, no rendering) for both the pure TypeScript and WASM Component Model implementations at 100, 200, 500, 1000, and 5000 boids.
On a mid-range laptop, expect WASM to be 2-5x faster than TypeScript at larger flock sizes, with the gap widening as boid count increases. The Component Model has a slight overhead vs. raw wasm-bindgen due to the component boundary, but the difference is negligible in practice — well under 10%.
Conclusion
The progression tells a story about the web platform’s evolution:
- TypeScript is the right starting point. Simple, debuggable, good enough for small demos.
- Rust + WASM unlocks consistent performance without GC surprises.
- The Component Model adds a typed interface contract. It’s not just about speed - it’s about making the simulation composable. Someone else can implement the WIT interface differently, or compose multiple components together, without touching the rendering code.
- WebGPU is the future for truly large-scale simulations, bringing GPU parallelism to the browser.
The WASM Component Model is still young - the toolchain has rough edges, and browser support requires jco transpilation. But the direction is clear: portable, composable, language-agnostic modules with typed interfaces. That’s worth building toward.
You can try the interactive demo or explore the source code for this implementation.