Building a Bird Flock: From Canvas to WASM Components

In 1986, Craig Reynolds presented a paper called “Flocks, Herds, and Schools: A Distributed Behavioral Model.” It described how three deceptively simple rules - applied independently to each agent - produce the mesmerizing, coordinated motion we see in bird flocks and fish schools.

Those three rules are:

Separation - steer away from neighbors that are too close
Alignment - steer toward the average heading of nearby neighbors
Cohesion - steer toward the average position of nearby neighbors

No leader. No global plan. Just local rules producing emergent order.

Mental model: each boid only looks at nearby boids, computes a few steering forces, and updates its own position. The flock emerges from everyone running the same tiny local algorithm.

This makes boids a perfect benchmark for exploring web performance. The simulation is embarrassingly parallel, computationally uniform, and scales predictably: double the boids, quadruple the neighbor comparisons. Let’s build it four different ways, from simplest to most scalable.

The live demo

Before we dive into the implementation, here’s the final result running live on this page - a Rust-compiled WebAssembly component rendering to Canvas 2D. Expand the parameters panel to adjust the flock in real-time.

Loading WASM module...

Approach 1: Pure TypeScript + Canvas 2D

The simplest approach is to implement everything in TypeScript. No build tools, no WASM, just a script tag and a canvas.

The data model

Each boid is an object with position and velocity:

interface Boid {
  x: number;
  y: number;
  vx: number;
  vy: number;
}

The update loop

For each boid, scan all other boids and accumulate the three steering forces:

function tick(boids: Boid[], params: Params, dt: number) {
  for (const boid of boids) {
    let sepX = 0, sepY = 0;
    let aliVx = 0, aliVy = 0, aliCount = 0;
    let cohX = 0, cohY = 0, cohCount = 0;

    for (const other of boids) {
      if (other === boid) continue;
      const dx = other.x - boid.x;
      const dy = other.y - boid.y;
      const distSq = dx * dx + dy * dy;

      if (distSq < params.separationDistance ** 2) {
        sepX -= dx / distSq;
        sepY -= dy / distSq;
      }
      if (distSq < params.alignmentDistance ** 2) {
        aliVx += other.vx;
        aliVy += other.vy;
        aliCount++;
      }
      if (distSq < params.cohesionDistance ** 2) {
        cohX += other.x;
        cohY += other.y;
        cohCount++;
      }
    }

    // Apply forces, clamp speed, update position...
    boid.vx += sepX * params.separationStrength;
    boid.vy += sepY * params.separationStrength;
    // ... alignment and cohesion similarly
    boid.x += boid.vx * dt;
    boid.y += boid.vy * dt;
  }
}

Mental model: the TypeScript version is the direct translation of the rules: for every boid, loop over every other boid, accumulate forces, then move. It is simple and readable, but the nested loop is the cost center.

Tradeoffs

Pros:

Zero toolchain complexity. Works in any browser, any bundler.
Easy to debug - just console.log a boid.
Hot-reloading is instant.

Cons:

Each boid is a heap-allocated object. At 200 boids with 60fps, that’s a lot of GC pressure.
The O(n²) neighbor scan runs in interpreted JavaScript. V8’s JIT helps, but you’ll feel the strain around 300-400 boids.
No way to leverage SIMD or shared memory.

For a blog demo with ~200 boids, this works fine. But we can do better.

Approach 2: Rust + wasm-bindgen

Moving the simulation into Rust and compiling to WebAssembly eliminates GC pressure and gives us predictable, near-native performance.

Flat memory layout

Instead of objects, we use a flat Vec<f32> where each boid occupies 4 consecutive floats:

// Boid i lives at index i * 4:
//   state[i*4]     = x
//   state[i*4 + 1] = y
//   state[i*4 + 2] = vx
//   state[i*4 + 3] = vy
pub fn tick(state: &[f32], params: &Params, dt: f32) -> Vec<f32> {
    let n = state.len() / 4;
    let mut out = vec![0.0f32; n * 4];

    for i in 0..n {
        let xi = state[i * 4];
        let yi = state[i * 4 + 1];
        // ... accumulate forces, update position
    }

    out
}

This layout is cache-friendly: iterating over boids walks memory linearly instead of chasing pointers.

Mental model: instead of passing a flock of objects across the JS/WASM boundary, we pass one contiguous numeric buffer. JavaScript and Rust agree that every four floats mean x, y, vx, and vy.

The wasm-bindgen approach

With wasm-bindgen and wasm-pack, you’d expose functions directly:

#[wasm_bindgen]
pub fn init(num_boids: u32, width: f32, height: f32) -> Vec<f32> {
    // ...
}

#[wasm_bindgen]
pub fn tick(state: &[f32], /* 13 individual params */) -> Vec<f32> {
    // ...
}

Tradeoffs

Pros:

2-5x faster than TypeScript for the core simulation loop.
No GC pauses - Rust manages its own memory.
Predictable, consistent frame times.
wasm-pack makes the build straightforward.

Cons:

The interface is a flat list of function parameters. Add a new param? Change every call site.
No schema for the interface - consumers need to read docs or source code.
Tightly coupled to the JavaScript host: the #[wasm_bindgen] boundary is specific to JS.

This is where most Rust+WASM projects stop. But the WebAssembly Component Model offers something more interesting.

Approach 3: WASM Component Model

The Component Model introduces WIT (WebAssembly Interface Types) - a way to define typed, language-agnostic interfaces for WASM modules. Instead of exposing raw functions with primitive parameters, you define a contract.

The WIT interface

Here’s our boids simulation as a WIT world:

package blog:boids@0.1.0;

interface types {
    record params {
        num-boids: u32,
        max-speed: f32,
        min-speed: f32,
        separation-distance: f32,
        alignment-distance: f32,
        cohesion-distance: f32,
        separation-strength: f32,
        alignment-strength: f32,
        cohesion-strength: f32,
        turn-margin: f32,
        turn-factor: f32,
        width: f32,
        height: f32,
    }

    type state-buffer = list<f32>;
}

interface simulation {
    use types.{params, state-buffer};

    init: func(p: params) -> state-buffer;
    tick: func(state: state-buffer, p: params, dt: f32) -> state-buffer;
    resize: func(state: state-buffer, old-count: u32,
                 new-count: u32, p: params) -> state-buffer;
}

world boids {
    export simulation;
}

Mental model: WIT is the boundary both sides agree on. Rust implements it, JavaScript consumes the generated types, and changing the interface breaks both sides until the contract is updated everywhere.

This is a typed contract. The params record bundles all configuration into a single structured type. The state-buffer is an explicit alias making the flat-array convention part of the spec.

The toolchain

Building a WASM component from Rust uses cargo-component:

# Install the tools
rustup target add wasm32-wasip2
cargo install cargo-component
npm install -D @bytecodealliance/jco

# Build + transpile to ESM
cargo component build --release
npx jco transpile target/wasm32-wasip2/release/boids.wasm 
  -o src/lib/wasm/boids/ --name boids

cargo-component reads the WIT definition and generates Rust bindings via wit-bindgen. You implement a trait:

wit_bindgen::generate!({ world: "boids" });

struct Component;

impl Guest for Component {
    fn init(p: Params) -> Vec<f32> {
        boid::init(&p)
    }

    fn tick(state: Vec<f32>, p: Params, dt: f32) -> Vec<f32> {
        boid::tick(&state, &p, dt)
    }

    fn resize(state: Vec<f32>, old_count: u32,
              new_count: u32, p: Params) -> Vec<f32> {
        boid::resize(&state, old_count, new_count, &p)
    }
}

bindings::export!(Component with_types_in bindings);

jco transpile then converts the WASM component into browser-compatible ESM with TypeScript definitions. The generated types mirror the WIT exactly:

export interface Params {
  numBoids: number;
  maxSpeed: number;
  separationDistance: number;
  // ... all 13 fields, typed and documented
}

export function init(p: Params): Float32Array;
export function tick(state: Float32Array, p: Params, dt: number): Float32Array;

Mental model: cargo-component turns the Rust implementation into a typed WASM component, and jco turns that component into browser-loadable ESM glue. The browser imports normal JavaScript, but the API still comes from the WIT contract.

Using it from Svelte

The Svelte component dynamically imports the WASM (to avoid SSR issues) and runs the animation loop:

onMount(async () => {
  const { simulation } = await import('$lib/wasm/boids/boids');
  let state = simulation.init(params);

  const loop = (now: number) => {
    const dt = Math.min((now - lastTime) / 1000, 0.05);
    state = simulation.tick(state, params, dt);
    draw(state, ctx);
    requestAnimationFrame(loop);
  };
  requestAnimationFrame(loop);
});

Parameter changes take effect on the next tick - no reinitialization needed (except when changing flock size, which uses resize()).

Why the Component Model matters

The value isn’t just performance - it’s composability:

Typed contract: The WIT file is the API documentation. Add a wind-strength field to params? The Rust compiler and TypeScript types both break until you handle it.
Language-agnostic: Someone could reimplement the simulation in Go, C, or Python, targeting the same WIT interface. The Svelte frontend wouldn’t change.
Composable: Components can be combined. You could compose a “predator” component with the boids component, both conforming to the same types interface.
Versionable: The @0.1.0 in the package name means the interface can evolve with semver semantics.

Mental model: for this demo, the Component Model is not mainly about making one function call faster. It is about making the simulation swappable, typed, and composable so different implementations can plug into the same frontend.

Tradeoffs

Pros:

All the performance benefits of Rust + WASM.
Structured, typed, language-neutral interface.
Future-proof: as WASI matures, the same component runs in browsers, servers, and edge runtimes.

Cons:

Extra toolchain (cargo-component, jco).
The WASI shims add some bundle weight (~60KB for the core WASM + JS glue).
Component boundary has a small overhead per call (data copying). Negligible for 200 boids (3.2KB/tick), but worth measuring at 1000+.

Shrinking the bundle

The initial WASM build was 76KB for the component, expanding to 58KB core WASM + 191KB JavaScript glue after jco transpile. That’s a lot for a simulation that’s fundamentally just arithmetic. Here’s how to cut it nearly in half.

Cargo release profile

Add size-focused settings to Cargo.toml:

[profile.release]
opt-level = "z"       # optimize for size over speed
lto = true            # link-time optimization across crates
codegen-units = 1     # single codegen unit for max optimization
panic = "abort"       # no unwinding machinery
strip = true          # remove debug symbols

opt-level = "z" trades micro-optimizations for smaller code. For our boids loop — simple arithmetic, no complex control flow — the performance difference is negligible. lto = true combined with codegen-units = 1 is the big win: it lets LLVM see across crate boundaries (wit-bindgen, wit-bindgen-rt) and eliminate dead code aggressively.

Post-build optimization

After cargo component build --release, two more tools help:

# Strip custom sections (names, producers metadata)
wasm-tools strip boids.wasm -o boids.wasm

# Minify the JS glue code
npx jco transpile boids.wasm -o output/ --minify

wasm-tools strip removes metadata sections the browser doesn’t need. The --minify flag on jco transpile compresses the JavaScript glue from 191KB to 97KB.

Note: jco transpile -O runs wasm-opt on extracted core modules, but as of early 2026, wasm-opt doesn’t fully support trunc_sat instructions emitted by recent Rust toolchains. Once this lands, expect another 15-20% reduction.

Results

Artifact	Before	After	Reduction
Component WASM	76 KB	59 KB	-22%
core.wasm	58 KB	40 KB	-33%
boids.js (glue)	191 KB	97 KB	-49%
Total bundle	265 KB	148 KB	-44%

Mental model: the arithmetic core is tiny; most of the size comes from runtime metadata and JavaScript glue around the component boundary. Size optimization is mostly about removing overhead that the browser does not need.

The remaining 97KB of JS glue is mostly WASI shims (@bytecodealliance/preview2-shim) that our boids module doesn’t actually use. As the Component Model matures and browsers gain native support, this overhead will disappear.

Composability: Custom behavior via WIT imports

So far, the WASM component only exports functionality. But the Component Model’s real power is bidirectional: a component can also import interfaces from the host. This lets the JavaScript side inject custom behavior into the simulation.

The customizer interface

We extend our WIT world with an imported customizer interface:

interface customizer {
    /// Called per-boid after forces, before speed clamping.
    modify-velocity: func(x: f32, y: f32, vx: f32, vy: f32,
                          speed: f32) -> tuple<f32, f32>;

    /// Returns RGBA color for a boid based on its state.
    boid-color: func(x: f32, y: f32, vx: f32, vy: f32,
                     speed: f32) -> tuple<f32, f32, f32, f32>;
}

world boids {
    import customizer;
    export simulation;
}

The import keyword is the key difference. The WASM component declares that it needs these functions; the host provides them. This is the opposite of export — it’s a contract that flows inward.

One-line Rust change

In the boids tick loop, after computing forces and before clamping speed, we call the imported function:

// Host customizer: modify velocity (e.g. add wind, vortex)
let speed = (nvx * nvx + nvy * nvy).sqrt();
(nvx, nvy) = customizer::modify_velocity(xi, yi, nvx, nvy, speed);

That’s it. wit-bindgen generates the import stubs automatically. The Rust code calls a function that doesn’t exist in Rust — it’s provided by JavaScript at runtime.

Mental model: exports are functions the component gives to the host; imports are functions the host gives back to the component. Together, they make a typed plugin boundary instead of a one-way function call.

The JavaScript host implementation

The jco transpile command maps the import to a JS module:

npx jco transpile boids.wasm -o output/ 
  --map 'blog:boids/customizer@0.1.0=../../boids/customizer.js'

The customizer module exports swappable functions:

let modifier = (_x, _y, vx, vy, _speed) => [vx, vy]; // default: pass-through

export function setModifier(fn) { modifier = fn; }

export function modifyVelocity(x, y, vx, vy, speed) {
  return modifier(x, y, vx, vy, speed);
}

Mental model: imports invert control. The WASM component owns the simulation loop, but the host supplies behavior through a typed callback that can be swapped at runtime.

Now anyone can change the flock’s behavior by calling setModifier with a new function. The demo above includes presets like “Vortex” (circular force toward center), “Wind” (constant rightward push), and “Predator” (repulsion from a point).

Why this matters

This pattern is the Component Model’s answer to plugin systems:

The WASM component is sandboxed. It can only call functions explicitly declared in the WIT interface. No filesystem access, no network, no globals — just the functions you provide.
The interface is a contract. If you change modify-velocity to return three values instead of two, both the Rust compiler and the TypeScript types break. You can’t silently break the integration.
It’s language-agnostic. The customizer could be implemented in JavaScript today, or compiled from Rust/Go/Python as a separate WASM component and composed using wac plug.

The per-boid cross-boundary call adds ~microsecond overhead. At 200 boids this is invisible; at 5000 boids it’s measurable. For hot paths at scale, a batch API (modify-velocities(state) -> state) would amortize the overhead.

Looking ahead: WebGPU compute shaders

All three approaches above share a limitation: the O(n²) neighbor scan runs on a single thread. Even WASM can only push this to about 1000-2000 boids at 60fps. For 10,000+ boids, we need parallelism.

WebGPU compute shaders can run the neighbor scan across thousands of GPU threads simultaneously. A WGSL compute shader for boid updates would look something like:

@group(0) @binding(0) var<storage, read> input: array<vec4f>;
@group(0) @binding(1) var<storage, read_write> output: array<vec4f>;

@compute @workgroup_size(64)
fn update(@builtin(global_invocation_id) id: vec3u) {
    let i = id.x;
    if (i >= arrayLength(&input)) { return; }

    let pos = input[i].xy;
    let vel = input[i].zw;

    var sep = vec2f(0.0);
    var ali = vec2f(0.0);
    var coh = vec2f(0.0);
    var aliCount = 0u;
    var cohCount = 0u;

    for (var j = 0u; j < arrayLength(&input); j++) {
        if (j == i) { continue; }
        let other = input[j];
        let d = other.xy - pos;
        let distSq = dot(d, d);

        if (distSq < separationDist * separationDist) {
            sep -= d / distSq;
        }
        // ... alignment and cohesion similarly
    }

    // Apply forces, clamp, write output
    output[i] = vec4f(newPos, newVel);
}

Mental model: WebGPU does not magically remove the neighbor scan. It changes where the work runs: instead of one CPU thread stepping through boids, thousands of GPU invocations each handle one boid in parallel.

Each boid’s update runs on a separate GPU thread. The O(n²) is still there per-boid, but with 64 boids per workgroup and thousands of concurrent invocations, 10,000 boids becomes feasible at 60fps.

WebGPU is available in Chrome, Edge, and Firefox (behind a flag). The API is more complex than Canvas 2D, but for large-scale particle simulations, it’s the endgame.

A hybrid approach is compelling: use the WASM Component Model for the simulation interface (WIT types, parameter management), but dispatch the heavy computation to a GPU compute shader when WebGPU is available, falling back to the WASM simulation otherwise.

Performance comparison

Rather than quoting static numbers, you can run the benchmark yourself on your own hardware. It measures average tick time (headless, no rendering) for both the pure TypeScript and WASM Component Model implementations at 100, 200, 500, 1000, and 5000 boids.

On a mid-range laptop, expect WASM to be 2-5x faster than TypeScript at larger flock sizes, with the gap widening as boid count increases. The Component Model has a slight overhead vs. raw wasm-bindgen due to the component boundary, but the difference is negligible in practice — well under 10%.

Conclusion

The progression tells a story about the web platform’s evolution:

TypeScript is the right starting point. Simple, debuggable, good enough for small demos.
Rust + WASM unlocks consistent performance without GC surprises.
The Component Model adds a typed interface contract. It’s not just about speed - it’s about making the simulation composable. Someone else can implement the WIT interface differently, or compose multiple components together, without touching the rendering code.
WebGPU is the future for truly large-scale simulations, bringing GPU parallelism to the browser.

The WASM Component Model is still young - the toolchain has rough edges, and browser support requires jco transpilation. But the direction is clear: portable, composable, language-agnostic modules with typed interfaces. That’s worth building toward.

You can try the interactive demo or explore the source code for this implementation.