⚠️ Beta State

PyBevy is in an early and experimental stage. The API is incomplete, subject to breaking changes without notice, and you should expect bugs. Many features are still under development.

ViewColumn and Numba JIT

Near-native performance with compiled kernels and zero-copy ECS access.

Introduction

For maximum performance, PyBevy's ViewColumn API gives you direct zero-copy pointer access to ECS component storage. Combined with Numba's LLVM JIT compiler, you can write Python functions that compile to native machine code and run across multiple CPU cores.

This is PyBevy's fastest tier — achieving ~34% of native Rust speed, or about 500x faster than traditional Query iteration.

Setup

ViewColumn requires the Numba package. Install it alongside PyBevy with pip install pybevy[jit] (or separately with pip install numba).

Numba compiles Python functions to native machine code using LLVM. The first call to a JIT-compiled function has a small warm-up cost; subsequent calls run at native speed.

The Pattern

The ViewColumn workflow has three steps:

Iterate batches — view.iter_batches() yields archetype batches
Get column handles — batch.column_mut(Transform) returns a ViewColumn
Pass to Numba kernel — The kernel receives direct pointers to ECS storage

import math
 
try:
    import numba  # type: ignore
except ImportError:
    numba = None
 
from pybevy.prelude import *

Writing a Numba Kernel

A kernel is a function decorated with @numba.jit(nopython=True). Inside the kernel, you access ViewColumn data with array indexing — col[i] reads or writes the i-th entity's value directly in ECS memory.

Simple Wave Kernel

@numba.jit(nopython=True)
def wave_kernel(pos_x, pos_y, time):
    for i in range(len(pos_x)):
        pos_y[i] = math.sin(pos_x[i] * 0.5 + time * 2.0) * 3.0

The nopython=True flag ensures Numba compiles everything to native code — if it can't, you'll get a clear error instead of silently falling back to slow Python.

Parallel Execution with `numba.prange()`

Replace range() with numba.prange() and add parallel=True to distribute work across all CPU cores:

@numba.jit(nopython=True, parallel=True)
def wave_kernel_parallel(pos_x, pos_y, pos_z, time):
    for i in numba.prange(len(pos_x)):
        x, z = pos_x[i], pos_z[i]
        dist = math.sqrt(x * x + z * z)
        pos_y[i] = math.sin(dist * 0.5 - time * 2.0) * 3.0

This gives an additional 4-8x speedup on multi-core machines.

The System Function

In your ECS system, iterate over batches, extract ViewColumn handles, and pass them to the kernel:

@component
class Cube(Component):
    pass
 
 
if numba is not None:
 
    @numba.jit(nopython=True, parallel=True)
    def animate_cubes_kernel(pos_x, pos_y, pos_z, time, amplitude):
        for i in numba.prange(len(pos_x)):
            x, z = pos_x[i], pos_z[i]
            dist = math.sqrt(x * x + z * z)
            pos_y[i] = math.sin(dist * 0.5 - time * 2.0) * amplitude
 
 
def animate_cubes(view: View[Mut[Transform], With[Cube]], time: Res[Time]) -> None:
    t = time.elapsed_secs()
 
    for batch in view.iter_batches():
        transform = batch.column_mut(Transform)
        animate_cubes_kernel(
            transform.translation.x,
            transform.translation.y,
            transform.translation.z,
            t,
            3.0,
        )

Each field access (transform.translation.x) returns a ViewColumn handle — a lightweight object that Numba understands as a pointer with stride information. No data is copied.

Safety: Poison-Pill Invalidation

ViewColumn handles are automatically invalidated when the system function returns. This prevents use-after-free bugs — if you try to use a stale handle, you'll get a clear RuntimeError instead of undefined behavior.

def my_system(view: View[Mut[Transform]]) -> None:
    for batch in view.iter_batches():
        col = batch.column_mut(Transform)
        process(col.translation.x)  # OK: within system scope
 
# After system returns, col is poisoned:
# process(col.translation.x)  # RuntimeError: stale ViewColumn

Performance Comparison

All tiers benchmarked at 1M entities (single float increment):

| Approach | Time | Speedup | vs Native Rust | |----------|------|---------|----------------| | Query iteration | 740 ms | 1x | 0.07% | | View API (bytecode VM) | 3.7 ms | 200x | 14% | | ViewColumn + Numba | 1.5 ms | 500x | 34% | | Native Rust (par_iter_mut) | 0.52 ms | — | 100% |

At 10M entities, ViewColumn achieves 41% of native Rust speed (8.5ms vs 3.5ms).

Complete Example: Cube Wave

A complete runnable example that animates a grid of cubes using ViewColumn + Numba JIT.

def setup(
    commands: Commands,
    meshes: ResMut[Assets[Mesh]],
    materials: ResMut[Assets[StandardMaterial]],
) -> None:
    cube_mesh = meshes.add(Cuboid.from_length(0.8))
    cube_material = materials.add(Color.srgb(0.3, 0.7, 0.9))
 
    # Spawn a 100x100 grid (10,000 cubes)
    grid_size = 100
    half = grid_size // 2
    spacing = 1.5
 
    for row in range(grid_size):
        for col in range(grid_size):
            x = (col - half) * spacing
            z = (row - half) * spacing
            commands.spawn(
                Cube(),
                Mesh3d(cube_mesh),
                MeshMaterial3d(cube_material),
                Transform.from_xyz(x, 0.0, z),
            )
 
    commands.spawn(
        DirectionalLight(illuminance=10000.0),
        Transform.IDENTITY.looking_at(Vec3(-1.0, -2.5, -1.0), Vec3.Y),
    )
    commands.spawn(
        Camera3d(),
        Transform.from_xyz(0.0, 80.0, 120.0).looking_at(Vec3.ZERO, Vec3.Y),
    )

Running the Example

@entrypoint
def main(app: App) -> App:
    return (
        app
        .add_plugins(DefaultPlugins)
        .add_systems(Startup, setup)
        .add_systems(Update, animate_cubes)
    )
 
if __name__ == "__main__":
    main().run()