⚠️ Beta State

PyBevy is in an early and experimental stage. The API is incomplete, subject to breaking changes without notice, and you should expect bugs. Many features are still under development.

Understanding Performance: Query vs View

PyBevy's three performance tiers and when to use each.

Introduction

PyBevy gives you three ways to work with entity components, each with dramatically different performance characteristics. Choosing the right one is key to building smooth, responsive applications — especially when your entity count grows.

| Approach | Speed vs Query | Best For | |----------|----------------|----------| | Query iteration | 1x (baseline) | Prototyping, <1k entities | | View API (bytecode VM) | ~200x | Batch mutations, 10k-1M entities | | ViewColumn + Numba JIT | ~500x | Heavy computation, 1M+ entities |

Tier 1: Query Iteration

The simplest approach — iterate over entities one at a time with a Query. This is the standard ECS pattern and works great for small entity counts or logic that runs infrequently.

from pybevy.prelude import *

Example: Rotating Cubes with Query

Here we rotate every cube a little each frame using a standard query loop.

@component
class RotatingCube(Component):
    pass
 
 
def rotate_cubes(time: Res[Time], query: Query[Mut[Transform], With[RotatingCube]]) -> None:
    for transform in query:
        transform.rotate_y(time.delta_secs())

This is clean and readable. But at 10k+ entities, the Python-level for loop becomes the bottleneck — each iteration crosses the Python/Rust boundary.

Tier 2: View API

The View API compiles your Python expressions into bytecode and executes them across all matching entities in parallel — no Python loop required.

@component
class WaveCube(Component):
    pass
 
 
def animate_wave(view: View[Mut[Transform], With[WaveCube]], time: Res[Time]) -> None:
    t = time.elapsed_secs()
    transform = view.column_mut(Transform)
 
    # This single expression updates ALL matching entities in parallel
    transform.translation.y = (transform.translation.x * 0.5 + t * 3.0).sin() * 2.0

The same sine-wave animation that would take ~740ms at 1M entities with Query takes only ~3.7ms with the View API — a 200x speedup.

Tier 3: ViewColumn + Numba JIT

For maximum performance, combine ViewColumn's zero-copy access with Numba's LLVM JIT compiler. This gives you direct pointer access to ECS storage compiled to native machine code with multi-core parallelization.

Example: Numba JIT Kernel

import numba
import math
 
@numba.jit(nopython=True, parallel=True)
def wave_kernel(pos_x, pos_y, pos_z, time):
    for i in numba.prange(len(pos_x)):
        dist = math.sqrt(pos_x[i]**2 + pos_z[i]**2)
        pos_y[i] = math.sin(dist * 0.5 - time * 2.0) * 3.0
 
def animate_jit(view: View[Mut[Transform], With[WaveCube]], time: Res[Time]):
    t = time.elapsed_secs()
    for batch in view.iter_batches():
        transform = batch.column_mut(Transform)
        wave_kernel(
            transform.translation.x,
            transform.translation.y,
            transform.translation.z,
            t,
        )

This achieves ~1.5ms at 1M entities — a 500x speedup over Query.

Benchmark Comparison

Here are real benchmark results at 1M entities (incrementing one float field):

| Approach | Time @ 1M | Speedup | vs Native Rust | |----------|-----------|---------|----------------| | Query iteration | 740 ms | 1x | 0.07% | | View API (bytecode VM) | 3.7 ms | 200x | 14% | | ViewColumn + Numba JIT | 1.5 ms | 500x | 34% | | Native Rust (par_iter_mut) | 0.52 ms | — | 100% |

The performance gap widens with scale:

| Entities | Query | ViewColumn | Speedup | |----------|-------|------------|---------| | 1,000 | 1.3 ms | 50 us | 27x | | 10,000 | 7.8 ms | 53 us | 147x | | 100,000 | 75.8 ms | 219 us | 346x | | 1,000,000 | 740 ms | 1.2 ms | 608x | | 10,000,000 | 7,306 ms | 9.1 ms | 804x |

Decision Guide

Use Query when:

Prototyping and exploring ideas
Small entity counts (<1k)
Complex per-entity logic (spawning, despawning, event handling)
Infrequent operations (setup, teardown)

Use View API when:

Animating 1k+ entities every frame
The operation can be expressed as arithmetic on component fields
You want the best balance of performance and simplicity

Use ViewColumn + Numba when:

Processing 100k+ entities with complex math
You need multi-core parallelization
Near-native performance is required
You're comfortable with Numba's constraints (no Python objects in kernels)

Complete Example

Here's a complete example showing the same wave animation using all three approaches, each on a different set of cubes.

def setup(
    commands: Commands,
    meshes: ResMut[Assets[Mesh]],
    materials: ResMut[Assets[StandardMaterial]],
) -> None:
    cube_mesh = meshes.add(Cuboid.from_length(1.0))
 
    # Red cubes — animated with Query
    commands.spawn(
        RotatingCube(),
        Mesh3d(cube_mesh),
        MeshMaterial3d(materials.add(Color.srgb(1.0, 0.3, 0.3))),
        Transform.from_xyz(-3.0, 0.0, 0.0),
    )
 
    # Green cubes — animated with View API
    commands.spawn(
        WaveCube(),
        Mesh3d(cube_mesh),
        MeshMaterial3d(materials.add(Color.srgb(0.3, 1.0, 0.3))),
        Transform.from_xyz(0.0, 0.0, 0.0),
    )
 
    # Light and camera
    commands.spawn(
        DirectionalLight(illuminance=5000.0),
        Transform.IDENTITY.looking_at(Vec3(-1.0, -1.0, -1.0), Vec3.Y),
    )
    commands.spawn(
        Camera3d(),
        Transform.from_xyz(0.0, 5.0, 10.0).looking_at(Vec3.ZERO, Vec3.Y),
    )
 
@entrypoint
def main(app: App) -> App:
    return (
        app
        .add_plugins(DefaultPlugins)
        .add_systems(Startup, setup)
        .add_systems(Update, (rotate_cubes, animate_wave))
    )
 
if __name__ == "__main__":
    main().run()