PyBevy is in an early and experimental stage. The API is incomplete, subject to breaking changes without notice, and you should expect bugs. Many features are still under development.
Understanding Performance: Query vs View
PyBevy's three performance tiers and when to use each.
Introduction
PyBevy gives you three ways to work with entity components, each with dramatically different performance characteristics. Choosing the right one is key to building smooth, responsive applications — especially when your entity count grows.
| Approach | Speed vs Query | Best For | |----------|----------------|----------| | Query iteration | 1x (baseline) | Prototyping, <1k entities | | View API (bytecode VM) | ~200x | Batch mutations, 10k-1M entities | | ViewColumn + Numba JIT | ~500x | Heavy computation, 1M+ entities |
Tier 1: Query Iteration
The simplest approach — iterate over entities one at a time with a Query. This is the
standard ECS pattern and works great for small entity counts or logic that runs
infrequently.
from pybevy.prelude import *Example: Rotating Cubes with Query
Here we rotate every cube a little each frame using a standard query loop.
@component
class RotatingCube(Component):
pass
def rotate_cubes(time: Res[Time], query: Query[Mut[Transform], With[RotatingCube]]) -> None:
for transform in query:
transform.rotate_y(time.delta_secs())This is clean and readable. But at 10k+ entities, the Python-level for loop becomes the
bottleneck — each iteration crosses the Python/Rust boundary.
Tier 2: View API
The View API compiles your Python expressions into bytecode and executes them across all matching entities in parallel — no Python loop required.
@component
class WaveCube(Component):
pass
def animate_wave(view: View[Mut[Transform], With[WaveCube]], time: Res[Time]) -> None:
t = time.elapsed_secs()
transform = view.column_mut(Transform)
# This single expression updates ALL matching entities in parallel
transform.translation.y = (transform.translation.x * 0.5 + t * 3.0).sin() * 2.0The same sine-wave animation that would take ~740ms at 1M entities with Query takes only ~3.7ms with the View API — a 200x speedup.
Tier 3: ViewColumn + Numba JIT
For maximum performance, combine ViewColumn's zero-copy access with Numba's LLVM JIT compiler. This gives you direct pointer access to ECS storage compiled to native machine code with multi-core parallelization.
Example: Numba JIT Kernel
import numba
import math
@numba.jit(nopython=True, parallel=True)
def wave_kernel(pos_x, pos_y, pos_z, time):
for i in numba.prange(len(pos_x)):
dist = math.sqrt(pos_x[i]**2 + pos_z[i]**2)
pos_y[i] = math.sin(dist * 0.5 - time * 2.0) * 3.0
def animate_jit(view: View[Mut[Transform], With[WaveCube]], time: Res[Time]):
t = time.elapsed_secs()
for batch in view.iter_batches():
transform = batch.column_mut(Transform)
wave_kernel(
transform.translation.x,
transform.translation.y,
transform.translation.z,
t,
)This achieves ~1.5ms at 1M entities — a 500x speedup over Query.
Benchmark Comparison
Here are real benchmark results at 1M entities (incrementing one float field):
| Approach | Time @ 1M | Speedup | vs Native Rust |
|----------|-----------|---------|----------------|
| Query iteration | 740 ms | 1x | 0.07% |
| View API (bytecode VM) | 3.7 ms | 200x | 14% |
| ViewColumn + Numba JIT | 1.5 ms | 500x | 34% |
| Native Rust (par_iter_mut) | 0.52 ms | — | 100% |
The performance gap widens with scale:
| Entities | Query | ViewColumn | Speedup | |----------|-------|------------|---------| | 1,000 | 1.3 ms | 50 us | 27x | | 10,000 | 7.8 ms | 53 us | 147x | | 100,000 | 75.8 ms | 219 us | 346x | | 1,000,000 | 740 ms | 1.2 ms | 608x | | 10,000,000 | 7,306 ms | 9.1 ms | 804x |
Decision Guide
Use Query when:
- Prototyping and exploring ideas
- Small entity counts (<1k)
- Complex per-entity logic (spawning, despawning, event handling)
- Infrequent operations (setup, teardown)
Use View API when:
- Animating 1k+ entities every frame
- The operation can be expressed as arithmetic on component fields
- You want the best balance of performance and simplicity
Use ViewColumn + Numba when:
- Processing 100k+ entities with complex math
- You need multi-core parallelization
- Near-native performance is required
- You're comfortable with Numba's constraints (no Python objects in kernels)
Complete Example
Here's a complete example showing the same wave animation using all three approaches, each on a different set of cubes.
def setup(
commands: Commands,
meshes: ResMut[Assets[Mesh]],
materials: ResMut[Assets[StandardMaterial]],
) -> None:
cube_mesh = meshes.add(Cuboid.from_length(1.0))
# Red cubes — animated with Query
commands.spawn(
RotatingCube(),
Mesh3d(cube_mesh),
MeshMaterial3d(materials.add(Color.srgb(1.0, 0.3, 0.3))),
Transform.from_xyz(-3.0, 0.0, 0.0),
)
# Green cubes — animated with View API
commands.spawn(
WaveCube(),
Mesh3d(cube_mesh),
MeshMaterial3d(materials.add(Color.srgb(0.3, 1.0, 0.3))),
Transform.from_xyz(0.0, 0.0, 0.0),
)
# Light and camera
commands.spawn(
DirectionalLight(illuminance=5000.0),
Transform.IDENTITY.looking_at(Vec3(-1.0, -1.0, -1.0), Vec3.Y),
)
commands.spawn(
Camera3d(),
Transform.from_xyz(0.0, 5.0, 10.0).looking_at(Vec3.ZERO, Vec3.Y),
)
@entrypoint
def main(app: App) -> App:
return (
app
.add_plugins(DefaultPlugins)
.add_systems(Startup, setup)
.add_systems(Update, (rotate_cubes, animate_wave))
)
if __name__ == "__main__":
main().run()