Skip to content

Benchmarks

Benchmarks

To check Brahe's consistency with other astrodynamics software, we've built a benchmark suite that exercises a wide range of astrodynamics tasks across multiple libraries. It's extensible, so new tasks and libraries can be added over time. The current set covers 32 tasks across 8 modules: time, coordinates, attitude, frames, orbits, propagation, force model, and access computation.

The primary "source of truth" is OreKit 13.1.5 (Java), arguably the most widely used open-source astrodynamics library. We also compare against GMAT R2026a (NASA Goddard's General Mission Analysis Tool) on 31 of the 32 tasks, Basilisk (AVS Lab) on the 14 tasks across attitude, orbits, frames, coordinates, and propagation where the two libraries' user-facing APIs overlap, and Nyx/ANISE (Nyx Space) on 26 tasks where the libraries have comparable APIs. The benchmarks report both performance and accuracy metrics for each library on each task, with OreKit as the reference baseline for accuracy comparisons.

Tip

These benchmarks show how Brahe agrees with other astrodynamics libraries and help users compare options. They are not meant to declare a winner. Brahe itself wouldn't exist without the work of other projects building out an open astrodynamics ecosystem, and we think it's healthy for users to have multiple viable tools to pick from.

Libraries have different design goals and use cases. Basilisk, for example, is a full spacecraft simulation environment with a Python interface; its API is not designed for atomic conversion calls or single-step propagation, since it computes those internally during simulation. GMAT is a script-driven mission analysis tool with a Python API layered on top of the script interface, so some tasks are implemented via script execution rather than direct API calls. Brahe's benchmark suite exercises atomic conversion calls (frame transforms, force-model accelerations, time-scale conversions, single-step Keplerian propagation) that aren't always exposed at the user-API level in other libraries; the suite only includes tasks where the API call being timed is apples-to-apples.

A useful side effect of comparing this many libraries is that broad agreement across them gives confidence that the implementations are correct and that results should be similar regardless of which library you pick. The performance numbers are also informative, but they should be read in light of each library's design goals and use cases.

Warning

Benchmarks are difficult and nuanced to put together. They're highly sensitive to the specific hardware and software environment they run in, and small implementation details can shift performance significantly. These comparisons tried to be faithful to each library's "out-of-the-box" capabilities; better ways to set up the tasks or use the libraries may well exist. Treat the numbers here as a starting point for comparison, not a final word. If you're choosing a library for a particular use case, it's worth running your own benchmarks on your target hardware and software.

Methodology

Languages and Libraries:

  • Java: OreKit 13.1.5 on OpenJDK 21
  • Basilisk: AVS Lab Basilisk (bsk Python wheel from PyPI), imported in-process by the Python runner; participates on a 14-task subset.
  • GMAT: GMAT R2026a (gmatpy API), accessed via a local GMAT install pointed to by GMAT_ROOT_PATH; participates on a 31-task subset.
  • Nyx Space: Nyx 2.4.0, ANISE 0.10.1, and hifitime Rust libraries. Nyx/ANISE participate on a 26-task subset.
  • Brahe (Python): Brahe Python bindings (PyO3)
  • Brahe (Rust): Brahe native Rust library Test Environment: 2021 MacBook Pro, Apple M1 Max, 64 GB RAM

Protocol: Two independent harnesses share the same task registry:

  • Performance — each task runs 200 iterations of a single fixed input with a fixed random seed. Mean / median / std / min / max execution time is reported per language. Per-module bar charts plot the mean time per task; the error bars span ±1 standard deviation of the iteration timings (raw spread of a single run, not a confidence interval on the mean).
  • Accuracy — each task runs once across a sweep of 200 independent initial conditions (configurable via --samples). For every non-baseline language the per-sample max-abs and RMS errors are computed against OreKit, then aggregated to p50 / p95 / p99 / max distributional statistics. Per-module CDFs visualize the full error distribution; per-task scatter plots are emitted where a single scalar (altitude, epoch, …) usefully indexes the sample. Accuracy comparisons take OreKit as the single reference baseline.

Performance Overview

The table below summarizes average speedup relative to different baselines. Values greater than 1.0× indicate the library is faster than the baseline; values less than 1.0× indicate the baseline is faster.

Module Tasks Avg GMAT Speedup Avg Basilisk Speedup Avg Nyx Speedup Avg Brahe (Python) Speedup Avg Brahe (Rust) Speedup
Time 5 1.70× 741.2× 22.7× 104.3×
Coordinates 5 0.90× 1.70× 28.9× 1.97× 88.3×
Attitude 4 1.81× 1.45× 94.7× 5.26× 204.9×
Frames 2 0.07× 5.69× 13.5× 12.7× 16.0×
Orbits 2 1.92× 1.33× 41.1× 4.51× 48.8×
Propagation 8 0.20× 0.34× 6.12× 4.61× 8.97×
Force Model 5 0.15× 0.94× 1.25× 91.9×
Access 1 0.07× 4.55× 4.59×

OreKit is the single performance reference. We don't show per-baseline speedup charts using Basilisk or GMAT as the denominator because each ratio is derivable from this chart, and the extra framings tend to read as "who won" comparisons rather than the consistency check we're after.


Time

Five tasks covering epoch creation and time system conversions (UTC → TAI, TT, GPS, UT1).

Task Orekit GMAT Nyx Brahe (Python) Brahe (Rust) GMAT Speedup Nyx Speedup Brahe (Python) Speedup Brahe (Rust) Speedup
Epoch Creation 223.5 µs 80.3 µs 8.8 µs 27.8 µs 12.5 µs 2.78× 25.4× 8.03× 17.9×
UTC To GPS 24.6 µs 34.7 µs 2.5 µs 3.4 µs 192.9 ns 0.71× 10.00× 7.30× 127.4×
UTC To TAI 25.6 µs 34.7 µs 2.4 µs 3.4 µs 190.8 ns 0.74× 10.7× 7.50× 134.3×
UTC To TT 24.9 µs 34.0 µs 2.6 µs 3.3 µs 197.1 ns 0.73× 9.67× 7.58× 126.4×
UTC To UT1 1.14 ms 325.3 µs 313.3 ns 13.8 µs 9.9 µs 3.52× 3650.4× 83.0× 115.3×

Accuracy (initial-condition sweep; sample count per row; p50 / p95 / p99 / max max-abs error vs Orekit):

Sampling range: each sample is a random calendar datetime with year uniformly in [2000, 2030], month [1, 12], day [1, 28], hour [0, 23], minute [0, 59], second [0, 60) s, and nanosecond [0, 10⁹) ns. Components are drawn independently per sample.

Task Comparison Samples p50 Max Abs p95 Max Abs p99 Max Abs Max Abs
Epoch Creation Orekit vs GMAT 200 2.8 ns 5.6 ns 5.6 ns 6.1 ns
Epoch Creation Orekit vs Nyx 200 0 s 0.5 ns 0.5 ns 0.5 ns
Epoch Creation Orekit vs Brahe (Python) 200 0 s 0 s 0 s 0 s
Epoch Creation Orekit vs Brahe (Rust) 200 0 s 0 s 0 s 0 s
UTC To GPS Orekit vs GMAT 200 2.8 ns 5.6 ns 5.6 ns 6.1 ns
UTC To GPS Orekit vs Nyx 200 0.5 ns 0.5 ns 0.5 ns 0.5 ns
UTC To GPS Orekit vs Brahe (Python) 200 0 s 0 s 0 s 0 s
UTC To GPS Orekit vs Brahe (Rust) 200 0 s 0 s 0 s 0 s
UTC To TAI Orekit vs GMAT 200 2.8 ns 5.6 ns 5.6 ns 6.1 ns
UTC To TAI Orekit vs Nyx 200 0 s 0.5 ns 0.5 ns 0.5 ns
UTC To TAI Orekit vs Brahe (Python) 200 0 s 0 s 0 s 0 s
UTC To TAI Orekit vs Brahe (Rust) 200 0 s 0 s 0 s 0 s
UTC To TT Orekit vs GMAT 200 2.8 ns 5.6 ns 5.6 ns 5.6 ns
UTC To TT Orekit vs Nyx 200 0 s 0.5 ns 0.5 ns 0.5 ns
UTC To TT Orekit vs Brahe (Python) 200 0 s 0 s 0 s 0 s
UTC To TT Orekit vs Brahe (Rust) 200 0 s 0 s 0 s 0 s
UTC To UT1 Orekit vs GMAT 200 3.3 ns 1.06 µs 1.06 µs 1.06 µs
UTC To UT1 Orekit vs Nyx 200 2.07 µs 5.37 µs 6.70 µs 7.70 µs
UTC To UT1 Orekit vs Brahe (Python) 200 0.5 ns 1.00 µs 1.00 µs 1.00 µs
UTC To UT1 Orekit vs Brahe (Rust) 200 0.5 ns 1.00 µs 1.00 µs 1.00 µs

Coordinates

Five tasks covering coordinate system transformations: geodetic/geocentric to/from ECEF, and ECEF to azimuth-elevation.

Task Orekit GMAT Basilisk Nyx Brahe (Python) Brahe (Rust) GMAT Speedup Basilisk Speedup Nyx Speedup Brahe (Python) Speedup Brahe (Rust) Speedup
ECEF To AzEl 256.3 µs 56.6 µs 120.2 µs 11.0 µs 4.53× 2.13× 23.3×
ECEF To Geocentric 97.9 µs 97.9 µs 49.6 µs 820.6 ns 1.00× 1.97× 119.3×
ECEF To Geodetic 106.1 µs 148.8 µs 187.2 µs 7.2 µs 60.0 µs 5.2 µs 0.71× 0.57× 14.8× 1.77× 20.6×
Geocentric To ECEF 109.4 µs 99.6 µs 51.2 µs 665.8 ns 1.10× 2.14× 164.3×
Geodetic To ECEF 92.7 µs 118.3 µs 32.8 µs 1.4 µs 50.6 µs 813.8 ns 0.78× 2.83× 67.5× 1.83× 113.9×

Accuracy (initial-condition sweep; sample count per row; p50 / p95 / p99 / max max-abs error vs Orekit):

Sampling range: geodetic/geocentric inputs use longitude uniform in [-180°, 180°], latitude uniform in [-90°, 90°], and altitude (or radius offset above the sphere) uniform in [0, 1000] km. The ecef_to_azel task additionally samples a satellite point above each station with longitude/latitude offsets in ±10° and satellite altitude uniform in [200, 1000] km; station latitude is restricted to [-70°, 70°] to avoid degeneracies near the poles.

Geodetic and geocentric outputs combine angular and altitude components in different units; the benchmark converts angle residuals to surface-arc distance so each row reports a single position-equivalent error in meters. The Orekit-vs-GMAT row on the geodetic conversions is dominated by GMAT's choice of equatorial radius (6378.1363 km vs WGS84's 6378.137 km, a ~0.7 m offset); on the WGS84 baselines the implementation-level agreement is sub-nanometer. The Orekit-vs-Nyx geodetic rows show ~350–415 mm residuals because ANISE's frame_info populates the Earth ellipsoid from pck11.pca (IAU PCK, \(a\) = 6378.1366 km) rather than WGS84 (\(a\) = 6378.137 km); the ~0.4 m equatorial-radius difference is the source of the offset. ANISE 0.10.1 doesn't expose a WGS84-specific ellipsoid override, so this is fixed in the library's coordinate conversion path.

Task Comparison Samples p50 Max Abs p95 Max Abs p99 Max Abs Max Abs
ECEF To AzEl Orekit vs Nyx 50 79.70 nm 182.29 nm 634.19 nm 666.12 nm
ECEF To AzEl Orekit vs Brahe (Python) 50 0.12 nm 0.47 nm 0.58 nm 0.70 nm
ECEF To AzEl Orekit vs Brahe (Rust) 50 0.20 nm 0.47 nm 0.58 nm 0.70 nm
ECEF To Geocentric Orekit vs GMAT 200 0.79 nm 3.24 nm 8.24 nm 2.44 µm
ECEF To Geocentric Orekit vs Brahe (Python) 200 0.10 nm 4.75 nm 19.32 nm 2.48 µm
ECEF To Geocentric Orekit vs Brahe (Rust) 200 0.40 nm 4.75 nm 19.32 nm 2.48 µm
ECEF To Geodetic Orekit vs Basilisk 200 0.79 nm 1.81 nm 2.37 nm 2.79 nm
ECEF To Geodetic Orekit vs GMAT 200 597.50 mm 694.08 mm 699.94 mm 701.90 mm
ECEF To Geodetic Orekit vs Nyx 200 407.80 mm 414.13 mm 414.25 mm 414.25 mm
ECEF To Geodetic Orekit vs Brahe (Python) 200 0.79 nm 1.75 nm 2.33 nm 2.56 nm
ECEF To Geodetic Orekit vs Brahe (Rust) 200 0.79 nm 1.86 nm 2.33 nm 2.56 nm
Geocentric To ECEF Orekit vs GMAT 200 0.47 nm 0.93 nm 0.93 nm 1.86 nm
Geocentric To ECEF Orekit vs Brahe (Python) 200 0.12 nm 0.93 nm 0.93 nm 0.93 nm
Geocentric To ECEF Orekit vs Brahe (Rust) 200 0.23 nm 1.05 nm 1.87 nm 2.79 nm
Geodetic To ECEF Orekit vs Basilisk 200 0.93 nm 1.86 nm 2.79 nm 2.79 nm
Geodetic To ECEF Orekit vs GMAT 200 624.70 mm 709.59 mm 718.09 mm 718.59 mm
Geodetic To ECEF Orekit vs Nyx 200 355.32 mm 412.70 mm 414.24 mm 414.25 mm
Geodetic To ECEF Orekit vs Brahe (Python) 200 0 nm 0.93 nm 0.93 nm 0.93 nm
Geodetic To ECEF Orekit vs Brahe (Rust) 200 0 nm 1.16 nm 1.87 nm 2.79 nm

Attitude

Four tasks covering conversions between quaternions, rotation matrices, and Euler angles.

Task Orekit GMAT Basilisk Nyx Brahe (Python) Brahe (Rust) GMAT Speedup Basilisk Speedup Nyx Speedup Brahe (Python) Speedup Brahe (Rust) Speedup
Euler Angle To Quaternion 136.5 µs 88.6 µs 70.8 µs 3.4 µs 20.4 µs 1.9 µs 1.54× 1.93× 39.9× 6.71× 71.5×
Quaternion To Euler Angle 165.3 µs 89.4 µs 83.1 µs 2.5 µs 15.2 µs 1.6 µs 1.85× 1.99× 65.6× 10.9× 106.2×
Quaternion To Rotation Matrix 132.3 µs 163.5 µs 175.4 µs 4.2 µs 79.9 µs 339.4 ns 0.81× 0.75× 31.6× 1.66× 389.8×
Rotation Matrix To Quaternion 248.4 µs 81.8 µs 218.7 µs 1.0 µs 141.6 µs 984.8 ns 3.04× 1.14× 241.8× 1.75× 252.2×

Accuracy: Quaternion comparisons are performed in rotation-matrix space (Frobenius norm of \(R_a - R_b\), where \(R\) is the rotation matrix induced by each quaternion). This removes the \(q \equiv -q\) sign ambiguity at the comparison boundary, so any residual is a real rotation difference rather than a representation artifact.

Sampling range: quaternion inputs are drawn uniformly from the unit 3-sphere by sampling each component i.i.d. from a unit Gaussian and normalizing — this yields a uniform distribution on SO(3) rotations. Rotation-matrix inputs are derived from the same quaternion samples. The euler_angle_to_quaternion task samples Euler angles directly with \(\phi \in [-\pi, \pi]\), \(\theta \in [-\pi/2, \pi/2]\), \(\psi \in [-\pi, \pi]\) (ZYX convention).

Task Comparison Samples p50 Max Abs p95 Max Abs p99 Max Abs Max Abs
Euler Angle To Quaternion Orekit vs Basilisk 200 2.22e-16 4.44e-16 6.66e-16 6.66e-16
Euler Angle To Quaternion Orekit vs GMAT 200 3.33e-16 6.66e-16 8.88e-16 8.88e-16
Euler Angle To Quaternion Orekit vs Nyx 200 3.33e-16 6.77e-16 8.9e-16 1.11e-15
Euler Angle To Quaternion Orekit vs Brahe (Python) 200 2.22e-16 6.66e-16 8.9e-16 1.33e-15
Euler Angle To Quaternion Orekit vs Brahe (Rust) 200 2.22e-16 8.88e-16 1.11e-15 1.33e-15
Quaternion To Euler Angle Orekit vs Basilisk 200 0 4.55e-16 8.88e-16 1.78e-15
Quaternion To Euler Angle Orekit vs GMAT 200 4.16e-17 8.88e-16 8.9e-16 2.22e-15
Quaternion To Euler Angle Orekit vs Nyx 200 2.78e-17 8.88e-16 1.78e-15 1.78e-15
Quaternion To Euler Angle Orekit vs Brahe (Python) 200 0 8.88e-16 1.78e-15 1.78e-15
Quaternion To Euler Angle Orekit vs Brahe (Rust) 200 2.78e-17 8.88e-16 1.78e-15 1.78e-15
Quaternion To Rotation Matrix Orekit vs Basilisk 200 1.11e-16 3.33e-16 4.44e-16 5.55e-16
Quaternion To Rotation Matrix Orekit vs GMAT 200 2.22e-16 4.44e-16 5.55e-16 7.77e-16
Quaternion To Rotation Matrix Orekit vs Nyx 200 2.22e-16 4.44e-16 6.67e-16 7.77e-16
Quaternion To Rotation Matrix Orekit vs Brahe (Python) 200 2.22e-16 4.47e-16 6.67e-16 7.77e-16
Quaternion To Rotation Matrix Orekit vs Brahe (Rust) 200 2.22e-16 4.47e-16 6.67e-16 7.77e-16
Rotation Matrix To Quaternion Orekit vs Basilisk 200 2.22e-16 6.66e-16 8.88e-16 1.11e-15
Rotation Matrix To Quaternion Orekit vs GMAT 200 2.22e-16 6.66e-16 8.88e-16 1.11e-15
Rotation Matrix To Quaternion Orekit vs Nyx 200 2.22e-16 6.66e-16 6.66e-16 1.11e-15
Rotation Matrix To Quaternion Orekit vs Brahe (Python) 200 2.22e-16 6.66e-16 8.88e-16 1.11e-15
Rotation Matrix To Quaternion Orekit vs Brahe (Rust) 200 2.22e-16 6.66e-16 8.9e-16 1.11e-15

Frames

Two tasks covering full 6-DOF state vector transformations between ECEF and ECI reference frames using the IAU 2006/2000A precession-nutation model.

Task Orekit GMAT Basilisk Nyx Brahe (Python) Brahe (Rust) GMAT Speedup Basilisk Speedup Nyx Speedup Brahe (Python) Speedup Brahe (Rust) Speedup
State ECEF To ECI 2.85 ms 41.70 ms 486.4 µs 156.4 µs 220.5 µs 176.8 µs 0.07× 5.86× 18.2× 12.9× 16.1×
State ECI To ECEF 2.79 ms 41.76 ms 505.4 µs 319.6 µs 222.6 µs 176.8 µs 0.07× 5.53× 8.74× 12.6× 15.8×

Accuracy (initial-condition sweep; sample count per row; p50 / p95 / p99 / max max-abs error vs Orekit):

Sampling range: each sample is a 6-DOF Cartesian state derived from random Keplerian elements with semi-major axis uniform in \(R_\oplus + [200, 36{,}000]\) km, eccentricity uniform in [0.001, 0.3], inclination uniform in [0°, 180°], and RAAN / argument of periapsis / true anomaly uniform in [0°, 360°). Epochs are drawn uniformly in [2024-01-01, 2025-01-01) (Basilisk's bundled ITRF93 PCK does not extend far past the 2024 window, which constrains the swept range).

Task Comparison Samples p50 Max Abs p95 Max Abs p99 Max Abs Max Abs
State ECEF To ECI Orekit vs Basilisk 200 1.20 m 1037.7 m 1497.7 m 1882.8 m
State ECEF To ECI Orekit vs GMAT 200 3.22 m 7.55 m 8.55 m 9.56 m
State ECEF To ECI Orekit vs Nyx 200 0.555 m 1.21 m 1.37 m 1.40 m
State ECEF To ECI Orekit vs Brahe (Python) 200 0.036 m 0.103 m 0.140 m 0.151 m
State ECEF To ECI Orekit vs Brahe (Rust) 200 0.037 m 0.103 m 0.140 m 0.151 m
State ECI To ECEF Orekit vs Basilisk 200 1.08 m 882.9 m 1411.8 m 1557.4 m
State ECI To ECEF Orekit vs GMAT 200 3.34 m 7.98 m 8.58 m 9.80 m
State ECI To ECEF Orekit vs Nyx 200 0.536 m 1.12 m 1.32 m 1.55 m
State ECI To ECEF Orekit vs Brahe (Python) 200 0.038 m 0.116 m 0.140 m 0.150 m
State ECI To ECEF Orekit vs Brahe (Rust) 200 0.038 m 0.116 m 0.140 m 0.150 m

Orbits

Two tasks covering conversions between Keplerian orbital elements and Cartesian state vectors.

Task Orekit GMAT Basilisk Nyx Brahe (Python) Brahe (Rust) GMAT Speedup Basilisk Speedup Nyx Speedup Brahe (Python) Speedup Brahe (Rust) Speedup
Cartesian To Keplerian 299.3 µs 158.8 µs 1.96 ms 5.5 µs 64.1 µs 5.0 µs 1.88× 0.15× 54.4× 4.67× 59.4×
Keplerian To Cartesian 289.1 µs 148.3 µs 114.9 µs 10.4 µs 66.4 µs 7.6 µs 1.95× 2.52× 27.8× 4.36× 38.2×

Accuracy (initial-condition sweep; sample count per row; p50 / p95 / p99 / max max-abs error vs Orekit):

Sampling range: Keplerian elements are drawn with semi-major axis uniform in \(R_\oplus + [200, 36{,}000]\) km (LEO through GEO), eccentricity uniform in [0.001, 0.5], inclination uniform in [0°, 180°], and RAAN / argument of periapsis / mean anomaly (or true anomaly for the Cartesian-input task) uniform in [0°, 360°). Cartesian-input samples are produced by converting these elements to a state vector via the standard perifocal-to-ECI rotation.

Task Comparison Samples p50 Max Abs p95 Max Abs p99 Max Abs Max Abs
Cartesian To Keplerian Orekit vs Basilisk 200 22.12 mm 50.56 mm 64.00 mm 70.68 mm
Cartesian To Keplerian Orekit vs GMAT 200 22.12 mm 50.56 mm 64.00 mm 70.68 mm
Cartesian To Keplerian Orekit vs Nyx 200 22.12 mm 50.56 mm 64.00 mm 70.68 mm
Cartesian To Keplerian Orekit vs Brahe (Python) 200 3.73 nm 11.36 nm 22.35 nm 29.80 nm
Cartesian To Keplerian Orekit vs Brahe (Rust) 200 3.73 nm 14.90 nm 22.35 nm 29.80 nm
Keplerian To Cartesian Orekit vs Basilisk 200 1.28 µm 2.20 µm 2.80 µm 3.38 µm
Keplerian To Cartesian Orekit vs GMAT 200 1.28 µm 2.20 µm 2.80 µm 3.38 µm
Keplerian To Cartesian Orekit vs Nyx 200 1.37 µm 5.19 µm 12.44 µm 15.75 µm
Keplerian To Cartesian Orekit vs Brahe (Python) 200 3.73 nm 13.07 nm 22.39 nm 37.25 nm
Keplerian To Cartesian Orekit vs Brahe (Rust) 200 3.73 nm 18.63 nm 37.31 nm 44.70 nm

Propagation

Five tasks covering Keplerian (two-body analytical), numerical (RK4/RK78 two-body), and SGP4/SDP4 propagation.

Task Orekit GMAT Basilisk Nyx Brahe (Python) Brahe (Rust) GMAT Speedup Basilisk Speedup Nyx Speedup Brahe (Python) Speedup Brahe (Rust) Speedup
Keplerian Single 300.5 µs 87.82 ms 16.3 µs 52.6 µs 33.1 µs 0.00× 18.4× 5.72× 9.07×
Keplerian Trajectory 281.1 µs 1.38 ms 55.7 µs 45.2 µs 14.8 µs 0.20× 5.05× 6.22× 19.0×
SGP4 Single 391.0 µs 88.16 ms 27.5 µs 13.4 µs 0.00× 14.2× 29.3×
SGP4 Trajectory 2.55 ms 135.49 ms 884.1 µs 396.9 µs 0.02× 2.88× 6.43×
Numerical Two-Body 1.50 ms 1.36 ms 1.68 ms 127.6 µs 502.9 µs 478.6 µs 1.10× 0.89× 11.7× 2.98× 3.13×
RK4 + 5x5 Gravity 4.20 ms 62.40 ms 96.69 ms 11.79 ms 1.52 ms 1.53 ms 0.07× 0.04× 0.36× 2.76× 2.75×
RK4 + 20x20 + Sun/Moon 6.24 ms 72.60 ms 99.35 ms 18.40 ms 9.04 ms 8.78 ms 0.09× 0.06× 0.34× 0.69× 0.71×
RK4 + 80x80 + Drag + SRP 44.77 ms 381.56 ms 117.38 ms 55.35 ms 32.06 ms 31.84 ms 0.12× 0.38× 0.81× 1.40× 1.41×

Accuracy (initial-condition sweep; sample count per row; p50 / p95 / p99 / max max-abs error vs Orekit):

Sampling range:

  • keplerian_single: random orbit with \(a \in R_\oplus + [200, 36{,}000]\) km, \(e \in [0.001, 0.3]\), \(i \in [0°, 180°]\), RAAN / \(\omega\) / \(M\) uniform in [0°, 360°); per-sample propagation duration \(\Delta t\) uniform in [1 h, 24 h]; epoch JD spread uniformly over calendar year 2024.
  • keplerian_trajectory, numerical_twobody, and the three RK4 force-model tasks (grav5x5, grav20x20_sun_moon, grav80x80_full): LEO IC sweep with \(a \in R_\oplus + [400, 1500]\) km, \(e \in [0.001, 0.02]\), \(i \in [0°, 180°]\), angles uniform in [0°, 360°), epoch JD spread over 2024. Altitude floor of 400 km and eccentricity ceiling of 0.02 keep every sampled perigee above ~250 km, where fixed-step RK4 remains well-behaved on every backend; the 80×80 + drag + SRP case otherwise aborted with "accuracy settings violated" errors in GMAT and produced 10¹¹ m state offsets in brahe. Each case is propagated over one nominal LEO orbital period (~90 minutes); only the final state is compared.
  • sgp4_single: ISS TLE evaluated at \(n\) random time offsets uniformly distributed in [0, 86400] s (sorted).
  • sgp4_trajectory: ISS TLE sampled at \(n\) uniformly spaced points across a fixed 48-hour horizon — points along a single trajectory rather than statistically independent draws, so this row reads as an error-growth profile rather than an IC distribution.
Task Comparison Samples p50 Max Abs p95 Max Abs p99 Max Abs Max Abs
Keplerian Single Orekit vs GMAT 200 1.55 m 3.56 m 4.12 m 4.89 m
Keplerian Single Orekit vs Nyx 200 0.057 m 0.143 m 0.177 m 0.238 m
Keplerian Single Orekit vs Brahe (Python) 200 11 nm 63 nm 89 nm 117 nm
Keplerian Single Orekit vs Brahe (Rust) 200 15 nm 76 nm 117 nm 189 nm
Keplerian Trajectory Orekit vs GMAT 200 0.520 m 0.738 m 0.779 m 0.842 m
Keplerian Trajectory Orekit vs Nyx 200 0.013 m 0.016 m 0.016 m 0.016 m
Keplerian Trajectory Orekit vs Brahe (Python) 200 2.8 nm 10 nm 14 nm 16 nm
Keplerian Trajectory Orekit vs Brahe (Rust) 200 3.3 nm 13 nm 16 nm 22 nm
SGP4 Single Orekit vs GMAT 200 0.435 m 0.538 m 0.553 m 0.559 m
SGP4 Single Orekit vs Brahe (Python) 200 18.9 m 44.0 m 52.7 m 54.4 m
SGP4 Single Orekit vs Brahe (Rust) 200 18.9 m 44.0 m 52.7 m 54.4 m
SGP4 Trajectory Orekit vs GMAT 200 0.394 m 0.513 m 0.543 m 0.552 m
SGP4 Trajectory Orekit vs Brahe (Python) 200 32.7 m 74.2 m 85.5 m 94.8 m
SGP4 Trajectory Orekit vs Brahe (Rust) 200 32.7 m 74.2 m 85.5 m 94.8 m
Numerical Two-Body Orekit vs Basilisk 200 0.512 m 0.723 m 0.786 m 0.799 m
Numerical Two-Body Orekit vs GMAT 200 15.7 m 24.5 m 26.7 m 27.1 m
Numerical Two-Body Orekit vs Nyx 200 0.013 m 0.016 m 0.016 m 0.016 m
Numerical Two-Body Orekit vs Brahe (Python) 200 28 nm 87 nm 100 nm 117 nm
Numerical Two-Body Orekit vs Brahe (Rust) 200 31 nm 94 nm 111 nm 122 nm
RK4 + 5x5 Gravity Orekit vs Basilisk 200 49.4 m 73.7 m 79.6 m 82.3 m
RK4 + 5x5 Gravity Orekit vs GMAT 200 1.03 m 1.91 m 2.20 m 2.26 m
RK4 + 5x5 Gravity Orekit vs Nyx 200 3.58 m 8.97 m 10.5 m 12.2 m
RK4 + 5x5 Gravity Orekit vs Brahe (Python) 200 71.9 µm 283.8 µm 428.1 µm 488.0 µm
RK4 + 5x5 Gravity Orekit vs Brahe (Rust) 200 71.9 µm 283.9 µm 428.2 µm 488.0 µm
RK4 + 20x20 + Sun/Moon Orekit vs Basilisk 200 49.4 m 73.8 m 78.6 m 83.7 m
RK4 + 20x20 + Sun/Moon Orekit vs GMAT 200 1.22 m 3.26 m 6.33 m 7.79 m
RK4 + 20x20 + Sun/Moon Orekit vs Nyx 200 3.59 m 8.97 m 10.5 m 11.4 m
RK4 + 20x20 + Sun/Moon Orekit vs Brahe (Python) 200 71.8 µm 282.5 µm 427.6 µm 489.2 µm
RK4 + 20x20 + Sun/Moon Orekit vs Brahe (Rust) 200 71.8 µm 282.5 µm 427.6 µm 489.2 µm
RK4 + 80x80 + Drag + SRP Orekit vs Basilisk 30 47.1 m 70.9 m 223.4 m 285.3 m
RK4 + 80x80 + Drag + SRP Orekit vs GMAT 30 1.29 m 12.9 m 47.0 m 59.5 m
RK4 + 80x80 + Drag + SRP Orekit vs Nyx 30 5.23 m 12.7 m 13.7 m 13.9 m
RK4 + 80x80 + Drag + SRP Orekit vs Brahe (Python) 30 2.01 m 12.6 m 43.8 m 55.4 m
RK4 + 80x80 + Drag + SRP Orekit vs Brahe (Rust) 30 2.01 m 12.6 m 43.8 m 55.4 m

Force Model

Five tasks evaluating a single acceleration term at a fixed spacecraft state and epoch (no integrator, no time stepping). This isolates the force-model implementations from the propagator, so any disagreement here comes from the force-model calculation itself — gravity coefficients, third-body ephemeris, frame transformations — rather than from numerical integration.

Task Orekit GMAT Nyx Brahe (Python) Brahe (Rust) GMAT Speedup Nyx Speedup Brahe (Python) Speedup Brahe (Rust) Speedup
Point Mass Gravity 7.8 µs 903.0 µs 38.8 µs 17.3 ns 0.01× 0.20× 452.8×
Spherical Harmonics 20x20 1.52 ms 60.71 ms 1.75 ms 483.7 µs 419.2 µs 0.03× 0.87× 3.15× 3.63×
Spherical Harmonics 80x80 5.69 ms 63.92 ms 4.09 ms 3.36 ms 3.21 ms 0.09× 1.39× 1.70× 1.77×
Third Body (Moon) 167.3 µs 926.5 µs 335.7 µs 431.4 µs 404.8 µs 0.18× 0.50× 0.39× 0.41×
Third Body (Sun) 423.5 µs 1.00 ms 425.7 µs 521.8 µs 501.8 µs 0.42× 0.99× 0.81× 0.84×

Accuracy (LEO IC sweep; per-pair max-abs error distribution vs Orekit):

Sampling range: force-model accuracy is swept over the same LEO range used for the propagation accuracy sweep — \(a \in R_\oplus + [400, 1500]\) km, \(e \in [0.001, 0.02]\), \(i \in [0°, 180°]\), angles uniform in [0°, 360°), epoch JD spread over 2024. Each case's Keplerian elements are converted to a Cartesian GCRF state in the task generator so every backend sees the identical state per case (avoiding implementation-drift in KOE→ECI conversion as a source of residual). Each task therefore yields one residual per case, so the harness reports a distribution rather than a single number.

Task Comparison Samples p50 Max Abs p95 Max Abs p99 Max Abs Max Abs
Point Mass Gravity Orekit vs GMAT 200 8.88e-16 m/s² 2.66e-15 m/s² 3.55e-15 m/s² 3.55e-15 m/s²
Point Mass Gravity Orekit vs Brahe (Python) 200 8.88e-16 m/s² 1.78e-15 m/s² 2.66e-15 m/s² 3.55e-15 m/s²
Point Mass Gravity Orekit vs Brahe (Rust) 200 8.88e-16 m/s² 1.78e-15 m/s² 2.66e-15 m/s² 3.55e-15 m/s²
Spherical Harmonics 20x20 Orekit vs GMAT 200 1.50e-07 m/s² 6.06e-07 m/s² 8.27e-07 m/s² 1.06e-06 m/s²
Spherical Harmonics 20x20 Orekit vs Nyx 200 7.04e-07 m/s² 1.34e-06 m/s² 1.73e-06 m/s² 1.94e-06 m/s²
Spherical Harmonics 20x20 Orekit vs Brahe (Python) 200 1.49e-08 m/s² 5.92e-08 m/s² 1.29e-07 m/s² 1.51e-07 m/s²
Spherical Harmonics 20x20 Orekit vs Brahe (Rust) 200 1.49e-08 m/s² 5.92e-08 m/s² 1.29e-07 m/s² 1.51e-07 m/s²
Spherical Harmonics 80x80 Orekit vs GMAT 200 1.60e-07 m/s² 1.15e-06 m/s² 2.27e-06 m/s² 2.98e-06 m/s²
Spherical Harmonics 80x80 Orekit vs Nyx 200 6.83e-07 m/s² 1.35e-06 m/s² 1.87e-06 m/s² 2.79e-06 m/s²
Spherical Harmonics 80x80 Orekit vs Brahe (Python) 200 1.43e-08 m/s² 6.28e-08 m/s² 1.34e-07 m/s² 1.51e-07 m/s²
Spherical Harmonics 80x80 Orekit vs Brahe (Rust) 200 1.43e-08 m/s² 6.28e-08 m/s² 1.34e-07 m/s² 1.51e-07 m/s²
Third Body (Moon) Orekit vs GMAT 200 8.48e-14 m/s² 1.43e-13 m/s² 1.75e-13 m/s² 2.09e-13 m/s²
Third Body (Moon) Orekit vs Nyx 200 7.52e-15 m/s² 1.19e-14 m/s² 1.61e-14 m/s² 1.73e-14 m/s²
Third Body (Moon) Orekit vs Brahe (Python) 200 9.14e-14 m/s² 1.69e-13 m/s² 2.27e-13 m/s² 2.38e-13 m/s²
Third Body (Moon) Orekit vs Brahe (Rust) 200 9.14e-14 m/s² 1.69e-13 m/s² 2.27e-13 m/s² 2.38e-13 m/s²
Third Body (Sun) Orekit vs GMAT 200 4.16e-15 m/s² 7.39e-15 m/s² 7.94e-15 m/s² 7.98e-15 m/s²
Third Body (Sun) Orekit vs Nyx 200 1.99e-18 m/s² 5.20e-18 m/s² 7.03e-18 m/s² 9.54e-18 m/s²
Third Body (Sun) Orekit vs Brahe (Python) 200 4.20e-14 m/s² 7.63e-14 m/s² 9.35e-14 m/s² 9.41e-14 m/s²
Third Body (Sun) Orekit vs Brahe (Rust) 200 4.20e-14 m/s² 7.63e-14 m/s² 9.35e-14 m/s² 9.41e-14 m/s²

Access (Comparative)

One task: computing all satellite-to-ground-station access windows over a 48-hour period using SGP4 propagation.

Task Orekit GMAT Brahe (Python) Brahe (Rust) GMAT Speedup Brahe (Python) Speedup Brahe (Rust) Speedup
SGP4 Access 650.14 ms 8.757 s 142.77 ms 141.59 ms 0.07× 4.55× 4.59×

Accuracy (per-location contact-count and per-window timing residuals vs Orekit):

Sampling range: a single fixed SGP4 propagator (ISS TLE) is evaluated against 100 random ground locations sampled with longitude uniform in [-180°, 180°], latitude uniform in [-90°, 90°], altitude fixed at 0 m, 10° minimum elevation, over a 24-hour search window starting at the TLE epoch. The samples here index ground locations rather than orbit initial conditions, so the accuracy distribution describes how access-window detection varies geographically for one orbit, not how it varies across orbits.

The table reports, for each backend pair: how many ground locations produced any contacts, the total contact count each backend found across those locations, the worst per-location count mismatch, and the distribution of matched-window start/end time residuals across all locations. Windows are matched greedily by nearest start time within a 120 s tolerance; unmatched windows contribute to the count difference but not the timing residuals.

Comparison Locations Contacts (Orekit) Contacts (Comparison) Contact Count Diff (max per loc) Start Err p50 Start Err p95 Start Err p99 Start Err max End Err p50 End Err p95 End Err p99 End Err max
Orekit vs GMAT 73 277 297 2 4.989 s 16.564 s 24.559 s 25.375 s 5.700 s 16.816 s 22.866 s 26.962 s
Orekit vs Brahe (Python) 73 277 277 1 7.483 ms 18.531 ms 29.258 ms 42.728 ms 7.725 ms 20.567 ms 34.493 ms 48.803 ms
Orekit vs Brahe (Rust) 73 277 277 1 7.483 ms 18.531 ms 29.258 ms 42.728 ms 7.725 ms 20.567 ms 34.493 ms 48.803 ms

Access Computation (Brahe vs Skyfield)

In addition to the OreKit comparison above, Brahe is also benchmarked against Skyfield, a popular Python astronomy library, for access computation. This benchmark focuses on Brahe's serial vs parallel execution modes and Python bindings vs native Rust performance.

The benchmark propagates the ISS (from a single, static TLE) over a 2-day (48-hour) window and computes all access intervals against 100 randomly sampled ground station locations with a 5° minimum elevation. Access start and end times agree to within one second between Brahe and Skyfield.

Implementation Avg Time vs Skyfield
Skyfield 32.78ms baseline
Brahe-Python (serial) 2.99ms 11.0x faster
Brahe-Python (parallel) 3.08ms 10.6x faster
Brahe-Rust (serial) 2.47ms 13.3x faster
Brahe-Rust (parallel) 2.00ms 16.4x faster

The parallel rows report the per-location amortized time when all 100 locations are batched into a single call; the serial rows report the mean time for a single location computed in isolation. Brahe's Python bindings dispatch into the same Rust core as the native Rust path, so the serial Python row sits within ~20% of the serial Rust row. For the Python path at this problem size, batching does not improve per-location time because the per-location work is already short relative to the call setup amortized in the serial measurement.


Notes on Basilisk Comparisons

Basilisk participates in 14 of 32 tasks. The gap reflects API surface, not capability: brahe's suite exercises atomic conversion calls (frame transforms, force-model accelerations, time-scale conversions, single-step Keplerian propagation), and Basilisk doesn't expose all of those at the user-API level since its dynamics modules compute them internally during simulation.

Where Basilisk participates: 4 attitude conversions (RigidBodyKinematics), 2 orbital-element conversions (orbitalMotion.elem2rv / rv2elem), 2 frame transformations (via the bundled pyswice SPICE Toolkit, J2000 ↔ ITRF93), 2 geodetic/ECEF coordinate conversions (also pyswice), and 4 numerical orbit propagation cases (spacecraft.Spacecraft + simIncludeGravBody with the RK4 default integrator).

Where Basilisk does not participate: SGP4 propagation (Basilisk has no SGP4 in its core), analytical Keplerian propagation (orbitalMotion.elem2rv is a single closed-form function call, not a propagator object), atomic force-model accelerations (computed inside dynamics modules), time-scale conversions (Basilisk uses SPICE's unitim_c, which would measure SPICE Toolkit rather than a peer time-scale library), and access calculations (depend on SGP4).

Frame definitions: Basilisk-via-pyswice transforms between J2000 and ITRF93 using NAIF's high-precision Earth orientation binary PCK (earth_latest_high_prec.bpc, downloaded automatically by bench-compare-setup). OreKit uses EME2000 (≡ J2000 to sub-meter precision) and ITRF (IERS 2010 conventions). ITRF93 follows the IERS 1996 conventions, so expect kilometer-scale ECEF position differences vs. the Java baseline.

Propagation methodology: Basilisk's SimBaseClass setup cost is included in the per-iteration timing, matching the OreKit/Rust pattern of timing the full run. Basilisk's inertial output (r_BN_N, v_BN_N, J2000-equatorial) is captured raw inside the timed region; the J2000 → GCRF frame-bias transform via brahe.state_eme2000_to_gcrf runs after time_iterations returns, so the timed work is purely Basilisk. Basilisk's default Earth mu is overridden from 3.986004360e14 to 3.986004418e14 so all three baselines see the same central-body parameter. Gravity coefficients come from GGM03S (Basilisk-bundled, up to degree 180) versus EIGEN-5C (Orekit) and brahe's native EGM family. The numerical two-body and three RK4 force-model tasks all run fixed-step Classical Runge-Kutta 4 (RK4) at the same step size on every implementation, so the residual reflects gravity-coefficient sources and floating-point ordering rather than integrator precision.

Anomaly convention: Basilisk's orbitalMotion.ClassicElements uses true anomaly in radians; brahe/Orekit use mean anomaly in degrees per the existing benchmark convention. Each library is timed on its native call; conversion happens outside the timed region. The scalar-first [w, x, y, z] quaternion convention is already consistent across all baselines, and Basilisk's RigidBodyKinematics.EP2C / C2EP / EP2Euler321 / euler3212EP use the same passive-rotation (DCM) convention as brahe's Quaternion primitives, so all four attitude conversions run on native library calls.

Atmospheric drag inputs: For the 80×80 full task, Basilisk's msisAtmosphere is driven with representative quiet-Sun space-weather values (Ap=8, F10.7=110) rather than by interpolating SpaceWeather-All-v1.2.txt the way Orekit's CssiSpaceWeatherData does. That's roughly what a Basilisk user would do without bespoke SW preparation, and it adds some bounded divergence from the Java baseline on that task.


Notes on GMAT Comparisons

GMAT R2026a (the General Mission Analysis Tool, developed by NASA Goddard) participates in 31 of the 32 tasks. The one exclusion is coordinates.ecef_to_azel: GMAT's GroundStation exposes no azimuth/elevation accessor via the gmatpy API, and the script-based Topocentric coordinate system path segfaults under gmat.RunScript(). The only working route is a pure-Python ENZ rotation, which doesn't exercise GMAT code and would be misleading to label as a GMAT comparison.

Where GMAT participates: all 5 time tasks, 4 of 5 coordinates tasks, all 4 attitude tasks, both frames tasks, both orbits tasks, all 8 propagation tasks, all 5 force-model tasks, and the access task — 31 tasks in total.

Setup: Install GMAT R2026a, export GMAT_ROOT_PATH pointing at the install root (the directory containing bin/, data/, api/), and run just bench-compare-setup. The setup recipe generates api_startup_file.txt via GMAT's bundled BuildApiStartupFile.py. When GMAT_ROOT_PATH is unset, GMAT comparisons are silently skipped.

Frame and coordinate disclosure: GMAT's EarthFixed body-fixed frame uses FK5 reference theory with IAU 1980 nutation; brahe and OreKit use the IERS 2010 ITRF realization. Position errors of ~10 m at LEO are attributable to frame-definition choice, not implementation precision. For geodetic coordinate conversions, GMAT uses an Earth equatorial radius of 6378.1363 km versus WGS84's 6378.137 km — accounting for ~0.7 m position error.

Quaternion convention: GMAT uses scalar-last [q1, q2, q3, q4]; the benchmark canonical is scalar-first [w, x, y, z]. Reordering is applied outside the timed region. All four attitude tasks (quaternion ↔ rotation matrix, quaternion ↔ Euler ZYX) call GMAT's native AttitudeConversionUtility primitives — ToCosineMatrix, ToQuaternion, ToEulerAngles(rv, 3, 2, 1).

Time scales: GMAT's A1ModJulian epoch is rooted at 1941-01-05 (JD 2430000.0), not the standard MJD epoch of 1858-11-17 (JD 2400000.5). The benchmark applies the appropriate offset for time-conversion inputs. GMAT has no GPS time-system enum; GPS times are derived as TAI − 19 s. GMAT's Gregorian time parser has millisecond precision, contributing ~0.5 ms to time-conversion accuracy comparisons.

Propagation methodology: Per-iteration construction of Spacecraft + ForceModel + Propagator is included in the timing, which reflects how a GMAT user actually exercises the API. This matches Basilisk's per-iteration setup methodology. The EarthMJ2000Eq → GCRF transform is applied after timing using brahe.state_eme2000_to_gcrf.

Units: GMAT uses km / km/s / km/s² internally for position, velocity, and acceleration; the benchmark canonical is SI (m / m/s / m/s²). Scaling by 10³ is applied outside the timed region.

Gravity coefficients and atmosphere sources: GMAT uses bundled JGM-2 / EGM-96 coefficients depending on degree; the JGM-2 file supports up to degree 70, so 80×80 tasks use EGM-96. Drag uses MSISE-90 from GMAT's bundled space-weather data. OreKit uses EIGEN-6S, Basilisk uses GGM03S, and brahe uses per-task bundled data, so differences reflect data-source variation rather than implementation precision.

SGP4 propagation and access tasks: These run through GMAT's script-based interface (gmat.LoadScript + gmat.RunScript) because the SGP4 and ContactLocator plugins don't activate cleanly through the direct gmatpy API path. The results are still correct; it's a GMAT integration detail.


Notes on Nyx / ANISE Comparisons

Nyx 2.4.0 is a Rust astrodynamics library designed for mission design, trajectory optimization, and orbit determination. ANISE 0.10.1 is its companion library for reference-frame and ephemeris operations; it provides the Almanac abstraction that loads NAIF SPICE-compatible kernels (SPK, PCK, FK) and exposes attitude, rotation, and ephemeris queries in native Rust. hifitime 4.x is the precision time library shared by both and underpins all epoch and duration arithmetic.

The Nyx/ANISE baseline runs as a separate Rust binary (compiled from benchmarks/nyx/), invoked as a subprocess by the Python runner the same way as the Java OreKit binary. This mirrors the brahe-Rust execution model: the Rust binary receives task parameters via stdin JSON and writes results to stdout JSON, while the Python harness handles timing and statistical aggregation identically to all other baselines.

Where Nyx/ANISE participates: 26 of 32 tasks. The 6 dropped tasks are listed below.

Dropped tasks:

  • coordinates.geocentric_to_ecef and coordinates.ecef_to_geocentric: Brahe defines these as pure spherical-coordinate conversions (latitude-longitude-radius ↔ XYZ without ellipsoidal flattening). ANISE 0.10.1 has no equivalent API that we found; its only coordinate conversion path is geodetic (WGS84 ellipsoid). Implementing this with hand-rolled trigonometry wouldn't exercise any Nyx/ANISE code, so we drop these tasks.
  • force_model.accel_point_mass_gravity: Brahe exposes a standalone single-body Newtonian evaluation. Nyx 2.4.0 doesn't expose this in a public API; the two-body term is internal to OrbitalDynamics::eom. A Nyx implementation would reduce to hand-rolled μ·r/r³ rather than library code, so we drop this task too.
  • access.sgp4_access: Brahe's location_accesses API finds all satellite-to-ground-station contact windows over an interval. Nyx 2.4.0 has no direct equivalent: the eclipse module is unrelated, and a window-finder built on top of Nyx's propagator would have to reproduce brahe's search logic rather than compare library APIs. Dropped.
  • propagation.sgp4_single and propagation.sgp4_trajectory: Nyx 2.4.0 doesn't expose an SGP4 propagator in its public API.

Earth orientation parameters: ANISE 0.10.1 doesn't load IERS finals2000A.all data. Earth rotation is driven by ANISE's own earth_latest_high_prec.bpc binary PCK, which is downloaded fresh from JPL at setup time by bench-compare-setup. Frame-transform residuals of roughly 0.5–1.5 m relative to OreKit (which uses IERS finals2000A.all) reflect this BPC-vs-IERS-finals2000A algorithmic difference rather than a precision defect — that difference is the variable under test for the frames tasks.

Gravity coefficient file format: Nyx supports SHADR and COF gravity-coefficient formats but not ICGEM. Brahe ships EGM2008_360.gfc in ICGEM format. The Nyx benchmark implementation parses the ICGEM file and writes a temporary SHADR file at benchmark startup; coefficient values are numerically identical to the source, so no accuracy impact is expected from the format translation.

Atmospheric drag model substitution: Nyx 2.4.0's public atmosphere models are ConstantDrag, an exponential model, and US Standard Atmosphere 1976. Neither NRLMSISE-00 nor Harris-Priester is exposed in a public API. The propagation.numerical_rk4_grav80x80_full task uses US Standard Atmosphere 1976 for drag, while OreKit uses NRLMSISE-00, so accuracy residuals on this task are larger than on the other propagation tasks. That's expected given the different atmosphere models and not a numerical precision difference.

Time tasks: All five time tasks run directly against hifitime's Epoch API, with no extra library layers. hifitime uses TAI internally with UT1 obtained from a bundled IERS file; conversions to GPS, TT, UTC, and UT1 match OreKit at nanosecond precision across the swept epoch range.

Attitude tasks: Attitude conversions use ANISE's DCM, EulerParameter, and matrix-product primitives. The DCM ↔ Euler-angle path goes through ANISE's ZYX decomposition; the quaternion ↔ DCM path uses ANISE's EulerParameter::to_dcm / DCM::to_euler_parameters. All four tasks run on native ANISE calls.

Frame tasks: The two frame-transformation tasks call Almanac::transform_to between EARTH_J2000 and EARTH_ITRF93 frames. ANISE 0.10.1 provides DynamicFrame with five variants: EarthMeanOfDate (MOD, precession only), EarthTrueOfDate (TOD, precession + nutation), EarthTrueEquatorMeanEquinox (TEME, precession + nutation + equation of the equinoxes), BodyMeanOfDate, and BodyTrueOfDate. All three Earth variants compute rotations relative to the ICRS/J2000 inertial parent using SOFA precession-nutation models, covering the GCRF → MOD → TOD segment of the IAU 2006/2000A chain. None of them include the Earth Rotation Angle (ERA) step that rotates the true-of-date pole to the Terrestrial Intermediate Reference System (TIRS), or the polar-motion step that maps TIRS to ITRF. So EARTH_ITRF93 is the only high-precision Earth body-fixed frame in ANISE 0.10.1; IAU_EARTH_FRAME uses a low-fidelity polynomial PCK rotation and isn't used here. The earth_latest_high_prec.bpc kernel is downloaded fresh from JPL at setup time and contains current Earth orientation parameters (UT1, polar motion). The NAIF orientation ID "ITRF93" refers to the frame definition standard used by NAIF/SPICE (IERS 1996 conventions), not to stale 1993-era data.

Performance characteristics: hifitime time-conversion tasks have no I/O or kernel-lookup overhead and are among the fastest tasks in the suite. ANISE attitude and orbit tasks carry per-call kernel-parsing overhead on the first call; subsequent calls within a benchmark run reuse the loaded Almanac. Numerical propagation tasks use Nyx's RungeKutta4 fixed-step integrator at the same step size as the other Rust and Java baselines, so integrator-introduced residuals are comparable.


GPU Comparison Suite (Brahe vs Astrojax)

A separate benchmark suite at benchmarks/gpu_comparison/ looks at a different question: at what batch size does running the same astrodynamics computation on a GPU start to beat Brahe's Rust + rayon CPU implementation? The comparison target is Astrojax, an experimental JAX-native astrodynamics library aimed at GPUs. Astrojax is derived from brahe's design but built on JAX so that GPU parallelization is direct. The design assumes batched work, since the GPU only pays off above some task-dependent crossover batch size.

The suite sweeps a geometric batch-size ladder (1 … 10^8 depending on task) for every task and reports throughput in operations per second. Multi-GPU cells are gated by a per-task multigpu_min_batch() below which pmap overhead dominates.

Different from the comparative suite

The comparative suite above measures per-iteration latency on a fixed input with Orekit as the accuracy baseline. The GPU comparison suite measures throughput vs batch size across two libraries with no accuracy baseline. The two suites use different harnesses, different result schemas, and different reporting conventions, and aren't meant to be merged.

Configurations:

Config Backend dtype Parallelism
brahe-rust-rayon Rust subprocess (bench_gpu_rust) f64 rayon, all CPU cores
astrojax-cpu JAX in a spawned Python child (JAX_PLATFORMS=cpu) f64 jit(vmap(...)) on one CPU device
astrojax-gpu in-process JAX f32 jit(vmap(...)) on one GPU
astrojax-multigpu in-process JAX f32 jit(vmap(...)) with NamedSharding across every visible GPU

Brahe runs in f64 (its only mode). Astrojax on CPU is f64 for apples-to-apples comparison. Astrojax on GPU is f32 because that's the realistic deployment choice on tensor-core hardware, where you trade precision for throughput.

Test environment: AMD EPYC 7713 (64 physical / 128 logical cores), 503 GB RAM, Linux 6.17. 2× NVIDIA A100 80 GB PCIe (driver 595.71.05, CUDA 13.2). Brahe 1.5.2 (commit 715192ec), Astrojax 0.8.0 (commit 98454ab), JAX 0.10.1, Rust 1.93.1, Python 3.14.

What an 'op' means

Throughput is reported in operations per second, where an operation is one user-facing call of the task, not a single integration step or low-level math op. The unit therefore varies by task family:

  • Time conversions (time.utc_mjd_to_tt_mjd): 1 op = 1 MJD-UTC → MJD-TT conversion.
  • Coordinate transformations (coordinates.*): 1 op = 1 conversion of one input vector (3-vector for geodetic / ECEF / ENZ; 6-vector for Keplerian / Cartesian state).
  • Frame transformations (frames.gcrf_to_itrf): 1 op = 1 (epoch, state) → ITRF state transformation, including the underlying precession / nutation / polar-motion rotations.
  • SGP4 propagation (propagation.sgp4_iss_sweep): 1 op = 1 propagation of the ISS TLE from its epoch to one tsince_minutes offset (one full SGP4 evaluation per element of the batch).
  • Numerical / force-model propagation (propagation.numerical_twobody_j2, force_model.grav_5x5): 1 op = 1 complete orbit propagation — i.e. 180 RK4 steps spanning one ~90-minute LEO period at 30-second cadence.

Peak Speedup vs Brahe-Rayon

For each (task, non-baseline-config) pair, this is the peak speedup of that config's throughput over brahe-rust-rayon at any batch size on the ladder. Values greater than 1 mean the config beats Brahe somewhere along its ladder.

task astrojax-cpu_peak_speedup astrojax-gpu_peak_speedup astrojax-multigpu_peak_speedup
coordinates.enz_to_azel 22.38x 22.25x 27.51x
coordinates.geodetic_to_ecef 18.75x 20.21x 15.65x
coordinates.keplerian_to_cartesian 1.98x 19.58x 29.85x
force_model.grav_5x5 0.10x 6.47x 9.21x
frames.gcrf_to_itrf 0.12x 16.86x 16.22x
propagation.numerical_twobody_j2 0.49x 4.86x 5.04x
propagation.sgp4_iss_sweep 5.47x 948.76x 1206.41x
time.utc_mjd_to_tt_mjd 6.68x 19.14x 35.63x

Per-Task Throughput vs Batch Size

Each chart plots throughput (ops/s, log-log) for every config that completed at least one cell. The crossover point, where an Astrojax curve crosses above brahe-rust-rayon, is the batch size at which that config starts outperforming Brahe.

Time conversions

batch_size brahe-rust-rayon (ops/s) astrojax-cpu (ops/s) astrojax-gpu (ops/s) astrojax-multigpu (ops/s)
1 3.10M 45.2K 4.59K skipped:below_multigpu_min_batch
100 690K 4.60M 414K skipped:below_multigpu_min_batch
10K 26.4M 47.5M 27.1M skipped:below_multigpu_min_batch
1M 244M 120M 3.07B 692M
10M 352M 71.5M 6.73B 9.97B
100M 384M 131M 4.18B 13.7B

Coordinate transformations

batch_size brahe-rust-rayon (ops/s) astrojax-cpu (ops/s) astrojax-gpu (ops/s) astrojax-multigpu (ops/s)
1 6.32M 56.5K 980 skipped:below_multigpu_min_batch
100 111K 2.09M 78.2K skipped:below_multigpu_min_batch
10K 8.24M 8.72M 93.2M skipped:below_multigpu_min_batch
100K 53.3M 111M 306M 221M
1M 327M 160M 1.59B 4.97B
10M 980M 172M 19.8B 15.3B
batch_size brahe-rust-rayon (ops/s) astrojax-cpu (ops/s) astrojax-gpu (ops/s) astrojax-multigpu (ops/s)
1 2.94M 57.7K 2.97K skipped:below_multigpu_min_batch
100 715K 1.42M 466K skipped:below_multigpu_min_batch
10K 8.79M 2.43M 7.32M skipped:below_multigpu_min_batch
100K 42.2M 10.7M 507M 266M
1M 176M 19.8M 2.06B 2.16B
10M 176M 21.5M 3.45B 5.25B
batch_size brahe-rust-rayon (ops/s) astrojax-cpu (ops/s) astrojax-gpu (ops/s) astrojax-multigpu (ops/s)
1 7.92M 87.0K 2.91K skipped:below_multigpu_min_batch
100 191K 4.28M 833K skipped:below_multigpu_min_batch
10K 14.1M 8.10M 62.2M skipped:below_multigpu_min_batch
100K 66.2M 67.3M 365M 224M
1M 390M 101M 8.67B 6.94B
10M 752M 178M 6.87B 20.7B

Frame transformations

batch_size brahe-rust-rayon (ops/s) astrojax-cpu (ops/s) astrojax-gpu (ops/s) astrojax-multigpu (ops/s)
1 187K 6.06K 1.92K skipped:below_multigpu_min_batch
100 599K 69.7K 177K skipped:below_multigpu_min_batch
1K 1.75M 124K 1.69M skipped:below_multigpu_min_batch
10K 2.53M 160K 13.9M skipped:below_multigpu_min_batch
100K 2.53M 219K 42.6M 41.0M
1M 2.39M 270K 29.2M 29.3M

SGP4 propagation

batch_size brahe-rust-rayon (ops/s) astrojax-cpu (ops/s) astrojax-gpu (ops/s) astrojax-multigpu (ops/s)
1 107K 50.4K 3.57K skipped:below_multigpu_min_batch
100 355K 579K 405K skipped:below_multigpu_min_batch
1K 1.11M 381K 3.66M skipped:below_multigpu_min_batch
10K 1.28M 795K 32.2M skipped:below_multigpu_min_batch
100K 1.25M 4.48M 314M 310M
1M 1.26M 6.89M 1.20B 1.52B

Numerical two-body / J2 propagation

batch_size brahe-rust-rayon (ops/s) astrojax-cpu (ops/s) astrojax-gpu (ops/s) astrojax-multigpu (ops/s)
1 2.20K 30.6 8.33 skipped:below_multigpu_min_batch
10 9.78K 268 97.6 skipped:below_multigpu_min_batch
100 17.8K 1.99K 958 skipped:below_multigpu_min_batch
1K 18.3K 8.98K 9.27K 9.38K
10K 17.7K 5.83K 86.2K 89.4K

Full force model (5×5 spherical harmonic)

batch_size brahe-rust-rayon (ops/s) astrojax-cpu (ops/s) astrojax-gpu (ops/s) astrojax-multigpu (ops/s)
1 1.35K 28.1 1.19 skipped:below_multigpu_min_batch
10 3.85K 130 11.0 skipped:below_multigpu_min_batch
100 9.64K 721 108 skipped:below_multigpu_min_batch
1K 10.6K 1.02K 1.03K 1.04K
10K 10.6K 945 10.4K 10.4K
100K 10.2K skipped:budget_projected_exceeded 65.9K 93.9K

force_model.grav_5x5 is the clearest example of the crossover. Brahe's hand-tuned spherical-harmonic gravity saturates at ~11k orbits/s above batch ~1k (limited by CPU thread count). Astrojax's RK4 + SH(5×5) graph is much slower per element at small batches because of JIT-compile overhead and a heavier per-step XLA graph, but it scales nearly linearly with batch size and crosses Brahe at roughly batch 30k. By batch 100k single-GPU is 6.5× faster than Brahe and dual-GPU is 9.2× faster. The gravity model is closed over inside _propagate_one, so each device JIT-compiles its own copy and only the per-orbit initial states are sharded.

Notes on the GPU Comparison

dtype heterogeneity: The Brahe (f64) vs Astrojax-GPU (f32) mismatch is intentional. F32 on A100 tensor cores is the realistic deployment choice; forcing astrojax-GPU to f64 would halve its throughput and turn the comparison into "JAX overhead vs Rust" instead of "what's realistically achievable on A100s". Astrojax-CPU stays at f64 so the JAX-vs-Rust comparison on a CPU device is apples-to-apples.

Spawn-isolated astrojax-cpu: JAX can't host CPU and GPU devices in the same Python process once it's initialized. The runner launches every astrojax-cpu cell in a multiprocessing.spawn-ed child with JAX_PLATFORMS=cpu set on the child, which isolates the CPU JAX init from the parent (which has CUDA jaxlib loaded) at the cost of one process spawn plus JIT-compile per cell.

Input generation is vectorized: Per-row inputs come from numpy.random.default_rng(seed) (one uniform(low, high, batch_size) call per column) rather than Python loops. The same seed produces the same N distinct rows across runs. At batch 10^7 generation takes ~2 s; at batch 10^8 (the largest cell, used by the time tasks) it takes ~4 s. This is setup overhead and is not part of the timed window.

Per-task batch ladders are tuned: cheap tasks (time conversion, coordinate transforms) sweep up to 10^7 or 10^8 to find the GPU crossover; heavier per-element tasks (frames, numerical propagation, full force model) cap at 10^4–10^6 so individual cells stay within the 90 s wall-clock budget.

Astrojax data alignment: both backends use Brahe's bundled data/eop/finals.all.iau2000.txt and data/space_weather/sw19571001.txt. Astrojax's load_eop_from_file and load_sw_from_file accept Brahe's files directly with no conversion step. JAX 64-bit mode is enabled via JAX_ENABLE_X64=1 so the EOP table preserves f64 precision, and integrator-using kernels also call astrojax.set_dtype(jnp.float64) so the RK4 scan body's carry input and output dtypes match.


Reproducing These Results

Comparative benchmarks (Java/Python/Rust/Basilisk/GMAT):

GMAT setup is opt-in via environment variable. To include GMAT in the comparison, install GMAT R2026a locally and export GMAT_ROOT_PATH pointing at the install root (the directory containing bin/, data/, api/) before running just bench-compare-setup. Example (macOS): export GMAT_ROOT_PATH="/Applications/GMAT R2026a". If GMAT_ROOT_PATH is unset, the setup recipe and all benchmark runs skip GMAT silently — the other four baselines run normally.

Prerequisites: brew install just uv openjdk (macOS). The just recipes auto-detect Homebrew's openjdk and Rust's ~/.cargo/bin/cargo, so no manual JAVA_HOME export or symlink is needed. If Rust isn't installed yet, follow the rustup instructions. The Rust toolchain is required to build the Brahe and Nyx benchmark binaries.

# Optional: point at a local GMAT install to include GMAT in the comparison.
export GMAT_ROOT_PATH="/Applications/GMAT R2026a"

# One-time setup: build Java/Rust implementations, install Basilisk wheel,
# download OreKit data, and (if GMAT_ROOT_PATH is set) generate GMAT's
# api_startup_file.txt via the bundled BuildApiStartupFile.py.
just bench-compare-setup

# Run benchmarks, generate figures + CSV tables, and stage artifacts for commit
# (perf takes --iterations, accuracy takes --samples; both default to 100, but
#  200 gives noticeably tighter ±1σ bars on the performance charts.)
just bench-compare --iterations 200 --seed 42
just bench-compare-accuracy --samples 200 --seed 42
BRAHE_FIGURE_OUTPUT_DIR=./docs/figures/ uv run python plots/fig_comparative_benchmarks.py

# Review staged changes and commit
git status
git commit -m "Update benchmark data"

Individual steps can also be run separately:

1
2
3
4
5
6
# Run benchmarks only (results saved to benchmarks/comparative/results/)
just bench-compare --iterations 200 --seed 42
just bench-compare-accuracy --samples 200 --seed 42

# Regenerate figures and tables from existing results (without re-running benchmarks)
python plots/fig_comparative_benchmarks.py

Access benchmark (Brahe vs Skyfield):

uv run scripts/benchmark_access_three_way.py --n-locations 100 --seed 42 --output chart.html --plot-style scatter --csv accesses.csv

GPU comparison benchmarks (Brahe-Rust vs Astrojax on CPU / single GPU / multi-GPU):

Prerequisites: CUDA-capable GPU(s) and an installed CUDA driver. The gpu-comparison extra in pyproject.toml carries only the Python tooling deps (typer, rich, psutil). Astrojax and the CUDA-capable jaxlib are installed by the just bench-gpu-install recipe — astrojax editable from ~/repos/astrojax when present, otherwise from PyPI; jaxlib matched to the detected CUDA major version. They are intentionally NOT pinned in pyproject.toml so uv sync --all-extras --frozen works in CI without a sibling astrojax checkout.

# One-time setup: install brahe[gpu-comparison] + astrojax + a CUDA-matched
# jaxlib. NO_LOCAL=1 forces the PyPI astrojax install regardless of
# ~/repos/astrojax presence.
just bench-gpu-install
just bench-gpu-build                          # compiles bench_gpu_rust

# Run the full suite — 8 tasks × 4 configs × geometric batch ladder.
# JAX_ENABLE_X64=1 is required so EOP-backed tasks preserve f64 precision
# on the astrojax-CPU side.
JAX_ENABLE_X64=1 just bench-gpu --budget 90 --iterations 5

# Regenerate figures + CSVs from the most recent results JSON.
BRAHE_FIGURE_OUTPUT_DIR=./docs/figures/ uv run python plots/fig_gpu_comparison.py

# Inspect a results file as a per-task table without re-running:
just bench-gpu-inspect benchmarks/gpu_comparison/results/run_<timestamp>.json

Individual cells can be triaged with just bench-gpu-cell <task> <config> <batch>. Use just bench-gpu-list to see registered tasks.