Bounded Multi-Source Shortest Paths: Algorithm Theory, Implementation Comparison, and Benchmarking
This page provides detailed information about each language implementation of the BMSSP algorithm, including performance characteristics, build instructions, and implementation notes.
All implementations follow the same standardized interface:
./bmssp --graph <type> --rows <r> --cols <c> --k <sources> --B <bound> --seed <s> --trials <n> --json
{
"impl": "rust-bmssp",
"lang": "Rust",
"graph": "grid",
"n": 2500,
"m": 9800,
"k": 4,
"B": 50,
"seed": 1,
"time_ns": 741251,
"popped": 868,
"edges_scanned": 3423,
"heap_pushes": 1047,
"B_prime": 50,
"mem_bytes": 241824
}
Based on empirical benchmarks in GitHub Actions environment:
🚀 Fastest (< 200μs): C, C++, Rust
⚡ Fast (200μs - 2ms): Nim, Crystal
🐌 Slower (> 5ms): Kotlin, Elixir, Erlang
Location: impls/c/
Build: make
Binary: bin/bmssp_c
cd impls/c
make
./bin/bmssp_c --graph grid --rows 50 --cols 50 --k 4 --B 50 --seed 1 --trials 5 --json
Performance: ⭐⭐⭐⭐⭐ Fastest overall
Typical time: ~99μs for 50×50 grid, k=4, B=50
Implementation highlights:
Code structure:
typedef struct {
uint32_t node;
uint64_t dist;
} HeapEntry;
typedef struct {
HeapEntry* data;
size_t size, capacity;
} BinaryHeap;
Location: impls/cpp/
Build: make
Binary: bin/bmssp_cpp
Performance: ⭐⭐⭐⭐⭐ Near C performance
Typical time: ~117μs for 50×50 grid, k=4, B=50
Implementation highlights:
priority_queue with custom comparatorCode structure:
using PQEntry = std::pair<uint64_t, uint32_t>;
std::priority_queue<PQEntry, std::vector<PQEntry>, std::greater<PQEntry>> pq;
Location: bmssp/ (Cargo crate)
Build: cargo build --release
Binary: target/release/bmssp
Performance: ⭐⭐⭐⭐ Excellent
Typical time: ~741μs for 50×50 grid, k=4, B=50
Implementation highlights:
BinaryHeap from standard libraryCode structure:
use std::collections::BinaryHeap;
#[derive(Copy, Clone, Eq, PartialEq)]
struct State {
cost: u64,
position: usize,
}
impl Ord for State {
fn cmp(&self, other: &Self) -> Ordering {
other.cost.cmp(&self.cost) // Min-heap
}
}
Cargo features:
cargo test # Run test suite
cargo bench -p bmssp # Benchmark suite
cargo doc --open # Generate docs
Location: impls/nim/
Build: nim c -d:release src/bmssp.nim
Binary: src/bmssp
Performance: ⭐⭐⭐⭐ Fast
Typical time: ~2ms for 50×50 grid, k=4, B=50
Implementation highlights:
heapqueue moduleLocation: impls/crystal/
Build: shards build --release
Binary: bin/bmssp_cr
Performance: ⭐⭐⭐⭐ Fast
Typical time: ~2ms for 50×50 grid, k=4, B=50
Implementation highlights:
Build and run:
cd impls/crystal
shards build --release
./bin/bmssp_cr --graph grid --rows 20 --cols 20 --k 8 --B 50 --seed 1 --trials 2 --json
Location: impls/kotlin/
Build: gradle shadowJar
Binary: build/libs/bmssp-all.jar
Performance: ⭐⭐ Slower (JVM overhead)
Typical time: ~5.3ms for 50×50 grid, k=4, B=50
Implementation highlights:
Run:
cd impls/kotlin
gradle shadowJar
java -jar build/libs/bmssp-all.jar --graph grid --rows 50 --cols 50 --k 4 --B 50
Location: impls/elixir/
Build: No build step (interpreted)
Script: bmssp.exs
Performance: ⭐⭐ Slower (BEAM VM)
Typical time: ~5.4ms for 50×50 grid, k=4, B=50
Implementation highlights:
Run:
cd impls/elixir
elixir bmssp.exs --graph grid --rows 50 --cols 50 --k 4 --B 50
Location: impls/erlang/
Build: erlc bmssp.erl
Binary: bmssp.beam
Performance: ⭐⭐⭐ Moderate (BEAM VM)
Typical time: ~1.2ms for 50×50 grid, k=4, B=50
Implementation highlights:
All implementations support three graph types:
--graph grid --rows R --cols C
--graph er --n N --p P
--graph ba --n N --m0 M0 --m M
For deterministic comparison across languages:
python3 bench/runner.py --shared-inputs --include-impls rust,c,cpp
This generates graphs once and reuses them, ensuring identical inputs.
Each implementation reports peak memory usage:
| Priority queue: Up to $O( | U | )$ entries |
All implementations must:
popped, edges_scanned, B_prime# Install dependencies
scripts/install_deps.sh --yes
# Build all implementations
python3 bench/runner.py --build-only
# Quick test (subset of languages)
python3 bench/runner.py --quick --include-impls rust,c,cpp --out results-test
# Full benchmark suite
python3 bench/runner.py --release --out results --timeout-seconds 600
See CONTRIBUTING.md for detailed instructions on adding new language implementations.
Requirements:
bench/runner.pyVerification checklist:
| ← Algorithm Theory | Benchmarking → |