Initialization Phase
Load the initial portion of the dataset and build vector indexes.
Measures: Index construction latency (T₀)
X-bench offers a controllable, system-agnostic suite for benchmarking filtered vector search, combining high-dimensional similarity lookup with structured scalar filtering to enable fair, scalable comparisons across vector databases.
X-bench is a controllable and end-to-end benchmark suite for evaluating filtered vector search—a core capability of modern vector databases that combines high-dimensional similarity search with structured scalar filtering. It is designed to reflect realistic filtered search scenarios, enabling fair, scalable, and explainable performance comparison across diverse database systems.
X-bench provides a modular, auto-scalable benchmarking pipeline covering data generation, query synthesis, and workload execution. The benchmark leverages statistically grounded methods—such as Johnson–Lindenstrauss–based random projection and distribution-aware query generation—to preserve vector-scalar correlations while flexibly controlling data dimension, scale, filtering rate, and filter correlation. This design enables systematic stress testing of vector databases under both static and dynamic workloads, including concurrent queries and online updates.
A unified ranking-based metric (Vrank) aggregates six-phase performance (indexing, query, concurrency, and update) into a single comparable score, providing a holistic view of system efficiency and robustness.
X-bench is system-agnostic and can be easily adapted to a wide range of vector databases and indexing strategies. It encourages combined software–hardware optimization and fair comparison under standardized configurations, promoting deeper understanding of the trade-offs in filtered vector search and driving innovation in next-generation vector data systems.
With its controllable workloads, realistic data distributions, and unified metric, X-bench offers the first principled framework for ranking and understanding filtered vector search performance at scale.
Click any metric to sort the table. Values highlighted in green indicate the best score for that column.
X-bench adopts a six-phase end-to-end evaluation workflow to comprehensively measure vector database performance across static and dynamic workloads. Each phase mirrors a real-world hybrid search stage, and the unified Vrank metric aggregates all results.
Load the initial portion of the dataset and build vector indexes.
Measures: Index construction latency (T₀)
Execute filtered vector search queries over the indexed data.
Measures: Average latency and recall
Run multiple filtered queries simultaneously to assess scalability.
Measures: Queries per second (QPS)
Insert the remaining dataset portion and trigger incremental index updates.
Measures: Insertion and maintenance time (T₁)
Modify a subset of scalar attributes to evaluate index maintenance under updates.
Measures: Update latency (T₂)
Delete part of the dataset to study removal overhead and cleanup efficiency.
Measures: Deletion latency (T₃)
All six phases are aggregated into the unified Vrank score, delivering a holistic ranking that captures indexing throughput, query efficiency, concurrency resilience, and dynamic maintenance cost.
Researchers and engineers across search, recommendation, and data infrastructure collaborate to advance benchmarking standards.
xxx
xxx
xxx
xxx