Skip to content

Architecture Design

This project is a lightweight embedded vector database implemented in pure Go, aiming to provide high-performance, low-latency vector similarity search capability for local applications, desktop programs, and edge devices. It provides:

  • Qdrant-compatible REST API (optional microservice mode)
  • Embedded library mode: zero network overhead, direct use in applications
  • HNSW approximate nearest neighbor index, supporting sub-linear search complexity
  • BoltDB-based persistence using Protobuf serialization
  • Support for multiple distance metrics (Cosine, Euclidean, Dot product)
  • Advanced metadata filtering (exact, range, prefix, regex, contains)
  • Optional 8-bit scalar quantization to reduce disk usage

Project Structure

The repository uses a layered structure organized by "functional domains":

  • cmd/govector: Standalone service entry point, starts HTTP API server
  • api: HTTP API layer, exposing Qdrant-compatible interfaces
  • core: Core engine layer, containing collection management, indexing, storage, models and quantization
  • example/embedded: Embedded library usage examples
  • Scripts: Build, release and demo scripts
graph TB
subgraph "Application Layer"
EMB["Embedded Application
example/embedded/main.go"] SVC["Standalone Service
cmd/govector/main.go"] end subgraph "API Layer" API["HTTP API Server
api/server.go"] end subgraph "Business Logic Layer" COL["Collection
core/collection.go"] IDX_IF["Index Interface VectorIndex
core/index.go"] HNSW["HNSW Index Implementation
core/hnsw_index.go"] FLAT["Flat Index Implementation
core/flat_index.go"] QUANT["Quantizer Interface/Implementation
core/quantization.go"] end subgraph "Storage Layer" STORE["Storage
core/storage.go"] BBOLT["BoltDB Engine
go.mod external dependency"] PROTO["Protobuf Serialization
core/proto/*.proto"] end EMB --> COL SVC --> API API --> COL COL --> IDX_IF IDX_IF --> HNSW IDX_IF --> FLAT COL --> STORE STORE --> BBOLT STORE --> PROTO QUANT --> STORE

Core Components

  • Storage Layer: bbolt-based key-value storage, one bucket per collection; collection metadata stored in special bucket; uses Protobuf serialization for point data; supports optional 8-bit scalar quantization (SQ8)
  • Index Layer (VectorIndex interface and implementations): Unified abstraction supporting flat brute-force index (FlatIndex) and HNSW graph index (HNSWIndex), transparently switchable at runtime
  • Business Logic Layer (Collection): Encapsulates collection dimension, distance metric, concurrency control, persistence consistency guarantee; provides Upsert/Search/Delete
  • API Layer (Server): Provides Qdrant-compatible REST interface, responsible for routing, parameter parsing, error handling and result encoding; supports automatic loading of persisted collections

Architecture Overview

The system adopts a four-layer architecture of "Storage Layer - Index Layer - Business Logic Layer - API Layer", combined with "Embedded Library Mode" and "Standalone Service Mode" to meet different deployment scenarios.

graph TB
Client["Client/Application"] --> API["API Layer
api/server.go"] API --> COL["Business Logic Layer
core/collection.go"] COL --> IDX["Index Layer Interface
core/index.go"] IDX --> HNSW["HNSW Implementation
core/hnsw_index.go"] IDX --> FLAT["Flat Implementation
core/flat_index.go"] COL --> STORE["Storage Layer
core/storage.go"] STORE --> BBOLT["bbolt database"] STORE --> PROTO["Protobuf Serialization"] STORE --> QUANT["SQ8 Quantizer"]

Detailed Component Analysis

API Layer (HTTP Server)

Responsibilities and Features: - Provides Qdrant-compatible endpoints: collection management, point upsert, query, delete - Automatically loads collection metadata from storage and initializes in-memory index - Thread-safe: internally uses mutex to protect HTTP server and collection map - Error handling: returns standard status codes for invalid JSON, non-existent collection, internal errors

Key Flow (using search as example):

sequenceDiagram
participant C as "Client"
participant S as "Server(api/server.go)"
participant M as "Collection"
participant I as "Index(VectorIndex)"
participant ST as "Storage"
C->>S : "POST /collections/{name}/points/search"
S->>S : "Parse request body/validate parameters"
S->>M : "Search(query, filter, limit)"
M->>I : "Search(query, filter, topK)"
I->>I : "HNSW graph traversal/flat scan"
I-->>M : "TopK results (with ID/Payload)"
M-->>S : "Results"
S-->>C : "JSON Response"

Business Logic Layer (Collection)

Responsibilities and Features: - Uniformly manages collection dimension, distance metric, concurrent reads and writes - Saves collection metadata at creation and ensures collection bucket exists in storage - Upsert strictly follows "persist first, then update in-memory index" order, rolls back on best-effort failure - Search/Count/Delete expose consistent interface, internally execute through index layer

Upsert Flow (with persistence consistency):

flowchart TD
Start(["Enter Upsert"]) --> Lock["Acquire write lock"]
Lock --> Validate["Validate dimension/generate version"]
Validate --> Persist{"Storage configured?"}
Persist --> |Yes| Save["Storage layer write"]
Persist --> |No| UpdateMem["Skip storage"]
Save --> UpdateMem
UpdateMem --> IndexUpsert["Index layer Upsert"]
IndexUpsert --> Ok{"Success?"}
Ok --> |Yes| Unlock["Unlock and return"]
Ok --> |No| Rollback["Try delete written points (best effort)"]
Rollback --> Unlock

Index Layer (VectorIndex Abstraction and Implementation)

  • Abstraction interface: Unified Upsert/Search/Delete/Count/filter by condition capabilities
  • HNSW implementation: Based on external library, supports custom M, EfSearch and other parameters; sets distance function according to distance metric; supports post-filtering by filter condition (current strategy)
  • Flat implementation: Maintains point map in memory, brute-force distance calculation, suitable for small-scale or scenarios requiring exact retrieval

HNSW Search Flow (with post-filtering):

flowchart TD
Enter(["Enter Search"]) --> CheckFilter{"Filter exists?"}
CheckFilter --> |Yes| FetchK["Expand fetch K (estimate post-filter count)"]
CheckFilter --> |No| UseTopK["Use topK"]
FetchK --> Graph["HNSW graph search"]
UseTopK --> Graph
Graph --> Iterate["Traverse neighbor nodes"]
Iterate --> Lookup["Lookup point by ID/validate Payload"]
Lookup --> Calc["Recalculate score by metric"]
Calc --> Collect["Collect results"]
Collect --> Done(["Return TopK"])

Storage Layer (Storage)

  • Persistence model: One bucket per collection; collection metadata stored in special bucket
  • Serialization: Uses Protobuf to serialize point structures to bytes, supporting multi-type payloads
  • Quantization: Optional SQ8 quantization, compresses on write, decompresses on read
  • Concurrency: Based on bbolt's transaction model, provides read-only/read-write transactions

Data Models and Filtering

  • PointStruct/SimplePoint: Contains ID, version, vector, payload
  • Filter/Condition: Supports Must/MustNot, multiple match types (exact, range, prefix, contains, regex)
  • Matching algorithm: Handles string, array, numeric types separately, supports regex compilation and matching

Standalone Service and Embedded Mode

  • Standalone service: Command line starts HTTP server, loads default collections, registers to API server
  • Embedded library: Application directly imports core package, creates Storage/Collection, performs Upsert/Search

Dependency Analysis

  • External dependencies
  • HNSW graph library: Used for approximate nearest neighbor search
  • bbolt: Embedded key-value database
  • Protobuf: Structured serialization
  • Internal coupling
  • API layer depends on Collection
  • Collection depends on VectorIndex interface and Storage
  • HNSW/Flat implementations depend on Collection's data structures
  • Storage depends on Protobuf definitions and bbolt
graph LR
API["api/server.go"] --> COL["core/collection.go"]
COL --> IDX["core/index.go"]
IDX --> HNSW["core/hnsw_index.go"]
IDX --> FLAT["core/flat_index.go"]
COL --> STORE["core/storage.go"]
STORE --> BBOLT["bbolt"]
STORE --> PROTO["protobuf"]
STORE --> QUANT["core/quantization.go"]

Performance and Scalability

  • Index selection
  • HNSW: Supports sub-linear search complexity, suitable for large-scale data; tunable via parameters (M, EfSearch, EfConstruction, K)
  • Flat index: O(n) brute-force search, suitable for small-scale or scenarios requiring exact retrieval
  • Persistence and serialization
  • Uses bbolt and Protobuf, balancing throughput and reliability
  • SQ8 quantization significantly reduces disk usage, suitable for large-scale vector storage
  • Concurrency and consistency
  • Collection uses read-write lock, API layer also locks collection map, avoiding races
  • Upsert strictly follows "persist first, then update index" order, rolls back on best-effort failure
  • Scalability recommendations
  • Horizontal scaling: Current design is single-machine embedded, does not support distributed replication; if horizontal scaling needed, can split collections at application layer or introduce proxy layer
  • Parameter tuning: Adjust HNSW parameters based on data scale and query characteristics; enable quantization to save space
  • Caching: Can add query cache at outer layer (note synchronization with version/updates)

Security and Operations

  • Security
  • Currently no built-in authentication/authorization or TLS; recommended to add authentication and encryption at reverse proxy layer
  • Input validation: API layer validates JSON, collection name, dimension, metric, etc.
  • Monitoring
  • Can add request count, latency, error rate metrics at API layer; combine with log output
  • Disaster recovery
  • bbolt file is data file, regular backup recommended; supports automatic loading of collection metadata and points after restart
  • Best-effort rollback on Upsert failure, reducing inconsistency risk

Troubleshooting Guide

  • Startup failure
  • Check bbolt database file permissions and path
  • Check storage layer open errors and collection metadata loading errors
  • Query exception
  • Confirm query vector dimension matches collection dimension
  • Check if filter conditions are correct (Must/MustNot, match types)
  • Write failure
  • Pay attention to Upsert persistence and index update order; check rollback logs
  • Service shutdown
  • Use graceful shutdown, wait for connection completion or timeout

Conclusion

This project implements the core capabilities of "embedded vector database" with a clear layered architecture: high-performance retrieval (HNSW), reliable persistence (bbolt+Protobuf), flexible filtering, and dual-mode deployment (embedded/microservice). Through abstracted index interface and strict persistence order, the system achieves good balance between availability and consistency. Future evolution can further develop in query filter pushdown, horizontal scaling and observability.