Architecture Design¶

This project is a lightweight embedded vector database implemented in pure Go, aiming to provide high-performance, low-latency vector similarity search capability for local applications, desktop programs, and edge devices. It provides:

Qdrant-compatible REST API (optional microservice mode)
Embedded library mode: zero network overhead, direct use in applications
HNSW approximate nearest neighbor index, supporting sub-linear search complexity
BoltDB-based persistence using Protobuf serialization
Support for multiple distance metrics (Cosine, Euclidean, Dot product)
Advanced metadata filtering (exact, range, prefix, regex, contains)
Optional 8-bit scalar quantization to reduce disk usage

Project Structure¶

The repository uses a layered structure organized by "functional domains":

cmd/govector: Standalone service entry point, starts HTTP API server
api: HTTP API layer, exposing Qdrant-compatible interfaces
core: Core engine layer, containing collection management, indexing, storage, models and quantization
example/embedded: Embedded library usage examples
Scripts: Build, release and demo scripts

graph TB
subgraph "Application Layer"
EMB["Embedded Application
example/embedded/main.go"]
SVC["Standalone Service
cmd/govector/main.go"]
end
subgraph "API Layer"
API["HTTP API Server
api/server.go"]
end
subgraph "Business Logic Layer"
COL["Collection
core/collection.go"]
IDX_IF["Index Interface VectorIndex
core/index.go"]
HNSW["HNSW Index Implementation
core/hnsw_index.go"]
FLAT["Flat Index Implementation
core/flat_index.go"]
QUANT["Quantizer Interface/Implementation
core/quantization.go"]
end
subgraph "Storage Layer"
STORE["Storage
core/storage.go"]
BBOLT["BoltDB Engine
go.mod external dependency"]
PROTO["Protobuf Serialization
core/proto/*.proto"]
end
EMB --> COL
SVC --> API
API --> COL
COL --> IDX_IF
IDX_IF --> HNSW
IDX_IF --> FLAT
COL --> STORE
STORE --> BBOLT
STORE --> PROTO
QUANT --> STORE

Core Components¶

Storage Layer: bbolt-based key-value storage, one bucket per collection; collection metadata stored in special bucket; uses Protobuf serialization for point data; supports optional 8-bit scalar quantization (SQ8)
Index Layer (VectorIndex interface and implementations): Unified abstraction supporting flat brute-force index (FlatIndex) and HNSW graph index (HNSWIndex), transparently switchable at runtime
Business Logic Layer (Collection): Encapsulates collection dimension, distance metric, concurrency control, persistence consistency guarantee; provides Upsert/Search/Delete
API Layer (Server): Provides Qdrant-compatible REST interface, responsible for routing, parameter parsing, error handling and result encoding; supports automatic loading of persisted collections

Architecture Overview¶

The system adopts a four-layer architecture of "Storage Layer - Index Layer - Business Logic Layer - API Layer", combined with "Embedded Library Mode" and "Standalone Service Mode" to meet different deployment scenarios.

graph TB
Client["Client/Application"] --> API["API Layer
api/server.go"]
API --> COL["Business Logic Layer
core/collection.go"]
COL --> IDX["Index Layer Interface
core/index.go"]
IDX --> HNSW["HNSW Implementation
core/hnsw_index.go"]
IDX --> FLAT["Flat Implementation
core/flat_index.go"]
COL --> STORE["Storage Layer
core/storage.go"]
STORE --> BBOLT["bbolt database"]
STORE --> PROTO["Protobuf Serialization"]
STORE --> QUANT["SQ8 Quantizer"]

Detailed Component Analysis¶

API Layer (HTTP Server)¶

Responsibilities and Features: - Provides Qdrant-compatible endpoints: collection management, point upsert, query, delete - Automatically loads collection metadata from storage and initializes in-memory index - Thread-safe: internally uses mutex to protect HTTP server and collection map - Error handling: returns standard status codes for invalid JSON, non-existent collection, internal errors

Key Flow (using search as example):

sequenceDiagram
participant C as "Client"
participant S as "Server(api/server.go)"
participant M as "Collection"
participant I as "Index(VectorIndex)"
participant ST as "Storage"
C->>S : "POST /collections/{name}/points/search"
S->>S : "Parse request body/validate parameters"
S->>M : "Search(query, filter, limit)"
M->>I : "Search(query, filter, topK)"
I->>I : "HNSW graph traversal/flat scan"
I-->>M : "TopK results (with ID/Payload)"
M-->>S : "Results"
S-->>C : "JSON Response"

Business Logic Layer (Collection)¶

Responsibilities and Features: - Uniformly manages collection dimension, distance metric, concurrent reads and writes - Saves collection metadata at creation and ensures collection bucket exists in storage - Upsert strictly follows "persist first, then update in-memory index" order, rolls back on best-effort failure - Search/Count/Delete expose consistent interface, internally execute through index layer

Upsert Flow (with persistence consistency):

flowchart TD
Start(["Enter Upsert"]) --> Lock["Acquire write lock"]
Lock --> Validate["Validate dimension/generate version"]
Validate --> Persist{"Storage configured?"}
Persist --> |Yes| Save["Storage layer write"]
Persist --> |No| UpdateMem["Skip storage"]
Save --> UpdateMem
UpdateMem --> IndexUpsert["Index layer Upsert"]
IndexUpsert --> Ok{"Success?"}
Ok --> |Yes| Unlock["Unlock and return"]
Ok --> |No| Rollback["Try delete written points (best effort)"]
Rollback --> Unlock

Index Layer (VectorIndex Abstraction and Implementation)¶

Abstraction interface: Unified Upsert/Search/Delete/Count/filter by condition capabilities
HNSW implementation: Based on external library, supports custom M, EfSearch and other parameters; sets distance function according to distance metric; supports post-filtering by filter condition (current strategy)
Flat implementation: Maintains point map in memory, brute-force distance calculation, suitable for small-scale or scenarios requiring exact retrieval

HNSW Search Flow (with post-filtering):

flowchart TD
Enter(["Enter Search"]) --> CheckFilter{"Filter exists?"}
CheckFilter --> |Yes| FetchK["Expand fetch K (estimate post-filter count)"]
CheckFilter --> |No| UseTopK["Use topK"]
FetchK --> Graph["HNSW graph search"]
UseTopK --> Graph
Graph --> Iterate["Traverse neighbor nodes"]
Iterate --> Lookup["Lookup point by ID/validate Payload"]
Lookup --> Calc["Recalculate score by metric"]
Calc --> Collect["Collect results"]
Collect --> Done(["Return TopK"])

Storage Layer (Storage)¶

Persistence model: One bucket per collection; collection metadata stored in special bucket
Serialization: Uses Protobuf to serialize point structures to bytes, supporting multi-type payloads
Quantization: Optional SQ8 quantization, compresses on write, decompresses on read
Concurrency: Based on bbolt's transaction model, provides read-only/read-write transactions

Data Models and Filtering¶

PointStruct/SimplePoint: Contains ID, version, vector, payload
Filter/Condition: Supports Must/MustNot, multiple match types (exact, range, prefix, contains, regex)
Matching algorithm: Handles string, array, numeric types separately, supports regex compilation and matching

Standalone Service and Embedded Mode¶

Standalone service: Command line starts HTTP server, loads default collections, registers to API server
Embedded library: Application directly imports core package, creates Storage/Collection, performs Upsert/Search

Dependency Analysis¶

External dependencies
HNSW graph library: Used for approximate nearest neighbor search
bbolt: Embedded key-value database
Protobuf: Structured serialization
Internal coupling
API layer depends on Collection
Collection depends on VectorIndex interface and Storage
HNSW/Flat implementations depend on Collection's data structures
Storage depends on Protobuf definitions and bbolt

graph LR
API["api/server.go"] --> COL["core/collection.go"]
COL --> IDX["core/index.go"]
IDX --> HNSW["core/hnsw_index.go"]
IDX --> FLAT["core/flat_index.go"]
COL --> STORE["core/storage.go"]
STORE --> BBOLT["bbolt"]
STORE --> PROTO["protobuf"]
STORE --> QUANT["core/quantization.go"]

Performance and Scalability¶

Index selection
HNSW: Supports sub-linear search complexity, suitable for large-scale data; tunable via parameters (M, EfSearch, EfConstruction, K)
Flat index: O(n) brute-force search, suitable for small-scale or scenarios requiring exact retrieval
Persistence and serialization
Uses bbolt and Protobuf, balancing throughput and reliability
SQ8 quantization significantly reduces disk usage, suitable for large-scale vector storage
Concurrency and consistency
Collection uses read-write lock, API layer also locks collection map, avoiding races
Upsert strictly follows "persist first, then update index" order, rolls back on best-effort failure
Scalability recommendations
Horizontal scaling: Current design is single-machine embedded, does not support distributed replication; if horizontal scaling needed, can split collections at application layer or introduce proxy layer
Parameter tuning: Adjust HNSW parameters based on data scale and query characteristics; enable quantization to save space
Caching: Can add query cache at outer layer (note synchronization with version/updates)

Security and Operations¶

Security
Currently no built-in authentication/authorization or TLS; recommended to add authentication and encryption at reverse proxy layer
Input validation: API layer validates JSON, collection name, dimension, metric, etc.
Monitoring
Can add request count, latency, error rate metrics at API layer; combine with log output
Disaster recovery
bbolt file is data file, regular backup recommended; supports automatic loading of collection metadata and points after restart
Best-effort rollback on Upsert failure, reducing inconsistency risk

Troubleshooting Guide¶

Startup failure
Check bbolt database file permissions and path
Check storage layer open errors and collection metadata loading errors
Query exception
Confirm query vector dimension matches collection dimension
Check if filter conditions are correct (Must/MustNot, match types)
Write failure
Pay attention to Upsert persistence and index update order; check rollback logs
Service shutdown
Use graceful shutdown, wait for connection completion or timeout

Conclusion¶

This project implements the core capabilities of "embedded vector database" with a clear layered architecture: high-performance retrieval (HNSW), reliable persistence (bbolt+Protobuf), flexible filtering, and dual-mode deployment (embedded/microservice). Through abstracted index interface and strict persistence order, the system achieves good balance between availability and consistency. Future evolution can further develop in query filter pushdown, horizontal scaling and observability.