Architecture Design¶
This project is a lightweight embedded vector database implemented in pure Go, aiming to provide high-performance, low-latency vector similarity search capability for local applications, desktop programs, and edge devices. It provides:
- Qdrant-compatible REST API (optional microservice mode)
- Embedded library mode: zero network overhead, direct use in applications
- HNSW approximate nearest neighbor index, supporting sub-linear search complexity
- BoltDB-based persistence using Protobuf serialization
- Support for multiple distance metrics (Cosine, Euclidean, Dot product)
- Advanced metadata filtering (exact, range, prefix, regex, contains)
- Optional 8-bit scalar quantization to reduce disk usage
Project Structure¶
The repository uses a layered structure organized by "functional domains":
- cmd/govector: Standalone service entry point, starts HTTP API server
- api: HTTP API layer, exposing Qdrant-compatible interfaces
- core: Core engine layer, containing collection management, indexing, storage, models and quantization
- example/embedded: Embedded library usage examples
- Scripts: Build, release and demo scripts
graph TB
subgraph "Application Layer"
EMB["Embedded Application
example/embedded/main.go"]
SVC["Standalone Service
cmd/govector/main.go"]
end
subgraph "API Layer"
API["HTTP API Server
api/server.go"]
end
subgraph "Business Logic Layer"
COL["Collection
core/collection.go"]
IDX_IF["Index Interface VectorIndex
core/index.go"]
HNSW["HNSW Index Implementation
core/hnsw_index.go"]
FLAT["Flat Index Implementation
core/flat_index.go"]
QUANT["Quantizer Interface/Implementation
core/quantization.go"]
end
subgraph "Storage Layer"
STORE["Storage
core/storage.go"]
BBOLT["BoltDB Engine
go.mod external dependency"]
PROTO["Protobuf Serialization
core/proto/*.proto"]
end
EMB --> COL
SVC --> API
API --> COL
COL --> IDX_IF
IDX_IF --> HNSW
IDX_IF --> FLAT
COL --> STORE
STORE --> BBOLT
STORE --> PROTO
QUANT --> STORE
Core Components¶
- Storage Layer: bbolt-based key-value storage, one bucket per collection; collection metadata stored in special bucket; uses Protobuf serialization for point data; supports optional 8-bit scalar quantization (SQ8)
- Index Layer (VectorIndex interface and implementations): Unified abstraction supporting flat brute-force index (FlatIndex) and HNSW graph index (HNSWIndex), transparently switchable at runtime
- Business Logic Layer (Collection): Encapsulates collection dimension, distance metric, concurrency control, persistence consistency guarantee; provides Upsert/Search/Delete
- API Layer (Server): Provides Qdrant-compatible REST interface, responsible for routing, parameter parsing, error handling and result encoding; supports automatic loading of persisted collections
Architecture Overview¶
The system adopts a four-layer architecture of "Storage Layer - Index Layer - Business Logic Layer - API Layer", combined with "Embedded Library Mode" and "Standalone Service Mode" to meet different deployment scenarios.
graph TB
Client["Client/Application"] --> API["API Layer
api/server.go"]
API --> COL["Business Logic Layer
core/collection.go"]
COL --> IDX["Index Layer Interface
core/index.go"]
IDX --> HNSW["HNSW Implementation
core/hnsw_index.go"]
IDX --> FLAT["Flat Implementation
core/flat_index.go"]
COL --> STORE["Storage Layer
core/storage.go"]
STORE --> BBOLT["bbolt database"]
STORE --> PROTO["Protobuf Serialization"]
STORE --> QUANT["SQ8 Quantizer"]
Detailed Component Analysis¶
API Layer (HTTP Server)¶
Responsibilities and Features: - Provides Qdrant-compatible endpoints: collection management, point upsert, query, delete - Automatically loads collection metadata from storage and initializes in-memory index - Thread-safe: internally uses mutex to protect HTTP server and collection map - Error handling: returns standard status codes for invalid JSON, non-existent collection, internal errors
Key Flow (using search as example):
sequenceDiagram
participant C as "Client"
participant S as "Server(api/server.go)"
participant M as "Collection"
participant I as "Index(VectorIndex)"
participant ST as "Storage"
C->>S : "POST /collections/{name}/points/search"
S->>S : "Parse request body/validate parameters"
S->>M : "Search(query, filter, limit)"
M->>I : "Search(query, filter, topK)"
I->>I : "HNSW graph traversal/flat scan"
I-->>M : "TopK results (with ID/Payload)"
M-->>S : "Results"
S-->>C : "JSON Response"
Business Logic Layer (Collection)¶
Responsibilities and Features: - Uniformly manages collection dimension, distance metric, concurrent reads and writes - Saves collection metadata at creation and ensures collection bucket exists in storage - Upsert strictly follows "persist first, then update in-memory index" order, rolls back on best-effort failure - Search/Count/Delete expose consistent interface, internally execute through index layer
Upsert Flow (with persistence consistency):
flowchart TD
Start(["Enter Upsert"]) --> Lock["Acquire write lock"]
Lock --> Validate["Validate dimension/generate version"]
Validate --> Persist{"Storage configured?"}
Persist --> |Yes| Save["Storage layer write"]
Persist --> |No| UpdateMem["Skip storage"]
Save --> UpdateMem
UpdateMem --> IndexUpsert["Index layer Upsert"]
IndexUpsert --> Ok{"Success?"}
Ok --> |Yes| Unlock["Unlock and return"]
Ok --> |No| Rollback["Try delete written points (best effort)"]
Rollback --> Unlock
Index Layer (VectorIndex Abstraction and Implementation)¶
- Abstraction interface: Unified Upsert/Search/Delete/Count/filter by condition capabilities
- HNSW implementation: Based on external library, supports custom M, EfSearch and other parameters; sets distance function according to distance metric; supports post-filtering by filter condition (current strategy)
- Flat implementation: Maintains point map in memory, brute-force distance calculation, suitable for small-scale or scenarios requiring exact retrieval
HNSW Search Flow (with post-filtering):
flowchart TD
Enter(["Enter Search"]) --> CheckFilter{"Filter exists?"}
CheckFilter --> |Yes| FetchK["Expand fetch K (estimate post-filter count)"]
CheckFilter --> |No| UseTopK["Use topK"]
FetchK --> Graph["HNSW graph search"]
UseTopK --> Graph
Graph --> Iterate["Traverse neighbor nodes"]
Iterate --> Lookup["Lookup point by ID/validate Payload"]
Lookup --> Calc["Recalculate score by metric"]
Calc --> Collect["Collect results"]
Collect --> Done(["Return TopK"])
Storage Layer (Storage)¶
- Persistence model: One bucket per collection; collection metadata stored in special bucket
- Serialization: Uses Protobuf to serialize point structures to bytes, supporting multi-type payloads
- Quantization: Optional SQ8 quantization, compresses on write, decompresses on read
- Concurrency: Based on bbolt's transaction model, provides read-only/read-write transactions
Data Models and Filtering¶
- PointStruct/SimplePoint: Contains ID, version, vector, payload
- Filter/Condition: Supports Must/MustNot, multiple match types (exact, range, prefix, contains, regex)
- Matching algorithm: Handles string, array, numeric types separately, supports regex compilation and matching
Standalone Service and Embedded Mode¶
- Standalone service: Command line starts HTTP server, loads default collections, registers to API server
- Embedded library: Application directly imports core package, creates Storage/Collection, performs Upsert/Search
Dependency Analysis¶
- External dependencies
- HNSW graph library: Used for approximate nearest neighbor search
- bbolt: Embedded key-value database
- Protobuf: Structured serialization
- Internal coupling
- API layer depends on Collection
- Collection depends on VectorIndex interface and Storage
- HNSW/Flat implementations depend on Collection's data structures
- Storage depends on Protobuf definitions and bbolt
graph LR
API["api/server.go"] --> COL["core/collection.go"]
COL --> IDX["core/index.go"]
IDX --> HNSW["core/hnsw_index.go"]
IDX --> FLAT["core/flat_index.go"]
COL --> STORE["core/storage.go"]
STORE --> BBOLT["bbolt"]
STORE --> PROTO["protobuf"]
STORE --> QUANT["core/quantization.go"]
Performance and Scalability¶
- Index selection
- HNSW: Supports sub-linear search complexity, suitable for large-scale data; tunable via parameters (M, EfSearch, EfConstruction, K)
- Flat index: O(n) brute-force search, suitable for small-scale or scenarios requiring exact retrieval
- Persistence and serialization
- Uses bbolt and Protobuf, balancing throughput and reliability
- SQ8 quantization significantly reduces disk usage, suitable for large-scale vector storage
- Concurrency and consistency
- Collection uses read-write lock, API layer also locks collection map, avoiding races
- Upsert strictly follows "persist first, then update index" order, rolls back on best-effort failure
- Scalability recommendations
- Horizontal scaling: Current design is single-machine embedded, does not support distributed replication; if horizontal scaling needed, can split collections at application layer or introduce proxy layer
- Parameter tuning: Adjust HNSW parameters based on data scale and query characteristics; enable quantization to save space
- Caching: Can add query cache at outer layer (note synchronization with version/updates)
Security and Operations¶
- Security
- Currently no built-in authentication/authorization or TLS; recommended to add authentication and encryption at reverse proxy layer
- Input validation: API layer validates JSON, collection name, dimension, metric, etc.
- Monitoring
- Can add request count, latency, error rate metrics at API layer; combine with log output
- Disaster recovery
- bbolt file is data file, regular backup recommended; supports automatic loading of collection metadata and points after restart
- Best-effort rollback on Upsert failure, reducing inconsistency risk
Troubleshooting Guide¶
- Startup failure
- Check bbolt database file permissions and path
- Check storage layer open errors and collection metadata loading errors
- Query exception
- Confirm query vector dimension matches collection dimension
- Check if filter conditions are correct (Must/MustNot, match types)
- Write failure
- Pay attention to Upsert persistence and index update order; check rollback logs
- Service shutdown
- Use graceful shutdown, wait for connection completion or timeout
Conclusion¶
This project implements the core capabilities of "embedded vector database" with a clear layered architecture: high-performance retrieval (HNSW), reliable persistence (bbolt+Protobuf), flexible filtering, and dual-mode deployment (embedded/microservice). Through abstracted index interface and strict persistence order, the system achieves good balance between availability and consistency. Future evolution can further develop in query filter pushdown, horizontal scaling and observability.