Skip to content

Deployment Operations

This document is for GoVector deployment and operations in production environments, covering hardware and system configuration, network and ports, monitoring and alerting, backup and recovery, upgrades and migration, logging and troubleshooting, containerization and cloud platform integration, and automated scripts and configuration templates. GoVector provides embedded vector database capabilities, supporting running as an independent microservice (HTTP API compatible with Qdrant), or embedded as a Go library in applications.

Project Structure

  • Executable entry: cmd/govector/main.go
  • HTTP API layer: api/server.go
  • Core engine: core/collection.go, core/storage.go, core/index.go, core/hnsw_index.go
  • Build and release: scripts/build_release.sh, scripts/release/govector.service, scripts/release/govector.rb
  • Examples and benchmarks: demo.sh, Makefile
  • Dependency declaration: go.mod
graph TB
subgraph "Process"
S["Server
cmd/govector/main.go"] end subgraph "API Layer" API["HTTP API Server
api/server.go"] end subgraph "Core Engine" COL["Collection
core/collection.go"] ST["Storage(BoltDB)
core/storage.go"] IDX["VectorIndex Interface
core/index.go"] HNSW["HNSWIndex Implementation
core/hnsw_index.go"] end S --> API API --> COL COL --> ST COL --> IDX IDX --> HNSW

Core Components

  • Service entry and parameters
  • Command-line parameters for port, database path, whether to enable HNSW index, etc., are parsed by the service entry and passed to the API layer.
  • HTTP API server
  • Provides REST interfaces for collection management, point write, search, delete; supports graceful shutdown.
  • Storage layer
  • Local persistence based on bbolt, uses Protobuf to serialize point data; supports optional vector quantization to reduce disk usage.
  • Collection and index
  • Collection encapsulates collection metadata, distance metric, in-memory index and storage; supports Flat/HNSW two index strategies.
  • HNSW index
  • Based on coder/hnsw graph algorithm, supports Cosine/Euclid/Dot distance metrics and parameter tuning.

Architecture Overview

The diagram below shows the complete chain from process startup to API request processing, index query and storage read/write.

sequenceDiagram
participant Proc as "Process
cmd/govector/main.go" participant API as "HTTP Server
api/server.go" participant Col as "Collection
core/collection.go" participant St as "Storage
core/storage.go" participant Idx as "Index Interface
core/index.go" participant Hnsw as "HNSW Implementation
core/hnsw_index.go" Proc->>API : Initialize and listen on port API->>Col : Register collection/load metadata API->>API : Handle requests(write/search/delete) API->>Col : Upsert/Search/Delete Col->>St : Write/read point data Col->>Idx : Upsert/Search/Delete Idx->>Hnsw : HNSW operations Hnsw-->>Col : Results St-->>Col : Point data Col-->>API : Results API-->>Proc : Respond to client

Detailed Component Analysis

Service Entry and Startup Flow

  • Parse parameters: port, database path, whether to enable HNSW.
  • Initialize storage (bbolt) and default collection.
  • Start HTTP server and register collections.
  • Support signal interrupt and graceful shutdown.
flowchart TD
Start(["Process Startup"]) --> Parse["Parse command-line arguments"]
Parse --> InitStore["Initialize Storage(BoltDB)"]
InitStore --> InitCol["Create/Load default collection"]
InitCol --> InitAPI["Initialize HTTP server"]
InitAPI --> Listen["Listen on port for requests"]
Listen --> Graceful["Receive signals and graceful shutdown"]
Graceful --> End(["End"])

HTTP API and Request Processing

  • Endpoint Overview
  • Collection management: POST/DELETE /collections, GET /collections, GET /collections/{name}
  • Point operations: PUT /collections/{name}/points, POST /collections/{name}/points/search, POST /collections/{name}/points/delete
  • Key Processing Logic
  • Write: Decode JSON -> Validate -> Write to storage -> Update in-memory index
  • Search: Decode JSON -> Validate dimensions -> Index search -> Filter -> Return results
  • Delete: Delete by ID or filter -> Delete from storage first, then delete from index
sequenceDiagram
participant Client as "Client"
participant API as "API Server"
participant Col as "Collection"
participant St as "Storage"
participant Idx as "Index"
Client->>API : PUT /collections/{name}/points
API->>API : Decode request
API->>Col : Upsert(points)
Col->>St : Write point
Col->>Idx : Update index
API-->>Client : {status: ok}
Client->>API : POST /collections/{name}/points/search
API->>API : Decode request
API->>Col : Search(vector, filter, limit)
Col->>Idx : Search
Idx-->>Col : Results
API-->>Client : {status: ok, result: ...}

Storage and Persistence

  • Storage engine: bbolt, one bucket per collection; collection metadata stored in special bucket.
  • Serialization: Protobuf encode/decode point data; supports optional vector quantization (SQ8).
  • Consistency: Write to storage first, then update index; attempt storage rollback on failure.
flowchart TD
UStart(["Upsert Entry"]) --> Validate["Validate vector dimensions/version"]
Validate --> Persist["Write to storage(BoltDB)"]
Persist --> UpdateIdx["Update in-memory index"]
UpdateIdx --> Done(["Done"])
Fail{"Write to storage failed?"}
Persist --> |No| UpdateIdx
Persist --> |Yes| Rollback["Rollback storage (best effort)"] --> Error(["Return error"])
DelStart(["Delete Entry"]) --> Decide["Select targets by ID or filter"]
Decide --> DelStore["Delete from storage"]
DelStore --> DelIdx["Delete from index"]
DelIdx --> DDone(["Done"])

HNSW Index and Parameters

  • Parameters: M, EfConstruction, EfSearch, K; default values see implementation.
  • Distance metrics: Cosine/Euclid/Dot; Dot needs to be negated to adapt to the convention of minimizing distance.
  • Search strategy: When filtering exists, use "oversampling" strategy, then post-filter.
classDiagram
class HNSWParams {
    +int M
    +int EfConstruction
    +int EfSearch
    +int K
}
class HNSWIndex {
    -Graph
    -Points
    -Metric
    -Params
    +Upsert(points)
    +Search(query, filter, topK)
    +Delete(id)
    +Count()
    +GetIDsByFilter(filter)
    +DeleteByFilter(filter)
}
HNSWIndex --> HNSWParams : "uses"

Dependency Analysis

  • External dependencies
  • bbolt: Local key-value storage
  • protobuf: Point data serialization
  • coder/hnsw: HNSW graph algorithm
  • Internal modules
  • cmd/govector: Process entry
  • api: HTTP server
  • core: Collection, storage, index, model
graph LR
GOV["govector(core)"] --> BBOLT["go.etcd.io/bbolt"]
GOV --> PROTO["google.golang.org/protobuf"]
GOV --> HNSWLIB["github.com/coder/hnsw"]
MAIN["cmd/govector/main.go"] --> API["api/server.go"]
API --> CORE["core/*"]

Performance and Capacity Planning

  • Index selection
  • Small scale/low latency priority: Flat index
  • Large scale/high throughput: HNSW index, adjust parameters based on scenario (EfSearch, K)
  • Vector quantization
  • Enabling SQ8 quantization can significantly reduce disk usage, suitable for large-scale data
  • Port and concurrency
  • Default port can be configured in startup parameters; production recommends binding to internal address and using reverse proxy
  • Benchmark reference
  • README provides latency and throughput references for different scales and indexes, can be used for capacity assessment