Skip to content

Configuration Issues

This guide focuses on GoVector configuration issue troubleshooting and fixing, covering the following aspects:

  • Configuration file format errors (e.g., invalid JSON)
  • Improper parameter settings (e.g., vector dimensions, distance metrics, HNSW parameters)
  • Port conflicts and service startup failures
  • Server configuration, collection configuration, index configuration verification methods
  • Configuration syntax checking and parameter validity verification
  • Log diagnostics and configuration rollback and repair strategies
  • Configuration templates and best practices

Project Structure

GoVector provides two usage modes: embedded library and independent microservice. The microservice mode is started via command-line entry, loads storage engine and default collection, and exposes Qdrant-compatible HTTP API.

graph TB
subgraph "Command-line Entry"
MAIN["cmd/govector/main.go
Parse command-line arguments and start service"] end subgraph "API Layer" API["api/server.go
HTTP server and route handling"] end subgraph "Core Layer" COL["core/collection.go
Collection management and consistency guarantee"] IDX["core/index.go
Index interface abstraction"] HNSW["core/hnsw_index.go
HNSW implementation and parameters"] STOR["core/storage.go
BoltDB persistence and metadata"] MODELS["core/models.go
Data model and filters"] end MAIN --> API API --> COL COL --> IDX COL --> STOR IDX --> HNSW API --> MODELS

Core Components

  • Command-line entry is responsible for parsing parameters such as port, database path, whether to enable HNSW, and initializing storage, default collection, and HTTP service.
  • API layer is responsible for loading persisted collection metadata, registering collections, and providing REST interfaces for collection and point operations.
  • Storage layer is based on BoltDB, providing collection buckets, metadata bucket, point serialization and quantization support.
  • Collection layer is responsible for dimension validation, in-memory index and storage consistency writes.
  • HNSW index layer provides adjustable parameters (M, EfConstruction, EfSearch, K) and adaptation for different distance metrics.

Architecture Overview

The diagram below shows the overall flow from command-line to API, then to storage and index, as well as key error points and log locations.

sequenceDiagram
participant CLI as "Command-line entry"
participant API as "API Server"
participant COL as "Collection"
participant STOR as "Storage(BoltDB)"
participant IDX as "Index(HNSW/Flat)"
CLI->>STOR : Initialize storage(open database)
STOR-->>CLI : Success/failure
CLI->>COL : Create/load default collection(dimension/metric/HNSW)
COL->>STOR : Save collection metadata
COL->>IDX : Initialize in-memory index
CLI->>API : Start HTTP service(listen on port)
API->>STOR : Load collection metadata on startup
STOR-->>API : Return collection metadata list
API->>COL : Rebuild collection instance for each metadata
API-->>CLI : Service ready/error

Detailed Component Analysis

Server Configuration and Startup Flow

  • Command-line parameters:
  • Port: Default value and valid range not explicitly limited in code, need to consider system port availability and permissions.
  • Database path: Used for bbolt file path, need to ensure directory exists and has read/write permissions.
  • Whether to enable HNSW: Affects default collection index type.
  • Startup sequence:
  • Initialize storage -> Load/create default collection -> Register collection to API -> Start HTTP service -> Listen for signals for graceful shutdown.
flowchart TD
Start(["Startup entry"]) --> Parse["Parse command-line arguments"]
Parse --> InitStore["Initialize storage engine"]
InitStore --> CreateCol["Create/load default collection"]
CreateCol --> RegAPI["Register collection to API"]
RegAPI --> Listen["Start HTTP listen"]
Listen --> Wait["Wait for signals or errors"]
Wait --> Graceful["Graceful shutdown(timeout)"]
Graceful --> End(["End"])

Collection Configuration and Parameter Validation

  • Required fields and constraints:
  • Name: Unique identifier, returns conflict if already exists on creation.
  • Vector dimension: Must be positive; dimension consistency is also validated on query/insert.
  • Distance metric: Supports Euclidean, Dot, Cosine; returns error for unknown values.
  • HNSW switch: Boolean; optional parameter object contains M, EfConstruction, EfSearch, K.
  • Metadata persistence:
  • Collection metadata is stored in special bucket; automatically loaded and collection instances rebuilt on restart.
flowchart TD
Req["Create collection request(JSON)"] --> Decode["Decode JSON"]
Decode --> Exists{"Name already exists?"}
Exists -- Yes --> Err409["Return 409 conflict"]
Exists -- No --> DimCheck{"Vector dimension > 0?"}
DimCheck -- No --> Err400A["Return 400 invalid dimension"]
DimCheck -- Yes --> Metric["Parse distance metric"]
Metric --> MetricOK{"Metric valid?"}
MetricOK -- No --> Err400B["Return 400 invalid metric"]
MetricOK -- Yes --> Params["Parse HNSW parameters(optional)"]
Params --> Create["Create collection(memory+storage)"]
Create --> Register["Register to server"]
Register --> OK["Return 200 success"]

Index Configuration and Parameters

  • HNSW parameters:
  • M: Maximum connections per node, default 16.
  • EfConstruction: Candidate list size during build phase, default 200.
  • EfSearch: Candidate list size during search phase, default 64.
  • K: Number of neighbors to return, default 10.
  • Distance metrics:
  • Euclidean, Dot, Cosine; Dot needs to be negated to adapt to the underlying library's convention of minimizing distance.
  • Parameter application path:
  • Parameters passed through collection creation interface; or use default parameters.
classDiagram
class HNSWParams {
    +int M
    +int EfConstruction
    +int EfSearch
    +int K
}
class HNSWIndex {
    +Upsert(points)
    +Search(query, filter, topK)
    +Delete(id)
    +Count()
    +GetIDsByFilter(filter)
    +DeleteByFilter(filter)
}
HNSWIndex --> HNSWParams : "uses"

Data Model and Filters

  • Point structure contains ID, version, vector and optional payload.
  • Filter supports exact match, range, prefix, contains, regex and other condition types.
  • Matching logic is executed on the server side, note that regex compilation failure directly results in non-match.

Dependency Analysis

  • External dependencies:
  • bbolt: Local key-value storage.
  • hnsw: HNSW graph search library.
  • protobuf: Point structure serialization.
  • Internal modules:
  • api: HTTP service and routing.
  • core: Collection, index, storage, model.
graph LR
GOVEC["github.com/DotNetAge/govector"] --> BBOLT["go.etcd.io/bbolt"]
GOVEC --> HNSWLIB["github.com/coder/hnsw"]
GOVEC --> PROTO["google.golang.org/protobuf"]

Performance Considerations

  • HNSW default parameters are suitable for most scenarios; when data scale is large and latency-sensitive, can adjust EfConstruction/EfSearch/K.
  • Dot distance is negated in HNSW to satisfy the underlying library's assumption of "minimizing distance".
  • Storage layer supports optional quantization, reduces disk usage but may sacrifice accuracy.

[This section is general guidance, no specific file reference needed]

Troubleshooting Guide

1. Configuration File Format Errors

  • Symptoms
  • Returns 400 when creating collection or writing points,提示 JSON 无效.
  • Cause identification
  • Request body is not valid JSON; API layer returns 400 directly on decode.
  • Troubleshooting steps
  • Use curl or Postman to send minimal valid request body, add fields gradually to confirm issue.
  • Refer to collection creation interface JSON field definition and examples.
  • Fix suggestions
  • Ensure JSON structure is complete, key names are correct, numeric types meet expectations (e.g., integer parameters should be numbers not strings).

2. Improper Parameter Settings

  • Invalid vector dimension
  • Symptoms: Returns 400 when creating collection,提示 dimension must be positive; or dimension mismatch when writing points.
  • Troubleshooting: Verify vector_size in request matches actual vector length.
  • Fix: Unify dimensions, ensure all point vector lengths match the collection.
  • Invalid distance metric
  • Symptoms: Returns 400 when creating collection,提示 metric invalid.
  • Troubleshooting: Only allows euclidean, dot, cosine (case-insensitive).
  • Fix: Use supported metric names.
  • HNSW parameter out of range or type error