Monitoring and Alerting¶
This guide is for operations and platform engineering teams, providing actionable configuration recommendations for GoVector's monitoring and alerting system. The content covers:
- Collection and observation methods for key performance indicators (query latency, memory usage, disk I/O, network throughput)
- Prometheus monitoring integration and Grafana dashboard template recommendations
- Alert rule design (thresholds and levels)
- Log collection and analysis (access logs, error logs, performance logs)
- Fault detection and automatic recovery mechanism configuration
- Monitoring data visualization and report generation
Since the current repository does not include built-in Prometheus/Grafana configuration or logging framework integration, this guide provides general best practices and scalable integration paths.
Project Structure¶
GoVector uses a layered organization of "command-line entry + HTTP API server + core engine":
- Command-line entry: Responsible for parameter parsing, storage initialization, collection loading and HTTP service startup/graceful shutdown
- API layer: Provides Qdrant-compatible REST interfaces, encapsulates collection management and vector operations
- Storage layer: Local persistence based on bbolt, supports metadata and point read/write
- Model and filtering: Defines point structure, scored point, filter and multiple matching types
- Benchmark tool: Provides memory statistics and measurement of query throughput/latency
graph TB
subgraph "Process"
CLI["Command-line entry
cmd/govector/main.go"]
API["HTTP API Server
api/server.go"]
CORE["Core engine and model
core/*"]
end
CLI --> API
API --> CORE
Core Components¶
- Command-line entry and lifecycle
- Parse parameters such as port, database path, index type
- Initialize storage engine and default collection
- Start HTTP service and register signal handling for graceful shutdown
- HTTP API server
- Provides collection management and point operation interfaces
- Internally maintains collection mapping and concurrency safety
- Storage engine
- Bucket-based storage based on bbolt, supports collection metadata and point read/write
- Supports optional vector quantization compression
- Model and filtering
- Defines point structure, scored point, filter and multiple matching types
- Benchmark tool
- Outputs memory statistics, index build time, average query latency and QPS
Architecture Overview¶
The diagram below shows the call chain from client to storage and key observation points:
sequenceDiagram
participant Client as "Client"
participant HTTP as "HTTP API Server"
participant Coll as "Collection"
participant Store as "Storage(Storage/BoltDB)"
Client->>HTTP : "PUT /collections/{name}/points"
HTTP->>Coll : "Upsert(batch points)"
Coll->>Store : "UpsertPoints(serialize+persist)"
Store-->>Coll : "Write result"
Coll-->>HTTP : "Status code/results"
HTTP-->>Client : "Response"
Client->>HTTP : "POST /collections/{name}/points/search"
HTTP->>Coll : "Search(query)"
Coll->>Store : "Read/deserialized as needed"
Store-->>Coll : "Point set"
Coll-->>HTTP : "TopK results"
HTTP-->>Client : "Response"
Detailed Component Analysis¶
Component 1: HTTP API Server (Monitoring Focus)¶
- Observation dimensions
- Total request volume, success rate, error code distribution
- Route-level latency (distinguished by endpoint)
- Concurrent connections and queue wait time
- Recommended instrumentation locations
- Record timestamps and status codes at request entry and return
- Separate statistics for collection management and point operation endpoints
- Observability output
- Export metrics in Prometheus text format
- Combine with existing logs to form access logs
flowchart TD
Start(["Request entry"]) --> Pre["Record start time/tags"]
Pre --> Route{"Route match"}
Route --> |Upsert| Upsert["Handle Upsert"]
Route --> |Search| Search["Handle Search"]
Route --> |Delete| Delete["Handle Delete"]
Route --> |Management| Manage["Handle collection management"]
Upsert --> Post["Record end time/status code"]
Search --> Post
Delete --> Post
Manage --> Post
Post --> Export["Export metrics/write logs"]
Export --> End(["Done"])
Component 2: Storage Engine (Disk I/O and Memory)¶
- Observation dimensions
- Upsert/Load/Delete time consumption and throughput
- bbolt transaction commit count and failure rate
- Serialization/deserialization overhead (Protobuf)
- CPU and memory changes when vector quantization is enabled
- Recommended instrumentation locations
- Instrument before and after UpsertPoints/LoadCollection/DeletePoints
- Record batch size, point count, byte length
- Observability output
- Metrics: Per-batch write latency, read latency, failure count
- Logs: Exception stack and context information
flowchart TD
UStart["UpsertPoints start"] --> Serialize["Serialize points(Protobuf)"]
Serialize --> Tx["bbolt transaction write"]
Tx --> UEnd["UpsertPoints end"]
LStart["LoadCollection start"] --> Tx2["bbolt transaction read"]
Tx2 --> Deserialize["Deserialize points(Protobuf)"]
Deserialize --> LEnd["LoadCollection end"]
DStart["DeletePoints start"] --> Tx3["bbolt transaction delete"]
Tx3 --> DEnd["DeletePoints end"]
Component 3: Collection and Filtering (Query Latency)¶
- Observation dimensions
- Search average latency, P95/P99 latency
- Impact of filter condition complexity on latency
- Comparison of different distance metrics and index types (HNSW/Flat)
- Recommended instrumentation locations
- Instrument at Search entry and return
- Distinguish whether filter is used, TopK size
- Observability output
- Latency histogram and summary
- Error classification statistics
flowchart TD
SStart["Search entry"] --> Filter["Apply filter conditions"]
Filter --> Index["Index search(HNSW/Flat)"]
Index --> Sort["Sort/take TopK"]
Sort --> SEpilogue["Record latency/results"]
SEpilogue --> SEnd["Return"]
Component 4: Benchmark Tool (Performance Baseline)¶
- Observation dimensions
- Index build time, average time per point
- Query average latency, QPS
- Runtime memory allocation and GC count
- Recommended instrumentation locations
- Print key metrics in benchmark script
- Observability output
- Structured logs (JSON), easy for subsequent analysis
- Comparison with different scales/dimensions/index types
flowchart TD
BStart["Start benchmark"] --> Build["Batch Upsert"]
Build --> Warm["Warm-up(optional)"]
Warm --> Queries["Random query N times"]
Queries --> Metrics["Calculate latency/QPS/latency"]
Metrics --> BEnd["Output report"]
Dependency Analysis¶
- External dependencies
- bbolt: Local key-value storage
- Protobuf: Point structure serialization
- HNSW: Approximate nearest neighbor index
- Internal coupling
- API server depends on collection and storage
- Storage depends on Protobuf and bbolt
- Benchmark tool is independent of runtime, only reuses core models
graph LR
API["api/server.go"] --> CORE["core/*"]
CORE --> BBOLT["bbolt"]
CORE --> PROTO["Protobuf"]
CORE --> HNSW["HNSW index"]
BENCH["cmd/bench/main.go"] --> CORE
Performance Considerations¶
- Query latency
- Using HNSW can significantly reduce query latency; Flat is suitable for small scale or low latency requirement scenarios
- More complex filter conditions result in higher query cost
- Memory usage
- Benchmark tool shows memory allocation and GC count under different scales, can be used for capacity planning
- Disk I/O
- Batch Upsert/Load/delete triggers bbolt transactions, avoid too small batches causing frequent IO
- Network throughput
- API layer is a single-machine HTTP service, bottleneck is usually network stack and CPU; can be optimized through concurrency and connection pooling