Engineering · 13 min read

Vector Databases Explained: When You Need One and How to Choose

A comprehensive comparison of Pinecone, Weaviate, Qdrant, Chroma, and pgvector with benchmarks, use cases, and architecture guidance for AI teams.

Priya Sharma·2026-02-18

Vector Databases Explained: When You Need One and How to Choose

If you are building any AI application that involves semantic search, retrieval-augmented generation (RAG), recommendation engines, or similarity matching, you have probably encountered the term "vector database." Over the past two years, vector databases have gone from a niche technology to one of the most discussed infrastructure components in the AI stack. But there is a lot of confusion around when you actually need a dedicated vector database versus when simpler alternatives will do.

At Obaro Labs, we have deployed vector search across more than thirty production systems. We have used nearly every major option on the market. This post distills what we have learned into practical guidance.

What Is a Vector Database?

At its core, a vector database stores and indexes high-dimensional vectors - numerical representations of data (text, images, audio, etc.) produced by embedding models. Unlike traditional databases that search by exact matches or keyword patterns, vector databases find items that are semantically similar by computing distances in vector space.

When you convert a sentence like "What are the side effects of metformin?" into a 1536-dimensional vector using an embedding model such as OpenAI text-embedding-3-small, the resulting vector captures the meaning of that sentence. A vector database can then find other vectors (and their associated documents) that are close in meaning, even if they use completely different words.

When You Actually Need a Vector Database

Not every AI application needs a dedicated vector database. Here is our decision framework:

You likely need a vector database when:

  • You have more than 100,000 documents to search over
  • You need sub-100ms query latency at scale
  • You require filtering combined with semantic search (metadata filtering)
  • Your index is updated frequently (not just batch)
  • You need multi-tenancy with data isolation

You can probably skip a dedicated vector database when:

  • You have fewer than 50,000 documents
  • You are prototyping or building an MVP
  • Your data is static or updated infrequently
  • You already run PostgreSQL and can tolerate slightly higher latency

For small-scale projects, pgvector in PostgreSQL or even in-memory search with FAISS can be perfectly adequate. We have seen teams over-engineer their vector search infrastructure for datasets that could fit in memory on a single machine.

Comparing the Major Options

We have benchmarked five popular vector database options across the dimensions that matter most in production: query latency, indexing speed, filtering capabilities, operational complexity, and cost.

Pinecone

Pinecone is a fully managed vector database that emphasizes simplicity. You do not manage infrastructure - you get an API endpoint and start indexing.

Strengths:

  • Zero operational overhead - no clusters to manage
  • Excellent metadata filtering performance
  • Consistent sub-50ms query latency at scale
  • Namespace isolation for multi-tenancy
  • Serverless tier for cost-effective small workloads

Weaknesses:

  • Vendor lock-in (proprietary, no self-hosting)
  • Cost scales steeply above 1M vectors on the standard tier
  • Limited control over index configuration
  • No on-premise option for regulated industries

Best for: Teams that want to move fast without managing infrastructure, SaaS products with moderate scale.

import { Pinecone } from "@pinecone-database/pinecone";

const pc = new Pinecone({ apiKey: process.env.PINECONE_API_KEY! });
const index = pc.index("knowledge-base");

// Upsert vectors with metadata
await index.namespace("tenant-123").upsert([
  {
    id: "doc-001",
    values: embedding, // float32 array from your embedding model
    metadata: {
      source: "internal-wiki",
      department: "engineering",
      lastUpdated: "2026-01-15",
    },
  },
]);

// Query with metadata filtering
const results = await index.namespace("tenant-123").query({
  vector: queryEmbedding,
  topK: 10,
  filter: { department: { $eq: "engineering" } },
  includeMetadata: true,
});

Weaviate

Weaviate is an open-source vector database with a rich feature set including built-in vectorization, hybrid search (combining BM25 keyword search with vector search), and a GraphQL API.

Strengths:

  • Open source with an active community
  • Built-in hybrid search (BM25 + vector)
  • Native multi-modal support (text, images)
  • GraphQL API is powerful for complex queries
  • Can run self-hosted or managed (Weaviate Cloud)

Weaknesses:

  • Higher operational complexity when self-hosted
  • Memory consumption can be significant for large indexes
  • GraphQL learning curve for teams used to REST
  • Cluster management requires Kubernetes expertise

Best for: Teams that need hybrid search, open-source requirements, or multi-modal applications.

Qdrant

Qdrant is a Rust-based open-source vector database that has gained significant traction for its performance and developer experience.

Strengths:

  • Excellent performance - written in Rust with SIMD optimizations
  • Advanced filtering with payload indexes
  • Quantization support for reducing memory usage
  • Simple REST and gRPC APIs
  • Good Docker and Kubernetes support

Weaknesses:

  • Smaller community than Weaviate
  • Managed cloud offering is newer and less mature
  • Documentation can be sparse for advanced features

Best for: Performance-sensitive applications, teams comfortable with self-hosting, cost-conscious deployments.

Chroma

Chroma positions itself as the "AI-native" open-source embedding database, designed to be the simplest possible developer experience.

Strengths:

  • Extremely simple API - get started in minutes
  • Excellent for prototyping and local development
  • Python-native with good LangChain and LlamaIndex integration
  • Lightweight - runs in-process or as a server

Weaknesses:

  • Not yet proven at large scale in production
  • Limited filtering compared to Pinecone or Qdrant
  • Single-node only (distributed mode is in development)
  • Performance degrades above approximately 1M vectors

Best for: Prototyping, small to medium projects, Python-heavy teams, quick experimentation.

import chromadb

client = chromadb.PersistentClient(path="./chroma_db")
collection = client.get_or_create_collection(
    name="documents",
    metadata={"hnsw:space": "cosine"}
)

# Add documents with metadata
collection.add(
    ids=["doc1", "doc2", "doc3"],
    documents=[
        "Patient reported improved symptoms after 4 weeks",
        "Quarterly revenue exceeded projections by 12%",
        "The API endpoint returns a 429 status under load",
    ],
    metadatas=[
        {"type": "clinical", "department": "cardiology"},
        {"type": "financial", "quarter": "Q3"},
        {"type": "technical", "service": "payments"},
    ],
)

# Query with filtering
results = collection.query(
    query_texts=["heart condition improvement"],
    n_results=5,
    where={"type": "clinical"},
)

pgvector (PostgreSQL Extension)

pgvector adds vector similarity search to PostgreSQL. If you are already running Postgres, this is the path of least resistance.

Strengths:

  • No new infrastructure - extends your existing PostgreSQL
  • Full SQL capabilities for filtering and joins
  • ACID transactions for vector data
  • Familiar tooling, monitoring, and backup processes
  • Excellent for hybrid workloads (relational + vector)

Weaknesses:

  • Query performance lags behind dedicated vector databases at scale
  • HNSW index build times are slow for large datasets
  • Memory tuning is critical and non-obvious
  • No built-in sharding for vector indexes

Best for: Teams already on PostgreSQL, applications under 5M vectors, regulated industries that cannot add new infrastructure easily.

-- Enable the extension
CREATE EXTENSION IF NOT EXISTS vector;

-- Create a table with a vector column
CREATE TABLE documents (
  id SERIAL PRIMARY KEY,
  content TEXT,
  metadata JSONB,
  embedding vector(1536)
);

-- Create an HNSW index for fast similarity search
CREATE INDEX ON documents
  USING hnsw (embedding vector_cosine_ops)
  WITH (m = 16, ef_construction = 200);

-- Query for similar documents with metadata filtering
SELECT id, content, metadata,
       1 - (embedding <=> $1::vector) AS similarity
FROM documents
WHERE metadata->>'department' = 'engineering'
ORDER BY embedding <=> $1::vector
LIMIT 10;

Benchmarks: What We Have Measured

We ran benchmarks on a standardized dataset of 2 million vectors (1536 dimensions, OpenAI embeddings) across all five options. Here are the results for 95th percentile query latency:

Databasep95 Latency (top-10)Index Build TimeMemory UsageMonthly Cost (managed)
Pinecone (s2)28msN/A (streaming)N/A (managed)~$280
Weaviate35ms45 min12 GB~$350 (cloud)
Qdrant22ms38 min9 GB~$200 (cloud)
Chroma89ms62 min14 GBSelf-hosted only
pgvector120ms95 min8 GBExisting PG cost

These numbers will vary significantly based on hardware, configuration, and query patterns. But the relative ordering is consistent with what we have seen across production deployments.

Architecture Patterns We Recommend

Pattern 1: pgvector for MVP, migrate later. Start with pgvector if you are already on PostgreSQL. You can always migrate to a dedicated solution when you hit scale limits. We have done this migration for three clients, and it is straightforward because the vector data is the same - only the query layer changes.

Pattern 2: Dedicated vector DB with PostgreSQL for metadata. Store your vectors in Qdrant or Pinecone, but keep your source documents and rich metadata in PostgreSQL. Query the vector DB for IDs, then hydrate from Postgres. This gives you the best query performance while maintaining relational integrity.

Pattern 3: Hybrid search with Weaviate. When you need both keyword and semantic search - common in legal, compliance, and research applications - Weaviate built-in hybrid search avoids the complexity of running two separate systems.

Embedding Model Considerations

Your choice of embedding model matters as much as your choice of vector database. A few recommendations:

  • OpenAI text-embedding-3-small: Good balance of quality and cost. 1536 dimensions. Our default recommendation for most use cases.
  • OpenAI text-embedding-3-large: Higher quality, 3072 dimensions. Use when retrieval accuracy is critical and you can afford the storage.
  • Cohere embed-v3: Strong multilingual support. Good option if you need to search across languages.
  • Open-source (e.g., bge-large-en-v1.5): No API costs, can run on your infrastructure. Quality is competitive for English-language use cases.

Remember that changing embedding models requires re-embedding your entire corpus. Choose carefully upfront, or build your pipeline to make re-embedding straightforward.

Common Mistakes We See

  1. Choosing based on benchmarks alone. Operational complexity matters more than a 10ms latency difference. Pick the database your team can operate confidently.

  2. Ignoring chunk strategy. The biggest lever for retrieval quality is not the vector database - it is how you chunk your documents. We typically use 512-token chunks with 50-token overlap for general text, and semantic chunking for structured documents.

  3. Skipping evaluation. You need a retrieval evaluation pipeline before going to production. Build a test set of queries with known relevant documents and measure recall at k and precision at k.

  4. Not planning for index updates. Many teams build a static index and do not plan for how new documents get added, old ones get removed, or existing ones get updated. Build this into your pipeline from day one.

  5. Over-indexing metadata. Every metadata filter adds complexity. Start with the minimum viable filtering and add more as needed.

Our Recommendation

For most Obaro Labs clients, we recommend starting with pgvector if the dataset is under 500K vectors and the team already runs PostgreSQL. For larger-scale or performance-critical applications, we recommend Qdrant for self-hosted deployments and Pinecone for fully managed. If you need hybrid search, Weaviate is the clear choice.

The vector database space is maturing rapidly. Whatever you choose, design your application with a clean abstraction layer so you can swap implementations as the landscape evolves. At Obaro Labs, we use a repository pattern that abstracts the vector store behind a consistent interface - making it straightforward to migrate between providers as our clients needs change.

If you are evaluating vector databases for your project, we are happy to share our benchmarking framework and help you make the right choice for your specific requirements.

Related Posts

Ready to build your AI advantage?

Stop researching. Start building. Book a free consultation and discover how custom AI can transform your business.