MinhVo

Minh Vo

rss feed

Slaying code & making it lit fr fr 🔥 tagline

Hey there 👋 I'm an AI Engineer with 7 years of experience building scalable web and mobile applications. Currently at Neurond AI (May 2025 — present), architecting an Enterprise AI Assistant Platform with multi-tenant RAG on pgvector, multi-provider LLM orchestration, and Azure-native infrastructure. Previously spent 5+ years at SNAPTEC (Sep 2019 — Apr 2025), leading SaaS themes, admin dashboards, and e-commerce platforms — earned the Hero of the Year award in 2021. I specialize in TypeScript, React, Next.js, and AI-Native engineering with Claude Code and Cursor.bio

Back to blogs

Vector Databases: Pinecone, Weaviate, and pgvector

Compare vector databases for AI applications: indexing algorithms, performance, and use cases.

Vector DatabaseAIPineconepgvector

By MinhVo

Introduction

The explosion of AI and machine learning applications has created a new category of database: the vector database. These specialized systems are designed to store, index, and query high-dimensional vector embeddings—numerical representations of data produced by machine learning models. Whether you're building semantic search, recommendation systems, anomaly detection, or retrieval-augmented generation (RAG) applications, vector databases are the backbone that makes similarity search fast and scalable.

Traditional databases like PostgreSQL and MySQL are optimized for structured data and exact matches. They struggle with the mathematical operations required for similarity search across millions of high-dimensional vectors. Vector databases solve this by using specialized indexing algorithms like HNSW (Hierarchical Navigable Small World) and IVF (Inverted File Index) that can efficiently find the most similar vectors without scanning every row.

In this guide, we'll compare three leading vector database solutions: Pinecone, a fully managed cloud service; Weaviate, an open-source AI-native database; and pgvector, a PostgreSQL extension that brings vector search to your existing Postgres deployment. We'll examine their architectures, performance characteristics, and ideal use cases to help you choose the right solution for your project.

Vector Database Architecture

Understanding Vector Databases: Core Concepts

Vector databases store data as high-dimensional vectors, typically produced by embedding models like OpenAI's text-embedding-ada-002, Sentence Transformers, or Cohere's embed models. These vectors represent the semantic meaning of text, images, audio, or other data types in a mathematical space where similar concepts are close together.

How Embeddings Work

When you pass text through an embedding model, it produces a vector of floating-point numbers (typically 384 to 1536 dimensions). These numbers capture the semantic meaning of the text—similar texts produce similar vectors. For example, "dog" and "puppy" would have vectors that are close together, while "dog" and "airplane" would be far apart.

Similarity Metrics

Vector databases support several distance metrics for comparing vectors:

  • Cosine similarity: Measures the angle between vectors, ignoring magnitude. Best for text embeddings.
  • Euclidean distance (L2): Measures straight-line distance between points. Good for image embeddings.
  • Dot product: Combines magnitude and direction. Used when vector magnitude is meaningful.

Indexing Algorithms

To search millions of vectors efficiently, vector databases use approximate nearest neighbor (ANN) algorithms:

  • HNSW: Creates a multi-layer graph that enables fast traversal to find similar vectors. Excellent recall and speed.
  • IVF: Partitions vectors into clusters and searches only relevant clusters. Good for large datasets.
  • PQ (Product Quantization): Compresses vectors to reduce memory usage at the cost of some accuracy.

Vector Indexing

Architecture and Design Patterns

Pinecone Architecture

Pinecone is a fully managed vector database that abstracts away all infrastructure concerns. You interact with it through a simple API: create an index, upsert vectors, and query. Pinecone handles sharding, replication, and scaling automatically.

Key architectural features:

  • Serverless and pod-based tiers: Choose between pay-per-query serverless or dedicated pods
  • Namespaces: Logical partitions within an index for multi-tenancy
  • Metadata filtering: Filter results by metadata alongside vector similarity
  • Real-time indexing: Vectors are searchable immediately after upsert

Weaviate Architecture

Weaviate is an open-source, AI-native vector database with a GraphQL-based query language. It can run self-hosted or as a managed service (WCS). Weaviate's unique feature is its modular architecture—you can plug in different vectorizers (text2vec-openai, text2vec-cohere, etc.) that automatically generate embeddings.

Key architectural features:

  • Module system: Swap embedding models without changing application code
  • Multi-tenancy: Built-in support for SaaS applications
  • Hybrid search: Combine vector search with traditional keyword search
  • Generative modules: Built-in RAG capabilities with OpenAI, Cohere, etc.

pgvector Architecture

pgvector is a PostgreSQL extension that adds vector similarity search to your existing Postgres database. It stores vectors as a native data type and supports IVFFlat and HNSW indexes. pgvector is ideal when you want to combine vector search with relational data without managing a separate database.

Key architectural features:

  • Native PostgreSQL integration: Use SQL to query vectors alongside relational data
  • ACID compliance: Full transactional support
  • Index types: IVFFlat for smaller datasets, HNSW for larger ones
  • Familiar tooling: Use existing Postgres tools for backup, replication, and monitoring

Database Comparison

Step-by-Step Implementation

Pinecone Implementation

// Installation: npm install @pinecone-database/pinecone
 
import { Pinecone } from '@pinecone-database/pinecone';
 
// Initialize client
const pinecone = new Pinecone({
  apiKey: process.env.PINECONE_API_KEY,
});
 
// Create index
await pinecone.createIndex({
  name: 'my-index',
  dimension: 1536,
  metric: 'cosine',
  spec: { serverless: { cloud: 'aws', region: 'us-east-1' } },
});
 
// Get index reference
const index = pinecone.index('my-index');
 
// Upsert vectors
await index.upsert([
  {
    id: 'doc-1',
    values: [0.1, 0.2, 0.3, ...], // 1536-dim vector
    metadata: { text: 'Hello world', category: 'greeting' },
  },
  {
    id: 'doc-2',
    values: [0.4, 0.5, 0.6, ...],
    metadata: { text: 'Goodbye world', category: 'farewell' },
  },
]);
 
// Query similar vectors
const results = await index.query({
  vector: [0.1, 0.2, 0.3, ...],
  topK: 10,
  includeMetadata: true,
  filter: { category: { $eq: 'greeting' } },
});
 
console.log(results.matches);
// [{ id: 'doc-1', score: 0.95, metadata: { text: 'Hello world', ... } }, ...]

Weaviate Implementation

// Installation: npm install weaviate-ts-client
 
import weaviate from 'weaviate-ts-client';
 
const client = weaviate.client({
  scheme: 'https',
  host: 'your-cluster.weaviate.network',
  apiKey: new weaviate.ApiKey(process.env.WEAVIATE_API_KEY),
});
 
// Create class (schema)
await client.schema
  .classCreator()
  .withClass({
    class: 'Document',
    vectorizer: 'text2vec-openai',
    moduleConfig: {
      'text2vec-openai': { model: 'ada-002' },
    },
    properties: [
      { name: 'title', dataType: ['text'] },
      { name: 'content', dataType: ['text'] },
      { name: 'category', dataType: ['text'] },
    ],
  })
  .do();
 
// Insert objects (vectors generated automatically)
await client.data
  .creator()
  .withClassName('Document')
  .withProperties({
    title: 'Introduction to Vector Databases',
    content: 'Vector databases store high-dimensional embeddings...',
    category: 'tutorial',
  })
  .do();
 
// Query with nearText (semantic search)
const result = await client.graphql
  .get()
  .withClassName('Document')
  .withNearText({ concepts: ['machine learning databases'] })
  .withLimit(10)
  .withFields('title content category _additional { distance }')
  .do();
 
// Hybrid search (combine vector + keyword)
const hybridResult = await client.graphql
  .get()
  .withClassName('Document')
  .withHybrid({
    query: 'vector search performance',
    alpha: 0.75, // 0 = keyword, 1 = vector
  })
  .withLimit(10)
  .withFields('title content _additional { score }')
  .do();

pgvector Implementation

-- Enable pgvector extension
CREATE EXTENSION IF NOT EXISTS vector;
 
-- Create table with vector column
CREATE TABLE documents (
  id SERIAL PRIMARY KEY,
  title TEXT NOT NULL,
  content TEXT NOT NULL,
  category TEXT,
  embedding vector(1536)
);
 
-- Create HNSW index for fast similarity search
CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops)
  WITH (m = 16, ef_construction = 64);
 
-- Insert vectors
INSERT INTO documents (title, content, category, embedding) VALUES
  ('Introduction to Vectors', 'Vectors are mathematical...', 'tutorial',
   '[0.1, 0.2, 0.3, ...]'::vector),
  ('Database Design', 'Relational databases use...', 'architecture',
   '[0.4, 0.5, 0.6, ...]'::vector);
 
-- Query: Find 10 most similar documents
SELECT title, content, category,
       1 - (embedding <=> '[0.1, 0.2, 0.3, ...]'::vector) AS similarity
FROM documents
ORDER BY embedding <=> '[0.1, 0.2, 0.3, ...]'::vector
LIMIT 10;
 
-- Hybrid: Combine vector search with metadata filter
SELECT title, content,
       1 - (embedding <=> $1::vector) AS similarity
FROM documents
WHERE category = $2
ORDER BY embedding <=> $1::vector
LIMIT 10;
 
-- IVFFlat index (better for smaller datasets)
CREATE INDEX ON documents USING ivfflat (embedding vector_cosine_ops)
  WITH (lists = 100);

Real-World Use Cases and Case Studies

Use Case 1: Semantic Search Engine

A legal technology company uses Pinecone to build a semantic search engine for court documents. Lawyers can search for cases using natural language queries like "patent infringement involving software algorithms" and find relevant precedents even when the exact terminology differs. Pinecone's managed infrastructure handles millions of document embeddings with sub-100ms query latency.

Use Case 2: E-Commerce Recommendation System

An online retailer uses Weaviate to power their product recommendation engine. Product descriptions and user reviews are vectorized using Weaviate's built-in text2vec-openai module. When a user views a product, the system queries for similar products using vector similarity. Weaviate's hybrid search capability combines semantic similarity with category filters to ensure relevant recommendations.

Use Case 3: RAG Application with Existing Postgres

A SaaS company already using PostgreSQL for their application database adds pgvector to implement a RAG (Retrieval-Augmented Generation) chatbot. By storing document embeddings alongside their relational data, they avoid the operational overhead of managing a separate vector database. The chatbot retrieves relevant documentation chunks and uses them as context for GPT-4 responses.

A stock photography platform uses Weaviate to enable reverse image search. Users upload an image, which is vectorized using a CLIP model, and the system finds visually similar photographs. Weaviate's multi-modal support allows combining image and text queries for more precise results.

Best Practices for Production

  1. Choose the right embedding model: The quality of your vectors determines search quality. OpenAI's text-embedding-3-small offers a good balance of cost and quality for most applications. For specialized domains, fine-tune a model on your data.

  2. Optimize index parameters: HNSW parameters (m, ef_construction, ef_search) trade off between recall, speed, and memory. Start with defaults and tune based on your recall requirements.

  3. Use metadata filtering wisely: Filter before vector search when possible to reduce the search space. Most vector databases support pre-filtering for this purpose.

  4. Batch your operations: Upsert and query vectors in batches rather than one at a time. This dramatically improves throughput—typically 10-50x for batch sizes of 100-500.

  5. Monitor recall and latency: Track the percentage of true nearest neighbors your index returns (recall) alongside query latency. If recall drops below 95%, consider re-indexing with different parameters.

  6. Implement caching: Cache frequent queries at the application layer. Vector similarity search results are deterministic, so caching is safe for static datasets.

  7. Plan for re-indexing: As your dataset grows, you may need to rebuild indexes with different parameters. Plan for this operational task in advance.

  8. Use namespaces for multi-tenancy: If serving multiple customers, use namespaces (Pinecone) or multi-tenancy features (Weaviate) to isolate data efficiently.

Common Pitfalls and Solutions

PitfallImpactSolution
Wrong embedding modelPoor search qualityBenchmark multiple models on your specific data
Ignoring index parametersSlow queries or low recallTune m, ef_construction, and ef_search for your workload
Storing large metadata in vectorsIncreased memory and slower queriesStore only filterable metadata; keep large blobs elsewhere
Not batching operationsPoor throughputBatch upserts and queries in groups of 100-500
Mixing distance metricsIncorrect resultsUse the same metric for indexing and querying
Neglecting index maintenanceDegrading performance over timeMonitor and rebuild indexes as data changes

Performance Optimization

Performance in vector databases depends on dataset size, dimensionality, index type, and query patterns. Here are benchmarks and optimization techniques:

// Batch upsert for maximum throughput
async function batchUpsert(index, vectors, batchSize = 100) {
  for (let i = 0; i < vectors.length; i += batchSize) {
    const batch = vectors.slice(i, i + batchSize);
    await index.upsert(batch);
  }
}
 
// Use namespaces for multi-tenant isolation
const tenantIndex = index.namespace('tenant-123');
const results = await tenantIndex.query({
  vector: queryVector,
  topK: 10,
  includeMetadata: true,
});
 
// pgvector: Set search parameters for better recall
await client.query("SET hnsw.ef_search = 100");
 
// Pinecone: Filter early to reduce search space
const results = await index.query({
  vector: queryVector,
  topK: 10,
  filter: {
    category: { $in: ['tech', 'science'] },
    date: { $gte: '2024-01-01' },
  },
});

Comparison with Alternatives

FeaturePineconeWeaviatepgvectorMilvusQdrant
DeploymentManaged onlySelf-hosted or managedPostgreSQL extensionSelf-hosted or managedSelf-hosted or managed
Max Dimensions20,000No limit2,00032,76865,535
Metadata FilteringYesYes (GraphQL)Yes (SQL)YesYes
Multi-tenancyNamespacesBuilt-inSeparate tablesPartitionsCollections
Hybrid SearchNoYesWith pg_trgmYesYes
PricingPay per query/podOpen source or cloudFree (Postgres)Open source or cloudOpen source or cloud
Best ForSimple managed solutionAI-native applicationsExisting Postgres usersLarge-scale deploymentsHigh-performance search

Advanced Patterns

Hybrid Search with Reciprocal Rank Fusion

# Combining vector and keyword search results
def hybrid_search(query_text, query_vector, alpha=0.5):
    # Vector search results
    vector_results = vector_db.query(query_vector, top_k=50)
    
    # Keyword search results  
    keyword_results = full_text_search(query_text, limit=50)
    
    # Reciprocal Rank Fusion
    scores = {}
    for rank, doc in enumerate(vector_results):
        scores[doc.id] = scores.get(doc.id, 0) + alpha / (60 + rank)
    
    for rank, doc in enumerate(keyword_results):
        scores[doc.id] = scores.get(doc.id, 0) + (1 - alpha) / (60 + rank)
    
    # Sort by combined score
    return sorted(scores.items(), key=lambda x: x[1], reverse=True)[:20]

Dynamic Embedding Model Selection

class EmbeddingRouter {
  constructor() {
    this.models = {
      'text-embedding-3-small': { dim: 1536, cost: 0.00002 },
      'text-embedding-3-large': { dim: 3072, cost: 0.00013 },
      'cohere-embed-v3': { dim: 1024, cost: 0.0001 },
    };
  }
 
  selectModel(textLength, qualityRequirement) {
    if (qualityRequirement === 'high') {
      return 'text-embedding-3-large';
    }
    if (textLength > 5000) {
      return 'cohere-embed-v3'; // Better for long documents
    }
    return 'text-embedding-3-small'; // Default, most cost-effective
  }
}

Testing Strategies

describe('Vector Search Service', () => {
  let mockVectorDb;
 
  beforeEach(() => {
    mockVectorDb = {
      query: jest.fn().mockResolvedValue([
        { id: 'doc-1', score: 0.95, metadata: { title: 'Test Doc' } },
      ]),
      upsert: jest.fn().mockResolvedValue({ upsertedCount: 1 }),
    };
  });
 
  test('should return relevant results', async () => {
    const service = new SearchService(mockVectorDb);
    const results = await service.search('machine learning', { limit: 5 });
 
    expect(results).toHaveLength(1);
    expect(results[0].score).toBeGreaterThan(0.9);
    expect(mockVectorDb.query).toHaveBeenCalledWith(
      expect.any(Array),
      expect.objectContaining({ topK: 5 })
    );
  });
 
  test('should filter by category', async () => {
    const service = new SearchService(mockVectorDb);
    await service.search('test', { category: 'tutorial' });
 
    expect(mockVectorDb.query).toHaveBeenCalledWith(
      expect.any(Array),
      expect.objectContaining({
        filter: { category: { $eq: 'tutorial' } },
      })
    );
  });
});

Hybrid Search: Combining Vector and Keyword Approaches

Pure vector search excels at semantic similarity but may miss exact keyword matches, while traditional keyword search handles exact terms but lacks semantic understanding. Hybrid search combines both approaches by running vector and keyword queries in parallel, then merging results using a weighted scoring algorithm. This pattern is particularly valuable for e-commerce search where users may search for exact product names (keyword) or describe products in natural language (vector).

Implement hybrid search by maintaining both a vector index and a traditional text index (like BM25) on the same data. Query both indexes simultaneously and combine results using reciprocal rank fusion (RRF) or a weighted sum of normalized scores. Weaviate and Pinecone both support hybrid queries natively, while pgvector requires manual implementation by querying the tsvector and vector indexes separately and joining the results.

The weighting between vector and keyword scores depends on your use case. Technical documentation search benefits from higher keyword weight because users search for specific function names and error codes. Customer support search benefits from higher vector weight because users describe problems in their own words rather than using exact terminology. Tune the weights based on user behavior analytics and search result click-through rates.

Future Outlook

The vector database landscape is evolving rapidly. Key trends include:

  • Convergence with traditional databases: PostgreSQL (pgvector), MySQL, and even SQLite are adding vector capabilities, reducing the need for specialized databases in many use cases.
  • Multi-modal embeddings: Models that embed text, images, audio, and video into a shared vector space are becoming mainstream.
  • Larger context windows: As embedding models support longer inputs, vector databases must handle larger documents and more granular chunking strategies.
  • Serverless vector search: Pinecone's serverless offering and similar products are making vector search accessible without infrastructure management.

The choice between dedicated vector databases and database extensions will increasingly depend on scale and operational requirements rather than capability.

Conclusion

Vector databases are essential infrastructure for AI-powered applications. Pinecone offers the simplest managed experience with excellent performance. Weaviate provides the most AI-native features with its module system and hybrid search. pgvector brings vector search to your existing PostgreSQL deployment with zero additional infrastructure.

Key takeaways:

  1. Start with pgvector if you already use PostgreSQL—operational simplicity matters
  2. Choose Pinecone for zero-ops managed vector search at scale
  3. Use Weaviate when you need AI-native features like automatic vectorization and hybrid search
  4. Always benchmark with your actual data—performance varies significantly by dataset
  5. Plan for embedding model changes—your vector database is only as good as your embeddings

The vector database space is moving fast, but the fundamentals of good vector search—quality embeddings, proper indexing, and thoughtful filtering—remain constant regardless of which solution you choose.