Sparse Vector Guide

This guide explains how to use the Sparse Vector feature in PowerMem, including configuration, query usage, schema upgrades, and historical data migration.

Prerequisites

Python 3.11+
powermem installed (pip install powermem)
Database requirements: seekdb or OceanBase >= 4.5.0

Note: Sparse vector feature only supports OceanBase storage backend, SQLite does not support this feature.

Configuring Sparse Vector

To enable sparse vector functionality, you need to configure two parts:

vector_store.config.include_sparse = True - Enable sparse vector support
sparse_embedder - Configure sparse vector embedding service

Environment Variable Configuration

Add the following configuration to your .env file:

# Database configuration
DATABASE_PROVIDER=oceanbase
OCEANBASE_HOST=127.0.0.1
OCEANBASE_PORT=2881
OCEANBASE_USER=root
OCEANBASE_PASSWORD=your_password
OCEANBASE_DATABASE=powermem
OCEANBASE_COLLECTION=memories
OCEANBASE_EMBEDDING_MODEL_DIMS=1536

# Enable sparse vector
SPARSE_VECTOR_ENABLE=true

# Sparse vector embedding configuration
SPARSE_EMBEDDER_PROVIDER=qwen
SPARSE_EMBEDDER_API_KEY=your_api_key
SPARSE_EMBEDDER_MODEL=text-embedding-v4
SPARSE_EMBEDDER_DIMS=1536

Dictionary Configuration

Configure sparse vector using Python dictionary:

from powermem import Memory

config = {
    'llm': {
        'provider': 'qwen',
        'config': {
            'api_key': 'your_api_key',
            'model': 'qwen-plus'
        }
    },
    'embedder': {
        'provider': 'qwen',
        'config': {
            'api_key': 'your_api_key',
            'model': 'text-embedding-v4',
            'embedding_dims': 1536
        }
    },
    # Sparse vector embedding configuration
    'sparse_embedder': {
        'provider': 'qwen',
        'config': {
            'api_key': 'your_api_key',
            'model': 'text-embedding-v4'
        }
    },
    'vector_store': {
        'provider': 'oceanbase',
        'config': {
            'collection_name': 'memories',
            'embedding_model_dims': 1536,
            'include_sparse': True,  # Enable sparse vector
            'connection_args': {
                'host': '127.0.0.1',
                'port': 2881,
                'user': 'root',
                'password': 'your_password',
                'db_name': 'powermem'
            },
            # Optional: Configure search weights
            'vector_weight': 0.5,
            'fts_weight': 0.5,
            'sparse_weight': 0.25
        }
    }
}

memory = Memory(config=config)

Configuration Parameters

Parameter	Type	Default	Description
`include_sparse`	bool	`False`	Whether to enable sparse vector support
`sparse_embedder.provider`	string	-	Sparse vector embedding provider (currently supports `qwen`)
`sparse_embedder.config.api_key`	string	-	API key
`sparse_embedder.config.model`	string	-	Embedding model name
`vector_weight`	float	`0.5`	Vector search weight
`fts_weight`	float	`0.5`	Full-text search weight
`sparse_weight`	float	`0.25`	Sparse vector search weight

Query Usage

After configuring sparse vector, searches will automatically use sparse vector for hybrid search without any code changes.

Basic Search

from powermem import Memory, auto_config

# Load configuration (automatically loads from .env)
config = auto_config()
memory = Memory(config=config)

# Add memory (automatically generates sparse vector)
memory.add(
    messages="Machine learning is a branch of artificial intelligence, I love machine learning",
    user_id="user123"
)

# Search (automatically uses sparse vector for hybrid search)
results = memory.search(
    query="AI technology",
    user_id="user123",
    limit=10
)

Search Weight Configuration

Search combines three methods: vector search, full-text search, and sparse vector search. You can adjust the influence of each search method by configuring weights:

vector_weight: Vector search weight (default 0.5)
fts_weight: Full-text search weight (default 0.5)
sparse_weight: Sparse vector search weight (default 0.25)

Schema Upgrade and Data Migration

If you already have a table without sparse vector support, you need to upgrade the schema and optionally migrate historical data.

For detailed instructions on upgrading existing tables and migrating historical data, please refer to:

Sparse Vector Migration Guide

The migration guide covers:

Schema upgrade steps
Migration parameters and options
Progress monitoring
Verification methods
Rollback procedures

Complete Usage Workflow

New Table (Recommended)

If creating a new table, simply enable sparse vector in the configuration:

from powermem import Memory, auto_config

config = auto_config()  # Ensure include_sparse=True in configuration
memory = Memory(config=config)

# Add memory (automatically generates sparse vector)
memory.add(messages="memory content", user_id="user123")

# Search (automatically uses sparse vector)
results = memory.search(query="query content", user_id="user123")

Existing Table Upgrade

If upgrading an existing table, please refer to the Sparse Vector Migration Guide for detailed instructions.

Prerequisites​

Configuring Sparse Vector​

Environment Variable Configuration​

Dictionary Configuration​

Configuration Parameters​

Query Usage​

Basic Search​

Search Weight Configuration​

Schema Upgrade and Data Migration​

Complete Usage Workflow​

New Table (Recommended)​

Existing Table Upgrade​