freeradiantbunny.org

freeradiantbunny.org/blog

ai database infrastructure

This page is about AI and LLM with respect to a database infrastructure.

1. Vector Databases

Vector databases are core to LLM retrieval and inference systems. They store embeddings and perform approximate nearest neighbor (ANN) searches.

2. Relational Databases

Used for structured data like logs, metadata, and fine-tuning datasets.

3. Document Stores

Useful for managing prompts, configuration, and LLM session context.

4. Time-Series & Logging Databases

Track usage, monitor performance, and support RLHF feedback loops.

5. Data Lakes & Warehouses

Handle the bulk of pre-training data and feature store infrastructure.

6. Graph Databases

Enable reasoning, symbolic augmentation, and knowledge graphs in LLM apps.

7. Hybrid Systems

Combining keyword, semantic, and embedding search in AI apps.

Infrastructure Summary Table

Role Common Technologies
Vector Search Pinecone, FAISS, Weaviate, Qdrant, Milvus
Structured Data PostgreSQL, MySQL, SQLite
Unstructured Data MongoDB, Couchbase
Monitoring & Logs InfluxDB, ClickHouse, Prometheus
Training Storage Databricks, Snowflake, BigQuery
Knowledge Graphs Neo4j, TigerGraph, Neptune

Trends & Takeaways