Retrieval-augmented generation (RAG) needs two things that can quietly add cost: a vector store and an embeddings model. Both have strong free options in 2026. Here is how to build AI search and RAG for $0 to start.
Free vector storage options
You often do not need a dedicated vector database to begin:
- Postgres + pgvector — run it on a free Postgres tier (for example, Neon's free plan). For many apps this is all you need, and it keeps your vectors next to your relational data.
- Dedicated vector DB free tiers — several managed vector databases offer free tiers suitable for prototypes and small indexes.
- SQLite / libSQL (Turso) — a free tier that works well for smaller, embedded use cases.
Check the Perkstack catalog for current data and infrastructure free tiers and credits.
Free and cheap embeddings
- Several LLM providers include embeddings in their free tiers or give signup credits you can spend on them — see free AI API credits.
- Open embedding models can be run on free inference tiers or cheap hosts.
- Cache embeddings aggressively — re-embedding the same text is wasted spend.
Free retrieval/search APIs for agents
If your "RAG" is really web retrieval, Tavily and Exa each offer around 1,000 free API requests a month — built for AI agents. They pair well with any LLM.
Keep RAG cheap as you grow
- Chunk and embed once; store, do not recompute.
- Use a smaller embedding dimension where quality allows.
- Compare inference providers per model in the rankings for the generation step.
Bottom line
Start RAG on Postgres + pgvector with free-tier embeddings, and add web retrieval via a free Tavily or Exa tier. Create a free account to find the credits that cover the rest of your stack.