Simple, Predictable Pricing
Start with a 30-day free trial, then choose the monthly lookup volume and namespace coverage that fits your AI app.
All plans include
Estimate the value of cache hits
Every cache hit is a provider call your app does not have to make. Actual dollar savings depend on model pricing, prompt size, response size, and workload repetition.
Starter
Validate caching in one app
Best for prototypes, indie apps, and validating one low-volume workflow before expanding.
$19
monthly โข after free trial
- 50,000 cache lookups / month
- 3 namespaces
- 5,000 validator checks / month
- 60 / min rate limit
Growth
Production cache for SaaS and support workflows
Best for production apps, support bots, internal tools, and repeated RAG questions.
$49
monthly โข after free trial
- 250,000 cache lookups / month
- 15 namespaces
- 25,000 validator checks / month
- 300 / min rate limit
Scale
High-volume cache with broad namespace coverage
Best for high-volume products, multiple apps, heavier traffic, and broad namespace usage.
$149
monthly โข after free trial
- 1,000,000 cache lookups / month
- 75 namespaces
- 100,000 validator checks / month
- 600 / min rate limit
Pricing FAQ
What is a lookup?
A lookup is one /chat request to PromptCacheAI. It checks for an exact or semantic cache hit before your app calls a model provider.
Does saving a response count as another lookup?
No. /cache/save stores the response from a cache miss and does not count as an extra lookup.
Do all plans include semantic caching?
Yes. All plans include exact-match caching, semantic matching, namespace TTL controls, and dashboard visibility.
Do all plans include test mode?
Yes. Test mode lets you simulate cache behavior, review semantic matches, and approve or reject prompt variants before serving cached responses live.
What are validator checks?
Validator checks are used for mid-confidence semantic matches where PromptCacheAI asks an AI validator whether a cached response is safe to reuse. If validator capacity is exhausted, those matches are treated as cache misses instead of being served automatically.
Do all plans include the dashboard?
Yes. The dashboard includes hit rate, exact hits, similarity hits, test-mode would-hits, prompt variants, prompt/response search, TTL status, and editable cached responses.
What happens if I hit my plan limits?
Plan quotas and rate limits are enforced to keep the service reliable. If your app needs more lookups, namespaces, or throughput, upgrade from billing settings or contact us for higher-volume needs.
Can I change plans later?
Yes. You can manage your subscription from billing settings after signing in.
Can I cancel during the trial?
Yes. You can cancel through the billing portal during the trial.