Question 1

What is Semantara and what problem does it solve?

Accepted Answer

Semantara is an intelligent proxy that sits between your app and AI providers (OpenAI, Anthropic, and soon Gemini). It adds semantic caching, complexity-based routing, and spend control so you pay less for the same answers — without changing your code.

Question 2

What does BYOK ("bring your own key") mean?

Accepted Answer

You use your own provider keys. Semantara never resells tokens or marks up your usage: you keep your billing relationship with OpenAI/Anthropic, and we only charge for the platform subscription.

Question 3

How does semantic caching save me money?

Accepted Answer

When two questions mean the same thing even if worded differently, Semantara reuses the answer it already generated instead of calling (and paying) the model again. In real workloads, 30-40% of queries are semantically similar, which typically means 40-70% lower cost on those queries.

Question 4

How is this different from OpenAI's or Anthropic's caching?

Accepted Answer

Providers cache identical prefixes (the exact same text). Semantara caches by meaning: "how do I reset my password?" and "I forgot my login, what now?" share one answer. And Semantara stacks on top of provider caching — it doesn't replace it.

Question 5

What is complexity-based routing?

Accepted Answer

Semantara classifies each request and sends it to the right model: simple ones to a cheaper, faster model; complex ones to the powerful model. You pay for the premium model only when it's actually needed.

Question 6

Do I have to change my code to integrate it?

Accepted Answer

No. Semantara exposes an OpenAI-compatible endpoint (/v1/chat/completions). Change the base URL (and optionally the model) and you're done; the rest of your integration stays the same.

Question 7

Which providers are supported?

Accepted Answer

Today OpenAI and Anthropic, with Gemini on the way. Being OpenAI-compatible, you can unify several providers behind a single integration.

Question 8

Are my keys and data secure?

Accepted Answer

Yes. Provider keys are encrypted with AES-256-GCM, every client is isolated in a multi-tenant environment, and we never use your data to train models. See the Security page for details.

Question 9

How much can I really save?

Accepted Answer

It depends on how many queries repeat and on your model mix. Use the Savings Calculator to estimate your case; typical savings on repeated queries range from 40% to 70%.

Question 10

Can I control my spend?

Accepted Answer

Yes. You track your usage and savings in real time from the dashboard —cost is computed on every request, per client and per key— around our core metric: dollars saved per month. And you set per-key rate limits to curb unexpected usage.

Question 11

How do plans and billing work?

Accepted Answer

There's a Free, Pro, and Business plan with fixed pricing, plus custom Enterprise. Because it's BYOK, the subscription covers the platform; model usage is billed directly by your providers. See the Plans page.

Question 12

Where is it hosted? Is there an on-premise option?

Accepted Answer

The managed version runs on cloud infrastructure. For enterprise needs with sensitive data, contact us: the Enterprise plan supports custom deployments.

Frequently asked questions