Business model
Unit economics for the prototype: subscription revenue and inference
spend for one primary model. Inputs are assumptions, not measurements
— calibrate against real Gemini promptTokenCount /
candidatesTokenCount logs as they accumulate.
Monthly P&L
Caching only applies to the system-prompt portion of input. Gemini cached rates assume a paid Vertex AI / AI Studio tier. Anthropic cached rates use the published 90%-off cache-read price; the one-time 25% cache-write surcharge isn't modeled — assume the cache is hot.
Models the Flash → Pro fallback. Cost is the escalation model's per-call rate × calls × escalation rate. If the escalation model equals the primary, this row stays at $0 — no double bill.
Covers what scales with calls: Cloud Functions invocation + GB-seconds, Firestore writes for the session doc and analytics events, Storage upload of the clip bytes. Calibrate against the Overview Firebase trend — recent monthly cost ÷ analyses ≈ per-analyze rate.
Covers what doesn't scale with calls: BigQuery billing export, scheduler triggers, dashboard reads, base infrastructure.
Optimal price forecast
Sweeps subscription price across the range using a constant-elasticity
demand curve anchored at your current Subscription × Users:
users(p) = u₀ · (p₀/p)^ε. Elasticity 1.0 means a 10% price hike
costs 10% of users; 1.5 is typical for prosumer SaaS; <1 means demand
is inelastic (no interior optimum — profit grows with price until the
curve tops out at the upper bound). Treat the optimum as a starting
hypothesis for a real pricing test, not a target.