• Valley Recap
  • Posts
  • ๐Ÿ“‰ Collapsing Token Pricing ๐Ÿ“ˆ Rising Costs ๐Ÿ’ต Bay Area Startups Collectively Secured $35B+ in May Week 3

๐Ÿ“‰ Collapsing Token Pricing ๐Ÿ“ˆ Rising Costs ๐Ÿ’ต Bay Area Startups Collectively Secured $35B+ in May Week 3

In partnership with

๐Ÿ“‰ Collapsing Token Pricing ๐Ÿ“ˆ Rising Costs

๐Ÿ’ก The price of a single AI token has collapsed. Hardware improvements, inference engine optimization, and architectural changes like disaggregated prefill/decode systems have pushed token costs down fast. API pricing keeps getting cheaper as open models and infrastructure providers fight for efficiency.

By itself, that should have lowered enterprise AI spend.
โ€ฆBUT IT DIDNโ€™T.

Enterprise AI bills are climbing because cheaper tokens unlocked far more usage. (Yes, Jevonโ€™s Paradox effect) We moved past the phase where AI handled short chatbot conversations and lightweight prompts. Tokens are now tied directly to production workloads: code generation, autonomous agents, background processing, internal copilots, research loops, and operational automation running continuously inside the business.

A growing percentage of AI compute now happens at test time instead of training time. Companies are no longer paying for a static answer. They are paying for reasoning depth, iteration, verification, memory management, tool usage, and agent coordination happening live during inference.

A single enterprise task can trigger massive hidden token generation behind the scenes. Multi-agent systems may explore several solution paths, validate outputs, call external tools, retry failed actions, and run internal reasoning loops before returning a final response. One prompt can quietly generate millions of tokens worth of compute.

So even as the cost per million tokens drops, overall infrastructure spend keeps rising because workload volume is scaling faster than efficiency gains.

This is changing how enterprises evaluate ROI around AI infrastructure.

The question is no longer โ€œWhat does a million tokens cost?โ€

The question is โ€œHow much labor, operational throughput, and decision-making capacity did those tokens replace?โ€

Once companies see measurable gains from agentic systems, they become willing to consume enormous amounts of inference. At that point, lower token pricing stops reducing spend and starts accelerating adoption.

The unit economics are collapsing.
The aggregate infrastructure demand is exploding.

That is why the era of the multi-million dollar AI inference bill is only getting started.

The infrastructure underneath has to keep up.
That's what AI INFRA SUMMIT exists to support.

See you at AIS 6 December 4th, San Francisco.
Secure your spot with Super Early Bird Tickets below

Bay Area Startups Collectively Secured $35B+ in May Week 3

The third week of the month closed with more than $35B in fundings, 98% of that came in eleven megadeals. The three largest were in the billions โ€“ Anthropic, $30B; Unnamed Anthropic JV, $1.5B and AMP PBC, $1.3B. Among the other ten megadeals were the $100M stakes that the U.S. government took in nine quantum computing companies, including four in Silicon Valley - two startups (PsiQuantum, Atom Computing), and two public companies (D-Wave Quantum, Rigetti Computing).

For startups raising capital: Stay on top of who's raising, who's closing and who's investing with the Pulse of the Valley weekday newsletter. Founders get the newsletter, database and alerts for just $7/month ($50 value). Check it out and sign up here.

Follow LinkSV on LinkedIn to stay on top of SV funding intelligence, and the companies, investors and executives impacting the startup ecosystem.

Early Stage:

  • Unnamed Anthropic JV closed a $1.5B Series A,an AI-native enterprise services firm to help mid-size companies bring Claude into their core operations.

  • AMP PBC closed a $1.3B Series A, AMP Infra PBC provides pooled, automated infrastructure on the global AI grid.

  • Hark closed a $700M Series A, building the most advanced personal intelligence in the world.

  • Vital Signals closed a $15M Seed, a health innovation company addressing how people understand and manage blood pressure over time.

  • Origin Lab closed a $8M Seed, technology platform turning licensed game worlds into structured training data for world models and multimodal AI.

Growth Stage:

  • Anthropic closed a $30B Series unknown, is an AI safety and research company developing AI systems that are helpful, honest and harmless.

  • Decart closed a $300M Series B, our efficient systems-level AI infrastructure enables a tenfold improvement in both training and inferencing of the largest generative models. 

  • Exa closed a $250M Series C, developing novel representation learning techniques and crawling infrastructure so that LLMs can intelligently find relevant information.

  • Armada closed a $230M Series B,  is the hyperscaler for the edge, delivering modular AI infrastructure from first deployment to AI factory.

  • Commure closed a $70M Series F, delivers next-generation AI infrastructure for health systems.

VAST DATA // AI infrastructure is running into a data problem at the same time compute demand is exploding. 

Training clusters are scaling, inference workloads are multiplying, and agentic systems are increasing pressure on storage, retrieval, orchestration, and real-time data movement.

Most organizations are not looking for โ€œmore storage.โ€ They want infrastructure that can feed GPUs efficiently, eliminate bottlenecks, reduce latency, and support AI systems operating at production scale. That is where VAST Data fits.

Who they are
VAST Data is an AI infrastructure company building a unified software platform that combines storage, database, compute orchestration, and real-time data services into a single AI operating system.

What they deliver
They provide AI-native data infrastructure designed for large-scale training, inference, analytics, and agentic workloads. Their platform unifies structured and unstructured data, multi-protocol storage, real-time data access, and distributed compute orchestration into a shared architecture built for AI-era workloads.

Who they serve
Cloud providers, hyperscalers, enterprises, research organizations, and AI infrastructure operators managing large-scale GPU environments and data-intensive workloads.

VAST Data joined us at AI Infra Summit 5, where conversations around inference economics, data movement, GPU utilization, and agentic infrastructure repeatedly surfaced across the event. As AI infrastructure shifts toward distributed, real-time, and increasingly autonomous systems, the data layer is becoming one of the most important parts of the stack.

Explore VAST Data to see how organizations are building AI infrastructure around unified data architecture.

Your Feedback Matters!

Your feedback is crucial in helping us refine our content and maintain the newsletter's value for you and your fellow readers. We welcome your suggestions on how we can improve our offering. [email protected] 

Logan Lemery
Head of Content // Team Ignite

Crash Expert: โ€œThis Looks Like 1929โ€ โ†’ 71,105 Diversifying Here

Mark Spitznagel, who made $1B in a single day during the 2015 flash crash, warned markets are mimicking 1929. Seems extreme but we did just see the worst quarter for the S&P since 2022.

So itโ€™s not so surprising that Vanguard and Goldman Sachs forecasted 5% and 3% annual S&P returns respectively for 2024-2034.

Late last year, Apolloโ€™s chief economist Torsten Slok put it this way: "expect zero in return in the S&P 500 over the coming decade."

Almost no one knows this, but postwar and contemporary art appreciated 10.2% annually with near-zero correlation to equities from 1995โ€“2025 overall.*

And sureโ€ฆ billionaires like Bezos can make headlines at auction, but what about the rest of us?

Masterworks makes it possible to invest in legendary artworks by Banksy, Basquiat, Picasso, and more โ€“ without spending millions.

29 exits. Net annualized returns like 16.5%, 17.6%, and 17.8% on works held over 1 year+. $1.3 billion invested. 500+ offerings.*

Shares in new offerings can sell quickly butโ€ฆ

*According to Masterworks data. Past performance is not indicative of future returns. Investing involves risk. Important Reg A disclosures: masterworks.com/cd.