A token is roughly three-quarters of a word. The sentence "the policy excludes flood damage" contains six words and approximately eight tokens. Every time you send a message to an AI - and every time it responds - the platform counts up the tokens on both sides and charges you accordingly.
That's it. Tokens are units of text, and AI is billed by the unit.
The reason it matters is that for most of the past three years, businesses didn't pay per token - they paid a flat monthly subscription. A broker could run an entire policy wording through ChatGPT a hundred times and pay the same $20 a month as someone who used it twice. That model is ending. Anthropic, OpenAI and others are now moving enterprise customers to usage-based billing, which means the token count is no longer academic. It's a line on someone's budget.
To get a feel for the scale involved, here are some rough token counts for tasks a broker would recognise.
|
Task |
Approx. tokens |
|---|---|
|
A short client email (in + out) |
300 - 500 |
|
Summarising a one-page endorsement |
800 - 1,500 |
|
Drafting a renewal cover letter |
1,000 - 2,000 |
|
Analysing a 10-page policy section |
800 - 1,500 |
|
Full commercial property policy wording (input only) |
50,000 - 150,000 |
|
Complete renewal pack: policy + submission + prior year |
200,000 - 400,000 |
|
AI agent running autonomously for one hour |
500,000 - 1,000,000+ |
These are indicative estimates. Actual consumption varies by model, prompt length and output complexity.
The last row is the one that caught companies like Uber off guard. When AI agents - software that can work through a task sequence autonomously without a human prompting each step - started running in the background, the token meter kept ticking whether anyone was watching or not. Uber burned through its entire 2026 AI budget by April. Workato saw its bill jump sevenfold the day its provider switched to usage-based pricing.
Token prices vary significantly by model and tier. As a rough guide based on current API pricing:
Budget models (older or smaller): around $0.50 - $2 per million tokens
Mid-range models (GPT-4 class, Claude Sonnet): around $3 - $10 per million tokens
Frontier models (Claude Opus, GPT-5.5): around $15 - $30 per million tokens
Running a complete renewal pack analysis through a frontier model - say 300,000 tokens in total - would cost roughly $4.50 to $9 per analysis at current API rates. Do that fifty times a month and you're spending $225 to $450 on that task alone, before anything else.
Under a flat $20 subscription, the same work costs nothing extra. That is why the shift to usage-based pricing landed so hard, so fast.
For brokerages on subscription plans rather than direct API access, the impact shows up differently - as usage caps, throttled access during peak hours, or prompts to upgrade to a more expensive tier. The underlying dynamic is the same: heavy document work burns through a lot of tokens, and the platforms are no longer absorbing that cost quietly.
The context window is how much text a model can hold in its working memory at once. It is measured in tokens.
This matters for insurance work more than most. If you're asking a model to compare two policy wordings, identify gaps in a submission, or summarise a claims history alongside a renewal proposal - all at the same time - everything you're working with needs to fit inside the context window simultaneously. Material that doesn't fit gets dropped, and the model works with an incomplete picture without necessarily flagging that it's doing so.
Both Claude Opus 4.8 and GPT-5.5 now offer one-million-token context windows via their APIs - enough for most complex commercial broking tasks. A standard commercial property policy wording runs to roughly 50,000 to 100,000 tokens; a complete renewal pack including submission, prior year policy and claims schedule might reach 200,000 to 400,000. The context window determines whether you can load all of that at once - or whether you need to break it into chunks and lose the whole-picture view that makes AI genuinely useful for complex placements.
The catch is that a larger context window also costs more to run. Loading 400,000 tokens into Claude Opus costs more per query than loading 10,000. For routine tasks - drafting a standard letter, answering a quick coverage question - using a large frontier model is like hiring a barrister to post a letter. A smaller, cheaper model will do fine. Knowing which tool to reach for is becoming a real operational skill.
The brokerages managing AI costs well in 2026 aren't restricting access - they're setting expectations. A few practical principles that come up consistently:
Match the model to the task. Routine drafting and quick summaries don't need a frontier model. Save the expensive models for complex multi-document analysis, high-stakes client correspondence and anything where accuracy is non-negotiable.
Set per-user spend awareness, not just caps. Hard caps create frustration. Helping staff understand roughly what different tasks cost - so they make informed choices rather than just hitting a wall - works better and produces fewer complaints.
Watch agent usage closely. Autonomous agents that run in the background can consume tokens at a rate that individual human users never would. Build monitoring into any agentic workflow before it goes live, not after the bill arrives.
Keep client data out of public interfaces regardless of cost. The token bill is the manageable risk. A data breach or a regulatory question about where client information went is not.
Tokens are the unit of work for AI, the same way kilowatt-hours are the unit for electricity. Most businesses spent the first few years of AI adoption with the meter covered up. It's now visible, and the reading matters. For insurance brokerages running document-heavy workflows, understanding roughly how many tokens your common tasks consume - and which models are appropriate for which jobs - is no longer a question for your IT team. It's a budgeting question.
For a full comparison of which AI platforms suit insurance brokers and what each costs, see our companion piece: Which AI is right for your brokerage?