What is a token - and why does it matter to your brokerage's AI budget?

You keep seeing the word "tokens" whenever anyone talks about AI costs. Here's what they actually are

What is a token - and why does it matter to your brokerage's AI budget?

Transformation

By Matthew Sellers

A token is roughly three-quarters of a word. The sentence "the policy excludes flood damage" contains six words and approximately eight tokens. Every time you send a message to an AI - and every time it responds - the platform counts up the tokens on both sides and charges you accordingly.

That's it. Tokens are units of text, and AI is billed by the unit.

The reason it matters is that for most of the past three years, businesses didn't pay per token - they paid a flat monthly subscription. A broker could run an entire policy wording through ChatGPT a hundred times and pay the same $20 a month as someone who used it twice. That model is ending. Anthropic, OpenAI and others are now moving enterprise customers to usage-based billing, which means the token count is no longer academic. It's a line on someone's budget.

A practical conversion

To get a feel for the scale involved, here are some rough token counts for tasks a broker would recognise.

Task

Approx. tokens

A short client email (in + out)

300 - 500

Summarising a one-page endorsement  

800 - 1,500

Drafting a renewal cover letter  

1,000 - 2,000

Analysing a 10-page policy section  

800 - 1,500

Full commercial property policy wording (input only)  

50,000 - 150,000

Complete renewal pack: policy + submission + prior year  

200,000 - 400,000

AI agent running autonomously for one hour  

500,000 - 1,000,000+


These are indicative estimates. Actual consumption varies by model, prompt length and output complexity.

The last row is the one that caught companies like Uber off guard. When AI agents - software that can work through a task sequence autonomously without a human prompting each step - started running in the background, the token meter kept ticking whether anyone was watching or not. Uber burned through its entire 2026 AI budget by April. Workato saw its bill jump sevenfold the day its provider switched to usage-based pricing.

What this costs in practice

Token prices vary significantly by model and tier. As a rough guide based on current API pricing:

Budget models (older or smaller): around $0.50 - $2 per million tokens

Mid-range models (GPT-4 class, Claude Sonnet): around $3 - $10 per million tokens

Frontier models (Claude Opus, GPT-5.5): around $15 - $30 per million tokens

Running a complete renewal pack analysis through a frontier model - say 300,000 tokens in total - would cost roughly $4.50 to $9 per analysis at current API rates. Do that fifty times a month and you're spending $225 to $450 on that task alone, before anything else.

Under a flat $20 subscription, the same work costs nothing extra. That is why the shift to usage-based pricing landed so hard, so fast.

For brokerages on subscription plans rather than direct API access, the impact shows up differently - as usage caps, throttled access during peak hours, or prompts to upgrade to a more expensive tier. The underlying dynamic is the same: heavy document work burns through a lot of tokens, and the platforms are no longer absorbing that cost quietly.

The context window question

The context window is how much text a model can hold in its working memory at once. It is measured in tokens.

This matters for insurance work more than most. If you're asking a model to compare two policy wordings, identify gaps in a submission, or summarise a claims history alongside a renewal proposal - all at the same time - everything you're working with needs to fit inside the context window simultaneously. Material that doesn't fit gets dropped, and the model works with an incomplete picture without necessarily flagging that it's doing so.

Both Claude Opus 4.8 and GPT-5.5 now offer one-million-token context windows via their APIs - enough for most complex commercial broking tasks. A standard commercial property policy wording runs to roughly 50,000 to 100,000 tokens; a complete renewal pack including submission, prior year policy and claims schedule might reach 200,000 to 400,000. The context window determines whether you can load all of that at once - or whether you need to break it into chunks and lose the whole-picture view that makes AI genuinely useful for complex placements.

The catch is that a larger context window also costs more to run. Loading 400,000 tokens into Claude Opus costs more per query than loading 10,000. For routine tasks - drafting a standard letter, answering a quick coverage question - using a large frontier model is like hiring a barrister to post a letter. A smaller, cheaper model will do fine. Knowing which tool to reach for is becoming a real operational skill.

What a sensible usage policy looks like

The brokerages managing AI costs well in 2026 aren't restricting access - they're setting expectations. A few practical principles that come up consistently:

Match the model to the task. Routine drafting and quick summaries don't need a frontier model. Save the expensive models for complex multi-document analysis, high-stakes client correspondence and anything where accuracy is non-negotiable.

Set per-user spend awareness, not just caps. Hard caps create frustration. Helping staff understand roughly what different tasks cost - so they make informed choices rather than just hitting a wall - works better and produces fewer complaints.

Watch agent usage closely. Autonomous agents that run in the background can consume tokens at a rate that individual human users never would. Build monitoring into any agentic workflow before it goes live, not after the bill arrives.

Keep client data out of public interfaces regardless of cost. The token bill is the manageable risk. A data breach or a regulatory question about where client information went is not.

The bottom line

Tokens are the unit of work for AI, the same way kilowatt-hours are the unit for electricity. Most businesses spent the first few years of AI adoption with the meter covered up. It's now visible, and the reading matters. For insurance brokerages running document-heavy workflows, understanding roughly how many tokens your common tasks consume - and which models are appropriate for which jobs - is no longer a question for your IT team. It's a budgeting question.

For a full comparison of which AI platforms suit insurance brokers and what each costs, see our companion piece: Which AI is right for your brokerage?

Keep up with the latest news and events

Join our mailing list, it’s free!