Playbook

AI vendor selection for insurance agencies.

A practitioner-grade framework for evaluating AI tools across underwriting, distribution, and operations. Built for agency owners, COOs, and consulting leads who do not have time to fall for the next demo pitch.

Why vendor selection is the highest-stakes AI decision.

A bad AI vendor decision in an insurance agency costs $25,000 to $80,000 in direct implementation cost, plus 6 to 12 months of internal disruption. The direct cost is the licensing fee and the integration hours. The disruption cost is the meetings, the workflow rewrites, the producer pushback, and the credibility hit when the rollout stalls.

If the agency makes two bad calls in the same year, the compounding effect is not linear. Producers and underwriters get vendor-fatigued. The next tool gets installed against active skepticism, and skepticism in change management is more expensive than the first failure was.

The market does not filter for the agency yet. The AI insurance vendor category is two years old. Reviews are thin. Reputation signals are unreliable because most vendors have fewer than 50 named customers. The agency is the filter. The framework below is how to be a good filter.

The five vendor archetypes.

Every AI vendor pitching insurance agencies in 2026 sits in one of five archetypes. The archetype matters because it predicts where the vendor is strong, where it is weak, and which risks dominate.

A1 Wrapper

Foundation-model wrappers

ChatGPT, Claude, or Gemini on rails with insurance-specific prompts and a thin UI. Often built by small teams. Fast to ship, easy to swap, but offers limited moat. Best for prospecting and document generation. Weakest on anything that requires structured data integration.

A2 AMS-native

AMS-native AI

Vertafore, Applied, AgencyZoom, NowCerts, and other AMS vendors baking AI into the workflows you already use. Lower switching cost, native data access, but innovation pace tied to AMS roadmap. Best when the workflow is already AMS-resident. Weakest when the AMS data model fights the AI use case.

A3 Specialty

Specialty insurtech point solutions

Purpose-built point solutions for submission intake, claims triage, certificate generation, prospecting, or proposal automation. Deeper than wrappers, narrower than AMS-native. Best when the workflow is high-volume and well-defined. Risk: vendor concentration if they fail or get acquired.

A4 Generic

Generic enterprise AI

Microsoft Copilot, Google Workspace AI, Notion AI, and other horizontal tools. Cheapest per seat, broadest capability, but no insurance-specific tuning. Best for back-office productivity, internal knowledge search, and meeting notes. Weakest on anything that requires insurance domain reasoning.

A5 Build

Custom builds

In-house engineering or contracted development against open-source models. Maximum control, maximum cost, maximum carrying overhead. Best for agencies with engineering depth, a specific moat to defend, or a workflow no vendor serves. Worst default. The build vs buy answer is almost always buy.

The Build / Buy / Borrow framework.

Before you score vendors, decide which lane the capability belongs in.

Apply the framework one capability at a time. Submission intake might be a clear buy. Internal knowledge search might be a borrow on Claude or Copilot. A custom underwriting model might be a build, or might be deferred entirely.

Twelve questions every vendor should answer.

The questions below are designed to surface gaps in the vendor's own thinking, not to chase compliance theater. Read the answers for substance, not for length.

  1. What foundation model does the product run on, and how do you handle model updates? If the vendor cannot tell you which model class powers their AI, walk. If they handle updates by re-prompting silently, you have a regression risk you do not control.
  2. What training data was used to fine-tune the model, if any? Most insurance AI vendors do not actually fine-tune. They prompt-engineer on a foundation model. That is fine, but the vendor should be honest about it.
  3. How does the product integrate with our AMS? API, webhook, RPA, or copy-paste? Each has different cost and reliability profiles. The vendor should know which AMSs they support natively, which they bridge through, and which they cannot serve.
  4. What is your security posture? SOC 2 Type 2 is the floor in 2026 for any vendor handling PII or producer data. Ask for the report. If they have not started SOC 2, expect to be the security validation customer, with the costs that come with it.
  5. What contractual escape do we have? Month-to-month is ideal at the pilot stage. Annual is acceptable for established vendors with a track record. Multi-year auto-renewals with no opt-out window are a hard no.
  6. How is the product priced? Per seat, per policy, per submission, flat agency tier, or usage-based? Each model creates different incentives for the vendor. Per-policy or per-submission rewards them for your growth, which can be good or expensive.
  7. Show us three named agency references at our size. Not logos. Named agencies, with named contacts, who agreed in writing to take a reference call. If the vendor cannot provide three, you are the proof point. Charge them for being one.
  8. What is the worst failure mode of the product, and how is it surfaced? Every AI system fails. The question is whether the failure is visible to the user, silently logged, or hidden. The best vendors will name their failure modes specifically.
  9. What audit trail does the product produce? For E&O purposes, the agency needs to reconstruct what the AI did and why. The vendor should produce per-action logs with timestamps, model version, prompt or input, output, and any user override.
  10. What is your model update cadence? Daily, weekly, monthly, or quarterly? The vendor should know. If they say "we update continuously," ask whether updates are tested against a regression suite. If they say "we update when the foundation model changes," ask whether you are notified.
  11. What is your E&O coverage and indemnification posture? Does the vendor carry their own E&O? Will they indemnify the agency for product failures? Vague language here usually means no.
  12. If you go out of business or get acquired, what happens to our data and our integration? Data export terms in writing. API SLA in writing. Acquisition language in writing. This is where most vendor contracts are weakest, and it is the highest-leverage clause to negotiate.

Score each vendor on the same 12 questions. Differences in answers are where the decision lives. Sameness in answers means the vendor has a polished sales playbook, not a better product.

Five red flags.

Any single flag is a signal for more diligence. Two or more is usually disqualifying.

Pilot scoping that tells you something.

A pilot exists to surface failure modes that the demo cannot show. Most insurance agency AI pilots fail not because the product is bad but because the pilot was scoped to confirm rather than to test.

A pilot worth running has all of the following.

Most pilots fail by drifting. A 30-day pilot with no readout schedule becomes a 90-day pilot with no decision, then a 180-day pilot the agency cannot kill without admitting it should have killed at day 30.

AMS and carrier integration.

Four integration patterns matter for insurance AI. Each has different cost, debuggability, and reliability profiles.

Carrier-side integration is harder. Most carriers expose limited APIs to agencies for AI use. Wholesale and MGA integration is more open than direct carrier integration in 2026. If your AI vendor promises "seamless carrier integration," ask which carriers, through which APIs, with what SLA. The honest answer is usually "fewer than they implied in the demo."

What's reasonable to pay.

2026 pricing patterns across the five archetypes, drawn from publicly listed and reference-verified ranges.

Archetype Typical pricing model 2026 range (SMB agency)
Foundation-model wrapperPer seat or flat agency$30 to $100 / user / month or $500 to $2,000 / agency / month
AMS-native AIAdd-on to AMS subscription$10 to $40 / user / month above AMS base
Specialty insurtechPer policy, per submission, or flat tier$300 to $2,000 / agency / month, or $1 to $10 per submission
Generic enterprise AIPer seat$20 to $30 / user / month (Microsoft Copilot, Google Workspace AI)
Custom buildProject + ongoing$30,000 to $200,000 build + 20% / year ongoing

Ranges reflect public pricing and reference-verified contract observations across SMB agencies in May 2026. Enterprise and mid-market tiers diverge.

The reasonable test: does the AI save more time per month than its monthly cost, accounting for change management overhead? Time saved should be measured net of the meeting hours, the prompt-tuning, the producer training, and the integration maintenance. Apparent savings before change management is what every vendor sells. Net savings after change management is what matters.

Scaling or killing after the pilot.

The week-4 decision document forces a binary at the right moment. The three honest outcomes:

Document what you learned regardless of the outcome. The next vendor decision in this category gets faster because you have a baseline. A killed pilot is not a wasted pilot if the post-mortem informs the next call.

For the governance layer that wraps vendor selection (policy, audit trail, incident response), see the AI Governance for Insurance Agencies playbook. For the operator-side view of where claims AI vendors deliver the most ROI, see the AI in Claims Operations playbook.

FAQ

Vendor selection questions.

How do I evaluate an AI vendor for an insurance agency?

Apply the Build / Buy / Borrow framework first. Then use the twelve-question vendor checklist above: provenance, training data, AMS integration, security posture, contractual escape, pricing model, references, named agency case studies, error-handling, audit trail, model update cadence, and E&O coverage. Score each vendor on the same rubric. Pilot only after the top two clear the rubric.

What are the red flags for an insurance AI vendor?

Five common: demo-only references with no named agencies, locked-in contracts longer than 12 months with no escape clause, pricing on "let's discuss", no published security or compliance attestation, and a sales team that cannot answer technical questions about model behavior.

How long should an AI vendor pilot run?

30 days is the right size for most insurance AI pilots. Shorter does not give the workflow enough cycles to surface real failure modes. Longer lets sunk cost set in.

What is a fair price for AI tools in an insurance agency?

Per-seat productivity AI runs $30 to $100 per user per month in 2026. Specialty insurtech runs $300 to $2,000 per month per agency for SMB tiers. The honest test: does the AI save more time per month than its monthly cost, net of change management overhead?

Should an agency build AI in-house or buy from a vendor?

For 95% of insurance agencies, buy. Build requires engineering capacity, a competitive moat, and no acceptable vendor existing. All three conditions. Borrow (open-source models with a thin internal layer) is the right middle path when the buy market is immature for a specific workflow.

Which AI vendor categories are most mature for insurance in 2026?

Submission triage and intake automation are the most mature. Claims triage and document extraction are close behind. AI for prospecting and proposal generation is mature in the producer tier. Underwriting AI is more fragmented and varies by line of business.

What integration patterns work for AI in an agency AMS?

Four common patterns: API (cleanest when the AMS exposes one), webhook (event-driven, harder to debug), RPA bridges (where neither API nor webhook exists, with TOS caveats), and copy-paste (fallback for month one of a pilot).

Who should make the AI vendor decision in an agency?

The agency owner or COO owns the decision, with input from the line-of-business leader closest to the workflow being automated and a designated sceptic tasked with surfacing failure modes. Two named decision-makers with one designated tiebreaker is the right size.

Where this framework lives in CAIC

This is Module 3 and Module 6 in the curriculum.

The framework above is a compressed version of the vendor evaluation methodology inside the Certified AI Insurance Credential (CAIC). The full curriculum covers Build / Buy / Borrow in Module 3, the live vendor landscape in Module 6 (refreshed quarterly), and the E&O exposure layer in Module 9. Get Module 1 free below.