Loupe — What AI Actually Costs, and What It Saves

01The Pricing Principle

We don't sell you tokens. We sell you the harness.

Every other "AI" vendor in this space buys tokens wholesale and resells them to you at a markup. That model quietly punishes you for using the product, and it puts the vendor's incentive in direct conflict with yours: they profit when your consumption rises.

Loupe takes the opposite position. We pass model tokens through at cost — zero markup. We charge only for the harness: the orchestration, governance, memory, and tooling that turn a raw language model into a jeweler that can actually quote, cost, transfer stock, and close the books correctly.

Tokens — pass-through, at cost

The fee charged by the underlying AI model. A raw commodity, bought at the same published rate available to anyone. We add nothing to it.

Harness — our value, flatly priced

Billed at $0.0025 per tool run. A workflow that touches one tool costs one unit; a workflow that chains seventy-five touches seventy-five. You pay for work performed, nothing more.

Why this is the honest position — and the durable one. Because we don't profit from token volume, our roadmap is free to do the one thing a markup vendor never will: relentlessly drive your token cost down. The harness is our edge, so making the tokens cheap is in our interest as much as yours.

Pricing & ROI Briefing · 01

02What You're Paying For

The model is the smallest part. The harness does the work.

A language model on its own cannot run a jewelry business. It cannot see your inventory, respect an approval gate, remember a session, or be trusted with a price. Seven systems around it — the harness — are what make it safe, accurate, and useful. This is what the $0.0025 buys.

LLM

The smallest part

Cost & Workflow Optimization

How we choose what runs where

Deterministic vs non-deterministic
Model selection — frontier to small
Skills vs memory encoding
Token & latency budgets

Context & Memory

What the model sees

Working memory
Episodic & long-term stores
Retrieval & RAG pipelines
Prompt assembly

Tools & Action

How the model acts

Tool registry
Dispatch & argument validation
External & ERP APIs
Permissions & approval gates

Orchestration & Loop

How the model reasons

Think → act → observe loop
Planning & decomposition
Sub-agents & delegation
Retries, budgets, stop rules

State & Persistence

What the model remembers

File system & workspace
Checkpoint & resume
Session & thread persistence
Artifacts & snapshots

Sandbox & Compute

Where the model runs

Isolated workspace
Controlled package & data access
Network egress controls
Credentials outside the model

Observability & Governance

How you trust the output

Tracing & structured logs
Evals & regression suites
Guardrails & policy
Human-in-the-loop

LLM

The smallest part

Cost & Workflow Optimization

How we choose what runs where

Deterministic vs non-deterministic
Model selection — frontier to small
Skills vs memory encoding
Token & latency budgets

Context & Memory

What the model sees

Working memory
Episodic & long-term stores
Retrieval & RAG pipelines
Prompt assembly

Tools & Action

How the model acts

Tool registry
Dispatch & argument validation
External & ERP APIs
Permissions & approval gates

Orchestration & Loop

How the model reasons

Think → act → observe loop
Planning & decomposition
Sub-agents & delegation
Retries, budgets, stop rules

State & Persistence

What the model remembers

File system & workspace
Checkpoint & resume
Session & thread persistence
Artifacts & snapshots

Sandbox & Compute

Where the model runs

Isolated workspace
Controlled package & data access
Network egress controls
Credentials outside the model

Observability & Governance

How you trust the output

Tracing & structured logs
Evals & regression suites
Guardrails & policy
Human-in-the-loop

Pricing & ROI Briefing · 02

03Indicative Price List

Cost per workflow run, in plain numbers.

Eighteen representative workflows, priced end-to-end. Token is the pass-through model cost; harness is $0.0025 per tool run; total is what one full run costs. The complexity tag reflects how much work each workflow chains — that is what drives cost.

18Workflows
priced

$2.652Total token
cost

$0.5375Total harness
cost

$3.1895Total cost
(one run each)

WF-001 Find a quote by number or customer LOW SALES / QUOTES & ORDERS

Look up one or a handful of quotes by quote number, customer name, status, or free text and report their parties, status, and item count. Pure read, no writes.

Token$0.011 Harness$0.0025 Total$0.014

WF-002 Quote expiry chase list LOW SALES / QUOTES & ORDERS

Sweep for pending quotes whose expiration date is approaching or passed and surface a chase list to the sales person so they can follow up before the offer lapses. Read-only.

Token$0.013 Harness$0.0025 Total$0.016

WF-004 Add catalogue variants to a draft quote in bulk LOW SALES / QUOTES & ORDERS

Add several catalogue variants as line items to an existing draft quote, letting the server build each self-contained variantSnapshot at the chosen config. Each add is an agentSafe approval-gated write.

Token$0.087 Harness$0.022 Total$0.110

WF-006 Apply a discount to a quote line LOW SALES / QUOTES & ORDERS

Edit a single quote item to set a discount or override the committed unit price on its valuation, then re-read the item to confirm the new discounted price. Editing items is approval-gated.

Token$0.054 Harness$0.015 Total$0.069

WF-007 Remove a line or delete a draft quote LOW SALES / QUOTES & ORDERS

Remove one or more line items from a draft quote, or delete a draft quote entirely (drafts only — there is no archive for quotes). Both are approval-gated writes.

Token$0.048 Harness$0.010 Total$0.058

WF-008 Finalize-readiness check before sending a quote MEDIUM SALES / QUOTES & ORDERS

Before finalizing, verify the draft quote has a customer set, at least one item, and that every item carries a variantSnapshot; report any gaps so the supplier can fix them. Read-only diagnostic that ends at the optional finalize approval card.

Token$0.027 Harness$0.0075 Total$0.035

WF-011 Finalize a quote, then withdraw it to edit again MEDIUM SALES / QUOTES & ORDERS

Move a complete draft quote to pending (finalize, supplier-side lock), and — if the customer asks for a change — withdraw it back to draft so the items become editable again. Both are approval-gated supplier actions.

Token$0.062 Harness$0.028 Total$0.089

WF-013 Convert complex selection to draft quote HIGH SALES / QUOTES & ORDERS

AI vision interprets the intent and objectives behind a hand-written selection and turns it into a draft quote, including complex modifications, new-variant creation, and high-fidelity image generation. Complex orchestration and dependency management.

Token$0.600 Harness$0.1225 Total$0.7225

WF-065 Post a Found / Loss Stock Adjustment MEDIUM INVENTORY / WAREHOUSE

Adjust fungible stock up (found / opening) or down (loss / shrinkage) for a single variant at a location, posting a balanced adjustment against the tenant variance account. Single approval-gated write.

Token$0.040 Harness$0.020 Total$0.060

WF-066 Instant Transfer of Free Stock Between Locations MEDIUM INVENTORY / WAREHOUSE

Move free (unreserved) stock instantly between two locations, posting from->to immediately as a 'received' transfer. Only free stock moves; cost-neutral. Single approval-gated write.

Token$0.040 Harness$0.018 Total$0.057

WF-067 Ship a Transfer to In-Transit with Carrier & Tracking MEDIUM INVENTORY / WAREHOUSE

Dispatch stock that physically travels between sites: post a shipped transfer to the tenant In-Transit location with carrier and tracking, so goods show as gone on dispatch and are received later. Single approval-gated write.

Token$0.035 Harness$0.022 Total$0.057

WF-076 New-Warehouse Onboarding (Build Tree + Opening Balances) HIGH INVENTORY / WAREHOUSE

Stand up a new site -> warehouse -> bin location tree (plus a QC hold dock with promisable:false), then load opening balances via manual stock-in and verify with a subtree rollup. Multi-step, several approval-gated writes.

Token$1.076 Harness$0.188 Total$1.263

WF-079 Export an Inventory Valuation Snapshot to File MEDIUM INVENTORY / WAREHOUSE

Capture a point-in-time inventory carrying-value readout and hand the user a downloadable file by kicking off an export job (the only outward data path; no accounting connector exists). Read plus an approval-gated export-job write and a download.

Token$0.020 Harness$0.010 Total$0.030

WF-106 What-if recost a variant at today's metal spot LOW MASTER DATA — PRODUCTS & VARIANTS

Re-price one existing variant config at today's (or a user-supplied) gold/platinum spot price without swapping any materials and without persisting anything — a pure read-only what-if.

Token$0.154 Harness$0.025 Total$0.179

WF-107 What-if cost a variant in a different metal (no create) LOW MASTER DATA — PRODUCTS & VARIANTS

Cost a hypothetical version of an existing variant with a material swapped per slot (e.g. the band in 14k rose gold instead of yellow) at today's spot, returning full valuations without persisting anything.

Token$0.188 Harness$0.022 Total$0.211

WF-108 Find all variants of a product in a given metal LOW MASTER DATA — PRODUCTS & VARIANTS

Search the catalog for every variant under a product (or matching a fuzzy term) and filter to those built in a specific metal/karat, returning a concise list with status.

Token$0.160 Harness$0.0050 Total$0.165

WF-109 Look up a material/process cost in a grid LOW MASTER DATA — PRODUCTS & VARIANTS

Look up the per-unit cost of one material (e.g. a diamond by carat range) or a process in a chosen pricing grid, range/property-matched, and report the matched cost and unit.

Token$0.026 Harness$0.0075 Total$0.033

WF-110 Today's metal spot price in working units LOW MASTER DATA — PRODUCTS & VARIANTS

Report the latest market spot price for a pure metal and translate it into a jewelry working figure (e.g. USD per dwt for a given karat) for quick costing.

Token$0.0098 Harness$0.010 Total$0.020

Pricing & ROI Briefing · 03

04How To Read The Numbers

These are estimates. Here's what moves them.

The price list is an honest indication, not a fixed tariff. Three variables move the real figure, and you should understand each before you model your spend.

1 · Run-to-run variance

The same workflow varies slightly each run — model responses differ in length and a tool may fire more than once. On a typical simple workflow of around $0.08, we observe a standard deviation of roughly ±$0.02. Budget the mean; expect the spread.

2 · Complexity, not category

The ±$0.02 figure describes the simple band. It does not describe a 75-run build like new-warehouse onboarding, which is an order of magnitude larger and far less frequent. Read each band on its own terms.

3 · Model selection

Token cost depends on which model the workflow requires. As frontier prices fall and our broker routes more aggressively, the token column trends down over time — the harness column stays flat.

Currency

All figures are in USD. Token costs are set by model providers in USD; invoiced amounts are converted to your billing currency at the prevailing rate on the date of quote.

The honest summary: a simple workflow runs a few cents, give or take two; a standard workflow a dime to a quarter; the occasional heavy build a dollar or so. Tokens drift down; the harness is fixed and predictable.

Pricing & ROI Briefing · 04

05The Comparison That Matters

Measured against the people doing it today.

You already employ skilled staff who do this work by hand — and you keep costs lean by throwing capable people at problems. The fair question isn't "AI vs. nothing." It's "what does one workflow cost as a Loupe run, versus the same task done by a person?" Here is that comparison, honestly framed.

The manual path

5–15 min

of a person's time per task — costing, quoting, reconciling, chasing

A fully-loaded back-office hour, even in a low-cost hub, rarely lands below a few dollars once you count salary, supervision, space, and rework. A single quote or recost consumes a measurable slice of it — plus the error rate that comes with manual data entry.

The Loupe path

2 min · cents

a few cents of token + harness, delivered while the client is still in the room

A standard quote-and-cost workflow runs well under a dollar, every time, with the variant snapshot, valuation, and audit trail captured automatically. No fatigue, no transcription error, no waiting for the analyst to be free.

The point is not to remove your people — it is to redeploy them: payroll should back rainmakers, not paperwork. Every workflow Loupe absorbs is an hour your team spends selling, sourcing, and serving clients instead of fighting a spreadsheet. The cost comparison is decisive; the strategic one is more so.

Pricing & ROI Briefing · 05

06Prove It Against Your Own Numbers

The human-equivalent cost, calculated live.

The comparison above uses round figures. This does the arithmetic properly. It models the fully-loaded annual cost of a mid-level data-entry clerk — salary, statutory months, employer contributions, management overhead, attrition, and the cost of human error — against the same volume of records run through Loupe. Adjust any assumption to match your operation; every number is a default you can move.

Records per clerk / day 200

Working days / year 230

Human field-error rate 1.0% Industry QA benchmark is ~1%. Move it to your reality.

Rework minutes / error 15

Downstream cost / error $8 Conservative: sits below the 1-10-100 "$10 to correct" tier.

Management overhead 15%

Annual turnover 35%

Replacement cost (months) 2

Display currency

Override FX rate (optional · local per USD)

Include cost of errors (applied fairly to both human and AI)

Loupe AI cost: $0.08 / record · residual error 0.5% · no fixed platform fee

Location	Human (fully loaded / yr)	Loupe / yr	Annual saving

How to read this — honestly

Salaries are indicative city-level benchmarks (±20–30%), not firm payroll quotes; India and Italy figures are the noisiest. FX is a snapshot dated 2026-06-18 — use the override box for volatile currencies like TRY and VND.
Error defaults are deliberately conservative (1% rate, $8 downstream cost). We have not inflated them to flatter the result; raise them to your own experience if they understate it.
Currency selection is cosmetic — it relabels the USD total at the chosen rate; it does not re-benchmark local salaries.
The error toggle applies the same per-error cost to both human and AI, so the comparison stays like-for-like.
Loupe's AI cost is set at $0.08 per record — the mid-point of our real per-workflow range, not a best case.

Pricing & ROI Briefing · 06

Skeptical of the human-cost figures? See exactly how they're built.

Methodology

How we calculate a fully-loaded employee

We did not pull "a few dollars an hour" out of the air. Every figure in the calculator above is built from a two-stage, country-specific waterfall — first the true cost of employing the person in their own jurisdiction, then the operational cost of running human data entry on top. Everything resolves to USD, then converts cosmetically to your chosen display currency.

Stage 1

All-in employment cost — what you actually pay for the person

Base annual salary. Mid-level local monthly salary × 12.
Statutory & customary extra-month pay. The country-correct piece, not a flat assumption: Belgium ~1.92 months (13th + double holiday pay), Italy 2.0 (13th + 14th), Indonesia / Vietnam / China / Hong Kong / India ~1.0 (THR / Tet / 13th / bonus + gratuity), Turkey ~1.0 (severance accrual), USA 0. This yields "total cash."
Employer social contributions. Total cash × the national employer rate, capped where the law caps it — Thailand SSF (฿9,000/yr) and Hong Kong MPF (HK$18,000/yr) take the lesser of rate or cap; everywhere else is uncapped.
Convert to USD. (Total cash + contributions) ÷ FX rate.
Other mandatory benefits, in USD. Chiefly the US row — employer health insurance ($7,884, KFF 2025 single coverage at ~85% employer share). Most countries are $0 here because their equivalent is already inside social contributions.

Stage 2

Operational loadings — the true annual cost

Management & QA overhead. Employment cost × 15% — a supervisor / QA layer per pod of clerks.
Attrition & replacement. Employment cost × (annual turnover × replacement-months ÷ 12) = 35% × 2 ÷ 12 ≈ 5.8%.
Human-error cost (the toggleable layer). Errored records/yr × cost-per-error, where errored records = working days × records/day × error rate, and cost-per-error = rework minutes × that location's own loaded cost-per-minute, plus a flat downstream-consequence dollar. The rework half is geo-specific; the consequence half is flat, because a bad record costs roughly the same downstream wherever it was keyed.

True annual cost = employment + management + attrition + error. Per day / hour / minute are simply that total divided by the working-time basis.

Worked example · Ho Chi Minh City

Base ₫130.0M → +1 month → ₫140.8M cash → +21.5% contributions (₫30.3M) → ₫171.1M ÷ 26,000 = $6,581 employment cost. Then +15% management ($987) + 5.8% attrition ($384) + error cost ($4,091, from 460 errored records/yr at ~$8.89 each) = $12,043 true annual cost — i.e. $52.36/day, $6.55/hour, $0.109/minute.

Inputs, sources & three honesty notes

Per-location levers (country-correct): monthly salary, FX, employer-contribution %, contribution cap, extra months, benefits. Global defaults (shared): 230 working days, 8 hrs/day, 200 records/day, 1% error rate, 15 min rework, $8 consequence, 15% overhead, 35% turnover, 2-month replacement. Salary data is reconciled from Glassdoor / Indeed / ERI and local boards; error economics rest on the 1% outsourcing-QA benchmark and the 1-10-100 rule.

Salary inputs carry a ±20–30% spread (India and Italy noisiest) — treat the tiers as solid and any single figure as indicative.
The error layer is the biggest swing factor and is deliberately conservative. Because its consequence component is a flat dollar, error cost is a larger share in low-wage locations — it actually exceeds salary in Yogyakarta. That is the real strategic finding, not an artifact.
Currency selection only relabels the USD total at the chosen rate; it does not re-benchmark local salaries.

07The Brokering Layer

Always cheaper than going to the model yourself.

Loupe routes every workflow to the lowest-cost model that still clears the bar for that task — frontier intelligence only where the work demands it, a smaller model everywhere it doesn't. You get the right answer at the floor price, without ever managing a model yourself. Weigh that against the two alternatives:

Your option	What it really costs	Verdict
Subscribe to a single AI provider (e.g. OpenAI) and try to wire it up with your ERP system	You pay top-tier token rates on every task — including the trivial ones a cheap model handles fine — and you still have no jewelry harness around it. You're buying the most expensive intelligence and using it as a calculator.	Overpays and under-performs, no liability cover on AI
Build your own AI-native ERP	Years of engineering, a standing team to maintain seven harness systems, evals, and model routing — rebuilt every time the frontier moves. A multi-million-dollar permanent cost centre to replicate what you can rent today.	Huge sunk costs at risk & likely under-performant, no liability cover on AI
Use Loupe as your Operating System and broker layer for intelligence	No markup on tokens, a flat harness fee, and routing that drives your per-task cost toward its minimum automatically — maintained by us, improving without your effort.	Lowest cost & highest performance, contractual liability cover on AI where it counts

Pricing & ROI Briefing · 07

08The Compounding Case

Cheap tokens, a fixed harness, and a widening lead.

Gold and diamonds are commodities; everyone pays roughly the same for the goods. Overhead is where the margin is won — and AI workflow cost is now part of that overhead. Loupe is engineered so that line item only ever gets smaller relative to the work it does.

No markup, ever. Tokens pass through at cost. You are never penalized for using the system that runs your business.

The floor price, automatically. Our broker routes each task to the cheapest model that clears the bar — and re-routes as the frontier shifts.

Labor redeployed to revenue. Cents per workflow replaces minutes per task, freeing your people to sell rather than reconcile.

The math compounds in one direction. As token costs fall and the harness stays flat, the gap between what a workflow costs you and what it would cost a competitor — running legacy systems and manual labor — widens every quarter. First movers don't just save; they pull ahead.

We run only one or two implementations at a time, done properly by our senior leadership and accelerated by AI — we re-platform your ERP in a fraction of the time a migration used to take. If the numbers in this briefing make sense for your operation, the next step is a deposit to hold your place. Implementation bandwidth is our hardest constraint, the queue is growing, and first movers go first.

Pricing & ROI Briefing · 08