All posts
Feature4 min read

AI-Powered CSV Summaries: What Changed and Why Your Data Stays Private

csvdiff.app uses Gemini or OpenRouter to explain your diff in plain English — and none of your data ever leaves the browser.

A diff that shows 847 modified rows is technically accurate and practically overwhelming. You know something changed — but what changed, and does it represent a meaningful pattern or random noise? This is where the AI summary feature in csvdiff.app becomes useful: instead of scrolling through hundreds of highlighted cells, you get a concise plain-English explanation of what happened across the entire dataset.

How the AI Summary Works

After the diff runs, csvdiff.app compiles a structured representation of the changes — which columns changed, how many rows were affected per change type, sample old and new values — and sends this summary (not your raw CSV data) to the AI model of your choice. The model returns a natural-language explanation that describes the patterns it sees. The whole exchange takes a few seconds.

AI summaryGenerated locally
Modified
847
Added
124
Removed
38
  • The status field changed from pending to approved on 312 rows, all with invoice dates in Q1 2025 — likely a scheduled batch approval.
  • unit_price increased on 156 rows in the Electronics category, with a median change of +8.3%. No other categories saw price movement.
  • 44 rows now have an empty phone_number where they previously had a value — worth investigating as a potential data-loss bug.
  • 38 removed rows all share the same legacy_account flag. This looks intentional, but confirm with the owning team before signing off.
Example output from running the AI summary on a 2,000-row subscriptions diff.

What Gets Sent to the Model

The prompt sent to the AI contains the diff statistics and a representative sample of changed rows — not the full CSV files. Your original data files are parsed and diffed entirely within your browser using JavaScript. They are never uploaded to csvdiff.app's servers, and they are never included wholesale in the AI prompt. The model sees enough to describe patterns, but not a full copy of your dataset.

Bring Your Own Key

csvdiff.app does not proxy AI requests through its own backend. You connect directly to Gemini or to any OpenRouter-compatible model using an API key you supply. The key is stored only in your browser's local storage and is sent directly from your browser to the AI provider — csvdiff.app's servers are not in the request path at all. This means csvdiff.app cannot see your API key, cannot log your prompts, and cannot accumulate your data.

What the AI Can Surface

  • Bulk field updates — "The status field changed from pending to approved on 312 rows, all with invoice dates in Q1 2025. This looks like a scheduled batch approval."
  • Rebrand or rename signals — "The company_name field changed on 89 rows, with old values containing 'Acme Corp' and new values containing 'Acme Inc'. This may reflect a corporate name change."
  • Data quality regressions — "The phone_number field is now empty on 44 rows where it previously had values. This may indicate a migration issue or accidental deletion."
  • Price or rate adjustments — "The unit_price column increased on all 156 affected rows, with a median change of +8.3%. Changes are consistent across the Electronics category."
  • Encoding or format anomalies — "Several rows show special characters replaced with question marks in the description field, which may indicate a character encoding mismatch."

Choosing a Model

csvdiff.app supports Google Gemini directly and any model available through OpenRouter, which includes GPT-4o, Claude, Mistral, and others. For most diff summaries, a fast mid-tier model is more than sufficient — the prompt is structured and the task is pattern recognition, not creative reasoning. Gemini Flash or a comparable OpenRouter model gives near-instant results at low cost.

Privacy Guarantees

The privacy model is worth stating clearly. Your CSV files are loaded into browser memory via the File API and processed entirely by client-side JavaScript. Nothing is uploaded. When you trigger an AI summary, a compact structured prompt is constructed in the browser and sent directly from your browser to the AI provider using your own API key. csvdiff.app has no server that handles this request. The provider (Google or OpenRouter) does receive the prompt, so you should review their data policies if your data is particularly sensitive — but the files themselves never leave your machine.

Client-side processing is not a marketing claim — it is enforced by the architecture. There is no upload endpoint. The app has no backend that could receive your data even if it wanted to.

When AI Summaries Are Most Useful

AI summaries add the most value when the diff is large and you need a quick triage before diving into specifics — a 2,000-row export diff after a system migration, a weekly product catalog update with changes spread across dozens of columns, or a vendor file where you are not sure what to expect. For small diffs with five or ten changes, reading the diff directly is just as fast. For large diffs, the AI summary gives you the narrative that turns raw change data into an actionable understanding of what happened.

Getting Started

To use the AI summary feature on csvdiff.app, run a diff as normal, then open the AI panel and paste in a Gemini API key (free tier available from Google AI Studio) or an OpenRouter key. Select your preferred model and click Summarize. The explanation appears inline alongside the diff — no copy-paste, no switching tabs.

Try it yourself

Ready to diff your files?

Upload two CSV files and see the differences in seconds. 100% client-side — your data never leaves the browser.

Start comparing →

Ready to diff?

Drop your files, see the deltas, export the merge. Takes 30 seconds.