Comparing two versions of text — code, documents, configuration files, contracts — is one of the most common technical tasks, and a good diff tool makes the difference between a quick review and an error-prone slog. The Unix `diff` command dates to 1974 and the algorithms behind modern diff tools haven't changed fundamentally since the 1980s, but the UI for visualizing differences has improved dramatically. The sections below cover how diff algorithms actually work under the hood, when to use line-level versus word-level comparison, and the specific workflows where a browser-based diff tool beats running `git diff` or opening a heavy IDE.
How Diff Algorithms Actually Work
Modern diff tools are built on the Longest Common Subsequence (LCS) algorithm, a classic dynamic-programming problem first solved in the 1970s and refined since. Given two input texts, the LCS algorithm finds the longest sequence of tokens (typically lines or words) that appears in both inputs in the same order, even if separated by other tokens. Everything inside the LCS is marked as "unchanged"; everything outside the LCS is classified as either an addition (present in the new version only) or a deletion (present in the old version only). The LCS algorithm itself runs in O(m × n) time and space, where m and n are the token counts of each input. For typical documents and code files of a few hundred to a few thousand lines, this is essentially instant. For very large files (100,000+ lines), more sophisticated algorithms like Myers' diff (used by Git) or patience diff (handles some edge cases better) produce similar results more efficiently. The result of the algorithm is a minimal edit script — the fewest additions and deletions that transform one input into the other. Different diff tools implement this edit script differently in their UI, but the underlying math is the same. Line-level diff tokenizes by newline; word-level diff tokenizes by whitespace boundaries; character-level diff tokenizes by individual Unicode code points. Each level catches different kinds of changes — line-level is best for code, word-level is best for prose, character-level is useful for detecting tiny changes like typos.
When to Use Line-Level vs Word-Level Diff
The tokenization level you choose dramatically affects how useful the diff output is for your task. Line-level diff treats each line as an atomic unit — a line either matches completely or is marked as changed. This works beautifully for code, where most meaningful changes involve adding, removing, or replacing whole lines or blocks. Line-level diff is what Git, GitHub PR views, VS Code's source control tab, and IDE merge tools all use by default. For prose documents like blog posts, contracts, or documentation, line-level diff produces confusing results because most prose edits change a few words within otherwise-identical paragraphs — and line-level marks the whole paragraph as changed, forcing the reviewer to eyeball the difference manually. Word-level diff fixes this by splitting on whitespace and marking only the changed words. A sentence that shifted from "the quick brown fox" to "a fast brown fox" shows with just "the" → "a" and "quick" → "fast" highlighted, leaving "brown fox" unchanged. This matches how writers and editors actually think about changes. Character-level diff is reserved for specific cases: detecting trailing whitespace changes, catching invisible Unicode character substitutions, and some spelling-correction workflows. Most practical tools default to line-level and offer word-level as a toggle, which is what this tool does.
When a Browser-Based Diff Beats Git or an IDE
Git and modern IDEs have excellent built-in diff tools, so why does a browser-based diff tool exist? Several practical scenarios make the browser tool the better choice. Comparing text that isn't in a repository: pasted content from emails, responses from different API endpoints, two versions of a configuration copied from different servers, output from two runs of the same script. Git and IDEs require content to be in files in a project, which is friction for ad-hoc comparisons. Comparing generated output for debugging: two JSON responses, two log files, two rendered HTML pages — the browser tool accepts paste and shows differences immediately without temp-file management. Non-developer users who don't have Git or an IDE installed: product managers comparing spec revisions, writers comparing drafts, legal teams comparing contract versions, teachers comparing student submissions. Quick one-off checks during meetings or calls: paste two strings someone is disputing and have the answer in seconds. Privacy-sensitive content: this diff runs entirely in your browser, so comparing sensitive text (internal policies, HR documents, security configurations) doesn't require uploading to a cloud diff service or storing in a local repo. The browser tool also supports sharing a URL of a specific comparison, which is useful for referring to a specific finding in a code review comment or bug report without reproducing the comparison context manually.