defuddle: Get the main content of any page as Markdown

April 7, 2026

aiprivacymobilegovernment

Black-handled scissors resting on a white surface next to an open magazine. — Photo by Karolina Grabowska www.kaboompics.com on Pexels

What it is

Defuddle strips the fluff. Feed it a URL or HTML and it finds the main article, tossing comments, sidebars, headers, footers and other noise so you’re left with readable content — HTML or Markdown. It has been reported that Defuddle was created for the Obsidian Web Clipper, though it’s built to run anywhere: in the browser, in Node.js, or from the command line. Beware: the project’s README warns it’s “very much a work in progress.” So proceed with curiosity, not blind faith.

How it works

Unlike some extractors that play fast and loose, Defuddle aims to be more forgiving and consistent: fewer uncertain removals, special handling for footnotes, math and code blocks, and richer metadata extraction (including schema.org). It even uses a page’s mobile styles as a heuristic to guess what’s unnecessary — clever, and a bit cheeky. The author positions it as an alternative to Mozilla Readability, with notable differences in behavior and output.

Why it matters

Why should you care? Because manual cleanup of clipped articles is a drag. If you save notes to Obsidian, feed a personal archive, or build an RSS-to-Markdown pipeline, getting a predictable, clean Markdown output saves time and sanity. Want reliable metadata too — author, published date, main image? Defuddle promises that. It’s a small tool for a common frustration, and when it works, it’s the little jolt of relief every information worker needs.

Getting started

Install with npm (npm install defuddle), or run it ad-hoc with npx. It ships a browser API, a Node-friendly module (works with JSDOM, linkedom, etc.), and a CLI with options for JSON, Markdown, debug mode and file output. One caveat: for defuddle/node to import properly you’ll need "type": "module" in package.json. Try it, tweak it, and expect improvements — this is an emerging utility, not a finished saint.

Sources: github.com/kepano, Lobsters

defuddle: Get the main content of any page as Markdown

What it is

How it works

Why it matters

Getting started

Comments