Building a Seamless Blog Engine: Obsidian-First Markdown, MDX, and Build-Time Highlighting
Building a Seamless Blog Engine
I write almost everything in Obsidian. Notes, drafts, half-thoughts, snippets — they all live in a single vault, with backlinks, embedded images, and callouts. The problem is that Obsidian's flavor of Markdown isn't standard Markdown:  for images, > [!NOTE] for callouts, wiki-links everywhere. Pushing one of those files through a normal MDX pipeline produces broken images and ugly blockquotes.
So instead of changing how I write, I built a thin engine that lets me drop my Obsidian notes straight into /content/blog and have them render correctly — admonitions, embedded images, code blocks, the whole thing.
This post is both a tour of how that engine works and a worked example of every feature it supports, because every code block, callout, and image you're about to see is generated by the same pipeline you're reading about.
The problem
The brief was deceptively simple:
- Authoring should feel like Obsidian. No frontmatter dance, no special fences, no "now port your callouts to MDX components" step.
- Output should feel like a hand-crafted Next.js site. Optimized images, dual-theme syntax highlighting, instant page loads, real SEO, no client-side JavaScript for static content.
- Adding a post should be a single git commit. No CMS, no database, no build step beyond
next build.
The catch is that goals 1 and 2 fight each other. Obsidian markdown isn't a strict superset of CommonMark —  will trip up any standard parser, and > [!WARNING] parses as a perfectly valid (and perfectly ugly) blockquote. Bridging the two is the entire job.
The architecture in one diagram
Every post moves through a four-stage pipeline. Each stage has exactly one job, which keeps the system debuggable.
┌────────────────────┐ ┌─────────────────────┐ ┌──────────────────────┐ ┌─────────────────┐
│ /content/blog/*.md │──▶│ String preprocessor │──▶│ MDX + plugin pipeline│──▶│ React renderers │
│ (Obsidian native) │ │ (lib/mdx.ts) │ │ (remark + rehype) │ │ (mdx-components)│
└────────────────────┘ └─────────────────────┘ └──────────────────────┘ └─────────────────┘
author-time build-time, regex build-time, AST render-timeThe split between string-level and tree-level transformations is the most important design decision in the codebase, so let's walk through why.
Stage 1 — String preprocessing
Some Obsidian quirks aren't valid Markdown at all.  doesn't parse — there's no AST node that represents it — so by the time MDX hands you a tree, it's too late: the syntax has already been mangled into plain text.
The fix is to rewrite Obsidian syntax into standard Markdown before the parser sees it, with a single regex pass:
export function preprocessObsidianMarkdown(raw: string): string {
return raw.replace(
/!\[\[([^\]|]+?)(?:\|([^\]]+))?\]\]/g,
(_match, filename: string, alias?: string) => {
const trimmedFile = filename.trim();
const altFromName =
trimmedFile.split("/").pop()?.replace(/\.[^.]+$/, "") ?? trimmedFile;
const alt = (alias?.trim() || altFromName).replace(/[[\]]/g, "");
const src = trimmedFile.startsWith("/") ? trimmedFile : `/${trimmedFile}`;
return ``;
}
);
}That single regex covers three Obsidian shapes and rewrites them all into vanilla Markdown:
| Input | Rewritten to |
|---|---|
 |  |
 |  |
 |  |
After this pass, the rest of the pipeline never has to know Obsidian exists. Every downstream tool — remark, rehype, the next/image wrapper — sees a normal image node and behaves correctly.
This is also where I'd add ==highlight== marks, [[wiki-links]] for cross-post navigation, or Obsidian's ^block-id references — anything that can be massaged into valid Markdown belongs in stage one.
Stage 2 — MDX with a plugin pipeline
Once the string is standards-compliant Markdown, it goes into next-mdx-remote/rsc. The interesting part is the plugin list:
<MDXRemote
source={post.content}
options={{
mdxOptions: {
remarkPlugins: [remarkGfm],
rehypePlugins: [
[rehypePrettyCode, rehypePrettyCodeOptions],
rehypeCodeRaw,
],
},
parseFrontmatter: false,
}}
components={mdxComponents}
/>Three plugins, each doing one thing:
remarkGfm— adds GitHub-flavored Markdown features: tables, strikethrough, task lists, autolinks. Standard stuff, but the table you saw a moment ago wouldn't render without it.rehypePrettyCode— runs every fenced code block through Shiki at build time, producing fully tokenized HTML with two themes baked in (github-lightfor light mode,github-dark-dimmedfor dark). Zero client-side JS for highlighting.rehypeCodeRaw— a tiny custom plugin (more on this below) that walks the highlighted output and stamps the original source onto each<figure>so the copy button has something to copy.
parseFrontmatter: false is important: we already stripped frontmatter with gray-matter upstream, and letting MDX try again would either fail or double-handle it.
Stage 3 — Build-time syntax highlighting
The highlighting story deserves its own section because it's where the engine punches above its weight.
Dual themes, zero client JS
Shiki tokenizes at build time and emits inline styles using two CSS variables per token: --shiki-light and --shiki-dark (plus *-bg for backgrounds). A small block of CSS in globals.css picks the right one based on the active color scheme:
[data-rehype-pretty-code-figure] pre {
background-color: var(--shiki-light-bg);
color: var(--shiki-light);
}
.dark [data-rehype-pretty-code-figure] pre {
background-color: var(--shiki-dark-bg);
color: var(--shiki-dark);
}Toggle the theme switcher in the top-right and watch every code block on this page re-color instantly — no flash, no re-tokenization, no JavaScript. Just CSS variables.
Line numbers, line highlights, word highlights
You've already seen all three demonstrated in this post. The fence syntax matches Shiki's conventions:
| You write | You get |
|---|---|
```ts | Plain syntax highlighting |
```ts showLineNumbers | Adds gutter line numbers |
```ts {1,3-5} | Highlights lines 1, 3, 4, 5 |
```ts /useState/ | Highlights every useState token |
`value{:ts}` | Inline-highlights with TS coloring |
Mixing them is fine — the TSX block earlier in this post uses showLineNumbers {3,7-12} together. The combinatorial fence syntax is the kind of thing that would be a nightmare to implement from scratch, which is why we lean on rehype-pretty-code for it.
The copy button — and the plugin we had to write
We wanted a "Copy" button on every code block. The trouble is that Shiki shreds the original source into a tree of tokenized <span>s. Once that's done, there's no data-source attribute or hidden text node holding the original — the source has been atomized. Trying to reconstruct it from the DOM at click time is fragile (whitespace, ligatures, and highlighted spans all get in the way).
Older versions of rehype-pretty-code exposed a __rawString__ property for exactly this, but it was removed. So we wrote a 50-line companion rehype plugin that runs after the highlighter and walks every highlighted figure, joining the leaf-text back together:
import { visit } from "unist-util-visit";
import type { Element, Root } from "hast";
import type { Plugin } from "unified";
export const rehypeCodeRaw: Plugin<[], Root> = () => (tree) => {
visit(tree, "element", (node: Element) => {
if (node.tagName !== "figure") return;
if (!node.properties?.["dataRehypePrettyCodeFigure"]) return;
let raw = "";
visit(node, "text", (textNode) => {
raw += textNode.value;
});
node.properties["dataRaw"] = raw;
});
};That data-raw attribute makes it to the rendered HTML, where a tiny client component (<CopyButton>) reads it and writes to the clipboard. The button is the only client-side JavaScript in the entire blog reading experience.
Stage 4 — React component overrides
After all that processing, the renderer emits standard HTML tags. The mdxComponents map swaps each one for a custom React component that handles the design system, accessibility, and the last few Obsidian quirks.
Images
function MdxImage({ src, alt }: { src?: string; alt?: string }) {
if (!src) return null;
const isExternal = /^https?:\/\//.test(src) || src.startsWith("//");
if (isExternal) return <img src={src} alt={alt ?? ""} />;
return <Image src={src} alt={alt ?? ""} width={1200} height={630} />;
}This is why stage one always emits a leading slash: it guarantees the path goes through the next/image branch and gets optimized, sized, and lazy-loaded. External images fall back to a plain <img> so we don't have to whitelist domains in next.config.js.
Admonitions — the trickiest one
Obsidian's callout syntax is just a blockquote whose first line happens to be [!TYPE] Optional title:
> [!WARNING] Heads up
> This will overwrite your changes.That's valid Markdown — MDX parses it into a perfectly normal <blockquote><p>...</p></blockquote>. So unlike the wiki-link case, we can't fix this at the string level without effectively re-implementing a Markdown parser to find blockquote boundaries.
Instead, the override does the detection at render time:
- Walk the React children tree with a
getNodeTexthelper that recurses into elements and joins text nodes. Run the[!TYPE] titleregex against the result. - If matched — extract
kind(lowercased) andtitle, then callstripAdmonitionMarker(children)to recursively walk the tree and remove the[!TYPE] titletext from the first text node only. Everything after stays intact: paragraphs, lists, links, code, even nested admonitions. - Render an
<Admonition>— look up the styling (border, background, icon) in anADMONITION_STYLESmap keyed by kind, drop the cleaned children inside.
Eight kinds are supported: NOTE, TIP, INFO, WARNING, IMPORTANT, CAUTION, DANGER, SUCCESS. Here are a few in action:
Bugs we hit along the way
Three issues bit us in production. Each one is worth knowing about because the symptom and the cause are far apart.
Bug 1 — Each child in a list should have a unique key
When stripAdmonitionMarker cleaned the blockquote body and handed an array of children to React.cloneElement, React 19 logged a key warning for every item.
The fix is one line:
const finalChildren = Array.isArray(cleanedChildren)
? React.Children.toArray(cleanedChildren)
: cleanedChildren;
return React.cloneElement(node, undefined, finalChildren);React.Children.toArray synthesizes stable, position-based keys, which is exactly what cloneElement wants when handed a list.
Bug 2 — Encountered a script tag while rendering
We wanted JSON-LD structured data for SEO. The obvious approach was a <script type="application/ld+json"> inside a manual <head> block in app/layout.tsx. React 19 immediately complained, because scripts injected as JSX children never execute.
Moving the script into <body> made the warning go away, but introduced…
Bug 3 — Hydration mismatch on the mobile menu button
The error pointed at a Radix <Sheet> trigger button:
Hydration failed because the server rendered HTML didn't match the client.
- aria-controls="radix-_R_4qlb_"
+ aria-controls="radix-_R_6qlb_"The button was fine. The cause was the JSON-LD <script> we'd just moved into <body>. React 19 hoists script resources during hydration, which shifts the Fiber index that Radix's useId() uses, so the server and client end up disagreeing on the auto-generated ID.
The fix was to render the JSON-LD via next/script with strategy="afterInteractive". next/script injects the tag outside React's reconciliation tree, so it can't shift sibling Fiber indices:
<Script
id="ld-person"
type="application/ld+json"
strategy="afterInteractive"
dangerouslySetInnerHTML={{ __html: JSON.stringify(personSchema) }}
/>Three bugs, three completely different files, all caused by the same chain of consequences from one inline <script> tag. That's React 19 in 2026.
What the engine doesn't do (yet)
Honest scope is the best feature.
| Feature | Status | Notes |
|---|---|---|
| Drafts | ✅ | draft: true in frontmatter excludes the post from build |
| Tags / categories | ✅ | Listed on cards; tag pages would be a 20-line addition |
| RSS | ❌ | A route.ts over getAllPosts() is a one-afternoon job |
| Search | ❌ | Honestly fine without — there are <50 posts |
| Comments | ❌ | Would need a backend; not worth the complexity |
| Math (KaTeX) | ❌ | Add remark-math + rehype-katex if a post needs it |
| Mermaid / diagrams | ❌ | Same — plugin-shaped problem, not engine-shaped |
| Wiki-link cross-refs | ❌ | [[other-post]] rewrites would go in stage one |
The engine is intentionally a core with extension points, not a framework. If a post needs math, that one post can opt into the plugin; everything else stays light.
What I'd tell someone building this from scratch
A few things I'd get right on the first try if I did it again:
- Pre-process at the string level for syntax-incompatible features (wiki-links, embedded blocks). Don't try to patch the AST after the fact.
- Override at the component level for compatible features (admonitions, custom links, callouts). Don't try to invent grammar.
- Highlight code at build time, always. The page-weight and FCP wins compound across every post, every visit.
- Use
next/scriptfor anything that emits a real<script>tag in modern React.useId()collisions are not worth the debugging time. - Keep the contract for authors as small as possible. Frontmatter + filename = published post. Anything more is friction.
The whole engine — preprocessor, plugins, components, and styles — is under 800 lines. Posts are pure Markdown. The build is next build. That's the entire system.
Try it yourself
Every feature in this post — the wiki-link image rewriting, the eight admonition kinds, dual-theme syntax highlighting with line numbers and word highlights, the copy buttons, the GFM tables — works the same way for any post you drop into /content/blog. There's no hidden second system, no per-post configuration, and no client-side JavaScript involved in rendering anything you've read above.
That was the goal: make the file the source of truth, and let the engine disappear.