Your Website Has No Official "Readme" and That's Becoming a Real Problem
Think about the last API you integrated. There was documentation, a schema, maybe an OpenAPI spec. You knew exactly what endpoints existed, what they returned, and how to authenticate. Now think about your website. Can a browser agent, an AI crawler, or even a new team member look at a single canonical document and understand what your site does, who it's for, and how it's structured? Probably not.
That gap is exactly what The Website Specification is trying to close. It's a proposed open standard, not a W3C recommendation yet, not enforced by anyone, but a practical attempt to give every website a machine-readable and human-readable identity document. Think of it like a package.json for your entire site.
What The Website Specification Actually Proposes
At its core, the spec defines a structured format, hosted at a known path on your domain, that describes your website's metadata in a consistent way. The idea borrows from conventions developers already know. Just as robots.txt lives at the root and tells crawlers what they can access, and sitemap.xml tells them what pages exist, a Website Specification file tells agents and humans what the site fundamentally is.
The proposed file is JSON-based and includes fields for the site's name, description, primary language, subject matter, intended audience, maintainer contact, and more. It's not trying to replace OpenGraph tags or JSON-LD structured data. It's filling a different role: a top-level declaration of intent and identity, rather than per-page metadata.
A minimal example looks something like this:
{
"name": "Coders Vibe",
"description": "A developer blog covering AI coding tools, vibe coding, web development, and practical bug fixes.",
"url": "https://codersvibe.com",
"language": "en",
"topics": ["web development", "JavaScript", "AI tools", "developer tooling"],
"audience": "working developers and indie hackers",
"maintainer": {
"name": "Coders Vibe Team",
"email": "[email protected]"
}
}
That's the rough shape of it. The actual spec goes further, but the philosophy stays the same: explicit beats implicit. Don't make a crawler or an AI agent guess what your site is about by scraping your homepage hero text.
Why This Matters Right Now
The web is changing fast. AI agents are crawling, summarizing, and acting on website content in ways that a robots.txt file was never designed to handle. Tools like ChatGPT's browsing mode, Perplexity, and a wave of autonomous coding agents all need to understand websites, not just index them.
Right now, when an AI agent hits your domain, it guesses. It reads your title tag, scrapes your homepage, maybe looks at your OpenGraph data. That guessing produces errors. A portfolio site gets treated like a shop. A docs site gets treated like a blog. An internal tool with a public-facing domain confuses every automated system that touches it.
A standardized specification file fixes the guessing problem at the root. The site tells agents what it is, once, authoritatively. No inference required.
This is especially relevant if you're building tools that interact with external sites programmatically. Anyone using TypeScript to write a crawler or an agent that needs to understand third-party domains would benefit enormously from a standard they could rely on. Right now there's no convention. You're pattern-matching against HTML structure and hoping for the best.
How It Fits Into the Existing Ecosystem
It's worth being clear about what this spec does not overlap with, because there are already several metadata conventions competing for attention:
- robots.txt controls crawler access. It says nothing about what the site is.
- sitemap.xml lists URLs. It says nothing about the site's purpose or audience.
- OpenGraph / Twitter Cards describe individual pages for social sharing previews. They're per-page, not site-wide.
- JSON-LD / Schema.org adds structured data to specific content types. Powerful, but complex and also per-page.
- manifest.json handles PWA installation metadata. Completely different concern.
None of these give you a single place to say "this is a developer blog, maintained by this team, covering these topics, for this audience." The Website Specification fills exactly that slot. It's complementary to all of the above, not a replacement.
The Vibe Coding Angle
Here's the part that's specifically interesting if you're building sites with AI coding tools. When you prompt an AI assistant to "build me a portfolio site" or "add an about page to my blog," the AI has no authoritative reference for what your site already says it is. It's working from whatever context you manually provide in the prompt.
Now imagine your site has a /.well-known/website.json or a root-level spec file. An AI agent with web access could read it before generating any code and instantly understand your site's purpose, audience, and structure. The output would be far more consistent. No more AI-generated copy that introduces your site as a "cutting-edge platform" when it's actually a straightforward personal blog.
That's not science fiction. Tools like Cursor, Copilot, and various MCP-enabled agents are already attempting to read project context from structured files. The Website Specification is the web-layer equivalent of a CLAUDE.md or a .cursorrules file. It's context, made explicit, for any agent that needs it.
If you care about the quality of AI-assisted development on your projects, you should care about this standard. The more structured context you give your tools, the better the output. That's been true since the first day anyone typed a prompt into a code editor.
Should You Implement It Today?
This is where I'll give you a direct opinion: yes, in a limited way, even before the spec is formally standardized.
The cost of adding a JSON file to your domain root is nearly zero. The upside is that you're ahead of the curve when agents and tools start actively looking for it, and you've done something genuinely useful for any human who wants to understand your site at a glance. Treat it the way early adopters treated robots.txt before it was a defacto standard. You didn't need anyone to tell you it was a good idea to define your crawl rules.
A few practical steps to get started:
- Read the spec at specification.website to understand the current proposed fields.
- Create a JSON file matching the proposed format and host it at your domain root or at
/.well-known/website.json, whichever the spec finalizes on. - Keep it honest. Don't keyword-stuff your topics list. Write the description you'd give a developer colleague, not a marketing exec.
- Add it to your site's deploy process so it stays updated when your site's scope changes.
If you're running a React or Next.js site, this is just a static JSON file served from the public folder. If you're on a more dynamic stack, a single route handler returning a hardcoded JSON object gets the job done. Either way, it's a 20-minute task.
The Bigger Picture for Web Standards
New conventions on the web rarely win because a standards body mandated them. They win because enough developers adopt them early, tooling starts consuming them, and the behavior becomes expected. robots.txt was created by a single developer in 1994 and became a de facto standard through adoption alone. Same story for favicon.ico and, more recently, llms.txt, another emerging convention aimed specifically at helping large language models understand site content.
The Website Specification is early. It might evolve significantly. But the underlying problem it solves, giving websites a canonical, machine-readable identity document, is real and getting more urgent as the web becomes increasingly consumed by automated agents rather than human eyeballs.
This connects to a broader shift in how developers need to think about their work. Writing code that humans can read has always mattered. Writing sites that agents can accurately understand is becoming just as important. The software development life cycle is already adapting to include AI tools at every stage. Site-level metadata standards are just the next layer of that adaptation.
If you're already thinking carefully about how open web standards evolve and who controls them, the Website Specification is worth watching closely. The question isn't whether web identity standards will exist. It's whether developers will shape them or inherit whatever the big platforms decide instead.
Start Small, Stay Honest
You don't need to wait for this to become an RFC. Pick your most important project, write a spec file that honestly describes what it does and who it's for, and drop it on your domain. See what happens when you start thinking about your site the way you'd think about documenting a library.
The question worth sitting with: if an AI agent read your website specification file right now and nothing else, would it build the right thing when you ask it to help you?





