Note from the author...
My posts frequently (like this one) have a 'theme' and tend to use a number of images for visual flourish. Personally, I like it that way, I find it more engaging and I prefer for people to read it that way. However, for users on a metered or slow connection, downloading unnecessary images is, well, unnecessary, potentially costly and kind of rude. Just to be polite to my users, I offer the ability for you to opt out of 'optional' images if the total size of viewing the page would exceed a budget I have currently defined as 200k...
Because of its origins, HTML has a number of elements and concepts which are, in retrospect, probably not so great. Headings being the way they are, for example, has caused seemingly no end of discussion. I'd like to explain why I think this is, as well as what we should do about it.
Let's talk about paragraphs. You can't get much more basic than a paragraph. When I say "paragraph" pretty much everyone will know immediately what I mean. While some of us might stumble a bit if pressed to describe it abstractly, most of us will, like the US Supreme Court surely "know it when I see it". Most people reading this, for example, will recognize immediately that this sentence is in its own paragraph.
Except that the above sentence isn't its own paragraph. Kind of. Let me tell you a story that will help explain the disconnect...
You see, the interesting thing about a paragraph is that we've had the concept for hundreds of years and its entire history has been to visually convey the start of something. Its meaning winds up coming from the visual. The sentence above looks like a paragraph on the screen or on the printed page, so, in effect it is. Except that it isn't really. But we'll get to that.
Some of you may have seen a mark like this before: ¶. If not, that's ok - it's an editor's mark called a "pilcrow". It means "mark the start of a new paragraph". Interestingly enough, the reason that paragraphs were historically indented in print comes from the fact that scribes began the habit of just starting a new line and leaving space to come back and draw the pilcrow later. Once you had the new line and space, you had an easy visual semantic already - the pilcrow becomes just noise. Editors (or teachers) will sometimes draw this mark between sentences to convey the same: "You should start a new paragraph here".
The thing worth noting in this story is that a pilcow (or newline+indent) were visual markers that implied, logically, that you were not just beginning a paragraph, you were inherently ending the previous one.
Flash forward a few hundred years to when we were just beginning to use computers. GML borrowed the concept of editors marks like the pilcrow in the 1960s. Thus, they reasoned about things the same way: No "closing tag" was required for many things. These traditions carried on years later in SGML, and years laters still in HTML. As you may know, the original vocabulary of HTML was based on a lot of SGML at CERN. It was built "on the bones" with the intent that you could view these documents with a web browser and still pretty much understand it.
Given this history, HTML originally (unsuprisingly) contained only the simplest and most basic concepts from print. It was originally intended to be written with a simple rich text style editor in your browser, in a more or less "flat" fashion - visually. Thus, something like this might be an example of a really common "original" document back in the day:
<h1>This marks the top level heading <p>You can know that this is a paragraph and not heading, because paragraphs and headings are mutually exclusive ideas, so, no need for a closing tag to that h1 <p>Similarly, you can know that the previous paragraph ends when this one starts, because anything else would be non-sensical. It is, effectively, a pilcrow, no need to close the p <h2>You can know that this is a subsection because we haven't hit another h1 yet. <p>And so on..
Except... Remember at the beginning when I said that sentence "wasn't a paragraph?" In fact, it is a
span inside of the previous paragraph. That is something that defies logic, and yet - there it is. Kind of.
A lot of people might think who cares? is it that big of a deal?. Yes, I think it is. We can't understand an "outline" of the document and it would be super if we could. Have you ever wanted to auto-generate a table of contents? Well, you can't easily do that based on mud or at best it is a really "dumb" outline. If the only way to really discern the proper boundaries of a section and their relationships with "headings" requires eyesight, well, machines don't have it. What if a new device, like a watch or something came along and wanted to generate an outline and present you with sections? How could it? Or how about this: You're driving in your car and I send you a link that says "read the history section here". Wouldn't it be nice if you could ask your phone's voice assistant "open this link and read me the history section". Or wouldn't it be nice if search engines and AI could analyze your sections? Oh, and also anyone using assistive technology (AT) would have a hard time. I mention them last because I want to stress that, in fact, all of these things rely on the same stuff. It's not just "people using screen readers" who need this. You too, effectively use (and will increasingly use) a kind of AT.
Ok, so why are we so bad at this? First, I think it is because we are by and large such visual creatures. If the visual semantics "seem right" we just kind of assume they are - even when they are total shit. While "default visual styling" was supposed to be a side-effect of getting it right, it seems to have played the opposite role since it is so easy to muck up everything else and still get the visuals right. You can put a heading in any old place - like as the only child of a meaningless element. What is that a heading for? When does that "section" start and stop? It's hard to definitively know. Likewise, search engines making use of headings intelligently was supposed to be a side effect of getting it right too. Given all this, we seem to have always stressed the side-effects and forgot the meaning.
However, it's not just that "we made a mess by misusing them" - the truth is, they are kind of unusable. It's implausible to imagine reasoning about things with flat markers anymore. Modern documents aren't flat, not remotely. They are increasingly rich and structured and highly stylized. Today we have complex sections and articles and navs and things between those and so on. Hell, modern documents frequently contain a lot of markup that exists purely as something to hang CSS on. We just cannot reason about "markers" in the same way. We can create one hell of a mess though thinking that it actually "means" what it seems to clearly mean visually.
Consider all of the ways that we stitch together forms of reusable fragments: CMSs, build tools, application servers and templates, etc. All of those are Really Good Things. A whole lot of the Web only exists because of them. Except that there is no way to know ahead of time which level heading is actually appropriate to include in a reusable fragment! Wow.
Now - here's the really really shitty part: We kind of seem to have known this since the beginning. Before there was a W3C, before there was a Netscape or even a Mosaic, before almost anyone even knew there was a "web" there was already (and continued to be) a lot of talk scratching around the idea that "flat" doesn't work here and at least at some level trying to talk about things in terms of just "sections with headings". In fact, the in the 4th email ever sent to the new www-talk mailing list Sir Tim Berners-Lee described this as his preference himself. Note that the first message was Tim testing the server, and the second was announcing itself and 3rd was someone subscribing. Realistically, one can say this might have been the first real chat about something in HTML. Let that sink in.
So yeah - the earth is round, documents are structured, and we should fix this problem with headings.
During the creation of HTML5, this was much discussed and there was a proposal for how to create a "Document Outline". This was speculative fiction, no one implemented it. Not even a little bit. Not even at all.
In June 2014, Steve Faulkner posted a kind of speculative polyfill (aka prollyfill) for an
h element which was implemented in Polymer and tried to stride the line that the Document Outline did. Then, just recently, Jonathan Neal reopened the discussion with a "not exactly custom element" speculative polyfill (it uses Mutation Observers to achieve that effect). That's great, I'm very excited. So, Jon and I consulted a bit on this new
h proposal. I felt like I was having a hard time articulating my thoughts, and particularly why I had the same kinds of worries and what I'd like to see (and why). So I wrote this to explain. You see, a lot of proposals so far seem to kind of attempt to stride the flat earth/round earth line. They carry on or adapt, to some degree or other, either the idea that the implication of starts can work or that 'level' is seperable somehow from structure. However, I think these things are just kind of fundamentally broken and failed at their core. So here's what I'd like to see, and what I think is missing that will really help... Nothing.
Ok, ready? What does the
h tag mean? Nothing.
Ok, that probably needs more...
It is possible to create a fairly simple receipe in which you can express meaningful outlines with an
h tag. If you use it this way, we can derive meaning. If you don't - for example, if an author tries to flat-earth them, or nest them, or do some other crazy shit - we can't. That's where the Nothing comes in.
Let's pretend that you just open an editor and start typing HTML today and you use a
foo tag. You can, you always could You know what it means? Nothing. You know what it looks like? Again, Nothing. Its insides match its outsides. That seems... good?
Now, let's imagine something else: Take your
foo and style it to look like an
h1. Visually it means
h1 - just like our opening example. But guess what it means to everything else? Nothing.
I would wager that if anyone suggested that we simply solve the heading problem by adding an
h rule to the default UA stylesheet, pretty much everyone would have the same reaction "you will confuse authors - because it means Nothing". And they'd be right - but that's kind of what we've been doing with headings all these years. In those scenarios where authors wrote funky markup, the visual looks great, but the tree is whack. It means "nothing" (or at least it is misleading enough to not have real meaning). That seems... bad?
I think this is kind of broken. Our "meaning" isn't obvious by default visually, so we're lulled into thinking it is "mostly right" or something, but really, that's a side effect.
Imagine instead that we wrote that such that only a "good" tag would take on any meaning at all - even visually. The rest would be explictly Nothing. Then, our insides would match our outsides again. It would be a really good incentive to learn "the right way to convey meaning" and to apply it.
A really incomplete visualization/explanation of this can be seen in this codepen and I've opened an issue in Jon's repo to discuss it.
Very special thanks to Jeremy Keith for locating the email I was looking for originally from www-talk but had been unable to find. This article originally contained a reference a year and a half later.