Note from the author...

My posts frequently (like this one) have a 'theme' and tend to use a number of images for visual flourish. Personally, I like it that way, I find it more engaging and I prefer for people to read it that way. However, for users on a metered or slow connection, downloading unnecessary images is, well, unnecessary, potentially costly and kind of rude. Just to be polite to my users, I offer the ability for you to opt out of 'optional' images if the total size of viewing the page would exceed a budget I have currently defined as 200k...

Show me all the images when I read

Don't ask me again on this device

Standards Crisis on Earth-1

This week, not for the first time, some folks (here's one link) suggested that our inability to standardize really common things has had some negative impacts - and discussion about how to fix that. I agree, but also have complex thoughts on this that have been brewing for some time in the form of half finished blog posts... So, I'll try to write those here and, hopefully, not make it too boring.

Introduction: A thought experiment setup

mysterious stranger (it's the flash.. or.. a flash)

Imagine that one day, during a small meeting of Web standards folks, a curiously dressed stranger arrives. "This earth's Web standards are in danger," he says. This earth?

Through him, we learn about the existence of a multiverse in which an infinite number of "earths" exist and play out differently. If you're not familliar with this, here'a a clip from CW's show The Flash briefly explaining the multiverse (it should jump you right to the part 1:06 into it, but you'll have to stop it yourself).

So, we can think about our own reality as "Earth-1" and his as "Earth-2". As he explains, his earth's Web (Earth-2's, that is) has a different timeline than ours, and whiteboards a table for us roughly illustrating how they differ, with some vey key elements called out.

-	Earth-1 (our reality)	Earth-2
1991	TimBL creates "the Web" at CERN. HTML has 13 elements.	Harrison Wells creates the web at Star Labs. HTML has 80 elements (almost the identical set which will make up the Earth-1 HTML 4.01 specification).
1993	"we should have way more tags" browsers support different subsets and supersets of Tim's originals. HTML+?	People are pretty happily using HTML
1994 - 1996	Netscape! Ads! Java and JavaScript - DOM0! HTML sent to IETF (RFC-1886) for standardization. HTML2? W3C!	Netscape! Ads! Java and JavaScript - DOM0! W3C!
1996	Whoa, IE supports a new "CSS" thing. HTML 3!	Whoa, IE supports a new "CSS" thing.
1997	Whoops no... HTML 3.2	People happily make websites
1998	HTML 4, CSS 2...Whoops, no wait, 4.01.	CSS 2
2000 - 2008	XML ALL THE THINGS! Further CSS/JS development	XML ALL THE THINGS! further CSS/JS development
2008 - 2010	WHATWG, "HTML5". Does hard interop work. Adds new input types. Adds appcache and some other APIs. Adds many new elements	WHATWG, "HTML Living Standard". Does hard interop work. Adds new input types. Defines principles of the Extensible Web Manifesto and explains the platform, creates missing primitves, pioneers things like custom elements. It is possible at the end of this to implment just about all HTML elements as custom elements with very good parity
2010 - 2014	People are using parts of HTML5 with polyfills.	People are beginning to make custom elements.
2014	Extensible Web Manifeto, Custom Elements "v0" work developing	*People have, suprisingly, created identical* custom element dopplegangers for every "Earth-1" HTML5 counterpart**
2014-2018	People are still just starting to scratch custom elements (v1) and things like shadow DOM which arent quite "there" yet, but get continually closer.	The doppleganger custom element equivalents have exact usage parity with their native Earth-1 counterparts. That is, every line of Earth-1 code that says `section` simply says `x-section` on Earth-2, and so on

"A rift," he explains, "was opened between universes and I have been trapped here for 6 months learning the differences between our worlds. They're actually very nearly identical in many ways, but I believe that my own earth provides some interesting lessons for yours."

Why?

For the last four or five years we've (in reality) made a lot of progress, on a lot of fronts with regard to reforming "standardization". We widely agree on a lot of foundational "bits" even across numerous standards bodies. But, there are still a lot of unanswered questions about how to make this really "click". While the basic challenges are largely similar across standards bodies, HTML is especially interesting because it adds additional complexities...

We all agree that standardization should be based on "science" and that it should involve developers and experimentation "in the wild". "The path to standardization leads through custom elements" is the refrain in both universes. The idea that standards bodies should work more like dictionary editors is a pretty well accepted "concept" all around.

But, "how, specifically, do we do those things?" is as yet unanswered and, it seems quite difficult to get people to discuss details in our own universe. Why? Because custom elements are young and it's very hard to speculate about hyptothetical elements and futures. It very quickly gets into all kinds of weeds and sputters out.

So, that's the idea with the multiverse: Perhaps by imagining a universe in which all of the HTML-5 elements arose in this fashion is, in a way, easier. We don't have to get people to accept that some kind of "section" element "might be useful" or "might get popular". We can simply say "it did" in precise parallel with our universe. It's easier to imagine. It allows us to get to the "... then what?" and focus on the actual questions.

Also, I think it is more fun.

"Data" initial handwaving

So, we've said that data exists on Earth-2, but that doesn't mean we have it. In fact, that's job #1: Any kind of "science" is going to inevitably involve measuring. But who measures? How? And what do we accept that those measurements actually mean? What do we do with that information?

Who measures and how is a potentially more complex topic than it seems on its face, so I'd like to just handwave over for a moment and come back to it later. For now, let's imagine that that "something like" The HTTP Archive report on element use, will do for initial discussion.

Ah ha... This is probably the first thing we can learn: That report is not currently capturing custom elements at all. In theory though, I think that is pretty easy to remedy, so let's just pretend that we can actuallly get an Earth-2 report identical to the one linked, except that all of the "HTML5 elements" in that table are custom element dopplegangers like x-section, x-aside or x-progress instead.

Usage and Languages

If we look at that information (measuring use on unique URLs to feather out some local biases in which a few authors use it really a lot) as a chart, it would look like this.

One thing that you'll notice is just how quickly usage "falls off". There are lots of ways to attempt to rationalize the falloff there, but the truth is that this seems to be true of all languages. Fascinatingly, it seems that no matter what dataset you look at, about 20% of words account for 80% of word occurances and you wind up with a similar kind of drop off and curve.

80% of pages counted here contain the elements html, head, body, meta, title, script, div, link, a, img, span, li, ul and p, but then things begin to drop off and they seem to do so kind of precipitously.

But even simple, mature and common tags like em, strong, u, b and i which are easy to use regardless of how you even create your content are used far less. i has even been popularly overloaded for use as an icon (no comment), increasing its chances of use. But none of these, not a single one, is in the "Top 20". Even with all of those advantages, i appears on less than 40% of URLs and em on less than 14%.

In fact, of this list of 135 elements that are actually part of some Earth-1 standard, almost 40% of them have less than 1% use. This is again, interesting as nearly half of all word use across any corpus is the same core 50-100 words, and nearly the other half will be words that appear in that same corpus only once. What's more, it's kind of hard to communicate using only those core words.

So what's the point? Well, simply that we should probably expect elements to follow "similar rules" of distribution and occurence, and that this should probably inform what we accept these measures to mean. It might seem at first like 1% (or even 5%) is "basically useless compared to this other thing" but that's not actually at all the case.

"Dictionary Research"

Ideally, someone (or something) would be "watching" data that probably looks a lot like the above and looking for slang that is getting popular enough to warrant further review. Ideally they would have some kind of rule.

Then, in terms of a language, like English, once some slang triggers the metaphorical batcomputer and "has their attention" research have to review really a lot of actual uses to see if there is an accepted definition of this word. It seems we would need that too. We need to know not just that those tags appear, but whether they mean the same thing, and are used consistently enough to draw a definition from. It may well be the case that a lot of people are simply using a common word for very different and totally incompatible purposes.

What specific rule(s) would work to trigger such review is certainly debatable, but I think we can probably come up with at least initial ideas and even elimiate some as "probably unreasonable" by thinking through this exersize a little more.

For example, practically speaking the more use it has, the harder this would be. I think that we can probably agree that reviewing uses of Earth-2's x-section would be pretty hard, because it's used on nearly 1/3 of all URLs in that report. Worse, as I will argue later, it's probably much worse than that.

So, combined with what we know about languages and word occurence, this seems to argue that we need to a rule that catch more, earlier. Even a rule like "any element that appears in the Top 100 occurences of a report like that" will miss important things, and maybe be too late. Anything considerably more stringent than that seems (like, Top 40 or Top 50) is almost certainly not going to work. Or, perhaps it is a report that only shows the top N custom elements by occurence.

But, I think that it's more than just 'when do we review' - I think in many cases we want to proactively work more closely with at least some developers, very early on. There are several attempts at this sort of thing, but currently at least, we've not figured our exactly how to do this or how it fits into standarization.

In any case, we can safely imagine that x-section use grabs attention, and that after some herculean review we agree that there is a common use/definition... Now what?

Dictionary, or Dictionaries?

Well, now it gets even a little fuzzier. The most straightforward way of understanding "standardization" here is that elements become part of HTML. I think that at a minimum, everyone in all universes would agree that HTML should contain the "core words", but what are those? "section" is almost certainly one, but it's unclear what the "core words" really should be. Further, even in spoken languages real communication usualy requires more than core words. It seems very likely to me that HTML should have (and arguably does on Earth-1 already) at least some "more specific" words too. But, I'm not sure where we draw the line, or that we're discussing what that even means.

On Earth-1, for example, before "HTML5" one could definitely argue that HTML itself was about really general things whose semantics were about basic text. Earth-2's version of 'core words' still has that characteristic. But our HTML now includes elements like progress, which, in practice, does not really seem like that. Unless you really, really squint at it and imagine use cases that don't actually exist, progress really only has meaning with dynamic state - for something more "applicationy". It is undeniable that people build applications with the Web - but, the question of whether HTML itself should include tags about "application stuff" was very real: Many converations took place in the early-mid 2000s. We decided "yes" and put them into HTML5 - but if things developed the way described on Earth-2 where HTML didn't include that sort of thing natively... Would we, add it to HTML itself? I'm honestly not sure.

What I do know is that we will have to draw lines: Not everything useful, even useful to significant numbers of people, can probably 'fit' in HTML. What do we do with those? Do we just pretend they don't exist? Are they simply relegated to life as custom elements and everything that currently entails?

If so, this seems like kind of a pity for a lot of reasons. First, the "price" of doing otherwise expands as very popular ideas build on one another. Second, some kind of standardization is still probably helpful, even if that is outside of HTML itself, and it seems that encouraging and incentivizing that for things that can find an audience would be helpful. Is it possible that there is some kind of thing inbetween "It's part of HTML" and "It's just a custom element like any other" which has some advatages and incentives that help encourage good things to happen?

Some idea around helping optimize or virtualy eliminate loading times, for example, would be a big deal. Perhaps if dictionary researchers idenitifed something like this we could provide a way to expose a 'universal cache' registry or something that effectively registered a hash with the browser itself that developers could lean on to avoid 99% of loading for that definition... or something? Perhaps those could also use WebAssembly to further reduce parse? I don't know, honestly, I'm just describing vague ideas about what be possible 'advantage' characterstics of an 'inbetween' soution.

The closest thing I have seen to discussions like this are layered APIs but I'm not quite sure that that's really the same thing, or is "it." I'd like us to think this through a lot more.

Codifying vs Similar Solutioning

Another thing that I think plays into this thinking about the story of how we go from custom element to native HTML.

Again, considering that Earth-2's data suggests unequivocably that there should be a standard "section"... What does that actually mean in it's final state, and, probably as importantly - how do we arrive there along the way?

Let's imagine that, they decide to just make a native section tag now. As always happens, a browser or two actually does the work and starts shipping that. An Earth-2 developer starting a new a new project is faced with a 'new' kind of problem that I think we'e never faced before: They can choose between a popular custom element or it's native doppleganger, but they are written differently. Ideally, they would start using the "native" one and optionally load the custom element definition as a polyfill -- but how? As far as I have seen there is no staightforward path, advice or pattern here about how to span this gap.

A developer could, perhaps skip loading the actual definition, but they have to keep using the custom element markup and registry... For example, perhaps they could write something like:

if (customElements.get('x-section')) {
    customElements.define('x-section', HTMLSectionElement)
} else {
    // load polyfill
}

But... If I think about it - that's kind of screwed up and makes it seem harder to "dig out of" that situation. Until all browsers ship it, people are extremely unlikely to use it because so many already have a custom element solution that works everywhere. Until people use it, browsers are unlikely to priotitize its shipping, even if we can agree it is good. Even once everyone has shipped, there is legacy - and the bigger that legacy gets, the harder this problem is. This approach seems like it would cause the 'legacy' to considerably expand rather than contract.

That's kind of interesting because it's both a new problem for us (again, as far as I can see) and a problem that language dictionary editors don't have. They don't mint a similar solution and attempt to coin a new word that means the same thing as something in popular parlance. They just say "yeah, that's officially a word".

Perhaps instead an 'inbetween' solution could be helpful in many of these cases too. Allowing that x-foo could actually be a standard and making it easy to opt-into a more official and 'advantaged one' would actually be a lot closer to what language dictionary editor's do, and maybe that is important in maybe useful ways?

Again, I don't know. These are things I would like to see more discussion on.

Actually Measuring and Researching

Thusfar we have simply imagined that the data from the HTTP Archive is our data set. But, as I said, I think that getting the "real" data that we actually need is probably going to be trickier than that.

To illustrate, let's turn back the progress (or x-progress) example. Is that really an accurate read on it's use? I would wager, very probably not. Why? Well, I think that it's because of what we're measing there, and how.

While the HTTP Archive has a really a lot of data, it's actually not even the tip of the iceberg that is the real "Web" (ie, all the things you view in a Web browser). As great as it is, it is neither unbiased (not in an intentional or negative sense) nor complete.

x-progress is an example of something that I would fully expect to under-report on the HTTP Archive report. It is freqently only inserted into a document termporarily, and based upon some significant user interaction. It's considerably more likely to be used in things that are more "appy". These are kind of problems for the HTTP Archive to get it even if they are "public" but those sorts of things also are frequently "behind" something that makes it even harder -- requiring a login, or a coprporate intranet, for example. So, that really skews what is going to show up here and, if we are looking at occurences and ranks as our 'trigger' or how we go back and do the research - that's going to make whole classes of very useful elements substantially disavantaged and others subtantially overly advantaged in ways that might be worth considering.

What's more, as I said, it includes a lot of sites, but it's still limited not only by the above, but also based on some measure of 'popularity'. Depending on what you are talking about this can have a really big impact and, I think, lead us astray. Eating food, for example, is critical - but I spend far more time doing all sorts of other innane things. It's less popular by that measure than, say, playing Playstation. But... idk... it feels important.

Another, probably better, way to measure might be browser telemetry. But, afaik, browser telemetry doesn't give us back URLs or a good way to reach and research those things either.

Further, we'd have to be careful to somehow collect and coalese this data seamlessly too to avoid other kinds of hiccups though. For example, there are sites/apps which are keenly important and widely used for specific purposes who, for various reasons, people avoid certain browsers. This is bad for the Web, but it isn't always for equally bad reasons: Telemetry data for Firefox, for example, would have for a long time seriously under-reported how important video was for me, saying it was near 0. That would have been entirely untrue though, I almost always have videos playing. Rather, most of us quickly learned that Firefox wouldn't play a lot of videos and consciously would view sites with videos in Chrome.

Similarly, I have seen other very particular but not rare use cases in which specific browsers are recommended not to use for something kind of specific because of issues with thirdparty software. Businesses and developers frequently have to play a very delicate balancing act and, it would be terrible if, for example a browser were to claim that their own low use didn't need to be specially reconicled somehow with another browser's much higher numbers because that may well indicate such a gap.

So....?

So, it seems like, somehow, we're going to start figuring out how we really science the shit out of this. How do we do it? How do we answer all of these questions and setup processes (and maybe new roles?) to make this really hum?

We want developers to play a big role here, but without answers, developers speculate about the future. I've seen this already and I worry that this could be disillusioning if that speculation winds up being far from reality. I don't want people to feel misled or disullusioned. If this goes badly, our hypothetical visitor from another earth might be right: Our standards might be in trouble. In fact, I think that this might even already more relevant and pressing than we realize... but I'll save that argument for another post.

It's entirely possible that there are some fairly complete and mature thought around some of these issues that I've just not yet heard articulated. I'm fairly certain though that even if there are, there are still many additional discussions to be had, changes to try and so on before we've really figured it out - so I would love to have (or hear) more of those conversations. It definitely seems to me like there is a lot of value in having some open conversations on these topics.