Author Information

Brian Kardell
  • Developer Advocate at Igalia
  • Original Co-author/Co-signer of The Extensible Web Manifesto
  • Co-Founder/Chair, W3C Extensible Web CG
  • Member, W3C (OpenJS Foundation)
  • Co-author of HitchJS
  • Blogger
  • Art, Science & History Lover
  • Standards Geek
Follow Me On...
Posted on 3/29/2016

TokenLists: Missing Web DNA

As we move forward with the Web, it's important that we look down into the DNA and continue to find the missing connections - things that are fundamentally related in concept and managed so by the browser, but not exposed to developers in similar fashion. I'd like to talk about one of those I recently uncovered...

A long long time ago, in a browser far far away, Brendan Eich introduced what would become known as "DOM Level 0" - basically: Simple reflective properties that allowed you to access useful bits of what would later become "DOM" and twiddle with them.  It looked something like this...

document.forms[0].firstName.value = "Brian";

However, there is a long, complex and twisted history that led us to where we are today (see my A Brief(-ish) History of the Web Universe series of posts).  To sum up some key bits: CSS and the actual DOM were conceived of separately from thoughts of DOM0 and JavaScript.  Unlike its predecessor, the "real" DOM was intended to be generic. Part of this was just trying to bring a lot of people together and "fix" the Web. At the time of its conception, there was a lot of focus on trying to address the problems of SGML that made HTML so appealing in the first place - so we got DTDs for HTML and work began on all things XML.

The "real" DOM, then, was intended to serve all masters, and as such it dealt with basic attributes in a tree which could be serialized, manipulated and parsed and rewritten in any language with a common interface. This meant that authors would use getAttribute(attributeName) and setAttribute(attributeName, value) to get and set attribute values respectively.  It seemed to those spec writers, then absolutely logical to create an attribute called "class" and allow a user to type:

element.setAttribute("class", "intro")

Which would be serializable or parsable as something like

<div class="intro">

This was problematic in the browser though because DOM Level 0 was not only more well known, but far more terse/convenient. Most authors really just wanted to deal with reflective properties and type something like:

element.class = "intro";

In CSS, the class attribute has special meaning, surely it deserved some sugar. But the above wouldn't have worked because at the time, using property names that were JavaScript reserved words just wouldn't work. To resolve all these issues, we got the .className property which was reflective on the class attribute.  Problem solved... Except, not.

Not Simple Enough

CSS says that any element can specify 0...N classes, not 0 or 1. These are provided in a space separated list.  In SGML/XML terms these were "NMTokens".  It sounds quite simple - a space separated list of values with some simple constraints should work everywhere, and it does... kind of.

In the browser world, however, where we were messing with classes at runtime all over the place that needed to be reflected back (CSS wasn't based on runtime properties, it was based on attributes) we began facing issues.  Someone would come along and write code like the above example, which assumed that it was a single value:

element.className = "intro";

The net result being that any existing class names at the time of execution were replaced with just one.  Some other person would assume they wanted to toggle a value and write something like:

// Toggle the 'selected' class
element.className = (element.className === "selected") ? "" : "selected";

The problem being two-fold:  First, it assumes it could === a single value, the second being that it can overwrite all the others.  We had problems removing classes, adding them, removing them, toggling them, finding out if it contained something.  It sounds trivial but it turns out that it wasn't: Each time you wanted to touch the className you had to deal with deserializing the string, doing your work and re-serializing it without stepping on any of a number of landmines.  The net result was, as one might expect, we came up with libraries to help with this - however, they varied in quality and assumptions. It was still a mess.

Problem Solved.

When jQuery joined the W3C after becoming the most widely used solution, they lobbied to improve this situation (disclaimer, I represent jQuery in several W3C groups).  It wasn't long before we had the .classList interface.  The world is much better with .classList at our disposal - finally we can be rid of the above problem.  Now users can write:

element.classList.add("intro");

It's the missing interface developers always needed.

Wait... Problem Solved?

Sadly, I think still not quite.  While it's a major improvement, the trouble is that the NMTokens issue does not solely affect the class attribute, or even just in JavaScript, but we only exposed it through .classList.

It's quite possible that you are thinking "Well, this probably isn't something I need to worry about because I've never come across it".  However, I think you will, and that's the problem.

There are other NMTokens issues that you've probably not thought about before but eventually will have to.  Accessibility is a good example of where this pops up a lot, and if you've not thought about accessibility in the past, it's very possible you've never run into it for that reason alone.

The aria-describedby attribute is just one example. A control can be described by multiple elements for different purposes.  For example, an input element may have associated helpful advice that appears in a tooltip popup and associated constraint validation errors.  Further, it works a lot like the class attribute and has similar challenge in JavaScript in that it frequently has to be actively maintained, not just written in markup, and that's deceptively hard.  For example, an author should not associate an input with an errors collection until there are actually errors.

This sucks.  ARIA is hard enough without re-facing remedial seeming sorts of challenges that are indentical to once we've already solved.

Good News and Bad

The good news is that standards makers had the foresight to create an interface for this type of problem called DOMTokenList with all the useful methods and properties that .classList exposes.  The .classList property holds a DOMTokenList.

The bad news is that it's pretty much locked away and there's no way to easily re-apply it to new things as they emerge.  We could continue to identify spec properties and create new things like .classList each time we find them.  For example, we could expose .ariaDescribedByList - and we might want to occasionally do that - but it's not great.  It's just additive.  Each time we do, the API of things to learn gets bigger, it also doesn't expose these abilities to custom elements, and it doesn't help with anything that isn't specifically HTML (if you care about that sort of thing).

Alternatively, however, we could define a single new foundational DOM method to expose any attribute this way

Can you show me?

Yes! The good news is that this is actually pretty easy to do, minimal new API to learn and should reasonably work for everyone.  Jonathan Neal and I are providing a prollyfill, or a "speculative polyfill", for this (public domain). This allows people to ask for an attribute as a DOMTokenList and deal with it the same as they would .classList.  Because it's a proposal, and should a standard ultimately arrive it may differ, we've underscored the method name to keep it future safe, but here's an example of its use...

element._tokenListFor("aria-describedby").add("foo-help-text");

In Extensible Web terms, this isn't asking for new additive functionality at all - it is explaining existing magic that already exists, but lies mostly dormant and unexposed in the bowels of the platform.  Given this interface, the .classList property, for example, is then merely legacy sugar for its equivalent .asTokenList(attr) accessor (which doesn't require 'name' distinction either and deals with dasherized attributes just fine too):

element._tokenListFor("class").add("intro");

Thanks to the many people who proofread, looked at demos, discussed or gave thoughts on this as it developed, including Jonathan Neal, Bruce Lawson, Mathias Bynens, Simon St Laurent, Jake Archibald, and Alice Boxhall.