The Secret Life of Custom Elements
Twenty years ago last month, Google published an analysis of "slightly over a billion documents," a snapshot of the web that helped shape the early direction of HTML5. It followed a lineage of smaller, more personal studies — individuals poking at the web to answer some narrow question, often with datasets that would easily fit on a thumb drive today. For about half those two decades, I’ve been arguing that we need more study of the web, not less. The platform evolves faster than our understanding of it, and the only way to know what the web actually is — not what we imagine it to be — is to look.
Every month the HTTP Archive quietly captures a snapshot of the web as it actually exists—not the idealized web that we hope for, but the messy, improvised, duct‑taped reality of millions of sites in the wild. I’ve been collecting and studying these elements for the last six years.
This new dataset is the largest I’ve ever worked with: Billions of pages, hundreds of thousands of distinct non-standard element names, and a long tail that stretches into places no standards body has ever seriously examined. And unlike the Google study, which looked for patterns in class names, this dataset captures the long tail of non‑standard elements — the names people invent for actual elements when the platform doesn’t give them what they need.
What emerges is a portrait of the web as it is lived: messy, inventive, repetitive, global, and full of reinvention. It’s also a mirror held up to the platform itself.
But, it's also much more complex to study than I could have imagined a decade ago, and I really wish that the W3C (and member orgs which include academia) had taken up the charge to begin to figure out how to really study the web and use that information to inform standards work.
What's difficult about it...
One problem is that the dataset itself has some fairly extreme bias. The crawl doesn't hit anything that isn't on the public internet - that means it excludes intranets which are massive. In fact, most of my career was spent working on intranets. The crawl captures only home pages, plus the target of whatever it interprets as the largest link on that page. It also can't get to anything that requires login - which means that for a site like twitter or bluesky or mastodon, you're going to get something very unrepresentative of any of those. So, one challenge I'd love to see us trying to tackle is how to get even better data representation. It's hard to "pave cowpaths" if they're in a country we can't even see into.
Initially I had this idea that we could watch for the adoption of tags - imagining that we'd get some that would become very popular, just like we did with JavaScript libraries and frameworks. However, it turns out that this is not the signal it might first appear to be. An element appearing in tens of thousands or even hundreds of thousands of pages is often simply because they are part of a larger successful system. If Wix or Shopify create some custom elements that work behind the WYSIWYG tooling, and lots of people use it to create their pages - then suddenly that element gets very very popular - even if it isn't actually particularly good. In fact, we can see shifts in the data where the teams themselves changed their minds and another version supplants the first very quickly because it's simply internal.
Then, I thought that perhaps what we can do with the dataset instead, is to squint at it and look a little more abstractly at what people are naming their elements and see if people are re-solving similar problems. Do we find, for example, multiple non-standard element names that appear to be about tabs? Yes! Clearly that is indicative that we need a native element, right? Maybe. It's a bit more nuanced than that. Here are the most commonly re-created/repeated non-standard element themes:
- Navigation
- Headers and footers
- Carousels and sliders
- Modals
- Search bars
- Product cards
- Login forms
- Cookie banners
- Accordions
- Tabs
- Toasts
- Breadcrumbs
While we don't have several of these in standard HTML, we do have native <header>, <footer>, <nav>, <dialog>, and <search> elements, and even accordions via the name attribute of <details>. And yet, the wild still contains hundreds or thousands of custom elements with names like <app-header>, <site-footer>, <main-nav>, <modal-dialog>, <search-box>, and <accordion-panel>.
Native primitives may exist, but not at the same level of abstraction as these. <header> and <footer> in HTML are structural, not behavioral. <dialog> is behavioral, but not styled. <search> exists, but doesn’t solve autocomplete, filtering, or results.
So developers build those - and, if you stop and think about it, not all non-standard elements are equally as undesirable. Many of them will be simple decorations or thin wrappers that do use their native counterparts. Where there is definitely some interesting thing to study is where there is clear generic need where the platform doesn't provide anything close. Above, tabs, for example.
Observations..
Here are many observations from the data, in no real particular order of importance.
Forms and Inputs: Tweaked, Wrapped, and Re‑Wrapped
Forms and inputs are a great example of the constant re-invention I just described. Sometimes it's because the native element is insufficient, but that's not necessarily the case. In some cases they're just slight wrappers. Among them are lots and lots of "pickers" and "selecters" that show up...
<custom-select><date-picker><variant-picker><quantity-selector>
There is already a lot of ongoing work to make native form elements (including selects) require less code and just be more stylable and flexible, and the data at least suggests that such efforts will be very welcome.
Hidden Machinery
A surprising number of elements aren’t UI components at all. They’re runtime markers:
<ng-container><router-outlet><astro-island><ion-router-outlet><next-route-announcer>
These exist because frameworks need declarative boundaries for hydration, routing, rendering or template expansion. I suppose it is debatable wither these are an indicator of “missing HTML features”, or just how much.
Carousels (and sliders... and toasts)
I don't love carousels, but it's hard to deny that they are popular. There are dozens of distinct and identifiable carousel/slider elements in the dataset and they appear a lot. I really dislike a few bits of Google's attempt to make CSS-only carousels possible, but it's pretty clear why they chose to tackle that problem. I guess it is worth stressing again the bias in the dataset here - if there is a page I most expect to see a carousel, it is exactly the primary one the archive crawls. So, while it is the most popular in the dataset, I don't know that it is the most popular all-around. You can see why Google winds up with their proposals though, toasts are on that top list too.
Structural semantics?
There are a few broad categories where the main point seems to be "semantics". That is, very often many of these don't actually do anything, beyond provide some hooks, mainly for styling. They aren't actually even custom elements sometimes (or maybe even often) - just non-standard elements.
e-commerce
Dozens of these surround e-commerce. There are tens of thousands of sites that use elements with names (and variants).
Product & merchandising
<product-card><product-title><product-price><product-rating><product-variant><product-gallery><product-description><product-badge>
Pricing & money
<price-money><sale-price><compare-at-price><discount-amount><currency-display>
Inventory & availability
<stock-status><pickup-availability><delivery-estimate><inventory-level>
Cart & checkout
<cart-items><cart-count><checkout-button><order-summary>
Very interestingly they are often used alongside actual machine readable semantics via jsonLD in the same markup.
While the vast majority of these elements appear because of common tooling, the fact that there are dozens of variants of similar names appearing on smaller numbers of sites indicates there is something widely interesting here. It's hard to say what it is other than that it would be nice to have a common structural semantic that would work for both purposes.
I guess the biggest surprise here is that if it's true, why hasn't such a thing arisen already? It is entirely within the community's power to develop such a thing. Perhaps the answer is that there is just so much variance it isn't easily plausible. Maybe templating would somehow allow us to achieve a common pattern which achieved this based on the shared jsonLD semantics.
Publishing & Editorial Semantics
CMSes and news sites often invent tags for editorial structure, and many of these are sticking around.
Content structure
<article-header><article-summary><article-author><article-date><article-tags><article-tag><article-category><byline><dateline><pullquote><footnote>
Taxonomy
<tag-list><category-label><topic-header>
These reflect the needs of journalism and long‑form content.
Social & Community Semantics
These show up in comment systems, forums, and social platforms.
User‑generated content
<comment><comment-list><comment-item><comment-author><comment-content><comment-date><comment-form>
Identity
<user-avatar><user-name><profile-card>
These encode relationships and interactions, not UI patterns.
Events
<event-date><event-location><event-schedule><event-details>
Again, these are domain objects, not widgets - and they have well established schema.org or microformats as well.
Invoicing
<invoice><invoice-line><invoice-total><invoice-summary>
Before the web came along, there were already national and international standards around electronically trading informtation like invoices - and when XML was sold, invoices were a common example. Here we are again.
"Namespaced" Elements
Several elements like `o:p`, `rdf:rdf`, `dc:format`, `cc:work`, `fb:like`, `g:plusone` appear in the top 100. These basically were thinking of an XHTML future (namespacing) that never really arrived. However, HTML has always allowed it - so that's just the tag name. In many ways, it's just as good. Interestingly, these may be some of the better examples of what I'd like to see happen - they are widely understood.
Conversely, while hugely successful, the share buttons are more an indication of a desire than something we could actually standardize in precisely that way. They also point to a desire _in time_. Google Plus doesn't even exist anymore, `fb:like` is from a time when Facebook was at the top of the most interesting places to be. Maybe one of the things we've learned is that this is way handier to do at the browser/OS levels? I suppose the Web Share API was a part of thinking how we'd deal with this.
The fact that they both still appear so much is also kind of an indication of age of the page and slow replacement of underlying tools.
Typos, Encoding Errors, and the Weird Stuff
One of the most delightful parts of the dataset is the long tail of what are almost certainly just typos:
<prodcut-card><navgation><contianer>
The fact that these can appear on tens of thousands of sites because they are part of common tooling helps re-enforce that not every non-standard element is a signal. :)
In conclusion...
I wish that I could say "Ah ha - the data says very clearly that these are the specific things we should definitely 'just write down' now" in the way that I imagined a decade ago, but I don't think we're there yet. I guess if I had to give three things I'd like to see happen from here they'd be:
We need lots more effort in thinking about how to study these things. I would love to see real investment in this space. This year, at last, the W3C is hiring someone to study the web. I'm not yet sure what that looks like but I look forward to trying to discuss more with them.
We need a real community effort - an Underwriters Labs for custom elements, with participation and funding from orgs with money. We don't necessarily need "the one true tabs" as much as we need a place to find what I expect will be a very few sets of tabs as custom elements which we can trust like we trust native elements. Given a little bit of time, I have faith that this will naturally sort itself into a few 'winners'.
That community effort might also include things which won't ever have native implmentations, but which lay down some kind of light semantic meaning or compound styling structure that we all begin to agree on - like product cards or breadcrumbs.
A lot of this is pretty adjacent/close to the ideas behind OpenUI and it's possible some of this could just happen there. However, due mainly to limits and participation, OpenUI has really not really produced custom elements or worked to somehow list or grade and promote them (though we did study them quite a bit in the tabs research). The effort led by Brad Frost to think about a "global design system" in particular might be closer to some of these ideas.