To Close or Not to Close
Have you ever wondered if it’s better to “close” a br
or input
tag
like <br />
or if it’s better to just write <br>
in HTML5? Or why it’s not
correct to write <script src="script.js" />
? Well so have I, and my findings
on the subject were a lot more interesting than I anticipated (if for some
strange reason you find stuff like this interesting).
If you are not interested in the whole story, just jump to the section «Validity» to get your answer.
Void elements
Void elements are a special kind of element that must not have content.
That’s a big difference to other elements that _can_be empty but can also
contain other elements and text (such as <div>
s).
The most known void elements are:
<br>
<hr>
<img>
<input>
<link>
<meta>
The lesser known are:
<area>
<base>
<col>
<command>
<embed>
<keygen>
<param>
<source>
<track>
<wbr>
That’s it. Those are all of the existing void elements.
It is not, and has never been, valid HTML to write <br></br>
, since this would
imply that the br
element accepts content (writing <br>Hello!</br>
has
absolutely no meaning). However, it is very common to see
both <br>
and <br />
.
Although most people know that in XHTML it is mandatory to
write <br />
the rules for HTML are less obvious.
History
To completely understand the rules of void elements a bit of history is necessary.
HTML, XML and X(HT)ML are all based on SGML, the Standard Generalized Markup Language which has been crafted in 1986.
HTML and XML derived directly from SGML. XML is a more restrictive subset of SGML and that’s what XHTML is based on.
So far so good? Then lets get to the interesting part:
SGML has a feature
called NET (Null End Tag).
This is a short notation to avoid having to close a tag when the content of your
element is simple text. With NET you can write <quote/Quoted text/
instead
of <quote>Quoted text</quote>
.
As a side note, elements that do not contain any text, can be written
as <quote//
which is
called SHORTTAG NETENABL IMMEDNET and
is the same as <quote></quote>
.
Now, by that logic, if a void element does not have a closing
tag, <br/
would be interpreted as <br>
, and <br/>
would be interpreted
as <br>>
which is obviously incorrect syntax. If you’re like me, you’re
probably thinking «This is insane!». Unfortunately the authors of the HTML4
specification didn’t think so, which is why this is part of the specification.
Apparently, the browser vendors at the time were not convinced as well, which
resulted in very poor browser support (which, in this case, is arguably not a
bad thing).
XML (and thus XHTML) recognised the madness of such a syntax, and did not
include the NET or the SHORTTAG NETENABL IMMEDNET «features», but provided a
sane syntax for void elements, namely
the Empty-Element tag that looks like
this: <br />
. It seems very natural which is why most developers thought it
was the right way to write it.
Luckily HTML evolves and the people at the World Wide Web Consortium (who are drafting and setting the standards throughout the web) are learning from their past mistakes as well. Which is why HTML5 makes a lot more sense.
Right in the introduction of the new HTML5 syntax, the W3C says:
HTML 5 defines an HTML syntax that is compatible with HTML 4 and XHTML 1 documents published on the Web, but is not compatible with the more esoteric SGML features of HTML 4, such as the NET syntax (i.e. <em/content/).
Yay for HTML5!
(I think they should have kept the cool SHORTTAG feature
(<strong>Hell yea</>
) but hey… at least HTML is not a complete mess any more)
Validity
So back to the question of validity, the current HTML5 specification for void elements is as follows:
Start tags consist of the following parts, in exactly the following order:
- A
<
character. - The element’s tag name.
- Optionally, one or more attributes, each of which must be preceded by one or more space characters.
- Optionally, one or more space characters.
- Optionally, a
/
character, which may be present only if the element is a void element. - A
>
character.
This means that the /
character has been rendered optional in HTML5, but it
doesn’t add any meaning. There is absolutely no
difference between <br>
and <br />
.
Correctness
Well, for those of you who are really addicted to X(HT)ML, you might think,
«yeah, it’s optional, but <br />
is still ‘more correct’», but I have to tell
you: it is not. Actually, one might argue that adding /
to a void tag is
an ignored syntax error. The possibility to write it has mostly been added for
compatibility reasons and every browser and parser should not
handle <br>
and <br />
any differently.
Google’s styleguide on that subject is also very clear that you should indeed not close void tags.
Theoretical disadvantages
Of course, not closing void tags has its disadvantages as well, but I think that
they do not outweigh the advantage of having clean and terse void tags
like <meta>
.
The first disadvantage of not closing void tags is that users need to
have knowledge of the existing void tags. If, for example, you don’t know
what an <img>
element is, you might be confused if you can’t find any closing
tags for it. But the list of void tags is very short and normally it’s quite
obvious which tags are void tags.
The second disadvantage is that it gets more complicated for editors and
parsers to get it right. They are required to know the complete list to
properly interpret the code. If you write <input>
in an editor, it has to
know that there will never be a </input>
following that.
But this is more an issue related to the specification than to closing or not
closing void tags, because <input></input>
is invalid, so void elements
require special treatment in any case.
My opinion on void tags
I think that the whole concept of void tags could be avoided completely by using
the content of some tags instead of defining additional attributes. Let’s take
the <img>
tag for example. It has a mandatory alt
attribute to add a
description of the image, and for good reason: people who can’t see the image
(either because they are physically incapable or because their device can’t
display images) should have a way to know what is depicted (and if you’re adding
an img
tag solely for design purposes then you’re doing something wrong
anyway). So my question is: why isn’t the content of the image tag the
alternative tag? It seems rather obvious to me to
write <img src="doge.png">Image of doge</img>
. The same goes for <meta>
tags
which even have a content
attribute! Why not just use the actual element
content for that? <input value="Value content">
should
be <input>Value content</input>
as is the case with <textarea>
, etc…
So really there are only a few void tags that should exist anyway, but obviously the W3C has to take backwards compatibility into account which makes changes of this kind much more difficult.
<script>
tag
Final thoughts: the The script tag has really been bothering me because it is such a verbose tag for
such a simple directive. It seems wrong to
write <script src="my-script.js"></script>
since the content of
this script
tag has no logical correlation to my-script.js
(and
the html specification disallows
to add both, content and the src
attribute).
The problem is, that <script>
is not a void tag since you can inline
JavaScript on your page and there are no “optional void tags”.
Using the <link>
tag would have been perfect since it’s already used for other
imports and provides all the attributes necessary to include external files. Of
course, as so often in the web, the reason it is not used is backwards
compatibility, since you would exclude all old browsers that don’t support that
syntax.
Need a Break?
I built Pausly to help people like us step away from the screen for just a few minutes and move in ways that refresh both body and mind. Whether you’re coding, designing, or writing, a quick break can make all the difference.
Give it a try