Nov
28
How To Write Clean HTML
Filed Under Web Development |
As a front-end Web developer, I write HTML on an almost daily basis as work. And as such, I’ve learned that it is of great value to write clean HTML. By clean, I mean that I write no more HTML code than is necessary. This creates the most efficient code possible, makes it easier to maintain or to insert into a CMS, and it is more easily understood by others who may work on my code. Most developers agree that these are good things. But in order to write clean HTML, you first must forget everything you know about how HTML works.
When I began developing Web sites 9 years ago, I - like most everyone else - used table-based layouts. I would print out a given design, stare at it for a while, and then draw grid lines on it as I mentally cut up the design into efficient slices. Then I’d slowly piece the slices back together with nested HTML tables. After more than 4 years of doing this, the light came on and I realized that CSS could do more than set font attributes. And so I started using style sheets to control my layouts. But for the longest time, I still printed out a given design and mentally sliced it up before I began coding the HTML. And my HTML code suffered. To produce truly clean code, I needed to forget what I knew about HTML.
HTML elements, you see, have certain visual properties. For instance, <strong> elements are bold, <hx> elements are big, and so on. But HTML was not originally designed to be thought of in terms of design. Though XHTML may fail as a standard with the eventual release of HTML 5, it is best to think of HTML as an XML language. XML does not have any design associated with it’s elements at all. In fact, there are not even any defined elements. Someone who writes an XML document simple creates custom tags that best represent the content within them. And this is the secret to writing clean HTML.
HTML differs from XML in that it has clearly defined tags. But these tags, while having design-related features, were created to define the content that is to appear within them. This type of code is referred to as semantic HTML. When coding semantically, a header tag is only used for header content; a table is only used for tabular data, etc. In theory, this should produce code that is considerably leaner and more forward-compatible than non-semantic code. But the problem is this: developers start coding their HTML after having seen the design.
When I begin a Web site project, the very first thing I do I request the content. And I use this content to build my initial HTML templates without ever seeing the design. So how do I know how to arrange the code, then? I simply pretend that I’m writing a book. Think of the content in terms of written reference material. The title of the book comes first, the table of contents typically comes second, the chapters comes next - with tables and images scattered within them - and the author information and references come last. Granted, this is an oversimplified model that is not right in all cases, but it works well most of the time. When the HTML code is arranged in this fashion, using no more elements than are necessary to describe the content, your first HTML (body) template might look something like this:
<h1>How to Create Clean HTML</h1>
<ul id="navigation">
<li><a href="step1.html">Step 1</a></li>
<li><a href="step2.html">Step 2</a></li>
<li><a href="step3.html">Step 3</a></li>
</ul>
<div id="content">
<h2>Step 1</h2>
<p> The first step is to forget what you know about HTML.</p>
</div>
<div id="footer">
<p>Copyright 2008 - Greg Laycock</p>
</div>
In an ideal world, this is where I would stop. Unfortunately, more HTML tags are always necessary. It’s at this point that I look at the design. Then I adds tags as necessary to account for inherent limitations in HTML and CSS. In this way, the final HTML body might look more like this:
<div id="main">
<div id="header">
<h1>How to Create Clean HTML</h1>
</div>
<ul id="navigation">
<li><a href="step1.html">Step 1</a></li>
<li><a href="step2.html">Step 2</a></li>
<li><a href="step3.html">Step 3</a></li>
</ul>
<div id="content">
<h2>Step 1</h2>
<p>The first step is to forget what you know about HTML.</p>
</div>
<div id="footer">
<p>Copyright 2008 - Greg Laycock</p>
</div>
</div>
The HTML has now gotten slightly more complex, but it is still semantic and is far from bloated. Granted, the more complex the design, the more “design-only” element need to be added. But if you stick to your guns and start your HTML before you ever see the design, you’ll soon be producing cleaner HTML than ever.
