An HTML document should identify its language. Currently, there are two choices. There is the familiar HTML (hypertext markup language) and the lesser known XHTML (extensible HTML).
To most designers and developers, the differences seem to be minor. However, if one is given a choice, XHTML is a better choice for the long run. This is because XHTML is actually XML (extensible markup language), which means it can share a magnitude of tools to automate validation and etc.
For accessibility purposes, conformance to XML means more screen readers do a better job reading the content of an XHTML document than an HTML document. Because XHTML has a stricter syntax, it also means a validated XHTML document will not contain malformed structures that an throw a screen reader off.
All documents should have a shell that is similar to that in listing 1.
Listing 1: | A simple HTML shell document |
This is, indeed, an empty document. However, there are already a few concepts that need to be explained.
Line 1 identifies the document. On this line, HTML identifies the top element of the document. PUBLIC indicates the availability. Use PUBLIC for HTML and XHTML documents.
The quoted string "-//W3C//DTD XHTML 1.0 Strict//EN" deserves a bit more explanation:
The last component is an URL that specifies the document type specification.
The “doctype” line is useful for validators, as it specifies exactly what mark up language is used, and where to find the syntax and structural rules (the DTD document).
Besides, the “doctype” line, one can also see that the <html> element has additional attributes:
In summary, the shell document in listing 1 tells a browser, validator or screen reader what kind of document this is, and how to read it. By making these specifications as exact and detailed as possible, a screen read or validator can do its job more effectively.