HTML vs XHTML part 2
In 2006 we converted our site from HTML 4.01 to XHTML 1.0. It seemed like it should be straightforward. After all, we reasoned, XHTML 1.0 is compatible with HTML so it should just be a case of changing the doctype from HTML to XHTML. Wrong.
Here is some legal HTML lifted from section 7 of the HTML 4.01 specification, which validates cleanly in the W3C validator:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <HTML> <HEAD> <TITLE>My first HTML document</TITLE> </HEAD> <BODY> <P>Hello world! </BODY> </HTML>
If you run the same example through the W3C validator but choose an XHTML doctype every line triggers a validation error.
Why? The bulk of the problems stem from the fact that XHTML is case-sensitive, and all XHTML tags must be lowercase. This means none of the tags above are recognized because they’re uppercase.
HTML is case-insensitive and nearly all W3C HTML examples use uppercase tags, which means authors diligently following the spirit and letter of the HTML specifications have produced countless valid HTML pages, which are incompatible with XHTML. I have some sympathy for the authors of standard: XHTML is based on XML so had to be case-sensitive, and no matter which element case they chose someone was going to lose out. But it was unfortunate that authors following the W3C coding style lost out.