This came up again recently in a discussion on the W3C i18n Interest Group list, and I decided to put my thoughts in this post so that I can point people to them easily.

I think HTML4 and HTML5 should continue to support <b> and <i> tags, for backwards compatability, but we should urge caution regarding their use and strongly encourage people to use <em> and <strong> or elements with class="…" where appropriate. (I reworded this 2008-02-01)

Here are a couple of reasons I say that:

  1. I constantly see people misusing these tags in ways that can make localization of content difficult.

    For example, just because and English document may use italicisation for emphasis, document titles and foreign words, it doesn’t hold that a Japanese translation of the document will use a single presentational convention for all three. Japanese authors may avoid both italicization and bolding, since their characters are too complicated to look good in small sizes with these effects. Japanese translators may find that the content communicates better if they use wakiten (boten marks) for emphasis, but corner brackets for 『 document names 』, and guillemets for 《 foreign words 》. These are common Japanese typographic approaches that we don’t use in English.

    The problem is that, if the English author has used <i> tags everywhere (thinking about the presentational rendering he/she wants in English), the Japanese localizer will be unable to easily apply different styling to the different types of text.

    The problem could be avoided if semantic markup is used. If the English author had used <em>..</em> and <span class="doctitle">...</span> and <span class="foreignword">..</span> to distinguish the three cases, it would allow the localizer to easily change the CSS to achieve different effects for these items, one at a time.

    Of course, over time this is equally relevant to pages that are monolingual. Suppose your new corporate publishing guidelines change, and proclaim that bolding is better than italics for document names. With semantically marked up HTML, you can easily change a whole site with one tiny edit to the CSS. In the situation described above, however, you’d have to hunt through every page for relevant <i> tags and change them individually, so that you didn’t apply the same style change to emphasis and foreign words too.

  2. Allowing authors to use <b> and <i> tags is also problematic, in my mind, because it keeps authors thinking in presentational terms, rather than helping them move to properly semantic markup. At the very least, it blurs the ideas. To an author in a hurry, it is also tempting to just slap one of these tags on the text to make it look different, rather than to stop and think about things like consistency and future-proofing. (Yes, I’ve often done it too…)