I got an email this morning asking for some use cases for the CSS :lang selector. Here are some ideas. This should help content authors understand how using :lang can sometimes be better than other approaches when selecting content for styling. Of course, not all user agents support :lang, and hopefully these use cases will also show how enabling support could be useful.

Use case 1

One of the main cases where I want to use :lang is when I have a page that includes numerous short pieces of text in a different script. Take, for example, my notes on the Myanmar script. In such cases I want to assign a particular font and perhaps font-size, etc, to the numerous Myanmar examples.

It does my head in trying to ensure that I labelled all the myanmar text with class attributes so that I get the right font and colour applied. And it’s frustrating, because all I’m doing is repeating information that’s there already in the lang attribute (and in the xml:lang attribute too, given that this is xhtml).

Adding class="my" everywhere also bulks up the document. Even in this smallish document, it adds over 1K to the page size.

It would make life a lot easier to just include a single CSS rule:

:lang(my) { font-family: myanmar1, sans-serif; color:red; font-size: 130%; }

Use case 2

Suppose you have the following Japanese text in an English document:

<blockquote lang=”ja” xml:lang=”ja”>ワールド・ワイド・ウェッブを<em>世界中</em>に広げましょう</blockquote>

Now suppose you want to apply different emphasis styling to the Japanese text, since italicisation doesn’t work well for ideographic scripts in small font sizes. Let’s suppose we wanted to add the proposed wakiten emphasis style that CSS3 describes. How do you make that happen?

Well, ideally, you’d just add the following rule to your CSS, and all would be taken care of:

em:lang(ja) { font-emphasize: dot before; font-style: normal; }

(“When you encounter an em tag and the language is Japanese use wakiten and remove the italics.”)

If you’re dealing with IE6 :lang is not supported, and you’d actually have to add a special class to each and every emphasis tag embedded in Japanese text and use a rule such as

em.ja { ... }

How annoying is that!

IE7 CR1 supports the CSS selectors lang |= and lang =. Aha! you might think, problem solved. We can use the following rule:

em[lang |= 'ja'] { ... }

But you’d be wrong. This only works if the language is declared on the em element itself. So you’d still have to go through and add lang="ja" xml:lang="ja" to each em element – even though you have already declared that the whole blockquote is in Japanese!

Use case 3

This use case is slightly less mainstream, but I think it presents a slightly different use case, but one which is increasingly common with the increase in multilingual blogs and AJAX powered pages. It applies when you include text into a page that comes from another environment, either by cut & paste, or by automatic means, and you don’t have the styling information that was associated with it originally.

Assuming that the text has language attributes, or that you can apply those, you could have a set of default rules in your environment that, say, apply a nastaliq font with a percentage size scaling factor to all text in Urdu, so that it has some styling at least, and is a reasonable size relative to the Latin text.

For example, if I cut and paste some Urdu text into this blog, it could make the difference between seeing this:
Text in English and Urdu without styling.

and this:
Text in English and Urdu with styling.

Adding, once, a couple of rules in your blog css that say:

:lang(ur) { font-family: standardMSUrdufont, standardMacUrdufont, standardUnixUrdufont, serif; font-size: 140%; }
em:lang(ur) { font-weight: bold; font-style: normal; }

would be preferable to having to add extra inline markup to the text as you add it to your blog each time.

As a similar example, I just released the latest version of the UniView tool (a kind of web-based Character Map on steroids). It includes a facility that allows you to write your own notes about characters in a separate document and see the relevant notes when looking up a specific character. The information is sucked in using AJAX features. See [1].

We do not at the moment try to incorporate/recognize the other document’s style rules when the notes are displayed in UniView, however, while keeping things simple, it may be useful to allow the UniView user switch on or off some very general default style rules specifying fonts and/or font sizing to text marked up for a particular language.

As long as the code is marked up for language, such defaults can be applied regardless of what class names or styling appeared in the original document. Of course, :lang would be very useful in this respect.

[1] To see this example
a. open UniView
b. where it says “Select a range to display” select Myanmar
c. click on character 1004 and see the description on the right
d. now click on the icon with a + sign between Notes: and Search string: fields
e. from the menu select Myanmar block and say ok, and dismiss the pop up
f. now click on character 1004 again, and see the notes added to the description on the right – these notes came from an XML file (see the same file served as xhtml)

(Anyone can write such a document, stick it on a server and include its information in UniView. The only requirement is that the notes you want to appear be surrounded by <div class=”notes” id=”C[hexCodepoint]”></div>. The example above is one such file supplied with UniView.)

Other useful stuff

At the W3C Internationalization site you can find:

  1. an article that answers the question: “What is the most appropriate way to associate CSS styles with text in a particular language in a multilingual XHTML/HTML document?
  2. a set of test pages relating to user agent support of :lang, lang|= and lang= and a fairly recent summary of results