In this post I’m hoping to make clearer some of the concepts and issues surrounding jukugo ruby. If you don’t know what ruby is, see the article Ruby for a very quick introduction, or see Ruby Markup and Styling for a slightly longer introduction to how it was expected to work in XHTML and CSS.

You can find an explanation of jukugo ruby in Requirements for Japanese Text Layout, sections 3.3 Ruby and Emphasis Dots and Appendix F Positioning of Jukugo-ruby (you need to read both).

What is jukugo ruby?

Jukugo refers to a Japanese compound noun, ie. a word made up of more than one kanji character. We are going to be talking here about how to mark up these jukugo words with ruby.

There are three types of ruby behaviour.

Mono ruby is commonly used for phonetic annotation of text. In mono-ruby all the ruby text for a given character is positioned alongside a single base character, and doesn’t overlap adjacent base characters. Jukugo are often marked up using a mono-ruby approach. You can break a word that uses mono ruby at any point, and the ruby text just stays with the base character.

Group ruby is often used where phonetic annotations don’t map to discreet base characters, or for semantic glosses that span the whole base text. You can’t split text that is annotated with group ruby. It has to wrap a single unit onto the next line.

Jukugo ruby is a term that is used not to describe ruby annotations over jukugo text, but rather to describe ruby with a slightly different behaviour than mono or group ruby. Jukugo ruby behaves like mono ruby, in that there is a strong association between ruby text and individual base characters. This becomes clear when you split a word at the end of a line: you’ll see that the ruby text is split so that the ruby annotating a specific base character stays with that character. What’s different about jukugo ruby is that when the word is NOT split at the end of the line, there can be some significant amount of overlap of ruby text with adjacent base characters.

Example of ruby text.

The image to the right shows three examples of ruby annotating jukugo words.

In the top two examples, mono ruby can be used to produce the desired effect, since neither of the base characters are overlapped by ruby text that doesn’t relate to that character.

The third example is where we see the difference that is referred to as jukugo ruby. The first three ruby characters are associated with the first kanji character. Just the last ruby character is associated with the second kanji character. And yet the ruby text has been arranged evenly across both kanji characters.

Note, however, that we aren’t simply spreading the ruby over the whole word, as we would with group ruby. There are rules that apply, and in some cases gaps will appear. See the following examples of distribution of ruby text over jukugo words.

Various examples of jukugo ruby.

In the next part of this post I will look at some of the problems encountered when trying to use HTML and CSS for jukugo ruby.

If you want to discuss this or contribute thoughts, please do so on the public-i18n-cjk@w3.org list. You can see the archive and subscribe at http://lists.w3.org/Archives/Public/public-i18n-cjk/