<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>ishida &#62;&#62; blog</title>
	<atom:link href="http://rishida.net/blog/?feed=rss2" rel="self" type="application/rss+xml" />
	<link>http://rishida.net/blog</link>
	<description>News of changes to my main site, and W3C related posts.</description>
	<lastBuildDate>Fri, 04 May 2012 16:35:06 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>Greenland</title>
		<link>http://rishida.net/blog/?p=919</link>
		<comments>http://rishida.net/blog/?p=919#comments</comments>
		<pubDate>Mon, 26 Mar 2012 08:58:04 +0000</pubDate>
		<dc:creator>r12a</dc:creator>
				<category><![CDATA[general]]></category>
		<category><![CDATA[photos]]></category>

		<guid isPermaLink="false">http://rishida.net/blog/?p=919</guid>
		<description><![CDATA[Greenland, a photo by r12a on Flickr. I&#8217;ve been processing some photos that have been lying around since last year. This is one of a few pictures of Greenland, taken as i flew over on the way to the Unicode Conference. Amazing glacier flows! See similar photos in lightbox view.]]></description>
			<content:encoded><![CDATA[<div style="float: left; margin: 0 20px 10px 0; padding: 0; font-size: 0.8em; line-height: 1.6em;"><a href="http://www.flickr.com/photos/ishida/7016948463/lightbox/" title="Greenland"><img style="box-shadow: 7px 7px 5px #888; margin-bottom: 10px; border-radius: 5px;" src="http://farm8.staticflickr.com/7243/7016948463_838b2402db.jpg" alt="Greenland by r12a" /></a><br /><span style="margin: 0;"><a href="http://www.flickr.com/photos/ishida/7016948463/">Greenland</a>, a photo by <a href="http://www.flickr.com/photos/ishida/">r12a</a> on Flickr.</span></div>
<p>I&#8217;ve been processing some photos that have been lying around since last year. This is one of a few pictures of Greenland, taken as i flew over on the way to the Unicode Conference. Amazing glacier flows!</p>
<p>See <a href="http://www.flickr.com/photos/ishida/7016948463/lightbox/" target="_blank">similar photos in lightbox view</a>.<br />
<br style="clear:both"/></p>
]]></content:encoded>
			<wfw:commentRss>http://rishida.net/blog/?feed=rss2&#038;p=919</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>New: Balinese script notes</title>
		<link>http://rishida.net/blog/?p=916</link>
		<comments>http://rishida.net/blog/?p=916#comments</comments>
		<pubDate>Sat, 10 Mar 2012 10:21:32 +0000</pubDate>
		<dc:creator>r12a</dc:creator>
				<category><![CDATA[general]]></category>
		<category><![CDATA[script notes]]></category>
		<category><![CDATA[writings]]></category>
		<category><![CDATA[balinese]]></category>
		<category><![CDATA[unicode]]></category>

		<guid isPermaLink="false">http://rishida.net/blog/?p=916</guid>
		<description><![CDATA[I just uploaded an initial draft of an article Balinese Script Notes. It lists the Unicode characters used to represent Balinese text, and briefly describes their use. It starts with brief notes on general script features and discussions about which Unicode characters are most appropriate when there is a choice. The script type is abugida [...]]]></description>
			<content:encoded><![CDATA[<div>
<p style="float: right; width: 260px; margin-left: 1em; margin-bottom: 1em"><img src="http://rishida.net/blog/images/balinesenotes.png" alt="Characters in the Unicode Balinese block." /></p>
<p>I just uploaded an initial draft of an article <a href="http://rishida.net/scripts/balinese/">Balinese Script Notes</a>. It lists the Unicode characters used to represent Balinese text, and briefly describes their use. It starts with brief notes on general script features and discussions about which Unicode characters are most appropriate when there is a choice.</p>
<p>The script type is abugida &#8211; consonants carry an inherent vowel. It&#8217;s a complex script derived from Brahmi, and has lots of contextual shaping and positioning going on. Text runs left-to-right, and words are not separated by spaces.</p>
<p>I think it&#8217;s one of the most attractive scripts in Unicode, and for that reason I&#8217;ve been wanting to learn more about it for some time now.</p>
</div>
<p style="float:left; font-size: 150%"><a href="http://rishida.net/scripts/balinese/">&gt;&gt; Read it</a></p>
<p><br clear="all" /></p>
]]></content:encoded>
			<wfw:commentRss>http://rishida.net/blog/?feed=rss2&#038;p=916</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>New picker: Balinese</title>
		<link>http://rishida.net/blog/?p=905</link>
		<comments>http://rishida.net/blog/?p=905#comments</comments>
		<pubDate>Mon, 05 Mar 2012 19:53:20 +0000</pubDate>
		<dc:creator>r12a</dc:creator>
				<category><![CDATA[general]]></category>
		<category><![CDATA[utilities]]></category>
		<category><![CDATA[web]]></category>
		<category><![CDATA[balinese]]></category>
		<category><![CDATA[kawi]]></category>
		<category><![CDATA[picker]]></category>
		<category><![CDATA[sasak]]></category>
		<category><![CDATA[unicode]]></category>

		<guid isPermaLink="false">http://rishida.net/blog/?p=905</guid>
		<description><![CDATA[&#62;&#62; Use it This picker contains characters from the Unicode Balinese block needed for writing the Balinese language. Characters needed for Sasak are also available in the Advanced section. Balinese musical notation characters are not included. About the tool: Pickers allow you to quickly create phrases in a script by clicking on Unicode characters arranged [...]]]></description>
			<content:encoded><![CDATA[<div>
<p style="float: right; width: 300px; margin-left: 1em; margin-bottom: 1em"><a href="http://rishida.net/blog/images/balinese-picker.png"><img src="http://rishida.net/blog/images/balinese-picker-small.png" alt="Picture of the page in action." /></a></p>
<p style="font-size: 150%"><a href="http://rishida.net/scripts/pickers/balinese/" target="_blank">&gt;&gt; Use it</a></p>
<p>This picker contains characters from the Unicode Balinese block needed for writing the Balinese language. Characters needed for Sasak are also available in the Advanced section. Balinese musical notation characters are not included.</p>
<p><strong>About the tool:</strong> Pickers allow you to quickly create phrases in a script by clicking on Unicode characters arranged in a way that aids their identification. Pickers are likely to be most useful if you don&#8217;t know a script well enough to use the native keyboard. The arrangement of characters also makes it much more usable than a regular character map utility.</p>
<p><strong>About this picker:</strong> Characters are grouped to aid input. The consonant block includes characters needed for Kawi and Sanskrit as well as the native Balinese characters, all arranged according to the Brahmi pronunciation grid.</p>
<p>The picker has only a default view and a font grid view. It&#8217;s difficult to put in the time for the shape-based, keyboard-based, and various transcription-based views in some other pickers. In a new departure, however, I have included a list of Latin characters on the default view to assist in writing transcriptions alongside Balinese text.</p>
<p>There is, however, a significant issue with this picker, due to the lack of support for Balinese as a script in computers. The only Unicode-based Balinese font I know of is Aksara Bali, but that font seems to only work as expected in Firefox on Mac OS X. Furthermore, the Aksara Bali font doesn&#8217;t handle ra repa as described in the Unicode Standard. The sequence &lt;consonant , adeg-adeg, ra repa> produces a visible adeg-adeg, rather than the post-fixed form of ra repa. The sequence &lt;consonant , vowel sign ra repa> produces the post-fixed form of ra repa, rather than the subjoined form. You can produce the post-fixed form with this font by using &lt;consonant , vowel sign ra repa> and the subjoined form by using &lt;consonant , adeg-adeg, ra, pepet>, but these sequences will produce content that cannot be matched against sequences using the Unicode approach, and content that may fail with other Unicode-compliant fonts in the future.</p>
<p>Hopefully some new, fully Unicode-compliant fonts will come along soon. This is one of the most beautiful scripts I have come across.</p>
<p>(Btw, I&#8217;m working on a set of notes for Balinese characters, linked from UniView, with some feature innovations to get around the font issue. Look out for that later. And I&#8217;m thinking I should develop a Javanese picker to go with this one. Just need a bit of time&#8230;)</p>
<p>For the curious, here&#8217;s the first article of the Universal Declaration of Human Rights, as typed in the Balinese picker. Translation by Tri Ediwan (reproduced from <a href="http://www.omniglot.com/writing/balinese.htm">Omniglot</a>). </p>
<p><img src="images/balinese-udhr.png" alt=" " />
</div>
]]></content:encoded>
			<wfw:commentRss>http://rishida.net/blog/?feed=rss2&#038;p=905</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>HTML5 adds new translate attribute</title>
		<link>http://rishida.net/blog/?p=831</link>
		<comments>http://rishida.net/blog/?p=831#comments</comments>
		<pubDate>Wed, 22 Feb 2012 11:07:07 +0000</pubDate>
		<dc:creator>r12a</dc:creator>
				<category><![CDATA[general]]></category>
		<category><![CDATA[i18n]]></category>
		<category><![CDATA[web]]></category>
		<category><![CDATA[writings]]></category>
		<category><![CDATA[html5]]></category>
		<category><![CDATA[translate]]></category>

		<guid isPermaLink="false">http://rishida.net/blog/?p=831</guid>
		<description><![CDATA[A translate attribute was recently added to HTML5. At the three MultilingualWeb workshops we have run over the past two years, the idea of this kind of &#8216;translate flag&#8217; has constantly excited strong interest from localizers, content creators, and from folks working with language technology. How it works Typically authors or automated script environments will [...]]]></description>
			<content:encoded><![CDATA[<p>A <code class="kw">translate</code> attribute was recently <a href="http://dev.w3.org/html5/spec/global-attributes.html#the-translate-attribute ">added to HTML5</a>. At the three <a href="http://multilingualweb.eu/">MultilingualWeb workshops</a> we have run over the past two years, the idea of this kind of &#8216;translate flag&#8217; has constantly excited strong interest from localizers, content creators, and from folks working with language technology.</p>
<h3>How it works</h3>
<p>Typically authors or automated script environments will put the attribute in the markup of a page. You may also find that, in industrial translation scenarios, localizers may add attributes during the translation preparation stage, as a way of avoiding the multiplicative effects of dealing with mistranslations in a large number of languages.</p>
<p>There is no effect on the rendered page (although you could, of course, style it if you found a good reason for doing so). The attribute will typically be used by workflow tools when the time comes to translate the text – be it by the careful craft of human translators, or by quick gist-translation APIs and services in the cloud.</p>
<p>The attribute can appear on any element, and it takes just two values: <code class="kw">yes</code> or <code class="kw">no</code>. If the value is <code class="kw">no</code>, translation tools should protect the text of the element from translation. The translation tool in question could be an automated translation engine, like those used in the online services offered by Google and Microsoft. Or it could be a human translator&#8217;s &#8216;workbench&#8217; tool, which would prevent the translator inadvertently changing the text.</p>
<p>Setting this translate flag on an element applies the value to all contained elements and to all attribute values of those elements.</p>
<p>You don&#8217;t have to use <code>translate="yes"</code> for this to work. If a page has no <code class="kw">translate</code> attribute, a translation system or translator should assume that all the text is to be translated.  The <code class="kw">yes</code> value is likely to see little use, though it could be very useful if you need to override a translate flag on a parent element and indicate some bits of text that should be translated. You may want to translate the natural language text in examples of source code, for example, but leave the code untranslated.</p>
<h3>Why it is needed</h3>
<p>You come across a need for this quite frequently. There is an example in the HTML5 spec about the Bee Game.  Here is a similar, but real example from my days at Xerox, where the documentation being translated referred to a machine with text on the hardware that wasn&#8217;t translated.</p>
<blockquote translate=no><p><code>&lt;p&gt;Click the Resume button on the Status Display or the<br />
&lt;span class=&quot;panelmsg&quot; translate=&quot;no&quot;&gt;CONTINUE&lt;/span&gt; button<br />
on the printer panel.&lt;/p&gt;</code></p></blockquote>
<p>Here are a couple more (real) examples of content that could benefit from the <code class="kw">translate</code> attribute.  The first is from a book, quoting a title of a work.</p>
<blockquote translate=no><p><code>&lt;p&gt;The question in the title &lt;cite translate=&quot;no&quot;&gt;How Far Can You Go?&lt;/cite&gt; applies to both the undermining of traditional religious belief by radical theology and the undermining of literary convention by the device of &quot;breaking frame&quot;...&lt;/p&gt;</code></p></blockquote>
<p>The next example is from a page about French bread – the French for bread is &#8216;<span lang="fr" xml:lang="fr" translate=no>pain</span>&#8216;.</p>
<blockquote translate=no><p><code>&lt;p&gt;Welcome to &lt;strong translate=&quot;no&quot;&gt;french pain&lt;/strong&gt; on Facebook. Join now to write reviews and connect with &lt;strong translate=&quot;no&quot;&gt;french pain&lt;/strong&gt;. Help your friends discover great places to visit by recommending &lt;strong translate=&quot;no&quot;&gt;french pain&lt;/strong&gt;.&lt;/p&gt;</code></p></blockquote>
<p>So adding the translate attribute to your page can help readers better understand your content when they run it through automatic translation systems, and can save a significant amount of cost and hassle for translation vendors with large throughput in many languages.</p>
<h3>What about Google Translate and Microsoft Translator?</h3>
<p>Both Google and Microsoft online translation services already provided the ability to prevent translation of content by adding markup to your content, although they did it in (multiple) different ways. Hopefully, the new attribute will help significantly by providing a standard approach.</p>
<p>Both Google and Microsoft currently support <code>class="notranslate"</code>, but replacing a class attribute value with an attribute that is a formal part of the language makes this feature much more reliable, especially in wider contexts. For example, a translation prep tool would be able to rely on the meaning of the HTML5 <code class="kw">translate</code> attribute always being what is expected. Also it becomes easier to port the concept to other scenarios, such as other translation APIs or localization standards such as XLIFF.</p>
<p>As it happens, the online service of Microsoft (who actually proposed a translate flag for HTML5 some time ago) already supported <code>translate="no"</code>. This, of course, was a proprietary tag until now, and Google didn&#8217;t support it. However, just yesterday morning I received word, by coincidence, that Webkit/Chromium has just added support for the <code class="kw">translate</code>  attribute, and yesterday afternoon Google added support for <code>translate="no"</code> to its online translation service. <a href="http://www.w3.org/International/tests/html-css/translate/results-online">See the results</a> of some tests I put together this morning. (Neither yet supports the <code>translate="yes"</code> override.)</p>
<p>In these proprietary systems, however, there are a good number of other non-standard ways to express similar ideas, even just sticking with Google and Microsoft.</p>
<p>Microsoft apparently supports <code>style="notranslate"</code>.  This is not one of the options Google lists for their online service, but on the other hand they have things that are not available via Microsoft&#8217;s service.</p>
<p>For example, if you have an entire page that should not be translated, you can add <code>&lt;meta name=&quot;google&quot; value=&quot;notranslate&quot;&gt;</code> inside the <code class="kw">head</code> element of your page and Google won&#8217;t translate any of the content on that page. (However they also support <code>&lt;meta name=&quot;google&quot; content=&quot;notranslate&quot;&gt;</code>.)  This shouldn&#8217;t be Google specific, and a single way of doing this, ie. <code>translate="no"</code>  on the <code class="kw">html</code> tag, is far cleaner.</p>
<p>It&#8217;s also not made clear, by the way, when dealing with either translation service, how to make sub-elements translatable inside an element where <code class="kw">translate</code> has been set to <code class="kw">no</code> &#8211; which may sometimes be needed.</p>
<p>As already mentioned, the new HTML5 translate attribute provides a simple and standard feature of HTML that can replace and simplify all these different approaches, and will help authors develop content that will work with other systems too.</p>
<h3>Can&#8217;t we just use the lang attribute?</h3>
<p>It was inevitable that someone would suggest this during the discussions around how to implement a translate flag, however overloading language tags is not the solution. For example, a language tag can indicate which text is to be spellchecked against a particular dictionary. This has nothing to do with whether that text is to be translated or not.  They are different concepts.  In a document that has <code>lang="en"</code> in the html header, if you set <code>lang="notranslate"</code> lower down the page, that text will now not be spellchecked, since the language is no longer English. (Nor for the matter will styling work, voice browsers pronounce correctly, etc.)</p>
<h3>Going beyond the translate attribute</h3>
<p>The W3C&#8217;s <a href="http://www.w3.org/TR/its/">ITS (International Tag Set) Recommendation</a> proposes the use of a translate flag such as the attribute just added to HTML5, but also goes beyond that in describing a way to assign translate flag values to particular elements or combinations of markup throughout a document or set of documents. For example, you could say, if it makes sense for your content, that by default, all <code class="kw">p</code> elements with a particular class name should have the translate flag set to <code class="kw">no</code> for a specific set of documents.</p>
<p>Microsoft offers something along these lines already, although it is much less powerful than the ITS approach. If you use <code>&lt;meta name=&quot;microsoft&quot; content=&quot;notranslateclasses myclass1 myclass2&quot; /&gt;</code> anywhere on the page (or as part of a widget snippet) it ensures that any of the CSS classes listed following “notranslateclasses” should behave the same as the “notranslate” class.</p>
<p>Microsoft and Google&#8217;s translation engines also don&#8217;t translate content within <code class="kw">code</code> elements.  Note, however, that you don&#8217;t seem to have any choice about this – there don&#8217;t seem to be instructions about how to override this if you do want your <code class="kw">code</code> element content translated.</p>
<p>By the way, there are plans afoot to set up a new MultilingualWeb-LT Working Group at the W3C in conjunction with a European Commission project to further develop ideas around the ITS spec, and create reference implementations. They will be looking, amongst many other things, at ways  of integrating the new <code class="kw">translate</code> attribute into localization industry workflows and standards. Keep an eye out for it.</p>
]]></content:encoded>
			<wfw:commentRss>http://rishida.net/blog/?feed=rss2&#038;p=831</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Fonts supplied with Windows7 and Mac OS X, by script</title>
		<link>http://rishida.net/blog/?p=808</link>
		<comments>http://rishida.net/blog/?p=808#comments</comments>
		<pubDate>Sun, 12 Feb 2012 10:34:58 +0000</pubDate>
		<dc:creator>r12a</dc:creator>
				<category><![CDATA[announcement]]></category>
		<category><![CDATA[general]]></category>
		<category><![CDATA[i18n]]></category>
		<category><![CDATA[script notes]]></category>
		<category><![CDATA[web]]></category>
		<category><![CDATA[writings]]></category>
		<category><![CDATA[fonts]]></category>
		<category><![CDATA[unicode]]></category>

		<guid isPermaLink="false">http://rishida.net/blog/?p=808</guid>
		<description><![CDATA[I&#8217;ve wanted to get around to this for years now. Here is a list of fonts that come with Windows7 and Mac OS X Snow Leopard/Lion, grouped by script. This kind of list could be used to set font-family styles for CSS, if you want to be reasonably sure what the user will see, or [...]]]></description>
			<content:encoded><![CDATA[<p style="float: right; width: 210px; margin-left: 1em; margin-bottom: 1em"><img src="http://rishida.net/blog/images/fontlist.png" alt="Picture of the page in action." /></p>
<p>I&#8217;ve wanted to get around to this for years now. Here is a list of fonts that come with Windows7 and Mac OS X Snow Leopard/Lion, grouped by script.</p>
<p>This kind of list could be used to set font-family styles for CSS, if you want to be reasonably sure what the user will see, or it could be used just to find a font you like for a particular script. I&#8217;m still working on the list, and there are some caveats.</p>
<p style="font-size: 150%"><a href="http://rishida.net/scripts/fontlist/">&gt;&gt; See the list</a></p>
<p>Some of the fonts listed above may be disabled on the user&#8217;s system. I&#8217;m making an assumption that someone who reads tibetan will have the Tibetan font turned on, but for my articles that explain writing systems to people in English, such assumptions may not hold.</p>
<p>The  <a href="http://www.microsoft.com/typography/fonts/product.aspx?pid=161">list I used to identify Windows fonts</a> is Windows7-specific  and fairly stable, but the   Mac font spans more than one version of Mac OS X, and I could only find an <a href="http://en.wikipedia.org/wiki/List_of_typefaces_included_with_Mac_OS_X">unofficial list of fonts for Snow Leopard</a>, and there were some fonts on that list that I didn&#8217;t have on my system. Where a Mac font is new with Lion (and there are a significant number) it is indicated. See the <a href="http://support.apple.com/kb/HT5098">official list of fonts on Mac OS X Lion</a>.</p>
<p>There shouldn&#8217;t be any fonts listed here for a given script that aren&#8217;t supplied with Windows7 or Mac OS X Snow Leopard/Lion, but there are probably supplied fonts that are not yet listed here (typically these will be large fonts that cover multiple scripts). In particular, note that I haven&#8217;t yet made a list of fonts that support Latin, Greek and Cyrillic (mainly because there are so many of them and partly because I&#8217;m wondering how useful it will be.)</p>
<p>The text used is as much as would fit on one line of article 1 of the Universal Declaration of Human Rights, taken from <a href="http://unicode.org/udhr/index_by_name.html">this Unicode page</a>, wherever I could find it. I created a few instances myself, where it was missing, and occasionally I resorted to arbitrary lists of characters.</p>
<p>You can obtain a character-based version of the text used by looking at the source text: look for the title attribute on the section heading.</p>
<p>Things still to do:</p>
<ul>
<li>create sections for Latin, Greek and Cyrillic fonts</li>
<li>check for fonts covering multiple Unicode blocks</li>
<li>figure out how to tell, and how to show which is the system default</li>
<li>work out and show what&#8217;s not available in Windows XP</li>
<li>work out what&#8217;s new in Lion, and whether it&#8217;s worth including them</li>
<li>figure out whether people with different locale setups see different things</li>
<li>recapture all font images that need it at 36px, rather than varying sizes</li>
</ul>
<h2 style="color: orange;">Update, 19 Feb 2012</h2>
<p>I uploaded a new version of the font list with the following main changes:</p>
<ul>
<li>If you click on an image you see text with that font applied (if you have it on your system, of course). The text can be zoomed from 14px to 100px (using a nice HTML5 slider, if you have the right browser! [try Chrome, Safari or Opera]). This text includes a little Latin text so you can see the relationship between that and the script.</li>
<li>All font graphics are now standardised so that text is imaged at a font size of 36px. This makes it more difficult to see some fonts (unless you can use the zoom text feature), but gives a better idea of how fonts vary in default size.</li>
<li>I added a few extra fonts which contained multiple script support.</li>
<li>I split Chinese into Simplified and Traditional sections.</li>
<li>Various other improvements, such as adding real text for N&#8217;Ko, correcting the Traditional Chinese text, flipping headers to the left for RTL fonts, reordering fonts so that similar ones are near to each other, etc.</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://rishida.net/blog/?feed=rss2&#038;p=808</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>UniView 6.1:  Unicode 6.1 support, popout windows, case converter, &#8230;</title>
		<link>http://rishida.net/blog/?p=791</link>
		<comments>http://rishida.net/blog/?p=791#comments</comments>
		<pubDate>Tue, 31 Jan 2012 11:40:10 +0000</pubDate>
		<dc:creator>r12a</dc:creator>
				<category><![CDATA[general]]></category>
		<category><![CDATA[i18n]]></category>
		<category><![CDATA[utilities]]></category>
		<category><![CDATA[web]]></category>
		<category><![CDATA[unicode]]></category>
		<category><![CDATA[uniview]]></category>

		<guid isPermaLink="false">http://rishida.net/blog/?p=791</guid>
		<description><![CDATA[&#62;&#62; Use UniView The major change in this update is the update of the data to support Unicode version 6.1.0, which should be released today. (See the list of links to new Unicode blocks below.) There are also a number of feature and bug related changes. What UniView does: Look up and see characters (using [...]]]></description>
			<content:encoded><![CDATA[<p style="float: right; width: 310px; margin-left: 1em; margin-bottom: 1em"><a href="http://rishida.net/blog/images/uniview61.png"><img src="http://rishida.net/blog/images/uniview61-small.png" alt="Picture of the page in action." /></a></p>
<p style="font-size: 150%"><a href="http://rishida.net/scripts/uniview/">&gt;&gt; Use UniView</a></p>
<p>The major change in this update is the update of the data to support <strong>Unicode version 6.1.0</strong>, which should be released today. (See the list of links to new Unicode blocks below.) </p>
<p>There are also a number of feature and bug related changes.</p>
<p><strong>What UniView does:</strong> Look up and see characters (using graphics or fonts) and property information, view whole character blocks or custom ranges, select characters to paste into your document, paste in and discover unknown characters, search for characters, do hex/dec/ncr conversions, highlight character types, etc. etc. Supports Unicode 6.1 and written with Web Standards to work on a variety of browsers. No need to install anything.</p>
<p><strong>List of changes:</strong> </p>
<ul>
<li>
<p>One significant change enables you to display information in a separate window, rather than overwriting the information currently displayed. This can be done by typing/pasting/dragging a set of characters or character code values into the new <span class="ui">Popout</span> area and selecting the <img border="0" align="bottom" alt=" " src="http://rishida.net/scripts/uniview/images/apply.gif"/> icon alongside the <span class="ui">Characters</span> or <span class="ui">Copy &amp; paste</span> input fields (depending on what you put in the popout window).</p>
</li>
<li>
<p>Two new icons were added to the <span class="ui">Copy &amp; paste</span> area:</p>
<p><img alt="Analyse" src="http://rishida.net/scripts/uniview/images/case-simple.png"/> Clicking on this will display the characters in the area in the lower right part of the page with all relevant characters converted to uppercase, lowercase and titlecase. Characters that had no case conversion information are also listed.</p>
<p><img alt="Analyse" src="http://rishida.net/scripts/uniview/images/case-detail.png"/> Clicking on this produces the same kind of output as clicking on the icon just above, but shows the mappings for those characters that have been changed, eg. e→E.</p>
</li>
<li>
<p>Where  character information displayed in the lower right panel has a case or decomposition mapping, UniView now displays the characters involved, rather than just giving the hex value(s), eg. Uppercase mapping:	0043    C. You will need a font on your system to see the characters displayed in this way, but whether or not you have a font, this provides a quick and easy way to copy the case-changed character (rather than having to copy the hex value and convert it first).</p>
</li>
<li>
<p>There is also a new line, slightly further down, when UniView is in graphic mode. This line starts with &#8216;As text:&#8217;, and shows the character using whatever default font you have on your system. Of course, if you don&#8217;t have a font that includes that character you won&#8217;t see it. This has been added to make it easier to copy and paste a character into text.</p>
</li>
<li>
<p>There is also a new line, slightly further down, when UniView is in graphic mode. This line starts with &#8216;As text:&#8217;, and shows the character using whatever default font you have on your system. Of course, if you don&#8217;t have a font that includes that character you won&#8217;t see it. This has been added to make it easier to copy and paste a character into text.</p>
</li>
<li>
<p>Fixed some small bugs, such as problems with search when U+29DC INCOMPLETE INFINITY is returned.</p>
</li>
</ul>
<p>Enjoy.</p>
<p>Here are direct links to the new blocks added to Unicode 6.1:</p>
<ul>
<li><a href="http://rishida.net/scripts/uniview/?block=Arabic_Extended-A">Arabic Extended-A</a></li>
<li><a href="http://rishida.net/scripts/uniview/?block=Sundanese_Supplement">Sundanese Supplement</a></li>
<li><a href="http://rishida.net/scripts/uniview/?block=Meetei_Mayek_Extensions">Meetei Mayek Extensions</a></li>
<li><a href="http://rishida.net/scripts/uniview/?block=Meroitic_Hieroglyphs">Meroitic Hieroglyphs</a></li>
<li><a href="http://rishida.net/scripts/uniview/?block=Meroitic_Cursive">Meroitic Cursive</a></li>
<li><a href="http://rishida.net/scripts/uniview/?block=Sora_Sompeng">Sora Sompeng</a></li>
<li><a href="http://rishida.net/scripts/uniview/?block=Chakma">Chakma</a></li>
<li><a href="http://rishida.net/scripts/uniview/?block=Sharada">Sharada</a></li>
<li><a href="http://rishida.net/scripts/uniview/?block=Takri">Takri</a></li>
<li><a href="http://rishida.net/scripts/uniview/?block=Miao">Miao</a></li>
<li><a href="http://rishida.net/scripts/uniview/?block=Arabic_Mathematical_Alphabetic_Symbols">Arabic Mathematical Alphabetic Symbols</a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://rishida.net/blog/?feed=rss2&#038;p=791</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Using unicode-range in font-face in CSS</title>
		<link>http://rishida.net/blog/?p=760</link>
		<comments>http://rishida.net/blog/?p=760#comments</comments>
		<pubDate>Sun, 18 Dec 2011 13:47:36 +0000</pubDate>
		<dc:creator>r12a</dc:creator>
				<category><![CDATA[general]]></category>
		<category><![CDATA[i18n]]></category>
		<category><![CDATA[web]]></category>
		<category><![CDATA[writings]]></category>
		<category><![CDATA[CSS]]></category>
		<category><![CDATA[font-face]]></category>
		<category><![CDATA[fonts]]></category>
		<category><![CDATA[glyphs]]></category>
		<category><![CDATA[unicode-range]]></category>

		<guid isPermaLink="false">http://rishida.net/blog/?p=760</guid>
		<description><![CDATA[These are notes on using CSS @font-face to gain finer control over the fonts applied to characters in particular Unicode ranges of your text, without resorting to additional markup. Possibilities and problems. Changing the font used for certain characters Most non-English fonts mix glyphs from different writing systems. Usually the font contains glyphs for Latin [...]]]></description>
			<content:encoded><![CDATA[<p>These are notes on using CSS @font-face to gain finer control over the fonts applied to characters in particular Unicode ranges of your text, without resorting to additional markup. Possibilities and problems.</p>
<h2>Changing the font used for certain characters</h2>
<p>Most non-English fonts mix glyphs from different writing systems. Usually the font contains glyphs for Latin characters plus a non-Latin script, for example English+Japanese, or English+Thai, etc.</p>
<p>Normally the font designer will take care to harmonise the Latin script glyphs with the non-Latin, but there may be cases where you want to change the design of the glyphs for, say, and embedded script without adding markup to your page.</p>
<p>For example, if I apply the MS-Mincho font to some content in Japanese with embedded Latin text I&#8217;m likely to see the following:</p>
<p style="text-align: center;"><img src="/rishida/blog/images/ms-mincho-only.png" alt='' /></p>
<p>Let&#8217;s suppose I&#8217;d like the English text to appear in a different, proportionally-spaced font. I could put markup around the English and set a class on the markup to apply the font I want, but this is very time consuming and bloats your code.</p>
<p>An alternative is to use @font-face. Here is an example:</p>
<pre>
@font-face {
  font-family: myJapanesefont;
  src: local(MS-Mincho);
  }
@font-face {
  font-family: myJapanesefont;
  src: local(Gentium);
  unicode-range: U+41-5A, U+61-7A, U+C0-FF;
  }
p {
  font-family: myJapanesefont;
  }
</pre>
<p>The result would be:</p>
<p style="text-align: center;"><img src="/rishida/blog/images/ms-mincho-gentium.png" alt='' /></p>
<p>The first font-face declaration associates the MS-Mincho font with the name &#8216;myJapanesefont&#8217;. The second font-face declaration associates the Baskerville font with the Unicode code points in the Latin-1 letter range (of course, you can extend this if you use Latin characters outside that range and they are covered by the font).</p>
<p>When specifying src the local() keyword indicates that font-face should look for the font on the user&#8217;s system. Of course, to improve interoperability, you may want to specify a number of alternatives here, or a downloadable WOFF font.  The most interoperable value to use for local() is the Postscript name of the font. (On the Mac open Font Book, select the font, and choose Preview > Show Font Information to find this.)</p>
<p>Note how I was careful to set the unicode-range values to exclude punctuation (such as the exclamation mark) that would be used by (and harmonised with) the Japanese characters. </p>
<h2>Adding support for new characters to a font</h2>
<p>You can use the same approach for fonts that don&#8217;t have support for a particular Unicode range.  </p>
<p>For example, the Nafees Nastaliq font has no glyphs for the Latin range (other than digits), so the browser falls back to the system default.</p>
<p style="text-align: center;"><img src="/rishida/blog/images/nafees-only.png" alt='' /></p>
<p>With the following code, I can produce a more pleasant font for the &#8216;W3C&#8217; part:</p>
<pre>
@font-face {;
  font-family: myUrduFont;
  src: local(NafeesNastaleeq);
  }
@font-face {
  font-family: myUrduFont;
  src: local(BookAntiqua);
  unicode-range: U+30-FF;
  }
div p {
  font-family: myUrduFont;
  font-size: 60px;
  }
</pre>
<p style="text-align: center;"><img src="/rishida/blog/images/nafees-bookantiqua.png" alt='' /></p>
<h2>A big fly in the ointment</h2>
<p>If you look at the ranges in the unicode-range value, you&#8217;ll see that I kept to just the letters of the alphabet in the Japanese example, and the missing glyphs in the Urdu case.</p>
<p>There are a number of characters that are used by all scripts, however, and these cause problems because you can&#8217;t apply fonts based on the context – even if you could work out what that context was.</p>
<p>In the case of the Japanese example above, numbers are left to be rendered by the Mincho font, but when those characters appear in the Latin text, they look incorrectly sized. Look, for example, at the 3 in W3C below.</p>
<p style="text-align: center;"><img src="/rishida/blog/images/mincho-gentium-digit.png" alt='' /></p>
<p>The same problem arises with spaces and punctuation marks. The exclamation mark was left in the Mincho font in the Japanese example because, in this case, it is part of the Japanese text. Punctuation of this kind, could however be associated with the Latin text. See the question mark in this example.</p>
<p style="text-align: center;"><img src="/rishida/blog/images/mincho-baskerville-question.png" alt='' /></p>
<p>Even more problematic are the spaces in that example.  They are too wide in the Latin text. In Urdu text we have the opposite problem, use Urdu space glyphs in Latin text and you don&#8217;t see them at all (there should be a gap between W3C and i18n below).</p>
<p style="text-align: center;"><img src="/rishida/blog/images/nafees-gentium-space.png" alt='' /></p>
<p>With my W3C hat on, I start wondering whether there are any rules we can use to apply different glyphs for some characters depending on the script context in which they are used, but then I realise that this is going to bring in all the problems we already have for bidi text when dealing with punctuation or spaces between flows of text in different scripts.  I&#8217;m not sure it&#8217;s a tractable problem without resorting to markup to delimit the boundaries. But then, of course, we end up right back where we started.</p>
<p>So it seems, disappointingly, that the unicode-range property is destined to be of only limited usefulness for me.  That&#8217;s a real shame.</p>
<h2>Another small issue</h2>
<p>The examples don&#8217;t show major problems, but I assume that sometimes the fonts you want to bring together using font-face will have very different aspect ratios, so you may need to use something like font-size-adjust to balance the size of the fonts being used.</p>
<h2>Browser support</h2>
<p>The CSS code above worked for me in Chrome and Safari on Mac OS X 10.6. but didn&#8217;t work in Firefox or Opera. Nor did it work in IE9 on Windows7.</p>
]]></content:encoded>
			<wfw:commentRss>http://rishida.net/blog/?feed=rss2&#038;p=760</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>What is XHTML5?</title>
		<link>http://rishida.net/blog/?p=747</link>
		<comments>http://rishida.net/blog/?p=747#comments</comments>
		<pubDate>Mon, 05 Dec 2011 11:03:37 +0000</pubDate>
		<dc:creator>r12a</dc:creator>
				<category><![CDATA[general]]></category>
		<category><![CDATA[i18n]]></category>
		<category><![CDATA[web]]></category>
		<category><![CDATA[writings]]></category>
		<category><![CDATA[html5]]></category>
		<category><![CDATA[polyglot]]></category>
		<category><![CDATA[xhtml]]></category>
		<category><![CDATA[xhtml5]]></category>

		<guid isPermaLink="false">http://rishida.net/blog/?p=747</guid>
		<description><![CDATA[There appears to be some confusion about XHTML1.0 vs XHTML5. Here is my best shot at an explanation of what XHTML5 is. * This post is written for people with some background in MIME types and html/xml formats. In case that&#8217;s not you, this may give you enough to follow the idea: &#8216;served as&#8217; means [...]]]></description>
			<content:encoded><![CDATA[<p>There appears to be some confusion about XHTML1.0 vs XHTML5. Here is my best shot at an explanation of what XHTML5 is.</p>
<div style="float:right; width: 30%; margin-left: 10px; margin-bottom: 10px; color:#999999;">* This post is written for people with some background in MIME types and html/xml formats. In case that&#8217;s not you, this may give you enough to follow the idea: &#8216;served as&#8217; means sent from a server to the browser with a MIME type declaration in the HTTP protocol header that says that the content of the page is HTML (text/html) or XML (eg. <code>application/xhtml+xml</code>). <a href="http://www.w3.org/International/articles/serving-xhtml/#mime" target="_blank">See examples and more explanations</a>.</div>
<p>XHTML5 is an HTML5 document served as* <code>application/xhtml+xml</code> (or another XML mime type). The syntax rules for XHTML5 documents are simply those rules given by the <a href="http://www.w3.org/TR/REC-xml/" target="_blank">XML specification</a>. The vocabulary (elements and attributes) is defined by the <a href="http://dev.w3.org/html5/spec/" target="_blank">HTML5 spec</a>.</p>
<p>Anything served as <code>text/html</code> is not XHTML5.</p>
<p>Note that HTML5 (without the X) can be written in a style that looks like XML syntax. For example, using a / in empty elements (eg. <code>&lt;img src=&quot;...&quot; /&gt;</code>), or using quotes around attributes.  But code written this way is still HTML5, not XHTML5, if it is served as <code>text/html</code>.</p>
<p>There are normally other differences between HTML5 and XHTML5.  For example, XHTML5 documents may have an XML declaration at the start of the document. HTML5 documents cannot have that.  XHTML5 documents are likely to have a more complicated doctype (to facilitate XML processing).  And XHTML5 documents will have an xmlns attribute on the html tag.  There are a few other HTML5 features that are not compatible with XML, and must be avoided.</p>
<p>Similar differences existed between HTML 4.01 and XHTML 1.0.  However, moving on from XHTML 1.0 will typically involve a subtle but significant shift in thinking.  You might have written XHTML 1.0 with no intention of serving it as anything other than <code>text/html</code>. XHTML in the XHTML 1.0 sense tended to be seen  largely as a difference in syntax; it was originally designed to be served as XML, but (with some customisations to suit HTML documents) could be, and usually was, served with an HTML mime type. XHTML in the XHTML5 sense, means HTML5 documents served with an XML mime type (and appropriate customisations to suit XML documents), ie. it&#8217;s the MIME type, not the syntax, that makes it XHTML.</p>
<p>Which brings us to Polyglot documents.  A polyglot document is a document that is the subset of HTML5 and XML that can be processed as either HTML or XHTML, and can be served as either <code>text/html</code> or <code>application/xhtml+xml</code>, ie. as either HTML5 or XHTML5, without any errors or warnings in either case.  The <a href="http://dev.w3.org/html5/html-xhtml-author-guide/" target="_blank">polyglot spec</a> defines the things which allow this compatibility (such as using no XML declaration, proper casing of element names, etc.), and which things to avoid.  It also mandates at least one additional extra, ie. disallowing UTF-16 encoded documents. </p>
]]></content:encoded>
			<wfw:commentRss>http://rishida.net/blog/?feed=rss2&#038;p=747</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>How to generate a list of Unicode characters with names and/or codepoints</title>
		<link>http://rishida.net/blog/?p=702</link>
		<comments>http://rishida.net/blog/?p=702#comments</comments>
		<pubDate>Tue, 25 Oct 2011 12:31:54 +0000</pubDate>
		<dc:creator>r12a</dc:creator>
				<category><![CDATA[general]]></category>
		<category><![CDATA[i18n]]></category>
		<category><![CDATA[utilities]]></category>
		<category><![CDATA[web]]></category>
		<category><![CDATA[writings]]></category>
		<category><![CDATA[characters]]></category>
		<category><![CDATA[unicode]]></category>
		<category><![CDATA[uniview]]></category>

		<guid isPermaLink="false">http://rishida.net/blog/?p=702</guid>
		<description><![CDATA[One of the more useful features of UniView is its ability to list the characters in a string with names and codepoints. This is particularly useful when you can&#8217;t tell what a string of characters contains because you don&#8217;t have a font, or because the script is too complex, etc. For example, I was recently [...]]]></description>
			<content:encoded><![CDATA[<p>One of the more useful features of <a href="http://rishida.net/scripts/uniview/" target="_blank" style="font-size: 120%;">UniView</a> is its ability to list the characters in a string with names and codepoints. This is particularly useful when you can&#8217;t tell what a string of characters contains because you don&#8217;t have a font, or because the script is too complex, etc.</p>
<div style="float:left; margin: 0 20px 10px 20px;"><img src="http://rishida.net/blog/images/ishida-nastaliq.png" alt="'ishida' in Persian in  nastaliq font style" /></div>
<p>For example, I was recently sent an email where my name was written in Persian as <span style="font-family: IranNastaliq, 'Nafees Nastaleeq', serif; font-size: 120%;">ایشی‌دا</span>. The image shows how it looks in a nastaliq font. </p>
<p style="clear:both;">To see the component characters, drop the string into UniView&#8217;s <span class=ui>Copy &amp; Paste</span> field and click on the <img src="http://rishida.net/scripts/newuniview/images/apply.gif" alt="downwards pointing arrow" /> icon.  Here is the result:</p>
<p><img src="http://rishida.net/blog/images/ishida-nastaliq-list.png" alt="list of characters" /></p>
<p>Note how you can now see that there&#8217;s an invisible control character in the string.  Note also that you see a graphic image for each character, which is a big help if the string you are investigating is just a sequence of boxes on your system.</p>
<p>Not only can you discover characters in this way, but you can create lists of characters which can be pasted into another document, and customise the format of those lists.</p>
<h2>Pasting the list elsewhere</h2>
<p>If you select this list and paste it into a document, you&#8217;ll see something like this:</p>
<pre>
  0627  ARABIC LETTER ALEF
  06CC  ARABIC LETTER FARSI YEH
  0634  ARABIC LETTER SHEEN
  06CC  ARABIC LETTER FARSI YEH
  200C  ZERO WIDTH NON-JOINER
  062F  ARABIC LETTER DAL
  0627  ARABIC LETTER ALEF
</pre>
<p>You can make the characters appear by deselecting <span class="ui">Use graphics</span> on the <span class="ui">Look up</span> tab. (Of course, you need an arabic font to see the list as intended.)</p>
<pre>
ا  ‎0627  ARABIC LETTER ALEF
ی  ‎06CC  ARABIC LETTER FARSI YEH
ش  ‎0634  ARABIC LETTER SHEEN
ی  ‎06CC  ARABIC LETTER FARSI YEH
‌  ‎200C  ZERO WIDTH NON-JOINER
د  ‎062F  ARABIC LETTER DAL
ا  ‎0627  ARABIC LETTER ALEF
</pre>
<h2>Customising the list format</h2>
<p>What may be less obvious is that you can also customise the format of this list using the settings under the <span class="ui">Options</span> tab.  For example, using the <span class="ui">List format</span> settings, I can produce a list that moves the character column between the number and the name, like this:</p>
<pre>
  0627  ا  ARABIC LETTER ALEF
  ‎06CC  ی  ARABIC LETTER FARSI YEH
  ‎0634  ش  ARABIC LETTER SHEEN
  ‎06CC  ی  ARABIC LETTER FARSI YEH
  ‎200C  ‌  ZERO WIDTH NON-JOINER
  ‎062F  د  ARABIC LETTER DAL
  ‎0627  ا  ARABIC LETTER ALEF
</pre>
<p>Or I can remove one or more columns from the list, such as:</p>
<pre>
  ا  ARABIC LETTER ALEF
  ی  ARABIC LETTER FARSI YEH
  ش  ARABIC LETTER SHEEN
  ی  ARABIC LETTER FARSI YEH
  ‌  ZERO WIDTH NON-JOINER
  د  ARABIC LETTER DAL
  ا  ARABIC LETTER ALEF
</pre>
<p>With the option <span class="ui">Show U+ in lists</span> I can also add or remove the U+ before the codepoint value.  For example, this lets me produce the following list:</p>
<pre>
  ‎U+0627  ARABIC LETTER ALEF
  ‎U+06CC  ARABIC LETTER FARSI YEH
  ‎U+0634  ARABIC LETTER SHEEN
  ‎U+06CC  ARABIC LETTER FARSI YEH
  ‎U+200C  ZERO WIDTH NON-JOINER
  ‎U+062F  ARABIC LETTER DAL
  ‎U+0627  ARABIC LETTER ALEF
</pre>
<h2>Other lists in UniView</h2>
<p>We&#8217;ve shown how you can make a list of characters in the <span class="ui">Cut &amp; Paste</span> box, but don&#8217;t forget that you can create lists for a Unicode block, custom range of characters, search list results, or list of codepoint values, etc.  And not only that, but you can filter lists in various ways.</p>
<p>Here is just one quick example of how you can obtain a list of numbers for the Devanagari script.  </p>
<ol>
<li>On the <span class=ui>Look up</span> tab, select Devanagari from the <span class=ui>Unicode block</span> pull down list.</li>
<li>Select <span class=ui>Show range as list</span> and deselect (optional) <span class=ui>Use graphics</span>.</li>
<li>Under the <span class=ui>Filter</span> tab, select Number from the <span class=ui>Show properties</span> pull down list.</li>
<li>Click on <span class=ui>Make list from highlights</span></li>
</ol>
<p>You end up with the following list, that you can paste into your document.</p>
<pre>
०  ‎0966  DEVANAGARI DIGIT ZERO
१  ‎0967  DEVANAGARI DIGIT ONE
२  ‎0968  DEVANAGARI DIGIT TWO
३  ‎0969  DEVANAGARI DIGIT THREE
४  ‎096A  DEVANAGARI DIGIT FOUR
५  ‎096B  DEVANAGARI DIGIT FIVE
६  ‎096C  DEVANAGARI DIGIT SIX
७  ‎096D  DEVANAGARI DIGIT SEVEN
८  ‎096E  DEVANAGARI DIGIT EIGHT
९  ‎096F  DEVANAGARI DIGIT NINE
</pre>
<p>(Of course, you can also customise the layout of this list as described in the previous section.)</p>
<p><a href="http://rishida.net/scripts/uniview/" target="_blank">Try it out</a>.</p>
<h2>Reversing the process: from list to string</h2>
<p>To complete the circle, you can also cut &amp; paste any of the lists in the blog text above into UniView, to explore each character&#8217;s properties or recreate the string.  </p>
<p>Select one of the lists above and paste it into the <span class=ui>Characters</span> input field on the <span class=ui>Look up</span> tab. Hit the <img src="http://rishida.net/scripts/newuniview/images/apply.gif" alt="downwards pointing arrow" /> icon alongside, and UniView will recreate the list for you. Click on each character to view detailed information about it.</p>
<p>If you want to recreate the string from the list, simply click on the <img src="http://rishida.net/scripts/newuniview/images/tochar.png" alt="upwards pointing arrow" /> icon below the <span class=ui>Copy &amp; paste</span> box, and the list of characters will be reconstituted in the box as a string.</p>
<p>Voila!</p>
]]></content:encoded>
			<wfw:commentRss>http://rishida.net/blog/?feed=rss2&#038;p=702</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Managing CSS style sheets for LTR and RTL variants of a page</title>
		<link>http://rishida.net/blog/?p=671</link>
		<comments>http://rishida.net/blog/?p=671#comments</comments>
		<pubDate>Mon, 10 Oct 2011 08:12:30 +0000</pubDate>
		<dc:creator>r12a</dc:creator>
				<category><![CDATA[general]]></category>
		<category><![CDATA[i18n]]></category>
		<category><![CDATA[web]]></category>
		<category><![CDATA[writings]]></category>
		<category><![CDATA[arabic]]></category>
		<category><![CDATA[bidi]]></category>
		<category><![CDATA[hebrew]]></category>
		<category><![CDATA[left]]></category>
		<category><![CDATA[margin]]></category>
		<category><![CDATA[right]]></category>
		<category><![CDATA[style sheets]]></category>
		<category><![CDATA[text-align]]></category>

		<guid isPermaLink="false">http://rishida.net/blog/?p=671</guid>
		<description><![CDATA[I created a new HTML5-based template for our W3C Internationalization articles recently, and I&#8217;ve just received some requests to translate documents into Arabic and Hebrew, so I had to get around to updating the bidi style sheets. (To make it quicker to develop styles, I create the style sheet for ltr pages first, and only [...]]]></description>
			<content:encoded><![CDATA[<p>I created a new HTML5-based template for our <a href="http://www.w3.org/International/articlelist">W3C Internationalization articles</a> recently, and I&#8217;ve just received some requests to translate documents into Arabic and Hebrew, so I had to get around to updating the bidi style sheets. (To make it quicker to develop styles, I create the style sheet for ltr pages first, and only when that is working well do I create the rtl style sheet info.)</p>
<p>Here are some thoughts about how to deal with style sheets for both right-to-left (rtl) and left-to-right (ltr) documents.</p>
<h2>What needs changing?</h2>
<p>Converting a style sheet is a little more involved than using a global search and replace to convert left to right, and vice versa. While this may catch many of the things that need changing, it won&#8217;t catch all, and it could also introduce errors into the style sheet.</p>
<p>For example, I had selectors called .topleft and .bottomright in my style sheet. These, of course, shouldn&#8217;t be changed.  There may also be occasional situations where you don&#8217;t want to change the direction of a particular block.</p>
<p>Another thing to look out for: I tend to use -left and -right a lot when setting things like margins, but where I have set something like <code>margin: 1em 32% .5em 7.5%;</code> you can&#8217;t just use search and replace, and you have to carefully scour the whole of the main stylesheet to find the instances where the right and left margins are not balanced.</p>
<p>There is a web service called <a href="http://cssjanus.commoner.com/" target="_blank">CSSJanus</a> that can apply a little intelligence to convert most of what you need.  You still have to use with care, but it does come with a convention to prevent conversion of properties where needed (you can disable CSSJanus from running on an entire class or any rule within a class by prepending a <code>/* @noflip */</code> comment before the rule(s) you want CSSJanus to ignore).</p>
<p>Note also that there are other things that may need changing besides the right and left values. For example, some of the graphics on our template need to be flipped (such as the dog-ear icon in the top corner of the page).</p>
<p>CSS may provide a way to do this in the future, but it is still only a proposal in a First Public Working Draft at the moment. (It would involve writing a selector such as <code>#site-navigation:dir(rtl) { background-image: url(standards-corner-rtl.png); }</code>.</p>
<h2>Approach 1: extracting changed properties to an auxiliary style sheet</h2>
<p>For the old template I have a secondary, bidi style sheet that I load after the main style sheet. This bidi style sheet contains a copy of just the rules in the main style sheet that needed changing and overwrites the styles in the main style sheet.  These changes were mainly to margin, padding, and text-align properties, though there were also some others, such as positioning, background and border properties.</p>
<p>The cons of this approach were:</p>
<ol>
<li>it&#8217;s a pain to create and maintain a second style sheet in the first place</li>
<li>it&#8217;s an even bigger pain to remember to copy any relevant changes in the main style sheet to the bidi style sheet, not least because the structure is different, and it&#8217;s a little harder to locate things</li>
<li>everywhere that the main style sheet declared, say, a left margin without declaring a value for the right margin, you have to figure out what that other margin should be and add it to the bidi style sheet. For example, if a figure has just margin-left: 32%, that will be converted to margin-right: 32%, but because the bidi style sheet hasn&#8217;t overwritten the main style sheet&#8217;s margin-left value, the Arabic page will end up with both margins set to 32%, and a much thinner figure than desired.  To prevent this, you need to figure out what all those missing values should be, which is typically not straightforward, and add them explicitly to the bidi style sheet.</li>
<li>downloading a second style sheet and overwriting styles leads to higher bandwidth consumption and more processing work for the rtl pages.</li>
</ol>
<h2>Approach 2: copying the whole style sheet and making changes</h2>
<p>This is the approach that I&#8217;m trying for the moment. Rather than painstakingly picking out just the lines that changed, I take a copy of the whole main style sheet, and load that with the article <em>instead</em> of the main style sheet.  Of course, I still have to change all the lefts to rights, and vice versa, and change all the graphics, etc.  But I don&#8217;t need to add additional rules in places where I previously only specified one side margin, padding, etc. </p>
<p>We&#8217;ll see how it works out.  Of course, the big problem here is that <em>any</em> change I make to the main style sheet has to be copied to the bidi style sheet, whether it is related to direction or not. Editing in two places is definitely going to be a pain, and breaks the big advantage that style sheets usually give you of applying changes with a single edit. Hopefully, if I&#8217;m careful, CSSJanus will ease that pain a little.</p>
<p>Another significant advantage should be that the page loads faster, because you don&#8217;t have to download two style sheets and overwrite a good proportion of the main style sheet to display the page.</p>
<p>And finally, as long as I format things exactly the same way, by running a diff program I may be able to spot where I forgot to change things in a way that&#8217;s not possible with approach 1.</p>
<h2>Approach 3: using :lang and a single file</h2>
<p>On the face of it, this seems like a better approach. Basically you have a single style sheet, but when you have a pair of rules such as <code>p { margin-right: 32%; margin-left: 7.5%;}</code> you add another line that says <code>p:lang(ar) { margin-left: 32%; margin-right: 7.5%; }</code>.</p>
<p>For small style sheets, this would probably work fine, but in my case I see some cons with this approach, which is why I didn&#8217;t take it:</p>
<ol>
<li>there are so many places where these extra lines need to be added that it will make the style sheet much harder to read, and this is made worse because the <code>p:lang(ar)</code> in the example above would actually need to be <code>p:lang(ar), p:lang(he), p:lang(ur), p:lang(fa), p:lang(dv) ...</code>, which is getting very messy, but also significantly pumps up the bandwidth and processing requirements compared with approach 2 (and not only for rtl docs).</li>
<li>you still have to add all those missing values we talked about in approach 1 that were not declared in the part of the style sheet dealing with ltr scripts</li>
<li>the list of languages could be long, since there is no way to say &#8220;make this rule work for any language with a predominantly rtl script&#8221;, and obscures those rules that really are language specific, such as for font settings, that I&#8217;d like to be able to find quickly when maintaining the style sheet</li>
<li>you really need to use the :lang() selector for this, and although it works on all recent versions of major browsers, it doesn&#8217;t work on, for example, IE6</li>
</ol>
<p>Having said that, I may use this approach for the few things that CSSJanus can&#8217;t convert, such as flipping images. That will hopefully mean that I can produce the alternative stylesheet in approach 2 just by running through CSSJanus. (We&#8217;ll see if I&#8217;m right in the long run, but so far so good&#8230;)</p>
<h2>Approach 4: what I&#8217;d really like to do</h2>
<p>The cleanest way to reduce most of these problems would be to add some additional properties or values so that if you wanted to you could replace </p>
<p><code>p { margin-right: 32%; margin-left: 7.5%; text-align: left; }</code> </p>
<p>with</p>
<p><code>p { margin-start: 32%; margin-end: 7.5%; text-align: start; }</code></p>
<p>Where start refers to the left for ltr documents and right for rtl docs. (And end is the converse.)</p>
<p>This would mean that that one rule would work for both ltr and rtl pages and I wouldn&#8217;t have to worry about most of the above.  </p>
<p>The new properties have been strongly recommended to the CSS WG several times over recent years, but have been blocked mainly by people who fear that a proliferation of properties or values is confusing to users. There may be some issues to resolve with regards to the cascade, but I&#8217;ve never really understood why it&#8217;s so hard to use start and end. Nor have I met any users of RTL scripts (or vertical scripts, for that matter) who find using start and end more confusing than using right and left &#8211; in fact, on the contrary, the ones I have talked with are actively pushing for the introduction of start and end to make their life easier. But it seems we are currently still at an impasse.</p>
<h3>text-align</h3>
<p>Similarly, a start and end value for text-align would be very useful. In fact, such a value is in the CSS3 Text module and is already recognised by latest versions of Firefox, Safari and Chrome, but unfortunately not IE8 or Opera, so I can&#8217;t really use it yet.</p>
<p>In my style sheet, due to some bad design on my part, what I actually needed most of the time was a value that says &#8220;turn off justify and apply the current default&#8221; &#8211; ie. align the text to left or right depending on the current direction of the text. Unfortunately, I think that we have to wait for full support of the start and end values to do that.  Applying text-align:left to unjustify, say, p elements in a particular context causes problems if some of those p elements are rtl and others ltr.  This is because, unlike mirroring margins or padding, text-align is more closely associated with the text itself than with page geometry. (I resolved this by reworking the style sheet so that I don&#8217;t need to unjustify elements, but I ought to follow <a href="http://www.w3.org/TR/2009/NOTE-i18n-html-tech-bidi-20090908/#tech-textalign">my own advice</a> more in future, and avoid using text-align unless absolutely necessary.)</p>
]]></content:encoded>
			<wfw:commentRss>http://rishida.net/blog/?feed=rss2&#038;p=671</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
	</channel>
</rss>

