Standards and best practices for the Multilingual Web


MultilingualWeb workshop, Luxembourg: registration closing today!

Go here to register: /register If you are still planning to register for the MultilingualWeb workshop in Luxembourg next week, this is the last day for registrations. If you don't register today you may not be able to gain access to the event location. Hope to see you there!

MultilingualWeb workshop, Luxembourg, to accept posters

If you would like to exhibit a poster during the MultilingualWeb workshop in Luxembourg, 15-16 March, please contact Manuel Tomas Carrasco Benitez (manuel.carrasco-benitez at and copy Richard Ishida ( This email address is being protected from spambots. You need JavaScript enabled to view it. ) for more details.

We made the decision to accept poster applications quite late in the process, as a result of requests from attendees. If you are interested, please make contact as soon as possible, since the deadline for registrations is now only a week away.

HTML5 adds new translate attribute

A translate attribute was recently added to HTML5. At the three MultilingualWeb workshops we have run over the past two years, the idea of this kind of ‘translate flag’ has constantly excited strong interest from localizers, content creators, and from folks working with language technology. Richard Ishida has written a blog post describing how the attribute is meant to work, and describing support for the attribute in Google Translate and Microsoft Translator. He also hints at some ways in which the translate flag could be extended.

META-NET and Lionbridge sponsor MultilingualWeb workshop in Limerick

Lionbridge logo

IIT logo

META-NET and Lionbridge are sponsoring the 3rd workshop in the MultilingualWeb series, which will be held in Limerick, Ireland on 21-22 September 2011. (See the Call for Participation and the recently published Program.)

If your organization would like to also sponsor the workshop, see how to apply. The deadline for sponsorship proposals for Limerick is 7 September 2011.

FLaReNet is also endorsing the workshop.

Freebie from for Anyone who Likes the XLIFF Standard

Both the Madrid, and the Pisa workshops of the Thematic Network “Multilingual Web” mentioned the XML Localization Interchange File Format (XLIFF) as a central component of streamlined localization processes. Presentations, in which XLIFF was mentioned included:

Right in time for the Second International XLIFF Symposium, the magazine MultiLingual has now published the article Insights into the Future of XLIFF. The magazine has even made a very interesting offer for anyone interested in XLIFF: a free digital-only subscription, or a print-with-digital subscription for a reduced fee (details of the offer).

The article answers three different questions related to XLIFF:

  1. What is the status quo related to implementation and adoption?
  2. Which ideas or recommendations exist for enhancements or future versions?
  3. Which general observations can be made?

Base for the answers were the contributions to the First International XLIFF Symposium in Limerick and discussions in the XLIFF Technical Committee related to the symposium's contributions.

MultilingualWeb site now available in Gaeilge

Given that the next workshop is in Limerick, we have translated the MultilingualWeb site into Irish.

There are a few user interface terms that are still pending translation, and as for all of the languages the reports, program, call for participation, etc. are still in English (mostly because we don't have the resources to deal with those, and partly because the workshop is in English). But a large amount of text on the site and the navigation is now in Irish.

In addition to Irish, we have translated the site into Spanish, German, French, Italian and Romanian.

There are also two widgets at the bottom of each page, one from Microsoft and one from Google, that allow you to get gist translations of parts of the site that are not translated, or get gist translations into many other languages.

(For more information about the Limerick workshop, see /register)

Takeaways from the first two W3C Workshops “Multilingual Web” related to W3C ITS

The first two events related to the Thematic Network “Multilingual Web” provided a couple of opportunities to share information on the W3C Internationalization Tag Set (ITS). Presentations, in which ITS was mentioned included:

a. Best Practices and Standards for Improving Globalization-related Processes

b. W3C Internationalization Tag Set (ITS)

c. The Bricks to Build Tomorrow's Translation Technologies and Processes

d. Using ITS in the Common Content Formats

Especially the workshop in Pisa provided a couple of interesting ITS-related thoughts:

1. Several speakers mentioned that it would be good if content could be categorized in a standard way as "Generated by Machine Translation (MT)". I guess there are various ways of looking at this from an ITS point of view:

  • a. an additional data category with a semantics such as "generatedBy"
  • b. via a special, BCP47-compliant, value for the existing ITS data category "Language Information"; that special value may actually be a composite one since there may be a need to capture things like the following
    1. Name of MT system that generated
    2. Quality of the input
    3. (Semi-)official quality rating of the system (BLEU score or the like)

2. Several speakers explained that it would be good if content could be categorized in a standard way as "OK to be submitted to Natural Language Processing (NLP)". Example: In order to build models for statistical Machine Translation the Web is deemed to be an invaluable resource. However, some uncertainty seems to exist whether this use of Web-based content would be permitted or not. A standardized categorization could help. I guess there are various ways of looking at this from an ITS point of view: a. an additional data category with a semantics such as "nlpOK" b. something similar to the existing ITS data category "Localization Note" (namely one that captures information for machine processing, not for human consumption; see the discussion).

3. Charles McCathieNevile mentioned the addition of the notion of a default locale to the Widget Packaging and Configuration (see ). This made me wonder if "defaultLocale" might not be something that could be useful in quite a number of contexts - and thus would be a candidate for an additional ITS data category. The Widget document actually initiated another localization related thought (namely that the Widget document should be required reading for anyone who works on standardized packaging for translation-related processes).

P.S.: The above is similar to post to the mailing list for the W3C ITS Interest Group.

Slides and IRC log for Pisa now available

By all accounts, the MultilingualWeb Workshop in Pisa proved to be as popular as its predecessor in Madrid, thanks to the efforts of the many excellent speakers and the local organizers. Once again, we had around 100 attendees and 33 speakers. The program page has now been updated to point to speakers' slides and to the relevant part of the IRC log. Links to video recordings will follow in about a week's time.

There is also a page pointing to social media reports, such as blog posts, tweets and photos, related to the workshop. If you have other blog posts, photos, etc. online, please let Richard Ishida know ( This email address is being protected from spambots. You need JavaScript enabled to view it. ) so that we can link to them from this page.

A summary report of the workshop will be produced shortly.

Initial draft page of links to slides and IRC log available

The MultilingualWeb Workshop in Madrid appears to have been a great success, thanks to the efforts of the many excellent speakers. As a first step in reporting the workshop, a page of links is now available that points to speakers' slides and to the relevant part of the IRC log. It also points to blog posts, tweets and photos related to the workshop.

We are still missing a small number of slide sets, and those will be added as speakers provide them.

If you know of other social media references to the workshop, please inform Richard Ishida ( This email address is being protected from spambots. You need JavaScript enabled to view it. ) so that they can be added to the page.

More information about the workshop will be disseminated shortly.

MultilingualWeb Workshop, Madrid. Initial program published.

A first view of the workshop program has just been published. Speakers represent a wide range of organizations and interests, such as:

BBC, DFKI, European Commission, Facebook, Google, Loquendo, LRC, Microsoft, Mozilla, Opera, SAP, W3C, WWW Foundation, and more.

Session titles include: Developers, Creators, Localizers, Machines, and Users. The workshop should provide useful cross-domain networking opportunities.

The first workshop takes place in Madrid on 26-27 October 2010. It is free and open to the public.

If you are interested in attending the workshop, see the Call for Participation for details on how to register.