Efficiently localizing user interfaces is an age-old problem that has haunted
programmers since the early days of software development. Many tools and
techniques have been employed over the years for this with differing levels of
success by organizations across the world.
A few years ago, stakehold...
Efficiently localizing user interfaces is an age-old problem that has haunted
programmers since the early days of software development. Many tools and
techniques have been employed over the years for this with differing levels of
success by organizations across the world.
A few years ago, stakeholders came together in the Unicode Consortium from
various areas of work bringing along tools and knowledge in order to build a
definitive system that could be a standard solution for these problems. The
first part of this design has taken shape as “MessageFormat 2”.
What is MessageFormat 2 like? How does it approach the vast problem space and
how exactly could it be adopted across various user interfaces? What further
tools and standards are already being developed on top of it? Join us in this
session to answer these questions and find out what the future of localization
will look like.
(c) FOSDEM 2025
1 & 2 February 2025
https://fosdem.org/2025/schedule/event/fosdem-2025-5561-solving-the-world-s-localization-problems/
Size: 4.65 MB
Language: en
Added: Mar 11, 2025
Slides: 39 pages
Slide Content
Solving the world’s
(localization) problems
Eemeli Aro Mozilla), Ujjwal Sharma Igalia
Ujjwal Sharma
from New Delhi, India
based out of A Coruña, Galiza
OSS zealot, open web maximalist
love dogs, (masochistic) videogames
work at Igalia
Eemeli Aro
●From Helsinki, Finland
●Staff Software Engineer at Mozilla
●Maintainer of messageformat and yaml on npm
●Working on localization and localization
since 2012
What's wrong with
current solutions?
-L10n frameworks are primarily chosen based on their developer
front-end
-The message format/syntax is incidental
-Each framework provides all the answers
-Solution needs to get picked early (often doesn't…) and changing to
another can have a really high cost
A world of silos and monoliths
-Dynamic messages vary on many aspects of language
-plural case
-grammatical gender
-personal gender
-Vowel sounds: English a/an, French le/l'
-Prepositions: in a car, on a bus
-Messages often vary in more than one dimension
-Variance depends on language
-English he/she vs. Finnish hän (but oh so many suffixes)
A world of limitations
Inflection is hard.
Even in English.
Variance is
multi-dimensional
-Explicitly identified as an interesting problem to solve in 2013, but no
sufficiently good format was identified then or later.
-In 2019, TC39TG2 formed the MFWG to define a new format that could
be made available in JS via Intl.MessageFormat.
-WG moved under Unicode CLDR as a more appropriate host – a solution
for JS should be good for everyone else as well.
-After many meetings over five years, we think we're done.
Solving the problem for the web
Standards develop
very slowly.
Intl.MessageFormat was first proposed in 2013.
The world beyond a single message
-Syntax & data model for a single message is good, but how do we put
together multiple messages?
-We need a new resource file format.
-Also, a metadata language – think JavaDoc/JSDoc for localization
-@locale
-@param
-@allow-empty
-…
A world of interoperability
-The message data model is not meant to be an abstract thing, but a tool
to be used
-This makes it possible to compare and convert messages across all
formats
-npm: messageformat, @messageformat/fluent,
@messageformat/icu-messageformat-1
-python: moz.l10n
-A better translation memory?
We're providing you with building blocks
-Your favourite L10n framework probably doesn't support MF2 yet.
-The tools you need to adopt MF2 probably aren't there yet.
-This is by design: We are not presuming to solve all the problems at
once, and we need your help.
-A key thought: Translatable human messages are not really that
complex, and the MF2 data model can represent all of them.
-MF2 isn't going to replace your current framework; it's trying to make it
better, and make it less of a silo.
Supporting localization in HTML
-Let's make localization declarative, and so web-native that you don't
need JavaScript to make it work.
-Declare in HTML your MF2 resources with <link rel="messages">, and
use them as <span msg="msg-id"></span>.
-This does depend on a message resource spec, and on the JS
Intl.MessageFormat spec.
<html>
You should tell us if we're wrong
-The 2.0 version of the spec is currently a Final Candidate, and it'll be
finalized with a month or so.
-If you think we're wrong about some part of this, you should tell us
ASAP, or we'll likely be stuck with our mistakes for the next decade or
three.