I came across an interesting iPhone app today – Word Lens. It’s an augmented reality application that identifies text that the iPhone sees, performs optical character recognition (OCR), performs machine translation (MT), then removes the original and renders the translation instead. I can’t explain the basic premise better than the video:
This is an excellent example of modern computer science. Individually, the methods they use have been around for a while, but they put together several different tasks and made it work on reasonable hardware. I applaud them for creating such a nice integration of OCR and MT along with the graphics work to render the final video stream.
Of course such a system isn’t perfect. It only works on normal text. If the contrast between the text and background is too low it doesn’t recognize the text. Or if the text is against a patterned or image background, it doesn’t recognize that text. Stylized fonts and handwriting aren’t supported, which probably means that it’s checking against a database of fonts.
As an application, it puts some of the burden of effort on the user. The user must specify the input/output languages — the application doesn’t detect the language from the image or the language from the phone. (OCR is more difficult if you don’t know which language model to use.) Additionally, you must specify the orientation of the phone relative to the text (landscape/portrait).
Currently, the only supported conversions are Spanish to English and English to Spanish. I’d guess that they’re waiting to see if it’s successful before taking the effort to implement other languages.
Also, if you have unsteady hands or the image quality is poor (say low lighting), then it’ll alternate between being able to recognize text and not. For example, it may alternate between being able to detect a word and not, but alternate differently for other words. As a result, it can take some concentration to use in difficult environments.
The base application is free, but only contains two demos — reversing letters and removing the text. You can purchase Spanish to English or English to Spanish packages for $5 each. This setup gives you a chance to experiment with their OCR for free and judge whether it’ll work or not before coughing up the cash.
The two language packs are currently 50% off until Dec. 31. I’m guessing the $5 price is after the 50% off.
Is it worth $5-10 to convert from Spanish to English for vacation/business? If you plan on exploring towns or sights I’d say absolutely. I’ll admit I’m a little biased though — $5 is nothing compared to airfare and I’d get much more entertainment out of it than half of a movie (which is about $10 these days).
I would’ve paid $5-10 for augmented reality French to English the last time I went to Montreal. Although most people speak English, translation for menus and signs would be useful. The downside is that menus in nice places are more likely to use stylized fonts (and thus wouldn’t be translated).
This is one of those applications with a very bright future. If the software has a chance to mature and expand to more languages, it will become a mainstay for travelers.
- Big thanks to Praveen for trying it out and helping.
- I can’t really evaluate the translation quality. If it’s like most MT systems, I’d expect good Spanish to English translation. Also, I’d expect that sometimes it’ll mess up, but that it’s close enough to guess the meaning.