Computer Science Department
School of Computer Science, Carnegie Mellon University


Text Detection and Translation from Natural Scenes

Jiang Gao, Jie Yang, Ying Zhang, and Alex Waibel

June 2001
(color images)

Keywords: Text detection, machine translation, OCR, intelligent interface, multimedia system

We present a system for automatic extraction and interpretation of signs from a natural scene. The system is capable of capturing images, detecting and recognizing signs, and translating them into a target language. The translation can be displayed on a hand-held wearable display, or a head mounted display. It can also be synthesized as a voice output message over the earphones. We address challenges in automatic sign extraction and translation. We describe methods for automatic sign extraction. We extend example-based machine translation technology for sign translation. We use a user-centered approach in the system development. The approach takes advantage of human intelligence if needed and leverages human capabilities. We are currently working on Chinese sign translation. We have developed a prototype system that can recognize Chinese signs input from a video camera that is a common gadget for a tourist, and translate the signs either into English text or a voice stream. We have built up a database containing about 800 Chinese signs for development and evaluation. We present evaluation results and analyze errors. The sign translation, in conjunction with spoken language translation, can help international tourists to overcome language barriers. The technology can also help a visually handicapped person to increase environmental awareness.

24 pages

Return to: SCS Technical Report Collection
School of Computer Science homepage

This page maintained by