Portable Optical Character to Braille Translator with Voice- Over “Providing vision to the...

37
Portable Optical Character to Braille Translator with Voice- Over “Providing vision to the visually challenged” EE382N: Advanced Embedded Systems Architecture Project Presentation Karthik Ramana Sankar Prasad Vidhyabhaskaran Tanvi Joshi

Transcript of Portable Optical Character to Braille Translator with Voice- Over “Providing vision to the...

Page 1: Portable Optical Character to Braille Translator with Voice- Over “Providing vision to the visually challenged” EE382N: Advanced Embedded Systems Architecture.

Portable Optical Character to Braille Translator with Voice-Over

“Providing vision to the visually challenged”

EE382N: Advanced Embedded Systems ArchitectureProject Presentation

Karthik Ramana SankarPrasad Vidhyabhaskaran

Tanvi Joshi

Page 2: Portable Optical Character to Braille Translator with Voice- Over “Providing vision to the visually challenged” EE382N: Advanced Embedded Systems Architecture.

Current State-of-Art in Braille Translators

• Connected to the PC using USB interface and translates keystrokes and on-screen text into Braille characters

• These displays work by raising dots through holes, as seen in the picture below• Current Braille displays in market:• Very expensive (around $5000 for the hardware and $1000 for the software;

possibly because of patents)• Not portable• Requires a PC for operation

Page 3: Portable Optical Character to Braille Translator with Voice- Over “Providing vision to the visually challenged” EE382N: Advanced Embedded Systems Architecture.

Our Product Idea• A portable Braille translator – a power efficient small device

(the size of a smart phone) with Braille display and audio system• Users can read or listen to any text from printed or electronic

source, without a PC• A system that can perform Real-time Image processing and

give feedback to users• Use open source software (like OpenCV) to reduce cost

Page 4: Portable Optical Character to Braille Translator with Voice- Over “Providing vision to the visually challenged” EE382N: Advanced Embedded Systems Architecture.

High level Block diagram

Pot-meter to adjust the

output speed

ADC interface

User input through push-button switches

ARM 926EJ-S or SoC (ARM + DSP cores)

Braille output

SD Card (file system)

SPI Interface

USB Interface to Camera/Scanner

AC97 Codec interface for Voice

Out

Page 5: Portable Optical Character to Braille Translator with Voice- Over “Providing vision to the visually challenged” EE382N: Advanced Embedded Systems Architecture.

Optical Character Recognition: OpenCV

• Open Source Computer Vision• Library of image processing functions• Usage well documented• Lots of tutorials • OCR tutorial (Damiles blog)

Page 6: Portable Optical Character to Braille Translator with Voice- Over “Providing vision to the visually challenged” EE382N: Advanced Embedded Systems Architecture.

Basic steps in OCR

• Preprocess• Normalize size• Normalize contrast/color

• Train/Evaluate• Use K- Nearest Neighbor algorithm

Page 7: Portable Optical Character to Braille Translator with Voice- Over “Providing vision to the visually challenged” EE382N: Advanced Embedded Systems Architecture.

Port to ARM

• Build OpenCV• Write friendly code• Cross compile

Page 8: Portable Optical Character to Braille Translator with Voice- Over “Providing vision to the visually challenged” EE382N: Advanced Embedded Systems Architecture.

Build OpenCV

• Host machine• PC Linux

• Specify toolchain to be used for cross compilation• Target machine• Target Kernel• Path to compiler, assemblers etc• CodeSourcery tools for ARM EABI

• Disable unwanted components• Watch out for compiler restrictions

Page 9: Portable Optical Character to Braille Translator with Voice- Over “Providing vision to the visually challenged” EE382N: Advanced Embedded Systems Architecture.

OCR code and drivers

• Avoided usage of certain libraries• Ulib• Used for rapid development of applications• Windows, mouse interactions etc.

• Drivers to communicate result to LCD and audio codec• Cross compile• ‘-static’ flag to prevent linking with shared

libraries

Page 10: Portable Optical Character to Braille Translator with Voice- Over “Providing vision to the visually challenged” EE382N: Advanced Embedded Systems Architecture.

LCD driver and state machine

• Substitute for Braille display• LCD displays number and the corresponding

Braille symbol • Driver sends custom character code• Verilog FSM

Page 11: Portable Optical Character to Braille Translator with Voice- Over “Providing vision to the visually challenged” EE382N: Advanced Embedded Systems Architecture.

OCR Tests

Page 12: Portable Optical Character to Braille Translator with Voice- Over “Providing vision to the visually challenged” EE382N: Advanced Embedded Systems Architecture.

OCR Results

Page 13: Portable Optical Character to Braille Translator with Voice- Over “Providing vision to the visually challenged” EE382N: Advanced Embedded Systems Architecture.

OCR Results

Page 14: Portable Optical Character to Braille Translator with Voice- Over “Providing vision to the visually challenged” EE382N: Advanced Embedded Systems Architecture.

OCR Results

• Object file size 2.3MB

• Typical time required for one classification ~0.6S

Page 15: Portable Optical Character to Braille Translator with Voice- Over “Providing vision to the visually challenged” EE382N: Advanced Embedded Systems Architecture.

Analog DevicesAD1981BL

Ref: EE382N-4 Lecture 1

Page 16: Portable Optical Character to Braille Translator with Voice- Over “Providing vision to the visually challenged” EE382N: Advanced Embedded Systems Architecture.

HW Design details

Ref: EE382N-4 Lecture 1

Page 17: Portable Optical Character to Braille Translator with Voice- Over “Providing vision to the visually challenged” EE382N: Advanced Embedded Systems Architecture.

AC97 Codec• Audio code standard developed by Intel, 1997• Used in PC mother Boards, Modems, Sound Cards

• Features:• AC link: Digital link between controller and audio Codec• Bidirectional serial digital stream at 12.288MHz• TDM on data lines: 256 bit wide frames @ 48kHz i.e. frame duration = 20.8

microseconds• Fixed rate of 48kHz or variable rate , variable sample size : 20b / 16 b• Sample rate conversions in controller or in software driver

Ref: Audio Codec ‘97 , Rev. 2.3, Intel

Page 18: Portable Optical Character to Braille Translator with Voice- Over “Providing vision to the visually challenged” EE382N: Advanced Embedded Systems Architecture.

AD1981BL Functional Block Diagram

Page 19: Portable Optical Character to Braille Translator with Voice- Over “Providing vision to the visually challenged” EE382N: Advanced Embedded Systems Architecture.

AC-Link Output FrameSlot 0: TAG / Codec ID Slot 7: PCM L Surround DACSlot 1: Command Address Port Slot 8: PCM R Surround DAC Slot 2: Command Data Port Slot 9: PCM LFE DACSlot 3: PCM Playback Left Channel Slot 10: Modem Line 2 Output Channel Slot 4: PCM Playback Right Channel Slot 11: Modem Handset Output ChannelSlot 5: Modem Line 1 Output Channel Slot 12: Modem GPIO Control ChannelSlot 6: PCM Center DAC

Page 20: Portable Optical Character to Braille Translator with Voice- Over “Providing vision to the visually challenged” EE382N: Advanced Embedded Systems Architecture.

AC-Link Input FrameSlot 0: TAG Slots 7-9: Vendor ReservedSlot 1: Status Address Port Slot 10: Modem Line 2 ADC Slot 2: Status Data Port Slot 11: Modem Handset ADC Slot 3: PCM Record Left Channel Slot 12: Modem GPIO Status Slot 4: PCM Record Right Channel Slot 5: Modem Line 1 ADC Slot 6: Dedicated Microphone Record Data

Page 21: Portable Optical Character to Braille Translator with Voice- Over “Providing vision to the visually challenged” EE382N: Advanced Embedded Systems Architecture.

AC97 Controller IP Core

• Source : OpenCores.org• Author: Rudolf Usselmann• Features:• 6 output, 3 input channels• Used: 2 Output channels (Main L/R )

• Variable sample rate up to 48kHz• Used: Fixed rate of 48kHz

• External DMA Engine Support• Not used. Future use for high speed audio transfer

Page 22: Portable Optical Character to Braille Translator with Voice- Over “Providing vision to the visually challenged” EE382N: Advanced Embedded Systems Architecture.

AC97 Controller IP Core (Contd..)

• Configuration through ac97_defines.v• Comment out extra channels. Main L/R channels selected

by default• Output FIFO depth of 16 selected• In-fifo unit removed by commenting the AC_SIN define

macro

Page 23: Portable Optical Character to Braille Translator with Voice- Over “Providing vision to the visually challenged” EE382N: Advanced Embedded Systems Architecture.

AC97 Controller IP Core (Contd..)

Original Architecture Modified Architecture

`

Top Level Wrapper

Driver Interface

ac97_top2x

Page 24: Portable Optical Character to Braille Translator with Voice- Over “Providing vision to the visually challenged” EE382N: Advanced Embedded Systems Architecture.

AC97 Controller IP Core (Contd..)

• ‘ac97_top’ block instantiated inside top-level wrapper

• Wishbone interface replaced by adding logic in the top-level wrapper• Control and data for configuring codec registers

( ac_97rf module)• Control and data for output channel FIFOs

(ac97_out_fifo module)

• Driver interface implemented as a state machine (Similar to Lab2)

Page 25: Portable Optical Character to Braille Translator with Voice- Over “Providing vision to the visually challenged” EE382N: Advanced Embedded Systems Architecture.

AC97 Controller IP Core (Contd..)

• Commands received from driver:• Read codec register• Write to codec register• Playback sound file

• Ideal case: • Playback command followed by sample wise transfer of 20bit-

PCM audio data • 24MHz handshaking (as_strobe, dtack) takes 9 cycles per sample if

state machine is not stalled• Data written to out FIFO in every next clock • Serial data streaming to Codec at fixed rate: 48kHz

Page 26: Portable Optical Character to Braille Translator with Voice- Over “Providing vision to the visually challenged” EE382N: Advanced Embedded Systems Architecture.

AC97 Controller IP Core (Contd..)

• Current Implementation:• Playback command indicates tone frequency

depending upon detected character• Square wave of the detected frequency

generated and then sent out for streaming • Playback for fixed amount of time and then

‘dtack’ asserted

Page 27: Portable Optical Character to Braille Translator with Voice- Over “Providing vision to the visually challenged” EE382N: Advanced Embedded Systems Architecture.

Future Work

• OpenCV – OCR• Process real world images and provide audio instructions to

users. Example: Traffic lights at road intersections • AC97 Codec• Explore about the power down feature of the AC97 controller,

this puts Codec into sleep mode. Power efficiency is very important for battery operated portable devices

• Develop other interfaces to ARM9• Interface a Camera/Scanner via a USB interface• Develop interface to motor/solenoid driver chips for driving

the braille actuators• Interface SD card using SPI and develop a file system on ARM9

Page 28: Portable Optical Character to Braille Translator with Voice- Over “Providing vision to the visually challenged” EE382N: Advanced Embedded Systems Architecture.

Questions?????

Page 29: Portable Optical Character to Braille Translator with Voice- Over “Providing vision to the visually challenged” EE382N: Advanced Embedded Systems Architecture.

Back Up

Page 30: Portable Optical Character to Braille Translator with Voice- Over “Providing vision to the visually challenged” EE382N: Advanced Embedded Systems Architecture.

Sources• CodeSourcery G++ Lite 2010q1-88 for ARM EABI• http://www.codesourcery.com/sgpp/lite/arm/portal/release1294

• OpenCV 1.1pre1• http://sourceforge.net/projects/opencvlibrary/files/opencv-unix/

• OpenCV tutorials• http://opencv.willowgarage.com/wiki/• http://blog.damiles.com/

Page 31: Portable Optical Character to Braille Translator with Voice- Over “Providing vision to the visually challenged” EE382N: Advanced Embedded Systems Architecture.

KNN• Two phases of operation

1. Train with different variants of the character set2. Evaluate• Find ‘k’ nearest neighbors• Distance in multidimensional space

• Majority class wins

• Simple algorithm, fast computation, relies on memory• Widely used for OCR

Page 32: Portable Optical Character to Braille Translator with Voice- Over “Providing vision to the visually challenged” EE382N: Advanced Embedded Systems Architecture.

Embedded Application Binary Interface (EABI)

• “specifies standard conventions for file formats, data types, register usage, stack frame organization, and function parameter passing of an embedded software program” -wiki

• Allows developers to link libraries generated with one compiler with object code generated with a different compiler

• Dynamic linking is not required/not allowed.• Compact stack frame organization is used to save memory

Page 33: Portable Optical Character to Braille Translator with Voice- Over “Providing vision to the visually challenged” EE382N: Advanced Embedded Systems Architecture.

ARM EABI• Faster floating point operations• ARM lacks FPU• It means that the compilers usually generate instructions for a piece

of hardware, namely a Floating Point Unit, that is not actually there! When you make a floating point operation, such at 3.58*x, the CPU runs into an illegal instruction, and it raises an exception. The kernel catches this specific exception and performs the intended float point operation, and then resumes executing the program. And this is slow because it implies a context switch.

• Context switch – Icache and DCache are flushed.• But with EABI, ‘softfloat’ (a software implementation for Binary

floating point arithmetic) is used by default. • Entire root file system doesn’t have to be recompiled with softfloat

enabled. With ARM EABI, there can be a mix of executables, where some use softfloat and some use the hardware FPU.

Page 34: Portable Optical Character to Braille Translator with Voice- Over “Providing vision to the visually challenged” EE382N: Advanced Embedded Systems Architecture.

Braille digits

Page 35: Portable Optical Character to Braille Translator with Voice- Over “Providing vision to the visually challenged” EE382N: Advanced Embedded Systems Architecture.

Displaying Braille characters on LCD

16 8 4 2 1

0

1

2

3

4

5

6

7

3

3

0

24

24

0

27

27

Page 36: Portable Optical Character to Braille Translator with Voice- Over “Providing vision to the visually challenged” EE382N: Advanced Embedded Systems Architecture.

LCD Driver• Interface• init_lcd(int fd)• clear_lcd(int fd)• outbraille_lcd(int fd, unsigned int number)

Page 37: Portable Optical Character to Braille Translator with Voice- Over “Providing vision to the visually challenged” EE382N: Advanced Embedded Systems Architecture.

Steps to display custom characters on LCD• Send command byte 0x40 (RS low)• Write between 8 to 64 bytes of data (RS high)• 8 bytes per character. One byte per row.• Upto 8 custom characters can be displayed

• Send command 0x80 (RS low) to stop sending• Send ASCII between 0x00 – 0x07 to access one of the 8

custom characters• By writing to the CG RAM repeatedly on the fly, any number of

custom characters can be displayed