camera-based-label-reader-for-blind-people

Views:
 
Category: Entertainment
     
 

Presentation Description

No description available.

Comments

Presentation Transcript

slide 1:

________________________________________ Author for correspondence E-mail: subbiahavit.ac.in Int. J. Chem. Sci.: 14S3 2016 840-844 ISSN 0972-768X www.sadgurupublications.com CAMERA BASED LABEL READER FOR BLIND PEOPLE A. SUBBIAH T. ARIVUKKARASU M. S. SARAVANAN and V. BALAJI Department of ECE Aarupadai Veedu Institute of Technology CHENNAI T.N. INDIA ABSTRACT An OCR Optical Character Recognition system which is a branch of computer vision and in turn a sub-class of Artificial Intelligence. OCR is the translation of optically scanned bitmaps of printed or hand written text into audio output by using of Raspberry Pi. OCRs developed for many world languages are already under efficient use. The Moving object region is extracted by a mixture-of-Gaussians-based background subtraction method. A text localization and recognition are conducted to acquire text information. To localize the text regions from the object a text localization algorithm by learning gradient features of stroke orientations and distributions of edge pixels .Text characters in the localized text regions are then binarized and recognized by off-the-shelf OCR software. The text codes recognized are output to blind users in speech. As the recognition process is completed the character codes in the text file are processed using Raspberry pi device which recognize character using Adaboost algorithm and python programming the audio output is listened. Key words: Label reader Blind Camera OCR. INTRODUCTION Optical character recognition OCR technology offers blind and visually impaired persons the capacity to scan printed text and then speak it back in synthetic speech or save it to a computer. Little technology exists to interpret graphics such as line art photographs and graphs into a medium easily accessible to blind and visually impaired persons. It also is not yet possible to convert handwriting whether script or block printing into an accessible medium. There are three essential elements to OCR technology-scanning recognition and reading text. Initially a printed document is scanned by a camera. OCR software then converts the images into recognized characters and words. The synthesizer in the OCR system then speaks the recognized text. Finally the information is stored in an electronic

slide 2:

Int. J. Chem. Sci.: 14S3 2016 841 form either in a personal computer PC or the memory of the OCR system itself. The recognition process takes account of the logical structure of the language. An OCR system will deduce that the word "tke" at the beginning of a sentence is a mistake and should be read as the word "the." OCRs also use a lexicon and apply spell checking techniques similar to those found in many word processors. All OCR systems create temporary files containing the texts characters and page layout. In some OCRs these temporary files can be converted into formats retrievable by commonly used computer software such as word processors and spreadsheet and database software. The blind or visually impaired user can access the scanned text by using adaptive technology devices that magnify the computer screen or provide speech or braille output. The text which is given as input includes stereotypical forms – such as street signs hospital signs and bus numbers –as well as more variable forms such as shop signs house numbers and billboards. Our database of city images were taken in ZZZZ partly by normally sighted viewers and partly by blind volunteers who were accompanied by sighted guides for safety reasons using automatic camera settings and little practical knowledge of where the text was located in the image. The databases have been labeled to enable us to train part of our algorithm and to evaluate the algorithm performance. The first and most important component of the algorithm is a strong classifier which is trained by the AdaBoost learning algorithm on labeled data. AdaBoost requires specifying a set of features from which to build the strong classifier. This paper selects this feature set guided by the principle of informative features. We calculate joint probability distributions of these feature responses on and off text so weak classifiers can be obtained as log-likelihood ratio tests. The strong classifier is applied to sub-regions of the image at multiple scale and outputs text candidate regions. Captured image Gray image Convert to binary image Recognize character V oice output Store text Apply OCR Fig. 1: Block diagram of conversion from image to voice

slide 3:

A. Subbiah et al.: Camera Based Label Reader…. 842 Proposed system In this we have proposed a system using Raspberry Pi Model for scanning the images from stereotypical forms – such as street signs hospital signs and bus numbers –as well as more variable forms such as shop signs house numbers and billboards. Here we are using AdaBoost Algorithm for processing the visual information and converting into audio speech. ADA Boost algorithm AdaBoost is discovered in computer science is called the Multiplicative Weights Update Algorithm MWUA and it has many applications like from learning theory to combinatorial optimization and game theory. The aim is to Maintain a nonnegative weight for the elements of some set Draw a random element proportionally to the weights So something with the chosen element and based on the outcome of the “something…” Update the weights and repeat. The “something” is usually a black box algorithm like “solve this simple optimization problem.” The output of the “something” is interpreted and the weights are updated according to the severity. Here one can interpret MWUA as minimizing regret with respect to the best alternative element one could have chosen in hindsight. In fact this was precisely the technique we used to attack the adversarial bandit learning problem . AdaBoost requires a set of classified data with image windows labeled manually as being text or non-text. We performed this labeling for the training dataset and divided each text window into several overlapping text segments with fixed width-to-height ratio 2:1. This lead to a total of 7132 text segments which were used as positive examples. The negative examples were obtained by a bootstrap process similar to Drucker et al. 2 First we selected negative examples by randomly sampling from windows in the image dataset. After training with these samples we applied the AdaBoost algorithm to classify all windows in the training images at a range of sizes. Those misclassified as text was then used as negative examples for retraining AdaBoost. The image regions most easily confused with text were vegetation repetitive structures such as railings or building facades and some chance patterns. RASPBERRY-PI The Raspberry-Pi is a credit-card sized single-board computer developed in the UK with the Intention of promoting the teaching of basic computer science in schools. Being so

slide 4:

Int. J. Chem. Sci.: 14S3 2016 843 versatile it soon became one important tool which can be used in different electronic projects. The Raspberry-Pi has a Broadcom BCM2835 SoC which includes an ARM1176JZF-S 700 MHz processor and a Videocore 4 GPU. The more recent model B has 512 MB of RAM 2 USB ports and an Ethernet port. It does not have a built-in hard disk so it relies on an SD card for booting and persistent storage. It also includes a camera board that connects to the CSI 2 camera port on the Raspberry Pi using a short ribbon cable. It provides connectivity for a camera capable of capturing still images or video recordings. The camera connects to the Image System Pipeline ISP in the Raspberry Pi’s SoC where the incoming camera data is processed and eventually converted to an image or video on the SD card. This camera module is capable of taking photos up to 5 MP 2592 × 1944 pixels and can record video at resolutions up to 1080p30 1920 x 1080 x 30fps. RESULTS AND DISCUSSION In this paper we have used the Adaboost algorithm in Raspberry PI Model for the conversion for Text to audio speech so that the blind can easily understand and read the data which they could not see. The same project can be extended and tested with other OCR Algorithms’. REFERENCES 1. X. Chen and A. L. Yuille Ada Boost Learning for Detecting and Reading Text in City Scenes Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition 2004 pp. 366-373. 2. Lukas Neumann Jiri Matas Real-Time Scene Text Localization and Recognition Published at the 25 th IEEE Conference on Computer Vision and Pattern Recognition CVPR Providence RI USA 2012.

slide 5:

A. Subbiah et al.: Camera Based Label Reader…. 844 3. Chucai Yi Yingli Tian and Aries Arditi Portable Camera-Based Assistive Text and Product Label Reading From Hand-Held Objects for Blind Persons IEEE/Asme Transactions on Mechatronics 2013. 4. N. Rajkumar M. G. Anand and N. Barathiraja Portable Camera-Based Product Label Reading For Blind People Int. J. Engg. Trends Technol. 1011 521 2014. Accepted : 11.10.2016