Ocr ppt

19
LEARNER-OCR (for beginners) For learning how to write and pronounce English characters

Transcript of Ocr ppt

Page 1: Ocr ppt

ENGLISH LANGUAGE LEARNER-OCR (for

beginners)For learning how to write and pronounce

English characters

Page 2: Ocr ppt

Introduction -1.About ocr-OCR is the acronym for Optical Character Recognition.Optical character recognition is needed when the

information should be readable both to humans and to a machine

Both hand written and printed characters may be recognized

It converts scanned images of machine-printed or handwritten text (numerals, letters, and symbols) into a computer-processable format.

Optical recognition is performed off-line after the writing or printing has been completed, as opposed to on-line recognition where the computer recognizes the characters as they are drawn.

Page 3: Ocr ppt

2. About speech synthesisThe text-to-speech (TTS) synthesis procedure

consists of two main phases. text analysis (input text->Phonetic o/p)and speech generation(phonetic info->acoustic o/p)

Conversion of text into Speech can be implemented using java Speech Application Programming Interface (JSAPI) through which applications can use functionality of speech engines. FreeTTS is JSAPI speech synthesis engine that we have used .

Page 4: Ocr ppt

Current status of development• OCR readers can convert typed and handwritten

documents into digital data. These readers scan the shape of a character on a document, compare the scanned character with a pre-defined shape, and convert the character into its corresponding bit pattern for storage in main computer memory. This technology is still in development.

• Speech synthesis has reached a high level of performance, with low error rates in text analysis, and high intelligibility in synthesis, but there is still much improvements to be done to achieve more natural sounding speech.

Page 5: Ocr ppt

advantagesreduces the time required by user to enter the data.Helps in learning language along with spoken help.no requirement of keyboard for entering text .A computer with handwriting recognition integrated

with speech synthesis can teach any time at any place.

Both writing and Pronounciation can be learned.people with reading disabilities(dyslexics), can use it.a person can change his own handwritten pattern of

alphabet (improved) , during his learning phase as many times as required.

Page 6: Ocr ppt

Artificial neural networks

An artificial neural network is composed of many artificial neurons that are linked together according to a specific network architecture. The objective of the neural network is to transform the inputs into meaningful outputs.

Inp

uts

Output

Page 7: Ocr ppt

Kohonen algoThe input to a Kohonen neural network is given using the

input neurons. These input neurons are each given the floating point numbers that make up the input pattern to the network. A Kohonen neural network requires that these inputs be normalized to the range between -1 and 1. One output neuron is chosen as the winner

To examine which neuron would win and produce output, steps to be followed are :

Normalize the input-First calculate the "vector length" of the input data, This is

done by summing the squares of the input vector. Then, determine the normalization factor. The normalization factor is the reciprocal of the square root of the vector length.

Page 8: Ocr ppt

Contd… Calculate each output neuron’s o/p – For each of the output neurons , dot product of input

vector and connection weights between the input neurons and that output neuron must be calculated.

This o/p must now be normalized by multiplying it by the normalization factor

The above calculated o/p is mapped to bipolar number by adding 1 and dividing the result by 2.

Finally, choose the winning neuron , the output neuron that has the largest o/p value becomes the winner.

Page 9: Ocr ppt

Unsupervised learningNo help from the outside no information available on the desired

outputLearning by doing

Page 10: Ocr ppt

Processes in our OCR…The hand written characters are first drawn using the mouse. the bit pattern of the image is grabbed.Cropping is done, for eliminating the extra white space

around the image.DownSampling, an algorithm to reduce the resolution of

the letters being drawn, is used for character recognition and training.

Recognition (using Kohonen Self Organization Map) and speech synthesis (Using JSAPI).

Training the network to recognize same or identical patterns.(by classifying to the same output neuron)

Error calculation(how well network classifies)

Page 11: Ocr ppt

Language used-JavaJAVA is a general computer programming language

developed by Sun Microsystems.Object oriented language platform independent code written in JAVA will be easier to maintain and

reuse in the long runJava has two GUI packages, the original Abstract

Windows Toolkit (AWT) and the newer SwingSwing components have the prefix J to distinguish them

from the original AWT ones (eg JFrame instead of Frame). To include Swing components and methods in your project you must import the java.awt.*, java.awt.event.*, and javax.swing.* packages.

Page 12: Ocr ppt

Containers are used to hold and group components such as text fields and checkboxes etc.

JFRAME AND JPANELJFrame is the most commonly used top-level container. It adds basic

functionality such as minimize, maximize, close, title and border to basic frames and windows. Some important JFrame methods are: setBounds(x,y,w,h), setSize(w,h), setResizable(bool), setTitle(str), setVisible(bool), isResizable() and getTitle(). The setDefaultCloseOperation(constant) method controls the action that occurs when the close icon is clicked.

JPanel is the most commonly used content pane. An instance of the pane is created and then added to a frame. The add() method allows GUI components to be added to the pane. The way they are added is controlled by the current layout manager.

For text-to-speech conversion using java we need some packages eg.-  speech,util ,synthesis,freetts etc and some jar files to be installed in our working folder before compiling our program.

Page 13: Ocr ppt

Data flow diagrams

Page 14: Ocr ppt

Dfd

Page 15: Ocr ppt

DfdHand written characters

Connection Weights

Vector Length

DownSampled image

Outputs

Normalized input

Recognition

Winner

Input vector

Normalized Outputs

User interface

Cropping

Kohonen Neural Network

Level-2 Data Flow Diagram

Page 16: Ocr ppt

features in ocrIt can recognize handwritten characters and simultaneously

speak that recognized character.We can train the network to recognize our own handwriting , so

that most of our characters can be recognized. The process for training a Kohonen neural network involves stepping through several epochs until the error of the Kohonen neural network is below acceptable level. Epoch occurs when training data is presented to the network ,error is calculated and weights are adjusted to reduce error.

We have a training file that contains training samples for our own handwriting (capital versions of 26 english letters ), which can be loaded and the application can be trained further to recognize characters drawn by us .

Page 17: Ocr ppt

Features cond…. It can create a list of letters that the program has been trained

for, by selecting a particular letter, and deleting it ,we can retrain our program for that letter.

Error i.e., how well the training inputs(the letters that you created) map to the output neurons(26 characters).If the error is below the acceptable level of error(10%), there is no requirement of any further training .

The first error, lastError, indicates the total error for the Kohonen neural network ,for the epoch that just occurred. The second error, bestError, indicates the best(least) lastError that has occurred so far. Tries counts the no. of times we have tried to adjust weight matrix to reduce error ,during training the n/w.

Page 18: Ocr ppt

Snapshot

Page 19: Ocr ppt

than