Use of Computer Vision and Machine Learning technologies in e-Governance OF COMPUTER... ·...

3

Department of Information Technology Government of Maharashtra

USE OF COMPUTER VISION AND MACHINE LEARNING TECHNOLOGIES IN E-GOVERNANCE A paper on implementation of OpenCV by UID Innovation Labs, Department of Information Technology, Government of Maharashtra

Use of computer vision and machine learning technologies in eGovernance UID Innovation Labs, Government of Maharashtra

P a g e 1 | 39

BACKGROUND

AADHAAR is 12 digit Unique Identification (UID) provided by Unique Identification Authority of India

(UIDAI) to empower residents of India. It is India’s one of the most ambitious eGovernance projects so

far and it envisages to issue UID numbers to all residents of India. As on 15th May 2014, more than 630

million UIDs have been issued.

State of Maharashtra has been in forefront of Aadhaar implementation since inception of project from

all perspectives of project implementation i.e. from enrolment to implementation. As on 15th May 2014,

approximately 9.3 million citizens are enrolled for UID and out of which 8.7 million UIDs are issued.

Government of Maharashtra (GoM) has deployed fully operational IT infrastructure and solutions to

facilitate use of Aadhaar Authentication and eKYC with Government service delivery. Use of eKYC by

Directorate of Government Printing and Stationary (DGPS) in the process of name, religion, date of birth

change for citizens and by Department of Registrations and Stamps for Registration of Leave and License

agreements, has changed the paradigm of G2C service delivery.

GoM had also taken various initiatives such as Mobile based Self-seeding, Web based Self Seeding, and

was the first state to set up State Resident Data Hub (SRDH) for facilitating Aadhaar Enabled Service

Delivery. SRDH is state resident data hub encompassing all data of citizen’s into it. Approximately 6.5

million residents’ demographic data and photographs are stored in SRDH, while our online

authentication modules link to UIDAI central servers for fingerprints and iris data. In addition to that,

GoM has also set up a multidisciplinary technology lab focusing on Aadhaar project called UID

Innovation Labs where various innovative tools and techniques are developed.

USE OF INNOVATIVE TECHNOLOGIES

With an ambition of creating highly accurate and reliable residential data hub, GoM decided to conduct

UID data cleansing activities in proactive manner. As a result, UID Innovation Labs evaluated various

scientific tools for data cleansing. One of the challenges during the activities was to validate citizens’

photographs using automated process. After evaluating various open source and commercial

alternatives, Open Source Computer Vision (OpenCV) was utilized as a framework for this objective.


P a g e 2 | 39

OpenCV is cross-platform open source computer vision and machine learning software library aimed at

real-time computer vision applications such as imagining, video processing, image processing, HDR

imaging etc. OpenCV contains more than 2500 optimized algorithms including state-of-the-art computer

vision and machine learning algorithms.

OpenCV 2.4 encompasses FaceRecognizer class for face recognition. OpenCV implements a version of

the face-detection technique first developed by Paul Viola and Michael Jones—commonly known as the

Viola-Jones detector*—and later extended by Rainer Lienhart and Jochen Maydt* to use diagonal

features (more on this distinction to follow). OpenCV refers to this detector as the “Haar classifier”

because it uses Haar features or, more precisely, Haar-like wavelets that consist of adding and

subtracting rectangular image regions before thresholding the result. It is a machine learning based

approach where a cascade function is trained from various positives and negative images. It is then used

to detect faces in similar images.

Face reorganization through geometric features of a face such as position of eyes, ears, nose, lips etc. is

probably the most intuitive approach to face recognition. Such positions are used to build a feature

vector. OpenCV facilitates various face image databases at http://face-rec.org/databases/ which can be

used for evaluation.

OpenCV facilitates set of pre-trained object-recognition files, but the code also allows you to train and

store new object models for the detector. While attempting face detection using Haar Cascades, initially,

the algorithm needs a lot of positive images and negative images to train the classifier. Therefore the

first task was to prepare dataset to train the classifier.

ISSUES WITH SRDH

For any successful service delivery the complete accuracy of citizen’s data and associated service data

should be ensured. During enrolment of Aadhaar, citizen’s biometric and demographic data is captured

and processed. Biometric data is being captured using iris scanner for eyes, fingerprint scanner for

fingers and Webcam for photograph of citizen as a visual identity.

UID being a project at such a large scale, it is evident that even with utmost quality assurance of data-

capture process, some data quality issues will come up. It is also practically very difficult to supervise

http://docs.opencv.org/trunk/modules/contrib/doc/facerec/facerec_api.html#FaceRecognizer%20:%20public%20Algorithm

http://face-rec.org/databases


P a g e 3 | 39

enrolment at grass root level on real-time as enrolment process is completely decentralized activity. It

has been observed that the enrolment operator clicks photograph without taking a look at captured

image due to hasten enrolment.

GoM has adopted various tools for data validation and cleansing on demographic data using SAS. Data

quality issues with photographs include issues like Photos with very less exposure, Photos where half of

face is captured, Face attributes not visible, multiple faces in one frame, closed eyes of individual, and

Photos without faces etc. Also, our Team wanted to verify the Gender and Age against the face, though

it is a difficult call for the software, and needs more of manual check.

IMPLICATION OF INVALID DATA

One of the core offerings of Aadhaar is Unique Identity based Service Delivery which is only doable

when service delivery data is integrated with UIDs. While conducting integration (which is also called as

Seeding) of UID and Service Data, the source of UID data should remain as a master, neat and organized.

Failing to the same would result false master data across all service database post integration which can

also lead to unavailability of entitled service delivery to citizen. Further the services like eKYC which may

be used extensively, for G2C service delivery, will falter if the data is ambiguous. The unreliable resident

data may lead to issues like non delivery of services to the citizens.

EXPECTED OUTCOME

GoM envisages cleansing the UID holder’s photographs by applying innovative tools and techniques

using open source platform to ensure usable SRDH with most reliable data to the extent possible. Such

identified bad records then will be referred to UIDAI for updates.

Further USRDH data after suitable masking/anonymising can be utilized to generate training files for

entities or research organization working on face recognition projects. Currently no such training files of

Indian faces are available in public domain. These training files (which we are providing in XML) can

benefit any research in the area of face recognition pertaining to India in a big way.


P a g e 4 | 39

KEY CHALLENGE

OpenCV has a library of trained files that can be used for object-recognition. These trained files are

contributed by world community, are mainly trained for non-Indian faces in case of face recognition

process. Thus the use of OpenCV using pre-trained object-recognition files was limited to Face

detection. The results for Age mismatch, gender mismatch and other specific analysis which required

more discrete application of facial features were not encouraging.

There were no training files available for Indian face structure. So the need was felt to create training

files on Indian faces, to utilize the setup to further analyze the database. Also these training files can be

shared in public domain for entities and research organization working on solutions/ research on face

recognition in Indian context.

PROCESS

Face detection (or any object detection) with OpenCV includes two major stages: training and detection.

The OpenCV library provides programs (or functions) that is used to train classifiers for their face

detection system, called HaarTraining, which create training files to be used by face detection functions.

There are some standard training files available in public domain can be used.

The face detection is achieved by using OpenCV library function call is "cvHaarDetectObjects" (refer

http://docs.opencv.org/modules/objdetect/doc/cascade_classification.html). This function takes image

to be analyzed as input along with various others parameters, with reference to the training file location

and returns sequence of rectangular regions in the given image.

For screening images, the training files available in the public domain for face features were used.

Training was done on Nose, Mouth, Eye Pair, and Smile. In addition to that, there were alternate training

files for frontal face (highlighted) that were used to validate the findings of the main training file. The

following training files (available at www.opencv.org) were used:


P a g e 5 | 39

OpenCV Command line utilities were used for the process of detection of bad images and

standardization which was processed on basic computing environment. Initial process was run for 35

million resident data sets which resulted into processing challenges while handling such a large data set

hence the data sets were divided into 8 separate shards and the process was applied in each shard

parallel.

FACE DETECTION

ProcessPhotoFile: This function was created, which used cvHaarDetectobject function of OpenCV.

This function analyzed the datasets against training images, records and the features identified in each

face such as Face, Nose, Mouth, Eye Pair and Smile and records the outputs result in a form of CSV file.

Post to this process, details against each UID is captured in CSV file. In addition to this,

ListFilePropertyfunction was created to retrieve Image properties likeWidth and Height, Aspect Ratio,

Image Size in KBs of the image file. This function required same parameters as Clean_Single function.

(Refer Appendix A for the code)

"haarcascade_frontalface_default.xml";

"haarcascade_mcs_nose.xml";

"haarcascade_mcs_mouth.xml";

"haarcascade_mcs_eyepair_small.xml";

"haarcascade_frontalface_alt.xml";

"haarcascade_frontalface_alt2.xml";

"haarcascade_smile.xml";

Parameters:

Dataset from pre-defined locations in array

Output:

CSV file with following delimiters:

uid,Faces,Face_Alt,Face_Alt2,Nose_Count,EyePair_Count,Mouth_Count,Eye_Glasses,Perc

entage_cover_1,Percentage_cover_2,Percentage_cover3


P a g e 6 | 39

GRANULAR APPROACH:

In order to detect face, it is apparent that respectable features of face is being detected and flagged

appropriately. In a view of the same, number of iterations were processed for each image for each face

feature detections. The following processes were conducted step by step on each image.

Figure 1. Iterations for Effective Face Detection

The consideration of an image as valid was based on the result having one Face, one eye pair, one nose,

at least one mouth, at least 60% face coverage. The following process depicts detailed steps:

OpenCV provides various Haar-cascade files for various elements of face detections in an image. Such

Cascades are located in data folder of OpenCV installation location.

The main function which detects faces is cvHaarDetectOjects which was called by our function

ProcessPhotoFile.

Parameters:

Dataset from pre-defined locations in array

Output:

CSV file with following delimiters:

UID,WIDTH,HEIGHT,ASPECT_RATIO, FILE_SIZE


P a g e 7 | 39

DETECTING FACES

OpenCV provides various cascade files for face detections in image. The Haar-Cascades training xml are

used to detect full frontal faces, using getFaceFeature()method for detecting faces. The objective

primarily is to ensure that images containing faces and are not blank or contain any other object.

Further the face should have at least 60% coverage in the images.

Sample code snippets:

const char * XML_FILE_FACE = "haarcascade_frontalface_default.xml";

CvHaarClassifierCascade *cascadeFace = (CvHaarClassifierCascade*)cvLoad(

XML_FILE_FACE, 0, 0, 0 );

//--- Using haar detection api of openCV detect facial features

CvSeq * sign =

cvHaarDetectObjects(img,cascadeFace,storageFace,scaleFactor,minNeighbors,CV_HAA

R_DO_CANNY_PRUNING);

//--- Parameters:

//--- img(IplImage) - Image to which apply Object detection

//--- cascadeFace(CvHaarClassifierCascade) - Haar-Cascade file

//--- storageFace(CvMemStorage) - Storage for detection object

//--- scaleFactor(double) = 1.3;

//--- minNeighbors = 3;

//--- CV_HAAR_DO_CANNY_PRUNING - defined by openCv

Used Haar-Cascades:

"haarcascade_frontalface_default.xml";

"haarcascade_frontalface_alt.xml";

"haarcascade_frontalface_alt2.xml";

Note: Haar-Cascades for face detection are located in the data folder of OpenCV

installation location!


P a g e 8 | 39

CList<CFaceFeature> featureDetails;

cFaceUtility.getFaceFeature(CFaceFeature::Face, _T("F:\\UIDSplitedFiles\\XXXXXXX6486.jpeg"),featureDetails);

cFaceUtility.getFaceFeature(CFaceFeature::FaceAlternate1, _T("F:\\UIDSplitedFiles\\XXXXXXX6486.jpeg"),featureDetails);

cFaceUtility.getFaceFeature(CFaceFeature::FaceAlternate2,

_T("F:\\UIDSplitedFiles\\XXXXXXX6486.jpeg"),featureDetails);

Output

The getFaceFeature() method returns “1” if the face is detected in

image

DETECTING NOSE

Similarly, OpenCV provides various cascade files for nose detections in image.


cFaceUtility.getFaceFeature(CFaceFeature::NosePosition,


Output The getFaceFeature() method returns “1” if nose is detected in image

Used Haar-Cascades:

"haarcascade_mcs_nose.xml";


P a g e 9 | 39

DETECTING EYES

The following Haar-Cascade was used for eye detections in image.


cFaceUtility.getFaceFeature(CFaceFeature::EyePosition, _T("F:\\UIDSplitedFiles\\XXXXXXX6486.jpeg"),featureDetails); cFaceUtility.getFaceFeature(CFaceFeature::SmallEyPosition,


Output

The getFaceFeature() method returns “1” if eyes are detected in

image

DETECTING MOUTH

The following Haar-Cascade was used for eye detections in image.

However limitations were observed in this process as the method identified eye area as a shape close to

mouth. More than one instance of detection of mouth was considered as a valid image.


Used Haar-Cascades:

"haarcascade_mcs_eyepair_small.xml";

Used Haar-Cascades:

"haarcascade_mcs_mouth.xml";


P a g e 10 | 39

int CFaceUtility::getMouthCount(IplImage* object) { try { CvSeq * sign = cvHaarDetectObjects(object,cascadeMouth,storageMouth,scaleFactor,minNeighbors,CV_HAAR_DO_CANNY_PRUNING); cvClearMemStorage(storageMouth); int total_Count = sign->total; //cvFree(&sign); return total_Count; }catch(exception * ex) { } return 0; }

Output

The getFaceFeature() method returns “1” if eyes are detected in

image

Once all the features are marked as “1”, the image can be inferred as good image. So using OpenCV

process it has been ascertained that the image has one Face, one eye pair, one nose, one mouth, and at

least 60% face coverage while images marked as “0” for given image parameters are identified as bad

images. Some of the identified bad images are as follows.


P a g e 11 | 39

ORGANIC VALIDATION PROCESS

All identified bad images are manually scrutinized through a Photo Validation tool developed by UID

Innovation Lab where Photo Validation operators manually check each identified image in the tool and

flag them either ‘Accept’ or ‘Reject’. Following two stage process of checker and verifier, once images

flagged as ‘Reject’, the same is flagged for updating.

IMAGE STANDARDIZATION

The dataset contains photographs with varied image sizes ranging from 1KB to 186 KB. On review of

images less than 4KB showed very bad quality of images. Records for such images were omitted from

the dataset.

To ensure uniformity in the images in the Usable Data Hub Image Standardization process is applied to

all the images which are not uniform. Uniform standard for the images is: Width 160 Pixel, Aspect Ration

1.25 to 1.33, face coverage of 60%. The following code was implemented for the same:

The photo standardization process optimizes, crops and adjusts the images to match the specifications.

The cropping is done carefully to ensure removal of whitespaces and not the face. Secondly, maintaining

aspect ratio is important as it ensures not distortion of images and uniform aspect ratio would ensure

uniform output when images are used in the application. The following images illustrates optimization

results of standardization process.

if(strCommand.Compare(_T("STANDARD_PHOTO")) >= 0)

{

cFaceUtility.createStandardAndBWImages(argv[2],argv[3],CFaceUtility::Size_200X160,

true, false);

}


P a g e 12 | 39

IMPACT OF CLEANSING ACTIVITY USING OPENCV

Out of 50 million data sets analyzed using OpenCV, 1.14 million images were flagged by the software.

These images went through the photo validation process, by putting two data entry operators

independently looking at these images, and then a supervisor going manually through their suggestions.

Over all, about 12,000 images were found to be unacceptable post photo validation process. Thus,

manual effort was reduced almost 50 times due to this specialized software.

The process identified around 2.2 % images as suspect. Post manual photo validation, it was found

that 0.024 % were images were not usable.


P a g e 13 | 39

CASCADE HAAR TRAINING FOR CLASSIFIER

The earlier exercise was based on the available training files in public domain (in OpenCV library). These

training files were mainly available from worldwide communities which trained on foreign faces. There

were no training files available on Indian faces.

With extensive database of Indian images, following process was initiated for generating Haar training

files for Indian faces.

As first case age group of 0-10 were selected to generate the training vector file. Images of good quality

from the age group 0-10 were selected from USRDH database. It was envisaged that using these training

files, FaceDetect process will identify child images (for any Indian database) from a set of images and

identify rest images with ‘Reject’ flag. This training vector file can be used by entities and research

organization to optimize their solutions and utilize the same in their applications.

As a part of the training, significant number of images were selected. The number of images were

increased in incremental iterations for generating training file. In all after four iterations a reliable set of

training vector file was generated.


P a g e 14 | 39

The input sample and results are depicted below:

SAMPLE:

ITERATIONS:

ITERATION 1

Positive Image set of 450 with marked Region of Interest

(RoI) was provided for Haar training.

Negative Image set of 700 were provided. The images were

mainly blank images and did not contain images of persons.


P a g e 15 | 39

Accepted Image

Results

Rejected Image

Results

Conclusion: Post training OpenCV rejected images with very old residents; however it accepted adult

images.


P a g e 16 | 39

ITERATION 2

Positive Images of 0-10 years were increased from 450 to

1000 and negative images were 1000 with some adult images.

Accepted Image

Results


P a g e 17 | 39

Rejected Image

Results

Conclusion: Post addition of adult images in negative set, OpenCV rejected images with very old

residents as well as some adult images. However it still accepted some of the adult images.


P a g e 18 | 39

ITERATION 3

In this iteration we used complete set of 1000 images of

Adults to train OpenCV.

It was found that false matched has been reduced and we

observed that selection ration and detection of child has

been increased with some exception. The rejected set

included some child images as well.

Accepted Image

Results as a child

Rejected Image

Results


P a g e 19 | 39

ITERATION 4

We tried to increase the no of positive image to around

1500 keeping negative set to 1000 images. This should have

shown better accuracy in identifying child images. However

there were no significant improvements observed from

iterations 3.

Conclusion: Based on the results of all the iterations, training Vector generated from Iteration 3 was

taken as most appropriate as optimum training file to identify bad images (that is, demographic Age

mismatching with photo).


P a g e 20 | 39

AUTOMATIC NUMBER PLATE RECOGNITION

Automatic number plate recognition (ANPR) uses optical character recognition on images to read vehicle

registration plates. They can use existing closed-circuit television or road-rule enforcement cameras, or

ones specifically designed for the task. ANPR can be used by various police forces for traffic

enforcements, identification of stolen vehicles, surveillance etc. ANPR can also be used for electronic toll

collection on pay-per-use roads and cataloging the movements of traffic or individuals.

AUTOMATIC NUMBER PLATE RECOGNITION USING OPENCV

UID Innovation Labs conducts various research and use cases as a part of eGovernance initiatives. After

the use of OpenCV in UID Face Recognition, the team also attempted Automatic Number Plate

Recognition (ANPR) using OpenCV. As a concept, ANPR detects vehicle numbers from images and

footage. ANPR carries immense applicability, adoptability and potential to be used in various monitoring

and mass surveillance systems for the Governments. The algorithm used by UID Innovation Labs

performs image segmentation and feature extraction, pattern recognition basics, Vector Machines and

K-Nearest.

In order to implement ANPR operations, the first task was to train the software on Indian number

plates. The following code was used to load the training sample and learn the K-nearest data from it. As

the Hon’ble Supreme Court of India has given directions for Standardizing Vehicle Number plates across

the country, we have used the Standardized Template or its close variants only, to train the software.


P a g e 21 | 39

NewOCROpenCV newOCROpenCV;

// Load the training sample and learn the K-nearest data from it

newOCROpenCV.setSize(15, 25,25);

newOCROpenCV.LearnFromImages(<PATH>\\NumberPlateDetection\\OCRData25");

// OCRData25 folder contains training images

//--- The main function to learn from training images

void NewOCROpenCV::LearnFromImages(char * pathToTraingImages)

{

if(showSteps == true)

{ namedWindow("single", CV_WINDOW_AUTOSIZE);

namedWindow("all",CV_WINDOW_AUTOSIZE); }

CvMat* trainData = cvCreateMat(classes * nNumberOfSamples,nImageSize,CV_32FC1);

CvMat* trainClasses = cvCreateMat(classes * nNumberOfSamples, 1, CV_32FC1);

Mat img;

char file[255];

//--- Read images one by one

for (int j = 0; j < nNumberOfSamples; j++)

{

for (int i = 0; i < classes; i++)

{

sprintf(file, "%s\\%d\\%c.jpg", pathToTraingImages,

j,strTrainCharacters[i]);

img = imread(file, 1);

if (!img.data)


P a g e 22 | 39

The training code above publishes a trained data in output file.

PLATE DETECTION AND RECOGNITION

Post to training and creating an output file, plate detection and recognition functions were defined.

Plate detection has the purpose of detecting the location of the plate in the whole camera frame. When

a plate is detected in an image, the plate segment is passed to the second step - plate recognition -

which uses an OCR algorithm to determine the alphanumeric characters on the plate.

In the next figure we can see the two main algorithm steps, plate detection and plate recognition. After

these steps the program draws over the camera frame the plate's characters that have been detected.

The algorithms can return bad results or even no result:

if (!img.data)

{

cout << "File " << file << " not found\n";

exit(1);

}

//--- Process image and get the training data out of it

Mat outfile;

PreProcessImage(&img, &outfile, nSizex, nSizey);

for (int n = 0; n < nImageSize; n++)

{

trainData->data.fl[i * nImageSize + n] = outfile.data[n];

}

trainClasses->data.fl[i] = i;

}

}

if(knearest != NULL)

{

delete knearest;

knearest = NULL;

}

knearest = new KNearest(trainData, trainClasses);

}


P a g e 23 | 39

In each step shown in the previous figure, three additional steps are defined that are commonly used in

pattern recognition algorithms:

1. Segmentation: This step detects and removes each patch/region of interest in the image.

2. Feature extraction: This step extracts from each patch a set of characteristics.

3. Classification: This step extracts each character from the plate recognition step or classifies each

image patch into "plate" or "no plate" in the plate-detection step.

Pattern recognition algorithms are very critical to train the system. Failing to train it effectively, ANPR

may not work correctly as different patterns need different types of training and evaluation. It is needed

to evaluate system in different environments, conditions, and with different features to get the best

results.


P a g e 24 | 39

PLATE DETECTION

To detect the plate segmentation and segment classification of image is performed. In segmentation,

different filters, morphological operations, contour algorithms, and validations to retrieve those parts of

the image that could have a plate are applied. While in segment classification, a Support Vector Machine

(SVM) classifier of OpenCV is applied to each image patch.

Before creating main application, two different classes for plate and non-plate are applied and trained.

IT was processed with parallel frontal-view color images that are 800 pixels wide and taken 2–4 meters

from a car. These requirements are important to ensure correct segmentations.

The following image shows all the processes involved in plate detection:

Sobel filter

Threshold operation

Close morphologic operation

Mask of one filled area

Possible detected plates marked in red (features images)

Detected plates after the SVM classifier

SEGMENTATION

The second part is to divide image into multiple segments in order to simplify image for analysis to make

further extractions easier. One important feature of plate segmentation is the high number of vertical

edges in a license plate assuming that the image was taken frontally, and the plate is not rotated and is

without perspective distortion.

CLASSIFICATION

After preprocess and segmentation of all possible parts of an image, it needs to be identified that each

segment is (or is not) a license plate. To do this, a Support Vector Machine (SVM) algorithm is used.


P a g e 25 | 39

A Support Vector Machine is a pattern recognition algorithm included in a family of supervised-learning

algorithms originally created for binary classification. Supervised learning is machine-learning algorithm

that learns through the use of labeled data. We need to train the algorithm with an amount of data that

is labeled; each data set needs to have a class. The SVM creates one or more hyper-planes that are used

to discriminate each class of the data.

TRAINING TO CLASSIFIER

The first task before any classification is to train our classifier; this job is done prior to beginning the

main application and its named offline training. It requires a sufficient amount of data to train the

system, but a bigger dataset does not always imply the best results. In current case, enough data was

not available due to the fact that there are no public license-plate databases. As a result, hundreds of

car photos were taken and then preprocess and segment all the photos.

More than 75 license-plate images and 35 images without license plates of 144 x 33 pixels were used for

training. The sample of this data is shown in the following image. It is sufficient enough to get decent

results considering the scope of experiment.

PLATE RECOGNITION

This step in license plate recognition aims to retrieve the characters of the license plate with optical

character recognition (OCR). For each detected plate, we proceed to segment the plate for each

character, and use an Artificial Neural Network (ANN) machine-learning algorithm to recognize the

character.

OCR SEGMENTATION

The plate image patch was obtained as the input to the segmentation OCR function with an equalized

histogram, post to that, a threshold filter was applied using image as the input of a Find contours

algorithm; we can see this process in the next figure:


P a g e 26 | 39

FEATURE EXTRACTION

The next step for each segmented character is to extract the features for training and classifying the

Artificial Neural Network algorithm. Unlike the plate detection feature-extraction step that is used in

SVM, we don't use all of the image pixels; we will apply more common features used in optical character

recognition containing horizontal and vertical accumulation histograms and a low-resolution image

sample.

OCR CLASSIFICATION

In the classification step, we use an Artificial Neural Network machine-learning algorithm. More

specifically, a Multi-Layer Perceptron (MLP), which is the most commonly used ANN algorithm. MLP

consists of a network of neurons with an input layer, output layer, and one or more hidden layers. Each

layer has one or more neurons connected with the previous and next layer.

For our sample, we only use one hidden layer, then we define a matrix of 1 row and 3 columns. The first

column position is the number of features, the second column position is the number of hidden neurons

in the hidden layer, and the third column position is the number of classes.

OpenCV defines a CvANN_MLP class for ANN. With the create function, we can initiate the class by

defining the number of layers and neurons, the activation function, and the alpha and beta parameters.

In OpenCV, the CvANN_MLP class uses the predict function for classifying a feature vector in a class.

Unlike the SVM classify function, the ANN's predict function returns a row with the size equal to the

number of classes with the probability of belonging to the input feature of each class.

To get the best result, we can use the minMaxLoc function to get the maximum and minimum response

and the position in the matrix. The class of our character is specified by the x position of a higher value:


P a g e 27 | 39

To finish each plate detected, we order its characters and return a string using the str() function of the

Plate class and we can draw it on the original image:

MODIFICATION TO HANDLE ‘MH’ NUMBERS

The output of the above process is a string on the number plate. However some intelligence to identify

Maharashtra vehicles was build and successfully tested. The strings usually returned ambiguous Number

Plated which were then cleaned to retrieve MH number plates.

string licensePlate=plate.str();

rectangle(input_image, plate.position, Scalar(0,0,200));

putText(input_image, licensePlate, Point(plate.position.x, plate.

position.y), CV_FONT_HERSHEY_SIMPLEX, 1, Scalar(0,0,200),2);


P a g e 28 | 39

The correction algorithm is based on the analysis of the results received and changing the numbers

identified to logical MH numbers.

bool NumberDetect::checkAndCorrectStateCode (CString & strStateCode) { CString strNewCode = _T(""); if(strStateCode.Mid(0,1).Compare(_T("M")) != 0) { strNewCode = _T("MH"); strStateCode = strNewCode; return true; } else if(strStateCode.Mid(1,1).Compare(_T("H")) != 0) { strNewCode = _T("MH"); strStateCode = strNewCode; return true; } return false; } bool NumberDetect::correctNumbertIfChar (CString & strNumber) { strNumber.Replace(_T("B"),_T("8")); strNumber.Replace(_T("O"),_T("0")); strNumber.Replace(_T("Y"),_T("1")); strNumber.Replace(_T("G"),_T("6")); strNumber.Replace(_T("A"),_T("4")); strNumber.Replace(_T("S"),_T("6")); strNumber.Replace(_T("Z"),_T("2")); strNumber.Replace(_T("Q"),_T("0")); return true; } bool NumberDetect::correctCharIfNumber (CString & strChar) { strChar.Replace(_T("8"),_T("B"));


P a g e 29 | 39

strChar.Replace(_T("0"),_T("O")); strChar.Replace(_T("1"),_T("Y")); strChar.Replace(_T("6"),_T("G")); strChar.Replace(_T("4"),_T("A")); return true; } bool NumberDetect::CorrectAsPerIndianStd (CString & strNumPlate) { if(strNumPlate.GetLength() < 9) { cout << "Number Plate is not standard or OCR has failed to detect single number. Try again\n"; return false; } // MH CString strStateCode = strNumPlate.Mid(0,2); checkAndCorrectStateCode(strStateCode); // 03 CString strAreaCode = strNumPlate.Mid(2,2); correctNumbertIfChar(strAreaCode); if(strNumPlate.GetLength() == 9) { CString restOFNum = strNumPlate.Mid(4,strNumPlate.GetLength() - 4); strNumPlate = strStateCode + strAreaCode + restOFNum; return true; } //AZ CString strSerialChar = strNumPlate.Mid(4,2); correctCharIfNumber(strSerialChar); //0572 CString strSrNum = strNumPlate.Mid(6,4); correctNumbertIfChar(strSrNum); strNumPlate = strStateCode + strAreaCode + strSerialChar + strSrNum; return true; }

APPLICATIONS OF ANPR

ANPR use is widespread among many law enforcement agencies across the World. ANPR can become

significant component of predictive policing strategies and intelligence gathering, recovery of stolen

vehicles, identification of wanted felons, and tracking individuals who are tax evaders etc. Of course,

ANPR has its uses in regular traffic management.


P a g e 30 | 39

The real-time processing or batch mode processing of license plate numbers can generate data which

can be analyzed for better traffic management, surveillance and other community benefits. Images,

dates, times and GPS coordinates can be stockpiled and can help place a suspect at a scene, aid in

witness identification, pattern recognition or the tracking of individuals.

ANPR can be used for parking enforcement, tracking vehicles involved in accidents tracking and seizure

of vehicles with pending traffic fines.

Augmented with BI and Big Data analysis, ANPR can be an effective tool in the hand of the Government

to enforce law, prevent crime, and improve traffic thus impacting quality of life.


P a g e 31 | 39

Appendix A

Function for File Properties

//--- This funciton writes the file properties to csv file int CFaceUtility::ListFileProperty(CString strFolderToClean) { try { //--- Prepare the csv file to write result CArray<CString> strFileList; int pos = strFolderToClean.ReverseFind(_T('\\')); CString strCSVFileName = _T("File_Prop_") + strFolderToClean.Mid(pos + 1) + _T(".csv"); time_t time = GetTickCount(); TCHAR currDir[MAX_PATH]; GetDllDirectory(MAX_PATH,currDir); SetCurrentDirectory(GetExePath()); //--- Log start time to find performance time_t time2 = GetTickCount(); CString strCSVPath = strFolderToClean + _T("\\") + strCSVFileName; if(isFolderOrFileExist(strCSVPath)) { cout << "Folder Already Processed. Exiting Batch... \n"; return -1; } CFile fNew; BOOL bIsSuccess = fNew.Open(strCSVPath,CFile::modeReadWrite | CFile::modeCreate); if(bIsSuccess == false) { strCSVPath = strFolderToClean + _T("\\2_") + strCSVFileName; fNew.Open(strFolderToClean,CFile::modeCreate); } //--- Get file ;list for directory listFiles(strFileList, strFolderToClean); int nTotalFiles = strFileList.GetSize(); if(nTotalFiles == 0) { cout << "0 File Found \n"; fNew.Close(); return -1; } //--- Header information to put in csv file


P a g e 32 | 39

CString strHeaderFile = _T("UID,WIDTH,HEIGHT,ASPECT_RATIO,STANDARD_SIZE,FILE_SIZE\r\n"); fNew.Write((LPCTSTR) strHeaderFile, strHeaderFile.GetLength() * sizeof(TCHAR)); cout << "Processing Started \n"; CString strFileContent; //--- iterate through each file and get its property for(int nCount = 0; nCount < nTotalFiles; nCount++) { CString strUIDNumber = strFileList.GetAt(nCount); CString strPhotoFIle = strFolderToClean + _T("\\") + strUIDNumber; strUIDNumber = strUIDNumber.Mid(0, strUIDNumber.ReverseFind(_T('.'))); CT2A ascii(strPhotoFIle); //--- Load image file and get its parameters such as width, height, aspect ratio, file size etc IplImage* img = cvLoadImage( ascii.m_psz ); //, CV_LOAD_IMAGE_GRAYSCALE if( !img || img->imageData == NULL) { cout << "Can not load image file" << ascii.m_psz; continue; } CFile * f = new CFile(strPhotoFIle,CFile::modeRead); long l = f->GetLength(); double imageAspect = (double) img->height / (double)img->width; long imgSize = l; int batchSize = 1000; //--- Dump the collected data at the end of every 1000 file count if(nCount!= 0 && (nCount % batchSize) == 0) { time2 = GetTickCount() - time2; cout << " Time Taken by each file in last" << batchSize << " files is (ms)" << time2/batchSize << "\n"; time2 = GetTickCount(); fNew.Write((LPCTSTR) strFileContent, strFileContent.GetLength() * sizeof(TCHAR)); strFileContent = ""; } int nIsStandard = 0; if(imageAspect >= 1.25 && imageAspect <= 1.339)


P a g e 33 | 39

{ nIsStandard = 1; } //--- add detected parameters CString strData = _T(""); strData.Format(_T("%s,%d,%d,%f,%d,%d\r\n"),strUIDNumber,img->width,img->height,imageAspect,nIsStandard,imgSize); strFileContent += strData; cvReleaseImage(&img); } //--- Final dump of data and file close fNew.Write((LPCTSTR) strFileContent, strFileContent.GetLength() * sizeof(TCHAR)); fNew.Close(); time = GetTickCount() - time; } catch(Exception * ex2) { return -1; } return 0; }

Function to detect Faces and face parts like nose, eyes, mouth, face //--- This funcion command and folder to process int CFaceUtility::ProcessPhotoFile(CString strCommand, CString strFolderToClean, CString strForcePrcess) { bool bForceProcess = false; bool bAddFileInfo = false; try { //--- check command if(strForcePrcess.Compare(_T("FORCE_PROCESS")) >= 0) { bForceProcess = true; } else if(strForcePrcess.Compare(_T("ADD_FILE_INFO")) >= 0) { bAddFileInfo = true; }


P a g e 34 | 39

CArray<CString> strFileList; CString clean_single("CLEAN_SINGLE"); //--- Chekc if folder does exist if(isFolderOrFileExist(strFolderToClean) == false) { AfxMessageBox(_T("Folder Does not exist")); return -1; } //--- or we have command to detect the single folder containing image files else if(strCommand.Compare(clean_single)) { AfxMessageBox(_T("Wrong Command line argument")); return -1; } //--- Prepare the output file (CSV) int pos = strFolderToClean.ReverseFind(_T('\\')); CString strCSVFileName = _T("Result_") + strFolderToClean.Mid(pos + 1) + _T(".csv"); time_t time = GetTickCount(); TCHAR currDir[MAX_PATH]; GetDllDirectory(MAX_PATH,currDir); SetCurrentDirectory(GetExePath()); //--- load Haar Cascade Files if not done earlier InitilizeFaceUtility(); time_t time2 = GetTickCount(); //--- If output CSV file exist and force flag is not set then exit the uitilty (Means already processsed) CString strCSVPath = strFolderToClean + _T("\\") + strCSVFileName; if(bForceProcess == false && isFolderOrFileExist(strCSVPath)) { cout << "Folder Already Processed. Exiting Batch... \n"; return -1; } CFile fNew; BOOL bIsSuccess = fNew.Open(strCSVPath,CFile::modeReadWrite | CFile::modeCreate); if(bIsSuccess == false) { strCSVPath = strFolderToClean + _T("\\2_") + strCSVFileName; fNew.Open(strFolderToClean,CFile::modeCreate); }


P a g e 35 | 39

cout << "File Listing Started \n"; //--- Get the files from folder listFiles(strFileList, strFolderToClean); int nTotalFiles = strFileList.GetSize(); if(nTotalFiles == 0) { cout << "0 File Found \n"; fNew.Close(); return -1; } //--- Output headr of CSV files CString strHeaderFile = _T("uid,Faces,Face_Alt,Face_Alt2,Nose_Count,EyePair_Count,Mouth_Count,Eye_Glasses,Percentage_cover_1,Percentage_cover_2,Percentage_cover3"); if(bAddFileInfo == true) { strHeaderFile += _T(",WIDTH,HEIGHT,ASPECT_RATIO,STANDARD_SIZE,FILE_SIZE"); } strHeaderFile += _T("\r\n"); fNew.Write((LPCTSTR) strHeaderFile, strHeaderFile.GetLength() * sizeof(TCHAR)); cout << "Processing Started \n"; CString strFileContent; //--- Parse each file and load it for detection for(int nCount = 0; nCount < nTotalFiles; nCount++) { CString strUIDNumber = strFileList.GetAt(nCount); CString strPhotoFIle = strFolderToClean + _T("\\") + strUIDNumber; strUIDNumber = strUIDNumber.Mid(0, strUIDNumber.ReverseFind(_T('.'))); CT2A ascii(strPhotoFIle); IplImage* imgOriginal = NULL; IplImage* img = NULL; //--- Load the image file imgOriginal = cvLoadImage( ascii.m_psz ); //, CV_LOAD_IMAGE_GRAYSCALE double imageArea = imgOriginal->width * imgOriginal->height; if( !imgOriginal || imgOriginal->imageData == NULL)


P a g e 36 | 39

{ cout << "Can not load image file" << ascii.m_psz; continue; } //--- If user has also aksed to add file proeprty feature then append this as well CString strFileInfo = _T(""); if(bAddFileInfo == true) { CFile * f = new CFile(strPhotoFIle,CFile::modeRead); long lFileSize = f->GetLength(); int imageHeight = imgOriginal->height; int imageWidth = imgOriginal->width; double imageAspect = (double) imageHeight / (double)imageWidth; int nIsStandard = 0; if(imageAspect >= 1.25 && imageAspect <= 1.339) { nIsStandard = 1; } strFileInfo.Format(_T(",%d,%d,%f,%d,%d"),imageWidth,imageHeight,imageAspect,nIsStandard, lFileSize); } //--- Incase non standard image 160 X200 or of that same aspect ratio int imageWidth = imgOriginal->width; if(imageWidth > 160) { double imageAspect = (double) imgOriginal->height / (double)imageWidth; int newImageWidth = 160; int nNewHeight =(int) ((float)newImageWidth * imageAspect); img = cvCreateImage ( cvSize(newImageWidth , nNewHeight ), imgOriginal->depth, imgOriginal->nChannels ); cvResize(imgOriginal, img); cvReleaseImage(&imgOriginal); imgOriginal = NULL; } else { img = imgOriginal; imgOriginal = NULL; } //--- Using haar detection api of openCV detect facial features CvSeq * sign = NULL;


P a g e 37 | 39

int total_Faces = -1; int total_faceAlt = -1; int total_faceAlt2 = -1; int total_Noses = -1; int total_Mouth = -1; int total_EyePair = -1; int total_EyGlasses = -1; try { sign = cvHaarDetectObjects(img,cascadeFace,storageFace,scaleFactor,minNeighbors,CV_HAAR_DO_CANNY_PRUNING); cvClearMemStorage(storageNose); if(sign == NULL) { continue; } total_Faces = sign->total; } catch(exception * ex) { continue; } int batchSize = 500; if(nCount!= 0 && (nCount % batchSize) == 0) { time2 = GetTickCount() - time2; cout << " Time Taken by each file in last " << batchSize << " files is (ms)" << time2/batchSize << "\n"; time2 = GetTickCount(); fNew.Write((LPCTSTR) strFileContent, strFileContent.GetLength() * sizeof(TCHAR)); strFileContent = ""; } //---No face detected if(total_Faces == 0) { total_faceAlt = getFaceAlternate(img); total_faceAlt2 = getFaceAlternate2(img); total_Noses = getNoseCount(img); total_EyePair = getEyePairCount(img); total_Mouth = getMouthCount(img); total_EyGlasses = -1; //--- As of now we don't care about eyeglasses //--- Prepare the output result CString strData = _T("");


P a g e 38 | 39

strData.Format(_T("%s,%d,%d,%d,%d,%d,%d,%d,0,0,0%s\r\n"),strUIDNumber,total_Faces,total_faceAlt,total_faceAlt2, total_Noses, total_EyePair, total_Mouth, total_EyGlasses,strFileInfo ); strFileContent += strData; } //--- 1 face detected else if (total_Faces == 1) { int i = 0; CvRect r = *(CvRect*)cvGetSeqElem( sign, i ); double face_area = r.width * r.height; total_faceAlt2 = getFaceAlternate2(img); total_Noses = getNoseCount(img); total_EyePair = getEyePairCount(img); total_Mouth = getMouthCount(img); total_EyGlasses = -1; //--- As of now we don't care about eyeglasses //--- Prepare the output result CString strData = _T(""); strData.Format(_T("%s,%d,%d,%d,%d,%d,%d,%d,%f,0,0 %s\r\n"),strUIDNumber,total_Faces,total_faceAlt,total_faceAlt2, total_Noses, total_EyePair, total_Mouth, total_EyGlasses, (face_area * 100/imageArea),strFileInfo ); strFileContent += strData; } //--- more then 1 face detected else { // Total faces are more then 5 then set it to 5 only if(total_Faces > 3) { total_Faces = 3; } //--- Prepare the output result CString strData = _T(""); strData.Format(_T("%s,%d,%d,%d,%d,%d,%d,%d"),strUIDNumber,total_Faces,total_faceAlt,total_faceAlt2, total_Noses, total_EyePair, total_Mouth, total_EyGlasses ); strFileContent += strData; CString strFacePercantage = _T(""); int i = 0; for(i = 0; i < total_Faces; i++ ) { CvRect r = *(CvRect*)cvGetSeqElem( sign, i ); double face_area = r.width * r.height; CString strPercen =_T("");


P a g e 39 | 39

strPercen.Format(_T(",%f"), (face_area * 100/imageArea)); strFacePercantage +=strPercen; } for(int j = i; j < 3; j++ ) { strFacePercantage += _T(",0"); } strFileContent += strFacePercantage + strFileInfo + _T("\r\n"); } cvReleaseImage(&img); } //--- Dump result to filesystem and close the file fNew.Write((LPCTSTR) strFileContent, strFileContent.GetLength() * sizeof(TCHAR)); fNew.Close(); time = GetTickCount() - time; cout << " Time Taken is (ms) " << time << " Average time taken in ms is " << time/nTotalFiles << "\n"; } catch(Exception * ex2) { return -1; } return 0; }

Use of Computer Vision and Machine Learning technologies in e-Governance OF COMPUTER... ·...

Documents

Transcript of Use of Computer Vision and Machine Learning technologies in e-Governance OF COMPUTER... ·...