Use of Computer Vision and Machine Learning technologies in e-Governance OF COMPUTER... ·...
Transcript of Use of Computer Vision and Machine Learning technologies in e-Governance OF COMPUTER... ·...
3
Department of Information Technology Government of Maharashtra
USE OF COMPUTER VISION AND MACHINE LEARNING TECHNOLOGIES IN E-GOVERNANCE A paper on implementation of OpenCV by UID Innovation Labs, Department of Information Technology, Government of Maharashtra
Use of computer vision and machine learning technologies in eGovernance UID Innovation Labs, Government of Maharashtra
P a g e 1 | 39
BACKGROUND
AADHAAR is 12 digit Unique Identification (UID) provided by Unique Identification Authority of India
(UIDAI) to empower residents of India. It is India’s one of the most ambitious eGovernance projects so
far and it envisages to issue UID numbers to all residents of India. As on 15th May 2014, more than 630
million UIDs have been issued.
State of Maharashtra has been in forefront of Aadhaar implementation since inception of project from
all perspectives of project implementation i.e. from enrolment to implementation. As on 15th May 2014,
approximately 9.3 million citizens are enrolled for UID and out of which 8.7 million UIDs are issued.
Government of Maharashtra (GoM) has deployed fully operational IT infrastructure and solutions to
facilitate use of Aadhaar Authentication and eKYC with Government service delivery. Use of eKYC by
Directorate of Government Printing and Stationary (DGPS) in the process of name, religion, date of birth
change for citizens and by Department of Registrations and Stamps for Registration of Leave and License
agreements, has changed the paradigm of G2C service delivery.
GoM had also taken various initiatives such as Mobile based Self-seeding, Web based Self Seeding, and
was the first state to set up State Resident Data Hub (SRDH) for facilitating Aadhaar Enabled Service
Delivery. SRDH is state resident data hub encompassing all data of citizen’s into it. Approximately 6.5
million residents’ demographic data and photographs are stored in SRDH, while our online
authentication modules link to UIDAI central servers for fingerprints and iris data. In addition to that,
GoM has also set up a multidisciplinary technology lab focusing on Aadhaar project called UID
Innovation Labs where various innovative tools and techniques are developed.
USE OF INNOVATIVE TECHNOLOGIES
With an ambition of creating highly accurate and reliable residential data hub, GoM decided to conduct
UID data cleansing activities in proactive manner. As a result, UID Innovation Labs evaluated various
scientific tools for data cleansing. One of the challenges during the activities was to validate citizens’
photographs using automated process. After evaluating various open source and commercial
alternatives, Open Source Computer Vision (OpenCV) was utilized as a framework for this objective.
Use of computer vision and machine learning technologies in eGovernance UID Innovation Labs, Government of Maharashtra
P a g e 2 | 39
OpenCV is cross-platform open source computer vision and machine learning software library aimed at
real-time computer vision applications such as imagining, video processing, image processing, HDR
imaging etc. OpenCV contains more than 2500 optimized algorithms including state-of-the-art computer
vision and machine learning algorithms.
OpenCV 2.4 encompasses FaceRecognizer class for face recognition. OpenCV implements a version of
the face-detection technique first developed by Paul Viola and Michael Jones—commonly known as the
Viola-Jones detector*—and later extended by Rainer Lienhart and Jochen Maydt* to use diagonal
features (more on this distinction to follow). OpenCV refers to this detector as the “Haar classifier”
because it uses Haar features or, more precisely, Haar-like wavelets that consist of adding and
subtracting rectangular image regions before thresholding the result. It is a machine learning based
approach where a cascade function is trained from various positives and negative images. It is then used
to detect faces in similar images.
Face reorganization through geometric features of a face such as position of eyes, ears, nose, lips etc. is
probably the most intuitive approach to face recognition. Such positions are used to build a feature
vector. OpenCV facilitates various face image databases at http://face-rec.org/databases/ which can be
used for evaluation.
OpenCV facilitates set of pre-trained object-recognition files, but the code also allows you to train and
store new object models for the detector. While attempting face detection using Haar Cascades, initially,
the algorithm needs a lot of positive images and negative images to train the classifier. Therefore the
first task was to prepare dataset to train the classifier.
ISSUES WITH SRDH
For any successful service delivery the complete accuracy of citizen’s data and associated service data
should be ensured. During enrolment of Aadhaar, citizen’s biometric and demographic data is captured
and processed. Biometric data is being captured using iris scanner for eyes, fingerprint scanner for
fingers and Webcam for photograph of citizen as a visual identity.
UID being a project at such a large scale, it is evident that even with utmost quality assurance of data-
capture process, some data quality issues will come up. It is also practically very difficult to supervise
Use of computer vision and machine learning technologies in eGovernance UID Innovation Labs, Government of Maharashtra
P a g e 3 | 39
enrolment at grass root level on real-time as enrolment process is completely decentralized activity. It
has been observed that the enrolment operator clicks photograph without taking a look at captured
image due to hasten enrolment.
GoM has adopted various tools for data validation and cleansing on demographic data using SAS. Data
quality issues with photographs include issues like Photos with very less exposure, Photos where half of
face is captured, Face attributes not visible, multiple faces in one frame, closed eyes of individual, and
Photos without faces etc. Also, our Team wanted to verify the Gender and Age against the face, though
it is a difficult call for the software, and needs more of manual check.
IMPLICATION OF INVALID DATA
One of the core offerings of Aadhaar is Unique Identity based Service Delivery which is only doable
when service delivery data is integrated with UIDs. While conducting integration (which is also called as
Seeding) of UID and Service Data, the source of UID data should remain as a master, neat and organized.
Failing to the same would result false master data across all service database post integration which can
also lead to unavailability of entitled service delivery to citizen. Further the services like eKYC which may
be used extensively, for G2C service delivery, will falter if the data is ambiguous. The unreliable resident
data may lead to issues like non delivery of services to the citizens.
EXPECTED OUTCOME
GoM envisages cleansing the UID holder’s photographs by applying innovative tools and techniques
using open source platform to ensure usable SRDH with most reliable data to the extent possible. Such
identified bad records then will be referred to UIDAI for updates.
Further USRDH data after suitable masking/anonymising can be utilized to generate training files for
entities or research organization working on face recognition projects. Currently no such training files of
Indian faces are available in public domain. These training files (which we are providing in XML) can
benefit any research in the area of face recognition pertaining to India in a big way.
Use of computer vision and machine learning technologies in eGovernance UID Innovation Labs, Government of Maharashtra
P a g e 4 | 39
KEY CHALLENGE
OpenCV has a library of trained files that can be used for object-recognition. These trained files are
contributed by world community, are mainly trained for non-Indian faces in case of face recognition
process. Thus the use of OpenCV using pre-trained object-recognition files was limited to Face
detection. The results for Age mismatch, gender mismatch and other specific analysis which required
more discrete application of facial features were not encouraging.
There were no training files available for Indian face structure. So the need was felt to create training
files on Indian faces, to utilize the setup to further analyze the database. Also these training files can be
shared in public domain for entities and research organization working on solutions/ research on face
recognition in Indian context.
PROCESS
Face detection (or any object detection) with OpenCV includes two major stages: training and detection.
The OpenCV library provides programs (or functions) that is used to train classifiers for their face
detection system, called HaarTraining, which create training files to be used by face detection functions.
There are some standard training files available in public domain can be used.
The face detection is achieved by using OpenCV library function call is "cvHaarDetectObjects" (refer
http://docs.opencv.org/modules/objdetect/doc/cascade_classification.html). This function takes image
to be analyzed as input along with various others parameters, with reference to the training file location
and returns sequence of rectangular regions in the given image.
For screening images, the training files available in the public domain for face features were used.
Training was done on Nose, Mouth, Eye Pair, and Smile. In addition to that, there were alternate training
files for frontal face (highlighted) that were used to validate the findings of the main training file. The
following training files (available at www.opencv.org) were used:
Use of computer vision and machine learning technologies in eGovernance UID Innovation Labs, Government of Maharashtra
P a g e 5 | 39
OpenCV Command line utilities were used for the process of detection of bad images and
standardization which was processed on basic computing environment. Initial process was run for 35
million resident data sets which resulted into processing challenges while handling such a large data set
hence the data sets were divided into 8 separate shards and the process was applied in each shard
parallel.
FACE DETECTION
ProcessPhotoFile: This function was created, which used cvHaarDetectobject function of OpenCV.
This function analyzed the datasets against training images, records and the features identified in each
face such as Face, Nose, Mouth, Eye Pair and Smile and records the outputs result in a form of CSV file.
Post to this process, details against each UID is captured in CSV file. In addition to this,
ListFilePropertyfunction was created to retrieve Image properties likeWidth and Height, Aspect Ratio,
Image Size in KBs of the image file. This function required same parameters as Clean_Single function.
(Refer Appendix A for the code)
"haarcascade_frontalface_default.xml";
"haarcascade_mcs_nose.xml";
"haarcascade_mcs_mouth.xml";
"haarcascade_mcs_eyepair_small.xml";
"haarcascade_frontalface_alt.xml";
"haarcascade_frontalface_alt2.xml";
"haarcascade_smile.xml";
Parameters:
Dataset from pre-defined locations in array
Output:
CSV file with following delimiters:
uid,Faces,Face_Alt,Face_Alt2,Nose_Count,EyePair_Count,Mouth_Count,Eye_Glasses,Perc
entage_cover_1,Percentage_cover_2,Percentage_cover3
Use of computer vision and machine learning technologies in eGovernance UID Innovation Labs, Government of Maharashtra
P a g e 6 | 39
GRANULAR APPROACH:
In order to detect face, it is apparent that respectable features of face is being detected and flagged
appropriately. In a view of the same, number of iterations were processed for each image for each face
feature detections. The following processes were conducted step by step on each image.
Figure 1. Iterations for Effective Face Detection
The consideration of an image as valid was based on the result having one Face, one eye pair, one nose,
at least one mouth, at least 60% face coverage. The following process depicts detailed steps:
OpenCV provides various Haar-cascade files for various elements of face detections in an image. Such
Cascades are located in data folder of OpenCV installation location.
The main function which detects faces is cvHaarDetectOjects which was called by our function
ProcessPhotoFile.
Parameters:
Dataset from pre-defined locations in array
Output:
CSV file with following delimiters:
UID,WIDTH,HEIGHT,ASPECT_RATIO, FILE_SIZE
Use of computer vision and machine learning technologies in eGovernance UID Innovation Labs, Government of Maharashtra
P a g e 7 | 39
DETECTING FACES
OpenCV provides various cascade files for face detections in image. The Haar-Cascades training xml are
used to detect full frontal faces, using getFaceFeature()method for detecting faces. The objective
primarily is to ensure that images containing faces and are not blank or contain any other object.
Further the face should have at least 60% coverage in the images.
Sample code snippets:
const char * XML_FILE_FACE = "haarcascade_frontalface_default.xml";
CvHaarClassifierCascade *cascadeFace = (CvHaarClassifierCascade*)cvLoad(
XML_FILE_FACE, 0, 0, 0 );
//--- Using haar detection api of openCV detect facial features
CvSeq * sign =
cvHaarDetectObjects(img,cascadeFace,storageFace,scaleFactor,minNeighbors,CV_HAA
R_DO_CANNY_PRUNING);
//--- Parameters:
//--- img(IplImage) - Image to which apply Object detection
//--- cascadeFace(CvHaarClassifierCascade) - Haar-Cascade file
//--- storageFace(CvMemStorage) - Storage for detection object
//--- scaleFactor(double) = 1.3;
//--- minNeighbors = 3;
//--- CV_HAAR_DO_CANNY_PRUNING - defined by openCv
Used Haar-Cascades:
"haarcascade_frontalface_default.xml";
"haarcascade_frontalface_alt.xml";
"haarcascade_frontalface_alt2.xml";
Note: Haar-Cascades for face detection are located in the data folder of OpenCV
installation location!
Use of computer vision and machine learning technologies in eGovernance UID Innovation Labs, Government of Maharashtra
P a g e 8 | 39
CList<CFaceFeature> featureDetails;
cFaceUtility.getFaceFeature(CFaceFeature::Face, _T("F:\\UIDSplitedFiles\\XXXXXXX6486.jpeg"),featureDetails);
cFaceUtility.getFaceFeature(CFaceFeature::FaceAlternate1, _T("F:\\UIDSplitedFiles\\XXXXXXX6486.jpeg"),featureDetails);
cFaceUtility.getFaceFeature(CFaceFeature::FaceAlternate2,
_T("F:\\UIDSplitedFiles\\XXXXXXX6486.jpeg"),featureDetails);
Output
The getFaceFeature() method returns “1” if the face is detected in
image
DETECTING NOSE
Similarly, OpenCV provides various cascade files for nose detections in image.
Sample code snippets:
cFaceUtility.getFaceFeature(CFaceFeature::NosePosition,
_T("F:\\UIDSplitedFiles\\XXXXXXX6486.jpeg"),featureDetails);
Output The getFaceFeature() method returns “1” if nose is detected in image
Used Haar-Cascades:
"haarcascade_mcs_nose.xml";
Use of computer vision and machine learning technologies in eGovernance UID Innovation Labs, Government of Maharashtra
P a g e 9 | 39
DETECTING EYES
The following Haar-Cascade was used for eye detections in image.
Sample code snippets:
cFaceUtility.getFaceFeature(CFaceFeature::EyePosition, _T("F:\\UIDSplitedFiles\\XXXXXXX6486.jpeg"),featureDetails); cFaceUtility.getFaceFeature(CFaceFeature::SmallEyPosition,
_T("F:\\UIDSplitedFiles\\XXXXXXX6486.jpeg"),featureDetails);
Output
The getFaceFeature() method returns “1” if eyes are detected in
image
DETECTING MOUTH
The following Haar-Cascade was used for eye detections in image.
However limitations were observed in this process as the method identified eye area as a shape close to
mouth. More than one instance of detection of mouth was considered as a valid image.
Sample code snippets:
Used Haar-Cascades:
"haarcascade_mcs_eyepair_small.xml";
Used Haar-Cascades:
"haarcascade_mcs_mouth.xml";
Use of computer vision and machine learning technologies in eGovernance UID Innovation Labs, Government of Maharashtra
P a g e 10 | 39
int CFaceUtility::getMouthCount(IplImage* object) { try { CvSeq * sign = cvHaarDetectObjects(object,cascadeMouth,storageMouth,scaleFactor,minNeighbors,CV_HAAR_DO_CANNY_PRUNING); cvClearMemStorage(storageMouth); int total_Count = sign->total; //cvFree(&sign); return total_Count; }catch(exception * ex) { } return 0; }
Output
The getFaceFeature() method returns “1” if eyes are detected in
image
Once all the features are marked as “1”, the image can be inferred as good image. So using OpenCV
process it has been ascertained that the image has one Face, one eye pair, one nose, one mouth, and at
least 60% face coverage while images marked as “0” for given image parameters are identified as bad
images. Some of the identified bad images are as follows.
Use of computer vision and machine learning technologies in eGovernance UID Innovation Labs, Government of Maharashtra
P a g e 11 | 39
ORGANIC VALIDATION PROCESS
All identified bad images are manually scrutinized through a Photo Validation tool developed by UID
Innovation Lab where Photo Validation operators manually check each identified image in the tool and
flag them either ‘Accept’ or ‘Reject’. Following two stage process of checker and verifier, once images
flagged as ‘Reject’, the same is flagged for updating.
IMAGE STANDARDIZATION
The dataset contains photographs with varied image sizes ranging from 1KB to 186 KB. On review of
images less than 4KB showed very bad quality of images. Records for such images were omitted from
the dataset.
To ensure uniformity in the images in the Usable Data Hub Image Standardization process is applied to
all the images which are not uniform. Uniform standard for the images is: Width 160 Pixel, Aspect Ration
1.25 to 1.33, face coverage of 60%. The following code was implemented for the same:
The photo standardization process optimizes, crops and adjusts the images to match the specifications.
The cropping is done carefully to ensure removal of whitespaces and not the face. Secondly, maintaining
aspect ratio is important as it ensures not distortion of images and uniform aspect ratio would ensure
uniform output when images are used in the application. The following images illustrates optimization
results of standardization process.
if(strCommand.Compare(_T("STANDARD_PHOTO")) >= 0)
{
cFaceUtility.createStandardAndBWImages(argv[2],argv[3],CFaceUtility::Size_200X160,
true, false);
}
Use of computer vision and machine learning technologies in eGovernance UID Innovation Labs, Government of Maharashtra
P a g e 12 | 39
IMPACT OF CLEANSING ACTIVITY USING OPENCV
Out of 50 million data sets analyzed using OpenCV, 1.14 million images were flagged by the software.
These images went through the photo validation process, by putting two data entry operators
independently looking at these images, and then a supervisor going manually through their suggestions.
Over all, about 12,000 images were found to be unacceptable post photo validation process. Thus,
manual effort was reduced almost 50 times due to this specialized software.
The process identified around 2.2 % images as suspect. Post manual photo validation, it was found
that 0.024 % were images were not usable.
Use of computer vision and machine learning technologies in eGovernance UID Innovation Labs, Government of Maharashtra
P a g e 13 | 39
CASCADE HAAR TRAINING FOR CLASSIFIER
The earlier exercise was based on the available training files in public domain (in OpenCV library). These
training files were mainly available from worldwide communities which trained on foreign faces. There
were no training files available on Indian faces.
With extensive database of Indian images, following process was initiated for generating Haar training
files for Indian faces.
As first case age group of 0-10 were selected to generate the training vector file. Images of good quality
from the age group 0-10 were selected from USRDH database. It was envisaged that using these training
files, FaceDetect process will identify child images (for any Indian database) from a set of images and
identify rest images with ‘Reject’ flag. This training vector file can be used by entities and research
organization to optimize their solutions and utilize the same in their applications.
As a part of the training, significant number of images were selected. The number of images were
increased in incremental iterations for generating training file. In all after four iterations a reliable set of
training vector file was generated.
Use of computer vision and machine learning technologies in eGovernance UID Innovation Labs, Government of Maharashtra
P a g e 14 | 39
The input sample and results are depicted below:
SAMPLE:
ITERATIONS:
ITERATION 1
Positive Image set of 450 with marked Region of Interest
(RoI) was provided for Haar training.
Negative Image set of 700 were provided. The images were
mainly blank images and did not contain images of persons.
Use of computer vision and machine learning technologies in eGovernance UID Innovation Labs, Government of Maharashtra
P a g e 15 | 39
Accepted Image
Results
Rejected Image
Results
Conclusion: Post training OpenCV rejected images with very old residents; however it accepted adult
images.
Use of computer vision and machine learning technologies in eGovernance UID Innovation Labs, Government of Maharashtra
P a g e 16 | 39
ITERATION 2
Positive Images of 0-10 years were increased from 450 to
1000 and negative images were 1000 with some adult images.
Accepted Image
Results
Use of computer vision and machine learning technologies in eGovernance UID Innovation Labs, Government of Maharashtra
P a g e 17 | 39
Rejected Image
Results
Conclusion: Post addition of adult images in negative set, OpenCV rejected images with very old
residents as well as some adult images. However it still accepted some of the adult images.
Use of computer vision and machine learning technologies in eGovernance UID Innovation Labs, Government of Maharashtra
P a g e 18 | 39
ITERATION 3
In this iteration we used complete set of 1000 images of
Adults to train OpenCV.
It was found that false matched has been reduced and we
observed that selection ration and detection of child has
been increased with some exception. The rejected set
included some child images as well.
Accepted Image
Results as a child
Rejected Image
Results
Use of computer vision and machine learning technologies in eGovernance UID Innovation Labs, Government of Maharashtra
P a g e 19 | 39
ITERATION 4
We tried to increase the no of positive image to around
1500 keeping negative set to 1000 images. This should have
shown better accuracy in identifying child images. However
there were no significant improvements observed from
iterations 3.
Conclusion: Based on the results of all the iterations, training Vector generated from Iteration 3 was
taken as most appropriate as optimum training file to identify bad images (that is, demographic Age
mismatching with photo).
Use of computer vision and machine learning technologies in eGovernance UID Innovation Labs, Government of Maharashtra
P a g e 20 | 39
AUTOMATIC NUMBER PLATE RECOGNITION
Automatic number plate recognition (ANPR) uses optical character recognition on images to read vehicle
registration plates. They can use existing closed-circuit television or road-rule enforcement cameras, or
ones specifically designed for the task. ANPR can be used by various police forces for traffic
enforcements, identification of stolen vehicles, surveillance etc. ANPR can also be used for electronic toll
collection on pay-per-use roads and cataloging the movements of traffic or individuals.
AUTOMATIC NUMBER PLATE RECOGNITION USING OPENCV
UID Innovation Labs conducts various research and use cases as a part of eGovernance initiatives. After
the use of OpenCV in UID Face Recognition, the team also attempted Automatic Number Plate
Recognition (ANPR) using OpenCV. As a concept, ANPR detects vehicle numbers from images and
footage. ANPR carries immense applicability, adoptability and potential to be used in various monitoring
and mass surveillance systems for the Governments. The algorithm used by UID Innovation Labs
performs image segmentation and feature extraction, pattern recognition basics, Vector Machines and
K-Nearest.
In order to implement ANPR operations, the first task was to train the software on Indian number
plates. The following code was used to load the training sample and learn the K-nearest data from it. As
the Hon’ble Supreme Court of India has given directions for Standardizing Vehicle Number plates across
the country, we have used the Standardized Template or its close variants only, to train the software.
Use of computer vision and machine learning technologies in eGovernance UID Innovation Labs, Government of Maharashtra
P a g e 21 | 39
NewOCROpenCV newOCROpenCV;
// Load the training sample and learn the K-nearest data from it
newOCROpenCV.setSize(15, 25,25);
newOCROpenCV.LearnFromImages(<PATH>\\NumberPlateDetection\\OCRData25");
// OCRData25 folder contains training images
//--- The main function to learn from training images
void NewOCROpenCV::LearnFromImages(char * pathToTraingImages)
{
if(showSteps == true)
{ namedWindow("single", CV_WINDOW_AUTOSIZE);
namedWindow("all",CV_WINDOW_AUTOSIZE); }
CvMat* trainData = cvCreateMat(classes * nNumberOfSamples,nImageSize,CV_32FC1);
CvMat* trainClasses = cvCreateMat(classes * nNumberOfSamples, 1, CV_32FC1);
Mat img;
char file[255];
//--- Read images one by one
for (int j = 0; j < nNumberOfSamples; j++)
{
for (int i = 0; i < classes; i++)
{
sprintf(file, "%s\\%d\\%c.jpg", pathToTraingImages,
j,strTrainCharacters[i]);
img = imread(file, 1);
if (!img.data)
Use of computer vision and machine learning technologies in eGovernance UID Innovation Labs, Government of Maharashtra
P a g e 22 | 39
The training code above publishes a trained data in output file.
PLATE DETECTION AND RECOGNITION
Post to training and creating an output file, plate detection and recognition functions were defined.
Plate detection has the purpose of detecting the location of the plate in the whole camera frame. When
a plate is detected in an image, the plate segment is passed to the second step - plate recognition -
which uses an OCR algorithm to determine the alphanumeric characters on the plate.
In the next figure we can see the two main algorithm steps, plate detection and plate recognition. After
these steps the program draws over the camera frame the plate's characters that have been detected.
The algorithms can return bad results or even no result:
if (!img.data)
{
cout << "File " << file << " not found\n";
exit(1);
}
//--- Process image and get the training data out of it
Mat outfile;
PreProcessImage(&img, &outfile, nSizex, nSizey);
for (int n = 0; n < nImageSize; n++)
{
trainData->data.fl[i * nImageSize + n] = outfile.data[n];
}
trainClasses->data.fl[i] = i;
}
}
if(knearest != NULL)
{
delete knearest;
knearest = NULL;
}
knearest = new KNearest(trainData, trainClasses);
}
Use of computer vision and machine learning technologies in eGovernance UID Innovation Labs, Government of Maharashtra
P a g e 23 | 39
In each step shown in the previous figure, three additional steps are defined that are commonly used in
pattern recognition algorithms:
1. Segmentation: This step detects and removes each patch/region of interest in the image.
2. Feature extraction: This step extracts from each patch a set of characteristics.
3. Classification: This step extracts each character from the plate recognition step or classifies each
image patch into "plate" or "no plate" in the plate-detection step.
Pattern recognition algorithms are very critical to train the system. Failing to train it effectively, ANPR
may not work correctly as different patterns need different types of training and evaluation. It is needed
to evaluate system in different environments, conditions, and with different features to get the best
results.
Use of computer vision and machine learning technologies in eGovernance UID Innovation Labs, Government of Maharashtra
P a g e 24 | 39
PLATE DETECTION
To detect the plate segmentation and segment classification of image is performed. In segmentation,
different filters, morphological operations, contour algorithms, and validations to retrieve those parts of
the image that could have a plate are applied. While in segment classification, a Support Vector Machine
(SVM) classifier of OpenCV is applied to each image patch.
Before creating main application, two different classes for plate and non-plate are applied and trained.
IT was processed with parallel frontal-view color images that are 800 pixels wide and taken 2–4 meters
from a car. These requirements are important to ensure correct segmentations.
The following image shows all the processes involved in plate detection:
Sobel filter
Threshold operation
Close morphologic operation
Mask of one filled area
Possible detected plates marked in red (features images)
Detected plates after the SVM classifier
SEGMENTATION
The second part is to divide image into multiple segments in order to simplify image for analysis to make
further extractions easier. One important feature of plate segmentation is the high number of vertical
edges in a license plate assuming that the image was taken frontally, and the plate is not rotated and is
without perspective distortion.
CLASSIFICATION
After preprocess and segmentation of all possible parts of an image, it needs to be identified that each
segment is (or is not) a license plate. To do this, a Support Vector Machine (SVM) algorithm is used.
Use of computer vision and machine learning technologies in eGovernance UID Innovation Labs, Government of Maharashtra
P a g e 25 | 39
A Support Vector Machine is a pattern recognition algorithm included in a family of supervised-learning
algorithms originally created for binary classification. Supervised learning is machine-learning algorithm
that learns through the use of labeled data. We need to train the algorithm with an amount of data that
is labeled; each data set needs to have a class. The SVM creates one or more hyper-planes that are used
to discriminate each class of the data.
TRAINING TO CLASSIFIER
The first task before any classification is to train our classifier; this job is done prior to beginning the
main application and its named offline training. It requires a sufficient amount of data to train the
system, but a bigger dataset does not always imply the best results. In current case, enough data was
not available due to the fact that there are no public license-plate databases. As a result, hundreds of
car photos were taken and then preprocess and segment all the photos.
More than 75 license-plate images and 35 images without license plates of 144 x 33 pixels were used for
training. The sample of this data is shown in the following image. It is sufficient enough to get decent
results considering the scope of experiment.
PLATE RECOGNITION
This step in license plate recognition aims to retrieve the characters of the license plate with optical
character recognition (OCR). For each detected plate, we proceed to segment the plate for each
character, and use an Artificial Neural Network (ANN) machine-learning algorithm to recognize the
character.
OCR SEGMENTATION
The plate image patch was obtained as the input to the segmentation OCR function with an equalized
histogram, post to that, a threshold filter was applied using image as the input of a Find contours
algorithm; we can see this process in the next figure:
Use of computer vision and machine learning technologies in eGovernance UID Innovation Labs, Government of Maharashtra
P a g e 26 | 39
FEATURE EXTRACTION
The next step for each segmented character is to extract the features for training and classifying the
Artificial Neural Network algorithm. Unlike the plate detection feature-extraction step that is used in
SVM, we don't use all of the image pixels; we will apply more common features used in optical character
recognition containing horizontal and vertical accumulation histograms and a low-resolution image
sample.
OCR CLASSIFICATION
In the classification step, we use an Artificial Neural Network machine-learning algorithm. More
specifically, a Multi-Layer Perceptron (MLP), which is the most commonly used ANN algorithm. MLP
consists of a network of neurons with an input layer, output layer, and one or more hidden layers. Each
layer has one or more neurons connected with the previous and next layer.
For our sample, we only use one hidden layer, then we define a matrix of 1 row and 3 columns. The first
column position is the number of features, the second column position is the number of hidden neurons
in the hidden layer, and the third column position is the number of classes.
OpenCV defines a CvANN_MLP class for ANN. With the create function, we can initiate the class by
defining the number of layers and neurons, the activation function, and the alpha and beta parameters.
In OpenCV, the CvANN_MLP class uses the predict function for classifying a feature vector in a class.
Unlike the SVM classify function, the ANN's predict function returns a row with the size equal to the
number of classes with the probability of belonging to the input feature of each class.
To get the best result, we can use the minMaxLoc function to get the maximum and minimum response
and the position in the matrix. The class of our character is specified by the x position of a higher value:
Use of computer vision and machine learning technologies in eGovernance UID Innovation Labs, Government of Maharashtra
P a g e 27 | 39
To finish each plate detected, we order its characters and return a string using the str() function of the
Plate class and we can draw it on the original image:
MODIFICATION TO HANDLE ‘MH’ NUMBERS
The output of the above process is a string on the number plate. However some intelligence to identify
Maharashtra vehicles was build and successfully tested. The strings usually returned ambiguous Number
Plated which were then cleaned to retrieve MH number plates.
string licensePlate=plate.str();
rectangle(input_image, plate.position, Scalar(0,0,200));
putText(input_image, licensePlate, Point(plate.position.x, plate.
position.y), CV_FONT_HERSHEY_SIMPLEX, 1, Scalar(0,0,200),2);
Use of computer vision and machine learning technologies in eGovernance UID Innovation Labs, Government of Maharashtra
P a g e 28 | 39
The correction algorithm is based on the analysis of the results received and changing the numbers
identified to logical MH numbers.
bool NumberDetect::checkAndCorrectStateCode (CString & strStateCode) { CString strNewCode = _T(""); if(strStateCode.Mid(0,1).Compare(_T("M")) != 0) { strNewCode = _T("MH"); strStateCode = strNewCode; return true; } else if(strStateCode.Mid(1,1).Compare(_T("H")) != 0) { strNewCode = _T("MH"); strStateCode = strNewCode; return true; } return false; } bool NumberDetect::correctNumbertIfChar (CString & strNumber) { strNumber.Replace(_T("B"),_T("8")); strNumber.Replace(_T("O"),_T("0")); strNumber.Replace(_T("Y"),_T("1")); strNumber.Replace(_T("G"),_T("6")); strNumber.Replace(_T("A"),_T("4")); strNumber.Replace(_T("S"),_T("6")); strNumber.Replace(_T("Z"),_T("2")); strNumber.Replace(_T("Q"),_T("0")); return true; } bool NumberDetect::correctCharIfNumber (CString & strChar) { strChar.Replace(_T("8"),_T("B"));
Use of computer vision and machine learning technologies in eGovernance UID Innovation Labs, Government of Maharashtra
P a g e 29 | 39
strChar.Replace(_T("0"),_T("O")); strChar.Replace(_T("1"),_T("Y")); strChar.Replace(_T("6"),_T("G")); strChar.Replace(_T("4"),_T("A")); return true; } bool NumberDetect::CorrectAsPerIndianStd (CString & strNumPlate) { if(strNumPlate.GetLength() < 9) { cout << "Number Plate is not standard or OCR has failed to detect single number. Try again\n"; return false; } // MH CString strStateCode = strNumPlate.Mid(0,2); checkAndCorrectStateCode(strStateCode); // 03 CString strAreaCode = strNumPlate.Mid(2,2); correctNumbertIfChar(strAreaCode); if(strNumPlate.GetLength() == 9) { CString restOFNum = strNumPlate.Mid(4,strNumPlate.GetLength() - 4); strNumPlate = strStateCode + strAreaCode + restOFNum; return true; } //AZ CString strSerialChar = strNumPlate.Mid(4,2); correctCharIfNumber(strSerialChar); //0572 CString strSrNum = strNumPlate.Mid(6,4); correctNumbertIfChar(strSrNum); strNumPlate = strStateCode + strAreaCode + strSerialChar + strSrNum; return true; }
APPLICATIONS OF ANPR
ANPR use is widespread among many law enforcement agencies across the World. ANPR can become
significant component of predictive policing strategies and intelligence gathering, recovery of stolen
vehicles, identification of wanted felons, and tracking individuals who are tax evaders etc. Of course,
ANPR has its uses in regular traffic management.
Use of computer vision and machine learning technologies in eGovernance UID Innovation Labs, Government of Maharashtra
P a g e 30 | 39
The real-time processing or batch mode processing of license plate numbers can generate data which
can be analyzed for better traffic management, surveillance and other community benefits. Images,
dates, times and GPS coordinates can be stockpiled and can help place a suspect at a scene, aid in
witness identification, pattern recognition or the tracking of individuals.
ANPR can be used for parking enforcement, tracking vehicles involved in accidents tracking and seizure
of vehicles with pending traffic fines.
Augmented with BI and Big Data analysis, ANPR can be an effective tool in the hand of the Government
to enforce law, prevent crime, and improve traffic thus impacting quality of life.
Use of computer vision and machine learning technologies in eGovernance UID Innovation Labs, Government of Maharashtra
P a g e 31 | 39
Appendix A
Function for File Properties
//--- This funciton writes the file properties to csv file int CFaceUtility::ListFileProperty(CString strFolderToClean) { try { //--- Prepare the csv file to write result CArray<CString> strFileList; int pos = strFolderToClean.ReverseFind(_T('\\')); CString strCSVFileName = _T("File_Prop_") + strFolderToClean.Mid(pos + 1) + _T(".csv"); time_t time = GetTickCount(); TCHAR currDir[MAX_PATH]; GetDllDirectory(MAX_PATH,currDir); SetCurrentDirectory(GetExePath()); //--- Log start time to find performance time_t time2 = GetTickCount(); CString strCSVPath = strFolderToClean + _T("\\") + strCSVFileName; if(isFolderOrFileExist(strCSVPath)) { cout << "Folder Already Processed. Exiting Batch... \n"; return -1; } CFile fNew; BOOL bIsSuccess = fNew.Open(strCSVPath,CFile::modeReadWrite | CFile::modeCreate); if(bIsSuccess == false) { strCSVPath = strFolderToClean + _T("\\2_") + strCSVFileName; fNew.Open(strFolderToClean,CFile::modeCreate); } //--- Get file ;list for directory listFiles(strFileList, strFolderToClean); int nTotalFiles = strFileList.GetSize(); if(nTotalFiles == 0) { cout << "0 File Found \n"; fNew.Close(); return -1; } //--- Header information to put in csv file
Use of computer vision and machine learning technologies in eGovernance UID Innovation Labs, Government of Maharashtra
P a g e 32 | 39
CString strHeaderFile = _T("UID,WIDTH,HEIGHT,ASPECT_RATIO,STANDARD_SIZE,FILE_SIZE\r\n"); fNew.Write((LPCTSTR) strHeaderFile, strHeaderFile.GetLength() * sizeof(TCHAR)); cout << "Processing Started \n"; CString strFileContent; //--- iterate through each file and get its property for(int nCount = 0; nCount < nTotalFiles; nCount++) { CString strUIDNumber = strFileList.GetAt(nCount); CString strPhotoFIle = strFolderToClean + _T("\\") + strUIDNumber; strUIDNumber = strUIDNumber.Mid(0, strUIDNumber.ReverseFind(_T('.'))); CT2A ascii(strPhotoFIle); //--- Load image file and get its parameters such as width, height, aspect ratio, file size etc IplImage* img = cvLoadImage( ascii.m_psz ); //, CV_LOAD_IMAGE_GRAYSCALE if( !img || img->imageData == NULL) { cout << "Can not load image file" << ascii.m_psz; continue; } CFile * f = new CFile(strPhotoFIle,CFile::modeRead); long l = f->GetLength(); double imageAspect = (double) img->height / (double)img->width; long imgSize = l; int batchSize = 1000; //--- Dump the collected data at the end of every 1000 file count if(nCount!= 0 && (nCount % batchSize) == 0) { time2 = GetTickCount() - time2; cout << " Time Taken by each file in last" << batchSize << " files is (ms)" << time2/batchSize << "\n"; time2 = GetTickCount(); fNew.Write((LPCTSTR) strFileContent, strFileContent.GetLength() * sizeof(TCHAR)); strFileContent = ""; } int nIsStandard = 0; if(imageAspect >= 1.25 && imageAspect <= 1.339)
Use of computer vision and machine learning technologies in eGovernance UID Innovation Labs, Government of Maharashtra
P a g e 33 | 39
{ nIsStandard = 1; } //--- add detected parameters CString strData = _T(""); strData.Format(_T("%s,%d,%d,%f,%d,%d\r\n"),strUIDNumber,img->width,img->height,imageAspect,nIsStandard,imgSize); strFileContent += strData; cvReleaseImage(&img); } //--- Final dump of data and file close fNew.Write((LPCTSTR) strFileContent, strFileContent.GetLength() * sizeof(TCHAR)); fNew.Close(); time = GetTickCount() - time; } catch(Exception * ex2) { return -1; } return 0; }
Function to detect Faces and face parts like nose, eyes, mouth, face //--- This funcion command and folder to process int CFaceUtility::ProcessPhotoFile(CString strCommand, CString strFolderToClean, CString strForcePrcess) { bool bForceProcess = false; bool bAddFileInfo = false; try { //--- check command if(strForcePrcess.Compare(_T("FORCE_PROCESS")) >= 0) { bForceProcess = true; } else if(strForcePrcess.Compare(_T("ADD_FILE_INFO")) >= 0) { bAddFileInfo = true; }
Use of computer vision and machine learning technologies in eGovernance UID Innovation Labs, Government of Maharashtra
P a g e 34 | 39
CArray<CString> strFileList; CString clean_single("CLEAN_SINGLE"); //--- Chekc if folder does exist if(isFolderOrFileExist(strFolderToClean) == false) { AfxMessageBox(_T("Folder Does not exist")); return -1; } //--- or we have command to detect the single folder containing image files else if(strCommand.Compare(clean_single)) { AfxMessageBox(_T("Wrong Command line argument")); return -1; } //--- Prepare the output file (CSV) int pos = strFolderToClean.ReverseFind(_T('\\')); CString strCSVFileName = _T("Result_") + strFolderToClean.Mid(pos + 1) + _T(".csv"); time_t time = GetTickCount(); TCHAR currDir[MAX_PATH]; GetDllDirectory(MAX_PATH,currDir); SetCurrentDirectory(GetExePath()); //--- load Haar Cascade Files if not done earlier InitilizeFaceUtility(); time_t time2 = GetTickCount(); //--- If output CSV file exist and force flag is not set then exit the uitilty (Means already processsed) CString strCSVPath = strFolderToClean + _T("\\") + strCSVFileName; if(bForceProcess == false && isFolderOrFileExist(strCSVPath)) { cout << "Folder Already Processed. Exiting Batch... \n"; return -1; } CFile fNew; BOOL bIsSuccess = fNew.Open(strCSVPath,CFile::modeReadWrite | CFile::modeCreate); if(bIsSuccess == false) { strCSVPath = strFolderToClean + _T("\\2_") + strCSVFileName; fNew.Open(strFolderToClean,CFile::modeCreate); }
Use of computer vision and machine learning technologies in eGovernance UID Innovation Labs, Government of Maharashtra
P a g e 35 | 39
cout << "File Listing Started \n"; //--- Get the files from folder listFiles(strFileList, strFolderToClean); int nTotalFiles = strFileList.GetSize(); if(nTotalFiles == 0) { cout << "0 File Found \n"; fNew.Close(); return -1; } //--- Output headr of CSV files CString strHeaderFile = _T("uid,Faces,Face_Alt,Face_Alt2,Nose_Count,EyePair_Count,Mouth_Count,Eye_Glasses,Percentage_cover_1,Percentage_cover_2,Percentage_cover3"); if(bAddFileInfo == true) { strHeaderFile += _T(",WIDTH,HEIGHT,ASPECT_RATIO,STANDARD_SIZE,FILE_SIZE"); } strHeaderFile += _T("\r\n"); fNew.Write((LPCTSTR) strHeaderFile, strHeaderFile.GetLength() * sizeof(TCHAR)); cout << "Processing Started \n"; CString strFileContent; //--- Parse each file and load it for detection for(int nCount = 0; nCount < nTotalFiles; nCount++) { CString strUIDNumber = strFileList.GetAt(nCount); CString strPhotoFIle = strFolderToClean + _T("\\") + strUIDNumber; strUIDNumber = strUIDNumber.Mid(0, strUIDNumber.ReverseFind(_T('.'))); CT2A ascii(strPhotoFIle); IplImage* imgOriginal = NULL; IplImage* img = NULL; //--- Load the image file imgOriginal = cvLoadImage( ascii.m_psz ); //, CV_LOAD_IMAGE_GRAYSCALE double imageArea = imgOriginal->width * imgOriginal->height; if( !imgOriginal || imgOriginal->imageData == NULL)
Use of computer vision and machine learning technologies in eGovernance UID Innovation Labs, Government of Maharashtra
P a g e 36 | 39
{ cout << "Can not load image file" << ascii.m_psz; continue; } //--- If user has also aksed to add file proeprty feature then append this as well CString strFileInfo = _T(""); if(bAddFileInfo == true) { CFile * f = new CFile(strPhotoFIle,CFile::modeRead); long lFileSize = f->GetLength(); int imageHeight = imgOriginal->height; int imageWidth = imgOriginal->width; double imageAspect = (double) imageHeight / (double)imageWidth; int nIsStandard = 0; if(imageAspect >= 1.25 && imageAspect <= 1.339) { nIsStandard = 1; } strFileInfo.Format(_T(",%d,%d,%f,%d,%d"),imageWidth,imageHeight,imageAspect,nIsStandard, lFileSize); } //--- Incase non standard image 160 X200 or of that same aspect ratio int imageWidth = imgOriginal->width; if(imageWidth > 160) { double imageAspect = (double) imgOriginal->height / (double)imageWidth; int newImageWidth = 160; int nNewHeight =(int) ((float)newImageWidth * imageAspect); img = cvCreateImage ( cvSize(newImageWidth , nNewHeight ), imgOriginal->depth, imgOriginal->nChannels ); cvResize(imgOriginal, img); cvReleaseImage(&imgOriginal); imgOriginal = NULL; } else { img = imgOriginal; imgOriginal = NULL; } //--- Using haar detection api of openCV detect facial features CvSeq * sign = NULL;
Use of computer vision and machine learning technologies in eGovernance UID Innovation Labs, Government of Maharashtra
P a g e 37 | 39
int total_Faces = -1; int total_faceAlt = -1; int total_faceAlt2 = -1; int total_Noses = -1; int total_Mouth = -1; int total_EyePair = -1; int total_EyGlasses = -1; try { sign = cvHaarDetectObjects(img,cascadeFace,storageFace,scaleFactor,minNeighbors,CV_HAAR_DO_CANNY_PRUNING); cvClearMemStorage(storageNose); if(sign == NULL) { continue; } total_Faces = sign->total; } catch(exception * ex) { continue; } int batchSize = 500; if(nCount!= 0 && (nCount % batchSize) == 0) { time2 = GetTickCount() - time2; cout << " Time Taken by each file in last " << batchSize << " files is (ms)" << time2/batchSize << "\n"; time2 = GetTickCount(); fNew.Write((LPCTSTR) strFileContent, strFileContent.GetLength() * sizeof(TCHAR)); strFileContent = ""; } //---No face detected if(total_Faces == 0) { total_faceAlt = getFaceAlternate(img); total_faceAlt2 = getFaceAlternate2(img); total_Noses = getNoseCount(img); total_EyePair = getEyePairCount(img); total_Mouth = getMouthCount(img); total_EyGlasses = -1; //--- As of now we don't care about eyeglasses //--- Prepare the output result CString strData = _T("");
Use of computer vision and machine learning technologies in eGovernance UID Innovation Labs, Government of Maharashtra
P a g e 38 | 39
strData.Format(_T("%s,%d,%d,%d,%d,%d,%d,%d,0,0,0%s\r\n"),strUIDNumber,total_Faces,total_faceAlt,total_faceAlt2, total_Noses, total_EyePair, total_Mouth, total_EyGlasses,strFileInfo ); strFileContent += strData; } //--- 1 face detected else if (total_Faces == 1) { int i = 0; CvRect r = *(CvRect*)cvGetSeqElem( sign, i ); double face_area = r.width * r.height; total_faceAlt2 = getFaceAlternate2(img); total_Noses = getNoseCount(img); total_EyePair = getEyePairCount(img); total_Mouth = getMouthCount(img); total_EyGlasses = -1; //--- As of now we don't care about eyeglasses //--- Prepare the output result CString strData = _T(""); strData.Format(_T("%s,%d,%d,%d,%d,%d,%d,%d,%f,0,0 %s\r\n"),strUIDNumber,total_Faces,total_faceAlt,total_faceAlt2, total_Noses, total_EyePair, total_Mouth, total_EyGlasses, (face_area * 100/imageArea),strFileInfo ); strFileContent += strData; } //--- more then 1 face detected else { // Total faces are more then 5 then set it to 5 only if(total_Faces > 3) { total_Faces = 3; } //--- Prepare the output result CString strData = _T(""); strData.Format(_T("%s,%d,%d,%d,%d,%d,%d,%d"),strUIDNumber,total_Faces,total_faceAlt,total_faceAlt2, total_Noses, total_EyePair, total_Mouth, total_EyGlasses ); strFileContent += strData; CString strFacePercantage = _T(""); int i = 0; for(i = 0; i < total_Faces; i++ ) { CvRect r = *(CvRect*)cvGetSeqElem( sign, i ); double face_area = r.width * r.height; CString strPercen =_T("");
Use of computer vision and machine learning technologies in eGovernance UID Innovation Labs, Government of Maharashtra
P a g e 39 | 39
strPercen.Format(_T(",%f"), (face_area * 100/imageArea)); strFacePercantage +=strPercen; } for(int j = i; j < 3; j++ ) { strFacePercantage += _T(",0"); } strFileContent += strFacePercantage + strFileInfo + _T("\r\n"); } cvReleaseImage(&img); } //--- Dump result to filesystem and close the file fNew.Write((LPCTSTR) strFileContent, strFileContent.GetLength() * sizeof(TCHAR)); fNew.Close(); time = GetTickCount() - time; cout << " Time Taken is (ms) " << time << " Average time taken in ms is " << time/nTotalFiles << "\n"; } catch(Exception * ex2) { return -1; } return 0; }