AADHAR Card- Database Creation

Creating A Unique Identity For Every Resident

Under The GuidanceDr. T. NambirajanProfessorDepartment of Management StudiesSchool of Management

BY GROUP 8BASIL JOHNPANCHAMISARITHAGIRIDHARANKABILANJOEL JOSEPH

2

• Collection of interrelated data• Set of programs to access the data• DBMS contains information about a particular enterprise• DBMS provides an environment that it both convenient and efficient to use.•Database management systems were developed to handle the following difficulties of typical file-processing systems supported by conventional operating systems

3

The Unique ID initiative

4

Principles of Aadhaar

One-time standardized Aadhaar enrollment establishes uniqueness of resident via ‘biometric de-duplication’

• Only one Aadhaar number per eligible individual

Online Authentication is provided by UIDAI• Demographic Data (Name, Address, DOB, Gender)• Biometric Data (Fingerprint)

Aadhaar :Subject to online authentication is proof of IDAadhaar enrollment / Update =

KYCAadhaar No. Issued,

stored in Auth. Server “Verification” of KYC

(Authentication)

5

Features of Aadhaar

Aadhaar is a 12 digit number – No Cards

Random Number – No Intelligence

Standard Attributes – No Profiling or Application Information

All Residents including Children get Numbers

Introducer System

Partnership Model

Flexibile Authentication Interface to Partners

1

2

3

4

5

6

7

6

Benefits of Aadhaar (I)

• No fakes

• No duplicates

• To all with special focus on the marginalised and the excluded

• Enable connectivity among databases

• Enable consolidation

Reduces Leakages

Provides Identities

Breaks Silos

7

Benefits of Aadhaar (II)

• Financial inclusion

• Electronic transfer of benefits

• Security of transactions

• Access to services

• Mobility in various application

• Building up of applications

Enables

Enhances

Ensures

8

Registrar On-boarding Process

1. MoU with the State Government2. Empowered Committee and Implementation Committee3. Nodal Department and Registrar4. KYR+ Fields5. Vendor Selection6. Enrolment Plan7. IEC Activity8. 13th FC Funds9. Monitoring the enrolment process10.ICT Infrastructure11.Aligning UID number to Databases & Government Programmes

9

Enrolment Station

10

Enrolment Station

11

11

Demographic data fields captured during Aadhaar enrolment

Field Name Comments

Name PoI documents required

Date of Birth Approximate/ Declared/ Verified

Gender M/F/T

Address PoA documents required

Parent/Spouse/Guardian Name Optional (mandatory in case of child below 5 years)

Introducer UID Where PoI/PoA not provided

12

Capture Demographic & Biometric Data

Optional data:• Introducer data for verification

and/or• Data of a relative who has a UID

number or an enrolment number• Phone no., email address

Biometric Data

12

Resident’s Photograph

Resident’s Finger Prints

Resident’s Iris

13

Aadhaar Letter

14

UID ecosystem – A symbiotic network

Oil Ministry

States

LIC & Banks

Income TaxOil Cos.

LPG agencies

Branches

Field agencies

Food & Supplies

Rural Development Social welfare

Ration shops

Registrar

Sub-registrar

Enrolling agency

Continuous monitoring and

feedback

15

UID Agencies

16

16

1

2

3

4

5

6

Enrollment Agencies

Registrars

KYR, Biometrics, KYR+

Resident

UIDAIBanks / POSB

KYR, Biometrics

Aadhaar

Aadhaar, KYR, Photo

Aadhaar Bank A/c

Financial Inclusion - Aadhaar enabled bank account

17

Process Workflow

Preparation Activities

Enrolment Activities

Ongoing Activities

Verification

procedures

Demographic & biometric data capture

Data transfer to

CIDR

CIDR

Rejections identified

Biometric De-

duplication

UID Assignmen

t

Data Updation

Authentication

Certifications

Devices

Operators

Registrar Readiness

MoU, committees etcProcess & Technology

Alignment

Prep Enrolments

Introducers

Operators, supervisors

Letter Printing & Delivery

Setup Enrolment Centers

Devices, hardware, software, connectivityPeople, admin support,

logistics etc.

1

2

1

3

45

6

7 8

9

http://www.google.co.in/imgres?imgurl=http://image.tradevv.com/2009/09/09/hongdalinda_520963_450/s700-4-finger-livescan-scanner.jpg&imgrefurl=http://www.tradevv.com/chinasuppliers/hongdalinda_p_7f303/china-S700-4-finger-livescan-scanner.html&usg=__6pAPfXqcKm7h_JtllEVawPmIBBg=&h=450&w=450&sz=12&hl=en&start=12&itbs=1&tbnid=wiORwfoAaffBFM:&tbnh=127&tbnw=127&prev=/images?q=4+finger+fingerprint+scanner&hl=en&gbv=2&tbs=isch:1

18

UID System

19

UID Architecture

20

ER Diagram Flowchart

21

Enrolment Data

• 600 to 800 million UIDs in 4 years• 1 million a day with transaction, durability guarantees• 350+ trillion matches every day

• ~5MB per resident• Maps to about 10-15 PB of raw data (2048-bit PKI encrypted)• About 30 TB I/O every day• Replication and backup across DCs of about 5+ TB of incremental data

every day• Lifecycle updates and new enrolments will continue for ever

• Enrolment data moves from very hot to cold, needing multi-layered storage architecture

• Additional process data• Several million events on an average moving through async channels

(some persistent and some transient)• Needing insert and update guarantees across data stores

22

Authentication Data• 100+ million authentications per day (10 hrs)

• Possible high variance on peak and average• Sub second response• Guaranteed audits

• Multi-DC architecture• All changes needs to be propagated from enrolment data stores to all

authentication sites

• Authentication request is about 4 K• 100 million authentications a day• 1 billion audit records in 10 days (30+ billion a year)• 4 TB encrypted audit logs in 10 days• Audit write must be guaranteed

23

Aadhaar Data Stores

Mongo cluster(all enrolment records/documents

– demographics + photo)

Shard 1

Shard 4

Shard 5

Shard 2

Shard 3 Low latency indexed read (Documents per sec),

High latency random search (seconds per read)

MySQL(all UID generated records - demographics only,

track & trace, enrolment status )

Low latency indexed read (milli-seconds per read),High latency random search (seconds per read)

UID master (sharded)

Enrolment DB

Solr cluster(all enrolment records/documents

– selected demographics only)

Low latency indexed read (Documents per sec),Low latency random search (Documents per sec)

Shard 0

Shard 2

Shard 6

Shard 9

Shard a

Shard d

Shard f

HDFS(all raw packets)

Data Node 1

Data Node 10

Data Node ..

High read throughput (MB per sec),High latency read (seconds per read)

Data Node 20

HBase(all enrolment

biometric templates)Region Ser. 1

Region Ser. 10

Region Ser. ..

High read throughput (MB per sec),Low-to-Medium latency read (milli-seconds per read)Region

Ser. 20

NFS(all archived raw packets)

Moderate read throughput,High latency read (seconds per read)

LUN 1 LUN 2 LUN 3 LUN 4

24

Systems Architecture

•Work distribution using SEDA & Messaging•Ability to scale within JVM and across•Recovery through check-pointing

•Sync Http based Auth gateway•Protocol Buffers & XML payloads•Sharded clusters

•Near Real-time data delivery to warehouse•Nightly data-sets used to build dashboards, data marts and reports

•Real-time monitoring using Events

25

Enrolment Biometric Middleware

• Distribute, Reconcile biometric data extraction and de-dup requests across multiple vendors (ABISs)

• Biometric data de-referencing/read service(Http) over sharded HDFS and NFS

• Serves bulk of the HDFS read requests (25TB per day)• Locate data from multiple HDFS clusters

– Sharded by read/write patterns : New, Archive, Purge• Calculates and maintains Volume allocation, SLA

breach thresholds of ABISs• Thresholds stored in ZK and pushed to middleware nodes

26

Event Streams & Sinks• Event framework supporting different interaction/data

durability patterns• P2P, Pub-Sub• Intra-JVM and Queue destinations - Durable / Non-Durable• Fire & Forget, Ack. after processing

• Event Sinks• Ephemeral data consumed by counters, metrics (dashboard)• Rolling file appenders that push data to HDFS

– Primary mechanism for delivering raw fact data from transactional systems to the warehouse staging area

27

Data Analysis

• Statistical analysis from millions of events• View into quality of enrolments – e.g. Enrolment Agencies,

Operators• Feature introduction – e.g. Based on avg. time taken for

biometric capture, demographic data input• Enrolment volumes – e.g. By Registrar, Agency, Operator

etc– Useful in fraud detection

• Goal to share anonymized data sets for use by industry and academia – information transparency

• Various reports – Self-serve, Canned, Operational and/or Aggregates

28

UID BI Platform

Data Analysis architectureData Access Framework

UIDAI Systems Events(Rabbit MQ)

Server DB(MySQL)

Hadoop HDFS

Data Warehouse (HDFS/Hive)

Event CSV

Fact DataDimension Data

Datasets

On-Demand Datasets

Datamarts(MySQL)

Raw Data

Dimension Data(MySQL)

PigPentaho Kettle

Hive

Pentaho Kettle

Canned Reports DashboardSelf-service

Analytics

Pentaho BI

FusionCharts

E-mail/Portal/Others

29

FIELD NAME DATA TYPE KEYNAME VARCHAR CONSTRAINT

MARITAL STATUS

VARCHAR CONSTRAINTS

ADDRESS VARCHAR CONSTRAINTS

PHONE NUMBER

NUMBER CONSTRAINTS

PINCODE NUMBER CONSTRAINTS

REGISTER NO NUMBER PRIMARY KEY

PERSONAL DETAILS 1

30

FIELD NAME DATATYPE KEYNAME VARCHAR CONSTRAINTS

Y.O.B DATE CONSTRAINTS

GENDER VARCHAR CONSTRAINTS

REGISTER NO NUMBER FOREGIN KEY

PERSONAL DETAILS 2

31

• Data Query language• Retrieve• update

• Data Manipulation Language• update• delete

• Data Definition Language• Create• insert

• Transaction Language• Commit• Revoke• savepoint

FUNCTION USED IN THE DATABASE CREATION

32

• CREATION:

Create table tablename(columnname datatype(size),columnname datatype(size))

Create table person details(name varchar(10),marital status varchar(12),address varchar(20),phone number number(10),pincode number(7),register no number(17));

PROCESS OF DATABASE

33

INSERTION:Insert into tablename[(columnname,columnname)]Values (expression,expression);

Insert into table person details(name , maritalstatus, address, phone number ,pincode , register number)values(“xxx”, “w/o yyy “,”no:14,nehru street kamaraj nagar puducherry”,”9123456789”,”605011”,”3560 2513 1913”);

34

• UPDATION:• Update tablename set columnname=expression, columnname=expression…. Where

columnname=expression;– Update personal details set name=“www”– Where pincode=“605011”;

35

• RETRIVAL • SELECT columnname,columname from tablename;

– Select name , register no from personal details;– Select name and register no from personal details where phone number=“9123456789”

DELETION:DELETE FROM tablename;DELETE from personal details;

36

• AVG - avg({distinct all}n)• MIN - min({distinct all}expr)• COUNT(expr) - count({distinct all}expr)• COUNT(*) - COUNT(*)• MAX - max({distinct all}expr)• SUM - sum({distinct all}n)• INITCAP - INITCAP(char)• LTRIM - LTRIM(char[,set])

ORACLE FUNCTIONS

37

Challenges in India Identity Card

38

References• Aadhaar Portal :

https://portal.uidai.gov.in/uidwebportal/dashboard.do• Data Portal :

https://data.uidai.gov.in/uiddatacatalog/dataCatalogHome.do

• Analytics whitepaper : http://uidai.gov.in/images/FrontPageUpdates/uid_doc_30012012.pdf

AADHAR Card- Database Creation

Technology

Transcript of AADHAR Card- Database Creation