AADHAR Card- Database Creation
-
Upload
basil-john -
Category
Technology
-
view
746 -
download
7
Transcript of AADHAR Card- Database Creation
Creating A Unique Identity For Every Resident
Under The GuidanceDr. T. NambirajanProfessorDepartment of Management StudiesSchool of Management
BY GROUP 8BASIL JOHNPANCHAMISARITHAGIRIDHARANKABILANJOEL JOSEPH
2
• Collection of interrelated data• Set of programs to access the data• DBMS contains information about a particular enterprise• DBMS provides an environment that it both convenient and efficient to use.•Database management systems were developed to handle the following difficulties of typical file-processing systems supported by conventional operating systems
3
The Unique ID initiative
4
Principles of Aadhaar
One-time standardized Aadhaar enrollment establishes uniqueness of resident via ‘biometric de-duplication’
• Only one Aadhaar number per eligible individual
Online Authentication is provided by UIDAI• Demographic Data (Name, Address, DOB, Gender)• Biometric Data (Fingerprint)
Aadhaar :Subject to online authentication is proof of IDAadhaar enrollment / Update =
KYCAadhaar No. Issued,
stored in Auth. Server “Verification” of KYC
(Authentication)
5
Features of Aadhaar
Aadhaar is a 12 digit number – No Cards
Random Number – No Intelligence
Standard Attributes – No Profiling or Application Information
All Residents including Children get Numbers
Introducer System
Partnership Model
Flexibile Authentication Interface to Partners
1
2
3
4
5
6
7
6
Benefits of Aadhaar (I)
• No fakes
• No duplicates
• To all with special focus on the marginalised and the excluded
• Enable connectivity among databases
• Enable consolidation
Reduces Leakages
Provides Identities
Breaks Silos
7
Benefits of Aadhaar (II)
• Financial inclusion
• Electronic transfer of benefits
• Security of transactions
• Access to services
• Mobility in various application
• Building up of applications
Enables
Enhances
Ensures
8
Registrar On-boarding Process
1. MoU with the State Government2. Empowered Committee and Implementation Committee3. Nodal Department and Registrar4. KYR+ Fields5. Vendor Selection6. Enrolment Plan7. IEC Activity8. 13th FC Funds9. Monitoring the enrolment process10.ICT Infrastructure11.Aligning UID number to Databases & Government Programmes
9
Enrolment Station
10
Enrolment Station
11
11
Demographic data fields captured during Aadhaar enrolment
Field Name Comments
Name PoI documents required
Date of Birth Approximate/ Declared/ Verified
Gender M/F/T
Address PoA documents required
Parent/Spouse/Guardian Name Optional (mandatory in case of child below 5 years)
Introducer UID Where PoI/PoA not provided
12
Capture Demographic & Biometric Data
Optional data:• Introducer data for verification
and/or• Data of a relative who has a UID
number or an enrolment number• Phone no., email address
Biometric Data
12
Resident’s Photograph
Resident’s Finger Prints
Resident’s Iris
13
Aadhaar Letter
14
UID ecosystem – A symbiotic network
Oil Ministry
States
LIC & Banks
Income TaxOil Cos.
LPG agencies
Branches
Field agencies
Food & Supplies
Rural Development Social welfare
Ration shops
Registrar
Sub-registrar
Enrolling agency
Continuous monitoring and
feedback
15
UID Agencies
16
16
1
2
3
4
5
6
Enrollment Agencies
Registrars
KYR, Biometrics, KYR+
Resident
UIDAIBanks / POSB
KYR, Biometrics
Aadhaar
Aadhaar, KYR, Photo
Aadhaar Bank A/c
Financial Inclusion - Aadhaar enabled bank account
17
Process Workflow
Preparation Activities
Enrolment Activities
Ongoing Activities
Verification
procedures
Demographic & biometric data capture
Data transfer to
CIDR
CIDR
Rejections identified
Biometric De-
duplication
UID Assignmen
t
Data Updation
Authentication
Certifications
Devices
Operators
Registrar Readiness
MoU, committees etcProcess & Technology
Alignment
Prep Enrolments
Introducers
Operators, supervisors
Letter Printing & Delivery
Setup Enrolment Centers
Devices, hardware, software, connectivityPeople, admin support,
logistics etc.
1
2
1
3
45
6
7 8
9
18
UID System
19
UID Architecture
20
ER Diagram Flowchart
21
Enrolment Data
• 600 to 800 million UIDs in 4 years• 1 million a day with transaction, durability guarantees• 350+ trillion matches every day
• ~5MB per resident• Maps to about 10-15 PB of raw data (2048-bit PKI encrypted)• About 30 TB I/O every day• Replication and backup across DCs of about 5+ TB of incremental data
every day• Lifecycle updates and new enrolments will continue for ever
• Enrolment data moves from very hot to cold, needing multi-layered storage architecture
• Additional process data• Several million events on an average moving through async channels
(some persistent and some transient)• Needing insert and update guarantees across data stores
22
Authentication Data• 100+ million authentications per day (10 hrs)
• Possible high variance on peak and average• Sub second response• Guaranteed audits
• Multi-DC architecture• All changes needs to be propagated from enrolment data stores to all
authentication sites
• Authentication request is about 4 K• 100 million authentications a day• 1 billion audit records in 10 days (30+ billion a year)• 4 TB encrypted audit logs in 10 days• Audit write must be guaranteed
23
Aadhaar Data Stores
Mongo cluster(all enrolment records/documents
– demographics + photo)
Shard 1
Shard 4
Shard 5
Shard 2
Shard 3 Low latency indexed read (Documents per sec),
High latency random search (seconds per read)
MySQL(all UID generated records - demographics only,
track & trace, enrolment status )
Low latency indexed read (milli-seconds per read),High latency random search (seconds per read)
UID master (sharded)
Enrolment DB
Solr cluster(all enrolment records/documents
– selected demographics only)
Low latency indexed read (Documents per sec),Low latency random search (Documents per sec)
Shard 0
Shard 2
Shard 6
Shard 9
Shard a
Shard d
Shard f
HDFS(all raw packets)
Data Node 1
Data Node 10
Data Node ..
High read throughput (MB per sec),High latency read (seconds per read)
Data Node 20
HBase(all enrolment
biometric templates)Region Ser. 1
Region Ser. 10
Region Ser. ..
High read throughput (MB per sec),Low-to-Medium latency read (milli-seconds per read)Region
Ser. 20
NFS(all archived raw packets)
Moderate read throughput,High latency read (seconds per read)
LUN 1 LUN 2 LUN 3 LUN 4
24
Systems Architecture
•Work distribution using SEDA & Messaging•Ability to scale within JVM and across•Recovery through check-pointing
•Sync Http based Auth gateway•Protocol Buffers & XML payloads•Sharded clusters
•Near Real-time data delivery to warehouse•Nightly data-sets used to build dashboards, data marts and reports
•Real-time monitoring using Events
25
Enrolment Biometric Middleware
• Distribute, Reconcile biometric data extraction and de-dup requests across multiple vendors (ABISs)
• Biometric data de-referencing/read service(Http) over sharded HDFS and NFS
• Serves bulk of the HDFS read requests (25TB per day)• Locate data from multiple HDFS clusters
– Sharded by read/write patterns : New, Archive, Purge• Calculates and maintains Volume allocation, SLA
breach thresholds of ABISs• Thresholds stored in ZK and pushed to middleware nodes
26
Event Streams & Sinks• Event framework supporting different interaction/data
durability patterns• P2P, Pub-Sub• Intra-JVM and Queue destinations - Durable / Non-Durable• Fire & Forget, Ack. after processing
• Event Sinks• Ephemeral data consumed by counters, metrics (dashboard)• Rolling file appenders that push data to HDFS
– Primary mechanism for delivering raw fact data from transactional systems to the warehouse staging area
27
Data Analysis
• Statistical analysis from millions of events• View into quality of enrolments – e.g. Enrolment Agencies,
Operators• Feature introduction – e.g. Based on avg. time taken for
biometric capture, demographic data input• Enrolment volumes – e.g. By Registrar, Agency, Operator
etc– Useful in fraud detection
• Goal to share anonymized data sets for use by industry and academia – information transparency
• Various reports – Self-serve, Canned, Operational and/or Aggregates
28
UID BI Platform
Data Analysis architectureData Access Framework
UIDAI Systems Events(Rabbit MQ)
Server DB(MySQL)
Hadoop HDFS
Data Warehouse (HDFS/Hive)
Event CSV
Fact DataDimension Data
Datasets
On-Demand Datasets
Datamarts(MySQL)
Raw Data
Dimension Data(MySQL)
PigPentaho Kettle
Hive
Pentaho Kettle
Canned Reports DashboardSelf-service
Analytics
Pentaho BI
FusionCharts
E-mail/Portal/Others
29
FIELD NAME DATA TYPE KEYNAME VARCHAR CONSTRAINT
MARITAL STATUS
VARCHAR CONSTRAINTS
ADDRESS VARCHAR CONSTRAINTS
PHONE NUMBER
NUMBER CONSTRAINTS
PINCODE NUMBER CONSTRAINTS
REGISTER NO NUMBER PRIMARY KEY
PERSONAL DETAILS 1
30
FIELD NAME DATATYPE KEYNAME VARCHAR CONSTRAINTS
Y.O.B DATE CONSTRAINTS
GENDER VARCHAR CONSTRAINTS
REGISTER NO NUMBER FOREGIN KEY
PERSONAL DETAILS 2
31
• Data Query language• Retrieve• update
• Data Manipulation Language• update• delete
• Data Definition Language• Create• insert
• Transaction Language• Commit• Revoke• savepoint
FUNCTION USED IN THE DATABASE CREATION
32
• CREATION:
Create table tablename(columnname datatype(size),columnname datatype(size))
Create table person details(name varchar(10),marital status varchar(12),address varchar(20),phone number number(10),pincode number(7),register no number(17));
PROCESS OF DATABASE
33
INSERTION:Insert into tablename[(columnname,columnname)]Values (expression,expression);
Insert into table person details(name , maritalstatus, address, phone number ,pincode , register number)values(“xxx”, “w/o yyy “,”no:14,nehru street kamaraj nagar puducherry”,”9123456789”,”605011”,”3560 2513 1913”);
34
• UPDATION:• Update tablename set columnname=expression, columnname=expression…. Where
columnname=expression;– Update personal details set name=“www”– Where pincode=“605011”;
35
• RETRIVAL • SELECT columnname,columname from tablename;
– Select name , register no from personal details;– Select name and register no from personal details where phone number=“9123456789”
DELETION:DELETE FROM tablename;DELETE from personal details;
36
• AVG - avg({distinct all}n)• MIN - min({distinct all}expr)• COUNT(expr) - count({distinct all}expr)• COUNT(*) - COUNT(*)• MAX - max({distinct all}expr)• SUM - sum({distinct all}n)• INITCAP - INITCAP(char)• LTRIM - LTRIM(char[,set])
ORACLE FUNCTIONS
37
Challenges in India Identity Card
38
References• Aadhaar Portal :
https://portal.uidai.gov.in/uidwebportal/dashboard.do• Data Portal :
https://data.uidai.gov.in/uiddatacatalog/dataCatalogHome.do
• Analytics whitepaper : http://uidai.gov.in/images/FrontPageUpdates/uid_doc_30012012.pdf
39
40
41