1
OLAP for heterogeneous socio-economic data – the challenge of integration, analysis and crime
prevention: a Czech case study.
Jiří HORÁK, Igor IVAN, Bronislava HORÁKOVÁVSB-Technical University of Ostrava
Intergraph CS Ltd.Czech Republic
2European Forum for Geography and Statistics 2015 Conference Vienna, Austria, 10 – 12 November 2015
Big Spatial Data
Big Spatial Data • Features:
– Volume beyond the limit of usual geo-processing, – Velocity higher than available by usual processes, – Variety, combining more diverse geodata sources than
usual.
• traditional methods of geodata collection, storing, processing, controlling, analysing, modelling, validating and visualizing fail to provide effective solutions
• how to exploit the big spatial data?
3European Forum for Geography and Statistics 2015 Conference Vienna, Austria, 10 – 12 November 2015
• part of Business intelligence• On-line analytical processing - provide an effective and
intuitive access to consolidated data (harmonized and aggregated) stored in multidimensional data structures.
• OLAP operations:– Drill-down (success in hierarchy down, towards more details), – Roll - Up (success in hierarchy up, obtaining more aggregated data)– Drill-Across (link several fact tables with the same granularity)– Slice-and-Dice (splitting data)– Pivot (exchange of dimension in designed view)
• multidimensional database as a Data Warehouse: subject-oriented, integrated, time-variant and non-volatile collection of data
Multidimensional database and OLAP
4European Forum for Geography and Statistics 2015 Conference Vienna, Austria, 10 – 12 November 2015
• dimensional modelling • elementary items in fact tables contain
aggregated data (counts, sums etc.) • organised according to dimensions
(features)• dimensions usually contain hierarchical
structure• Granularity – the level of detail for facts • Additivity - possibility to summarize data
according to dimensions
Fact tables and dimensions
http://www.code-magazine.com/focus/Article.aspx?QuickID=1103091
5European Forum for Geography and Statistics 2015 Conference Vienna, Austria, 10 – 12 November 2015
Data sources:• population data – grid 1km, 100 m Census 2011(CZSO),
municipal IS • reg. of land identif., addresses and properties - buildings
(NMCA)• central crime register (Police CZ) - events • offence register (city police) – local, central is planned• register of schools (Min. of Education, Youth and Sports) -
contact• register of health service providers (Min. of Health) – contact,
beds• register of unemployed (Labour office) • register of gambling machines (Min. of Finance)• register of companies (CZSO, or others)
DWH & OLAP for social environment (crime, human factors)
6European Forum for Geography and Statistics 2015 Conference Vienna, Austria, 10 – 12 November 2015
ETL processes:• Data differs in quality, formats, accesses, legal and ethical
aspects (license policy, sensitivity), and maintenance• control procedures - integrity constrains, check validity of time
range, geographical range, referential integrity etc.• harmonisation – referential time of event from time interval,
harmonisation of addresses, classification of facilities, buildings etc.
• Geocoding for missing or bad coordinates• aggregation – according to multidimensional modelling• data anonymization – filtering, scramble, rounding, projection
ETL processes for DWH & OLAP for social environment
7
Fact tables:• CRIME• POPULATION• UNEMPLOYED• HEALTH• BUILDING• FACILITIES
Dimensional tables:• DATE• SQUARE• ADMIN_UNITS• AGE• SEX• and more
Structure
8European Forum for Geography and Statistics 2015 Conference Vienna, Austria, 10 – 12 November 2015
• Grid – 100 x 100 m (4th level of the scale system for communes and urban districts, Bacler), 500 m, 1 km, 5 km
• Administrative units - part of municipality, municipality, MEA, LAU1, NUTS3
• temporal dimension - one day unit, week, month, year• day-cycle hours – hour unit, morning time, rush hours • age - 5-years basic categories, 10-years, 20-years, “30 and
more”. • crime (& offences) - standard 3-level classification system • facilities - purpose and the hierarchical structure
Dimensions and hierarchy
9European Forum for Geography and Statistics 2015 Conference Vienna, Austria, 10 – 12 November 2015
Pivoting
Place of commitment
XResid. of offenders
OLAP pivoting, selections, relationships
Scatter plot, regres.a.
Gambling machinesX
Population
10European Forum for Geography and Statistics 2015 Conference Vienna, Austria, 10 – 12 November 2015
Data grid view
11European Forum for Geography and Statistics 2015 Conference Vienna, Austria, 10 – 12 November 2015
Number of burglaries per 100 flats (2014)
12European Forum for Geography and Statistics 2015 Conference Vienna, Austria, 10 – 12 November 2015
# burglaries to dwellings, # residential buildings (2014)
3 towns:• CB Ceske Budejovice• KO Kolin• OV Ostrava
Differences:• density of buildings• density of burglaries • dependencies
13European Forum for Geography and Statistics 2015 Conference Vienna, Austria, 10 – 12 November 2015
Number of gambling machines per 1km2
14European Forum for Geography and Statistics 2015 Conference Vienna, Austria, 10 – 12 November 2015
Number of gambling clubs per 100 inhabitants
15European Forum for Geography and Statistics 2015 Conference Vienna, Austria, 10 – 12 November 2015
# sprayer crimes per 1 school (2014)
16European Forum for Geography and Statistics 2015 Conference Vienna, Austria, 10 – 12 November 2015
Classification tree for sprayer crimes
Dependency – second.schools + regions; no second.schools + gambling m. + districtsNo dependency – population, buildings, basic schools, property offences
Thank you for your attention!
17
Data is provided by the courtesy of the Czech Statistical Office, Police of the Czech Republic, Czech Office for Surveying, Mapping and Cadaster, Czech
Ministry of Finance, Labour offices, Czech Ministry of Health and Municipal Police departments in Ostrava, Kolín and České Budějovice.
The research is supported by the research of the Czech Ministry of Interior, project “Geoinformatics as a tool to support integrated activities of safety and
emergency units”, No. MV-32046-58/VZ-2012.
Top Related