Building Trustworthy Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas...

38
Building Trustworthy Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #3 Supporting Technologies: Databases, Information Management and Information Security August 2006

description

An Example Database System Adapted from C. J. Date, Addison Wesley, 1990

Transcript of Building Trustworthy Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas...

Page 1: Building Trustworthy Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #3 Supporting Technologies: Databases, Information.

Building Trustworthy Semantic Webs

Dr. Bhavani ThuraisinghamThe University of Texas at Dallas

Lecture #3Supporting Technologies: Databases, Information

Management and Information Security

August 2006

Page 2: Building Trustworthy Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #3 Supporting Technologies: Databases, Information.

Database System

Consists of database, hardware, Database Management System (DBMS), and users

Database is the repository for persistent data Hardware consists of secondary storage volumes, processors, and

main memory DBMS handles all users’ access to the database Users include application programmers, end users, and the

Database Administrator (DBA) Need: Reduced redundancy, avoids inconsistency, ability to share

data, enforce standards, apply security restrictions, maintain integrity, balance conflicting requirements

We have used the definition of a database management system given in C. J. Date’s Book (Addison Wesley, 1990)

Page 3: Building Trustworthy Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #3 Supporting Technologies: Databases, Information.

An Example Database System

Database

Database Management SystemApplicationPrograms

Users

Adapted from C. J. Date, Addison Wesley, 1990

Page 4: Building Trustworthy Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #3 Supporting Technologies: Databases, Information.

Metadata

Metadata describes the data in the database- Example: Database D consists of a relation EMP with

attributes SS#, Name, and Salary Metadatabase stores the metadata - Could be physically stored with the database

Metadatabase may also store constraints and administrative information

Metadata is also referred to as the schema or data dictionary

Page 5: Building Trustworthy Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #3 Supporting Technologies: Databases, Information.

Functional Architecture

User Interface Manager

QueryManager

Transaction Manager

Schema(Data Dictionary)Manager (metadata)

Security/IntegrityManager

FileManager Disk

Manager

Data Management

Storage Management

Page 6: Building Trustworthy Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #3 Supporting Technologies: Databases, Information.

DBMS Design Issues

Query Processing- Optimization techniques

Transaction Management- Techniques for concurrency control and recovery

Metadata Management- Techniques for querying and updating the metadatabase

Security/Integrity Maintenance- Techniques for processing integrity constraints and enforcing

access control rules Storage management- Access methods and index strategies for efficient access to the

database

Page 7: Building Trustworthy Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #3 Supporting Technologies: Databases, Information.

Relational Database: ExampleRelation S:

S# SNAME STATUS CITYS1 Smith 20 LondonS2 Jones 10 ParisS3 Blake 30 ParisS4 Clark 20 LondonS5 Adams 30 Athens

Relation P:

P# PNAME COLOR WEIGHT CITYP1 Nut Red 12 LondonP2 Bolt Green 17 ParisP3 Screw Blue 17 RomeP4 Screw Red 14 LondonP5 Cam Blue 12 ParisP6 Cog Red 19 London

Relation SP:

S# P# QTYS1 P1 300S1 P2 200S1 P3 400S1 P4 200S1 P5 100S1 P6 100S2 P1 300S2 P2 400S3 P2 200S4 P2 200S4 P4 300S4 P5 400

Page 8: Building Trustworthy Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #3 Supporting Technologies: Databases, Information.

Concepts in Object Database Systems

Objects- every entity is an object- Example: Book, Film, Employee, Car

Class - Objects with common attributes are grouped into a class

Attributes or Instance Variables- Properties of an object class inherited by the object instances

Class Hierarchy- Parent-Child class hierarchy

Composite objects- Book object with paragraphs, sections etc.

Methods- Functions associated with a class

Page 9: Building Trustworthy Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #3 Supporting Technologies: Databases, Information.

A Definition of a Distributed Database System

A collection of database systems connected via a network The software that is responsible for interconnection is a Distributed

Database Management System (DDBMS) Each DBMS executes local applications and should be involved in at

least one global application (Ceri and Pelagetti) Homogeneous environment

Page 10: Building Trustworthy Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #3 Supporting Technologies: Databases, Information.

Architecture

Communication NetworkDistributed Processor 1

DBMS 1

Data-base 1 Data-

base 3

Data-base 2 DBMS 2

DBMS 3

Distributed Processor 2

Distributed Processor 3

Site 1

Site 2

Site 3

Page 11: Building Trustworthy Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #3 Supporting Technologies: Databases, Information.

Data Distribution

EMP1

SS# Name Salary1 John 20 2 Paul 303 James 404 Jill 50

605 Mary6 Jane 70

D#102020 201020

DnameD# MGR

10 30 40

Jane David Peter

DEPT1SITE 1

SITE 2EMP2

SS# Name Salary9 Mathew 70

D#50

DnameD# MGR

50 Math John

Physics

DEPT2

David 80 30Peter 90 40

78

C. Sci. English French

20 Paul

Page 12: Building Trustworthy Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #3 Supporting Technologies: Databases, Information.

Interoperability of Heterogeneous Database Systems

Database System A Database System B

Network

Database System C(Legacy)

Transparent accessto heterogeneousdatabases - both usersand application programs;Query, Transactionprocessing

(Relational) (Object-Oriented)

Page 13: Building Trustworthy Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #3 Supporting Technologies: Databases, Information.

Federated Database ManagementDatabase System A Database System B

Database System C

Cooperating databasesystems yet maintainingsome degree ofautonomy

Federation F1

Federation F2

Page 14: Building Trustworthy Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #3 Supporting Technologies: Databases, Information.

Federated Data and Policy Management

ExportData/Policy

ComponentData/Policy for

Agency A

Data/Policy for Federation

ExportData/Policy

ComponentData/Policy for

Agency C

ComponentData/Policy for

Agency B

ExportData/Policy

Page 15: Building Trustworthy Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #3 Supporting Technologies: Databases, Information.

Current Status and Directions Developments- Several prototypes and some commercial products- Tools for schema integration and transformation- Standards for interoperable database systems

Challenges being addressed- Semantic heterogeneity- Autonomy and federation- Global transaction management- Integrity and Security

New challenges- Scale- Web data management

Page 16: Building Trustworthy Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #3 Supporting Technologies: Databases, Information.

What is Information Management?

Information management essentially analyzes the data and makes sense out of the data

Several technologies have to work together for effective information management- Data Warehousing: Extracting relevant data and putting this data

into a repository for analysis- Data Mining: Extracting information from the data previously

unknown- Multimedia: managing different media including text, images,

video and audio- Web: managing the databases and libraries on the web

Page 17: Building Trustworthy Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #3 Supporting Technologies: Databases, Information.

Data Warehouse

OracleDBMS forEmployees

SybaseDBMS forProjects

InformixDBMS forMedical

Data Warehouse:Data correlatingEmployees WithMedical Benefitsand Projects

Could beany DBMS; Usually based on the relational data model

UsersQuerythe Warehouse

Page 18: Building Trustworthy Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #3 Supporting Technologies: Databases, Information.

Multidimensional Data Model

Project Name

Project Leader

Project Sponsor

Project Cost

Project Duration

Dollars

Pounds

Yen

Years

Months

Weeks

Page 19: Building Trustworthy Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #3 Supporting Technologies: Databases, Information.

Data Mining

Data Mining Knowledge Mining

Knowledge Discoveryin Databases

Data Archaeology

Data Dredging

Database MiningKnowledge Extraction

Data Pattern Processing

Information Harvesting

Siftware

The process of discovering meaningful new correlations, patterns, and trends by sifting through large amounts of data, often previously unknown, using pattern recognition technologies and statistical and mathematical techniques(Thuraisingham 1998)

Page 20: Building Trustworthy Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #3 Supporting Technologies: Databases, Information.

Multimedia Information Management

VideoSource Scene

ChangeDetection

SpeakerChange

Detection

SilenceDetection

CommercialDetection

Key FrameSelection

StorySegmentation

NamedEntityTagging

Broadcast News Editor (BNE) Broadcast NewsNavigator (BNN)

Video and

Metadata

MultimediaDatabase

ManagementSystem

Web-based Search/Browse by Program, Person, Location, ...

Imagery

Audio

ClosedCaptionText

Segregate VideoStreams

Analyze and Store Video and Metadata

StoryGIST Theme

FrameClassifier

ClosedCaption

Preprocess

Correlation

Token Detection

BroadcastDetection

Page 21: Building Trustworthy Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #3 Supporting Technologies: Databases, Information.

Extracting Relations from Text for Mining: An Example

TextCorpus Repository

ConceptExtraction

AssociationRuleProduct

Person1 Person2Natalie Allen Linden Soles 117Leon Harris Joie Chen 53Ron Goldman Nicole Simpson 19

. . .Mobotu SeseSeko

Laurent Kabila 10

Goal: FindCooperating/Combating Leadersin a territory

Page 22: Building Trustworthy Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #3 Supporting Technologies: Databases, Information.

Image Processing:Example: Change Detection:

Trained Neural Network to predict “new” pixel from “old” pixel- Neural Networks good for multidimensional continuous data- Multiple nets gives range of “expected values”

Identified pixels where actual value substantially outside range of expected values- Anomaly if three or more bands (of seven) out of range

Identified groups of anomalous pixels

Page 23: Building Trustworthy Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #3 Supporting Technologies: Databases, Information.

Semantic Web

0 Some Challenges: Interoperability between Layers; Security and Privacy cut across all layers; Integration of Services; Composability

XML, XML Schemas

Rules/Query

Logic, Proof and TrustTRUST Other

ServicesRDF, Ontologies

URI, UNICODE

PRIVACY

0Adapted from Tim Berners Lee’s description of the Semantic Web

Page 24: Building Trustworthy Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #3 Supporting Technologies: Databases, Information.

Semantic Web Technologies

Web Database/Information Management- Information retrieval and Digital Libraries

XML, RDF and Ontologies- Representation information

Information Interoperability- Integrating heterogeneous data and information sources

Intelligent agents- Agents for locating resources, managing resources, querying

resources and understanding web pages Semantic Grids- Integrating semantic web with grid computing technologies

Page 25: Building Trustworthy Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #3 Supporting Technologies: Databases, Information.

Information Management for Collaboration

Team B

Teams A and BCollaboratingon a geographicalproblem

Team A

Page 26: Building Trustworthy Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #3 Supporting Technologies: Databases, Information.

Some Emerging Information Management Technologies

Visualization- Visualization tools enable the user to better understand the

information Peer-to-Peer Information Management- Peers communicate with each other, share resources and carry

out tasks Sensor and Wireless Information Management- Autonomous sensors cooperating with one another, gathering

data, fusing data and analyzing the data- Integrating wireless technologies with semantic web

technologies

Page 27: Building Trustworthy Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #3 Supporting Technologies: Databases, Information.

What is Knowledge Management?

Knowledge management, or KM, is the process through which organizations generate value from their intellectual property and knowledge-based assets

KM involves the creation, dissemination, and utilization of knowledge

Reference: http://www.commerce-database.com/knowledge-management.htm?source=google

Page 28: Building Trustworthy Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #3 Supporting Technologies: Databases, Information.

Knowledge Management Components

Components:StrategiesProcessesMetrics

Cycle:Knowledge, CreationSharing, Measurement And Improvement

Technologies:Expert systemsCollaborationTrainingWeb

Components ofKnowledge Management: Components,Cycle and Technologies

Page 29: Building Trustworthy Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #3 Supporting Technologies: Databases, Information.

Identification Creation

Diffusion - Tacit, Explicit

Integration Modification

Action

Organizational Learning Process

Metrics

Source: Reinhardt and Pawlowsky

Incentives

also see: Tools in Organizational Learninghttp://duplox.wz-berlin.de/oldb/forslin.html

Page 30: Building Trustworthy Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #3 Supporting Technologies: Databases, Information.

Operating System Security Access Control- Subjects are Processes and Objects are Files- Subjects have Read/Write Access to Objects- E.g., Process P1 has read acces to File F1 and write access to

File F2 Capabilities- Processes must presses certain Capabilities / Certificates to

access certain files to execute certain programs- E.g., Process P1 must have capability C to read file F

Page 31: Building Trustworthy Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #3 Supporting Technologies: Databases, Information.

Mandatory Security Bell and La Padula Security Policy- Subjects have clearance levels, Objects have sensitivity levels;

clearance and sensitivity levels are also called security levels- Unclassified < Confidential < Secret < TopSecret- Compartments are also possible - Compartments and Security levels form a partially ordered

lattice Security Properties- Simple Security Property: Subject has READ access to an object

of the subject’s security level dominates that of the objects- Star (*) Property: Subject has WRITE access to an object if the

subject’s security level is dominated by that of the objects\

Page 32: Building Trustworthy Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #3 Supporting Technologies: Databases, Information.

Covert Channel Example Trojan horse at a higher level covertly passes data to a Trojan

horse at a lower level Example:- File Lock/Unlock problem- Processes at Secret and Unclassified levels collude with

one another- When the Secret process lock a file and the Unclassified

process finds the file locked, a 1 bit is passed covertly- When the Secret process unlocks the file and the

Unclassified process finds it unlocked, a 1 bit is passed covertly- Over time the bits could contain sensitive data

Page 33: Building Trustworthy Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #3 Supporting Technologies: Databases, Information.

Network Security Security across all network layers- E.g., Data Link, Transport, Session, Presentation,

Application Network protocol security- Ver5ification and validation of network protocols

Intrusion detection and prevention- Applying data mining techniques

Encryption and Cryptography Access control and trust policies Other Measures- Prevention from denial of service, Secure routing, - - -

Page 34: Building Trustworthy Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #3 Supporting Technologies: Databases, Information.

Steps to Designing a Secure System Requirements, Informal Policy and model Formal security policy and model Security architecture- Identify security critical components; these components must be

trusted Design of the system Verification and Validation

Page 35: Building Trustworthy Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #3 Supporting Technologies: Databases, Information.

Product Evaluation Orange Book- Trusted Computer Systems Evaluation Criteria

Classes C1, C2, B1, B2, B3, A1 and beyond- C1 is the lowest level and A1 the highest level of assurance- Formal methods are needed for A1 systems

Interpretations of the Orange book for Networks (Trusted Network Interpretation) and Databases (Trusted Database Interpretation)

Several companion documents - Auditing, Inference and Aggregation, etc.

Many products are now evaluated using the federal Criteria

Page 36: Building Trustworthy Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #3 Supporting Technologies: Databases, Information.

Security Threats to Web/E-commerce

SecurityThreats andViolations

AccessControlViolations

IntegrityViolations Fraud

Denial ofService/InfrastructureAttacks

Sabotage

ConfidentialityAuthenticationNonrepudiationViolations

Page 37: Building Trustworthy Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #3 Supporting Technologies: Databases, Information.

Approaches and Solutions End-to-end security- Need to secure the clients, servers, networks, operating

systems, transactions, data, and programming languages- The various systems when put together have to be secure

Composable properties for security Access control rules, enforce security policies, auditing,

intrusion detection Verification and validation Security solutions proposed by W3C and OMG Java Security Firewalls Digital signatures and Message Digests, Cryptography

Page 38: Building Trustworthy Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #3 Supporting Technologies: Databases, Information.

Other Security Technologies Data and Applications Security Middleware Security Insider Threat Analysis Risk Management Trust and Economics Biometrics