J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

63
Lucas Jellema (AMIS) NLJUG JFall 2013 6th November 2013, Nijkerk, The Netherlands On the integrity of data in Java Applications

description

The accuracy, internal quality, and reliability of data is frequently referred to using the term 'data integrity'. Without it, data is less valuable or even useless. This session takes a close look at what data integrity entails and how it can be enforced in multi-tier application architectures using distributed data sources and global transactions. The discussion will make clear which elements are required from any robust implementation of data oriented business rules aka data constraints and it will explain how most existing solutions are not as watertight as is frequently assumed. Steps for achieving reliable constraint enforcement are demonstrated.

Transcript of J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

Page 1: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

Lucas Jellema (AMIS)NLJUG JFall 2013

6th November 2013, Nijkerk, The Netherlands

On the integrity of data in Java Applications

Page 2: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

Agenda

• What is integrity?• Enforcing data constraints

– throughout the application architecture• Transactions• Exclusive Access to …• The Distributed World

Page 3: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

3

Definition of Integrity

• Truth– Nothing but the truth

• The Only Truth• [Degree of] success or

completeness ofactions is known

Page 4: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

4

Sufficient Integrity

IntegrityIntegrity

Corruptible

π48,23

7,0

“five”

4233,0000002

Uncorrupted

CompleteConsistentReliable

Correct

Page 5: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

5

Conference Application

Page 6: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

6

Conference Application

Client(HTML 5 & Java Script)

Web TierJavaServer Faces

Business TierJPA

RDBMS

EJB

POJO Domain Model

Page 7: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

7

Validation at entry time

Page 8: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

8

Validation at entry timeClient and View

Page 9: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

9

Validation at entry timeClient and View

Page 10: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

10

More validation at entry time – bean Validation

Page 11: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

11

Validation at entry timeBean Validation in View

Page 12: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

12

Engage Bean Validation in Web Tier

Page 13: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

13

Record (Type) level rules

• Program should be Kidswhen age < 18; either Developer or Management when age > 18

• Using JavaScript – when either field changes

(handle nulls)– on submit of the entire

record

• Using Bean Validation: custom type validator– in either web-tier or JPA

Page 14: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

14

Type Level Constraints with Bean Validation

Page 15: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

15

Type Level Bean Validation: Custom Validator

Page 16: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

16

Validation Implementation options & considerations

Client(JSF based HTML 5 & Java Script)

Web TierJavaServer Faces

Business TierJPA

RDBMS

EJB

Mobile ClientClient

(pure HTML 5 & Java Script)

RESTful Services

POJO Domain Model

Native HTML 5; JavaScript

Native

Custom;JSF Validator;

Bean Validation

Custom;Bean Validation

Custom;Bean Validation

Native HTML 5; JavaScript

Page 17: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

17

But wait – there is more!

• More User Interfaces• More Attendee

Instances• More Entities

& More types of Constraints

• More Users, Sessions,and Transactions

• More Nodes in the Middle Tier Cluster

• More Data Stores

Page 18: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

18

Domain model

• Attendee• Speaker• Session• Room• Slot• Attendance

– Booked– Realized

Page 19: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

19

Multiple-Instances-of-Single-Entity constraints

• Constraints that cover multiple same type objects/instances– Attendee’s Registration Id is unique– No more than 5 conference attendees from the same company– Not more than two sessions by the same speaker– At most one session scheduled per room per slot– Only one keynote session in a slot– Sessions from up to a maximum of three tracks can be scheduled in the same room

Page 20: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

20

Inter entity constraints

• Attendees can only attend one hands-on session during the conference• A person cannot attend another session in a slot in which the session

(s)he is speaker of is scheduled• No more planned session attendances are allowed than the capacity of

the room in which the session is scheduled to take place• If the room capacity is smaller than 100, then no more than 2 people from

the same company may sign up for it• Attendees from Amsterdam cannot attend sessions in room 010

• Common challenge:– Many data change events

can lead to constraint violation

Page 21: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

21

Event Analysis for Inter Entity Constraint

• No more planned session attendances are allowed than the capacity of the room in which the session is scheduled to take place

Create, Update (session reference)

Update (capacity [decrease])

Update (room reference)

Page 22: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

22

Constraint classification

• Based on event-analysis (when can the constraint get violated) we discern these categories of contraints– Attribute– Tuple– Entity– Inter Entity

• Each category has its own implementation methods,options and considerations– Multi record instance rules cannot

meaningfully be enforced in client/web-tier

Page 23: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

23

Nous ne sommes pas ‘Sans Famille’

Page 24: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

24

Nous ne sommes pas ‘Sans Famille’

RDBMS

Client(JSF based HTML 5 & Java Script)

Web TierJavaServer Faces

Business TierJPA

EJB

Mobile ClientClient

(pure HTML 5 & Java Script)

RESTful Services

POJO Domain Model

Page 25: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

25

Multiple clients forData Source

RDBMS

Client(JSF based HTML 5 & Java Script)

Web TierJavaServer Faces

Business TierJPA

EJB

Mobile ClientClient

(pure HTML 5 & Java Script)

RESTful Services

POJO Domain Model

ESB.NET

BatchDBA/

Application Admin

Client(JSF based HTML 5 & Java Script)

Web TierJavaServer Faces

Business TierJPA

EJB

Mobile ClientClient

(pure HTML 5 & Java Script)

RESTful Services

POJO Domain Model

Page 26: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

26

Integrity Enforcement in the Persistent Store

• All data is available• Persistent store is the final stop: the buck stops here

– Any alternative data manipulation (channel) has to go to the persistent store– Mobile, Batch, DBA, ESB

• Built-in (native) mechanisms for constraint enforcement– Productive development, proven robustness, scalable performance– For example:

Column Type, PK/UK, FK, Check; trigger

• Transactions• Enforcing integrity is integral part of persisting data

– Without final validation, persistent store cannot take responsibility for integrity

Page 27: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

27

Multiple-Instances-of-Single-Entity constraints

• No more than 5 conference attendees from the same company

Page 28: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

28

Implementation Consideration for Multiple-Entity-Instance rule

• Implementation – how and where?– Is the entire set of data available– Is all associated info available– Is the data set stable?– Can the constraint elegantly be implemented (natively? good framework support?)– Are all data access paths covered?

Page 29: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

29

Implementing Multi-Instance constraint ‘5 max per company’

Business TierJPA

Register New Attendee – method A- Ensure L2 Cache is up to date in terms of Attendees (fetch all attendees into cache)- Inspect the collection of attendees for same company- Persist Attendee if collection does not hold 5 (or more)

POJO Domain Model

Attendees

Register New Attendee – method B- Select count of attendees in same company from the Data Store- Inspect the long value- Persist Attendee if long is < 5

L2 CacheAttendees

Page 30: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

30

Max 5 per companyJPA Facade enforcement

Page 31: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

31

Max 5 per Company – Flaws in JPA Enforcement

• Persist does not [always] ‘post to database’– When more than one attendee is added in a transaction, prior ones are not counted

when the latter are validated

Business TierJPA

Attendees

Facade

POJO Domain Model

Thread 1select countpersistselect countpersist

Page 32: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

32

One thread persisting two attendees in a row – no flush

Page 33: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

33

Max 5 per Company – Flaws in JPA Enforcement

• Persist does not [always] ‘post to database’– When more than one attendee is added in a transaction, prior ones are not counted

when the latter are validated

Business TierJPA

Attendees

Facade

POJO Domain Model

Thread 1select countpersistselect countpersistcommit

Page 34: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

34

Flush after persist for complete picture

Page 35: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

35

Web Tier

ClientHTML 5 & Java Script

Session A

JPA Facade enforcement in a multi-threaded world

Business TierJPA

Attendees

Facade

POJO Domain Model

Thread 1 Thread 2select countpersist

select countpersist

ClientHTML 5 & Java Script

Session B

Page 36: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

36

Web Tier

ClientHTML 5 & Java Script

Session A

JPA Facade enforcement in a multi-threaded world

Business TierJPA

Attendees

Facade

POJO Domain Model

Thread 1 Thread 2select countpersistcommit

select countpersistcommit

ClientHTML 5 & Java Script

Session B

Page 37: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

37

Two threads inter-leaving

Page 38: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

38

Database Solution?

Page 39: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

39

Data Trick – Materialized View with Check Constraint

Page 40: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

40

Transactions

• Logically consistent set of data manipulations– Atomic units of work– Succeed or fail together– Any changes inside a transaction are invisible to other sessions/transactions until the

transaction completes (commits)– Note: during a transaction, constraints may be violated; the only thing that matters:

commit [time]– Transaction ends with succesful commit or rollback –

In both cases, transaction-owned locks are released

• ACID (in RDBMS)– vs BASE (in NoSQL)

• Note: post vs. commit with RDBMS– Post means do [all] data manipulation (insert, update, delete) but do not commit [yet]– Only upon commit are changes persisted and published

Page 41: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

41

Perfect Integrity

Page 42: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

42

Fine grained locking

Attendees

Unique Key UK1 on (FirstName, LastName)

Transaction 1 Transaction 2

insert … ('John','Doe',…)

Page 43: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

43

Fine grained locking

Attendees

Unique Key UK1 on (FirstName, LastName)

Transaction 1 Transaction 2

insert … ('John','Doe',…)

update <JANE> set firstname ='John'

insert … ('Jane','Doe',…)

Page 44: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

44

Fine grained locking

Attendees

Unique Key UK1 on (FirstName, LastName)

Transaction 1 Transaction 2

insert … ('John','Doe',…)

update <JANE> set firstname ='John'

commit

insert … ('Jane','Doe',…)

Lock onUK1_JOHN_

DOE

Page 45: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

45

Web Tier

ClientHTML 5 & Java Script

Session A

JPA Facade enforcementExclusive Constraint Checking

Business TierJPA

Attendees

Facade

POJO Domain Model

Thread 1 Thread 2take lockselect countpersistcommit

take lock…select countrollback

ClientHTML 5 & Java Script

Session B

LockMgrATT_MAX

Page 46: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

46

Two threads and Lock on Constraint

Page 47: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

47

Two threads and Lock on Constraint

Page 48: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

48

Distributed or Global Transaction

• One logical unit of work - involving data manipulations in multiple resources (global transaction composed of local transactions)

Client(JSF based HTML 5 & Java Script)

Web TierJavaServer Faces

Business Tier

RDBMS

EJB

Mobile ClientClient

(pure HTML 5 & Java Script)

RESTful Services

POJO Domain Model

RDBMSJMS

ERPJCA

Page 49: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

49

Implementation for Distributed Transaction

• Typical approach: two-phase commit– Each resource locks and validates – then reports OK or NOK back to the transaction

overseeer– When all resources have indicated OK

then phase two:all resources commit and release locks

– When one or more resources signal NOK, then phase two: all resources roll back/undo changes and release locks

• With regards to integrity:– With a distributed transaction,

the integrity for each participant is handled as before; this will result in ‘constraint-locks’ in multiple separate resources

Page 50: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

50

Distributed (aka global) transaction inside container

• Java EE containers (and various non-EE JTA implementations) support global (distributed) transactions within a JVM– JTA (JSR-907) – based on X/Open XA architecture

• Key element is Transaction Monitor (the container) and Resource Managers (JDBC, EJB, JMS, JCA)

• One non-XA resource can participate (file system, email, …) in a global transaction:– All XA-resources perform Phase One – The non-XA resource does its thing– Upon success of the non-XA resource: others perform Phase two by comitting– Upon failure of the non-XA resource: others roll back

Page 51: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

51

Distributed transactions across/outside containers

Step 2:Payment

RDBMS

Client(JSF based HTML 5 & Java Script)

Web TierJavaServer Faces

Business TierJPA

EJB

Mobile ClientClient

(pure HTML 5 & Java Script)

RESTful Services

POJO Domain Model

Page 52: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

52

Container

Distributed transactions across/outside containers

• Transaction involving remote containers, Web Services, File System or any stateless transaction participant

• There is no actual common, shared vehicle (like a global XA transaction)– There is not really a coordinated two-phase commit

• Transaction consists of – Any resource does its thing – lock, validate, commit (or rollback), report back– If all resources report succes: great, done– If one resource reports failure the all other resources should perform ‘compensation’

– i.e. rollback/undo effects of a committed transaction

Remote/Stateless Enterprise Resource

Remote/Stateless Enterprise Resource

LocalEnterprise Resource

Transaction

commit

compensate

commit

Page 53: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

53

Compensation

• How to implement a compensation mechanism?• How long after the commit can compensation be requested?• What is the state of the enterprise resource between commit and the

compensation expiry time?• Should the invoker notify the resource that compensation is no longer

required (so the ‘logical locks’/’temporary state’ can be updated)– i.e. the global distributed transaction has succussfully completed

Enterprise Resource

commit

compensate

Page 54: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

54

RESTful transaction is a distributed transaction

Resource A Resource B Resource C

Client

PUT

PO

ST

DELETE

Domain Model/JPA Cache

Page 55: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

55

RESTful transaction is a distributed transaction

Resource A Resource B Resource C

Client

PUT

PO

ST

DELETE

Domain Model/JPA

Page 56: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

56

Distributed Constraints

• Constraints that involve data collections in multiple enterprise resources

RDBMSRDBMSJMS

Client(JSF based HTML 5 & Java Script)

Web TierJavaServer Faces

Business TierEJB

Mobile ClientClient

(pure HTML 5 & JS)

RESTful Services

POJO Domain Model

ERPJCA

Table XTable Y

Page 57: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

57

Distributed Constraints

• Not more than three attendees (resource A) from the same company may attend a session (resource B)– Insert/Update Attendance requires validation – as does update of Attendee.company

Client

Web Tier

Java EE Business Tier

Client Client

ATTENDANCESATTENDEES

Distributed Lock Manager

Web Tier

Java EE Business Tier

MAX_3_COMP_ATT

Page 58: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

58

Distributed Constraints

• Not more than three attendees (resource A) from the same company may attend a session (resource B)– Insert/Update Attendance requires validation – as does update of Attendee.company

Client

Web Tier

Java EE Business Tier

Client Client

ATTENDANCESATTENDEES

Distributed Lock Manager

Web Tier

Java EE Business Tier

MAX_3_COMP_ATT

Page 59: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

59

Distributed Constraints

• Not more than three attendees (resource A) from the same company may attend a session (resource B)– Insert/Update Attendance requires validation – as does update of Attendee.company

Client

Web Tier

Java EE Business Tier

Client Client

ATTENDANCESATTENDEES

Distributed Lock Manager

Web Tier

Java EE Business Tier

MAX_3_COMP_ATT

ESB

Page 60: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

61

Java global (distributed) lock managers

• Within JVM: SynchronousQueue• Across JVMs: Apache ZooKeeper, HazelCast, Oracle Coherence, …

JVM

JVM

JVM

Page 61: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

62

Summary

• Which level of integrity is required?• Change undermines integrity

– Data change is trigger for constraint validation

• Exclusive lock on multi-record validation– released when transaction commits

• Ensure that all data access paths are covered– Not all data manipulations may come through the Java middle tier

• Transactions may include multiple enterprise resources– That may not be able to participate in a distributed transaction and have to support a

compensation mechanism

• True integrity and real robustness are very hard to achieve– Much harder than is commonly assumed

Page 62: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

64

Handling Integrity Really Well...

Page 63: J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

Lucas Jellema (AMIS)

Email: [email protected]: @lucasjellema

Blog: http://technology.amis.nlWebsite: http://www.amis.nl