Batching and Java EE (jdk.io)
-
Upload
ryan-cuprak -
Category
Software
-
view
222 -
download
7
Transcript of Batching and Java EE (jdk.io)
![Page 1: Batching and Java EE (jdk.io)](https://reader036.fdocuments.net/reader036/viewer/2022062503/587fe5511a28ab46228b5461/html5/thumbnails/1.jpg)
BATCHING AND JAVA EERyan Cuprak
![Page 2: Batching and Java EE (jdk.io)](https://reader036.fdocuments.net/reader036/viewer/2022062503/587fe5511a28ab46228b5461/html5/thumbnails/2.jpg)
What is Batch Processing?Batch jobs are typically:
• Bulk-oriented
• Non-interactive
• Potentially compute intensive
• May require parallel execution
• Maybe invoked, ad hoc, scheduled, on-demand etc.
![Page 3: Batching and Java EE (jdk.io)](https://reader036.fdocuments.net/reader036/viewer/2022062503/587fe5511a28ab46228b5461/html5/thumbnails/3.jpg)
Batching Examples
• Monthly reports/statements
• Daily data cleanup
• One-time data migrations
• Data synchronization
• Data analysis
• Portfolio rebalancing
![Page 4: Batching and Java EE (jdk.io)](https://reader036.fdocuments.net/reader036/viewer/2022062503/587fe5511a28ab46228b5461/html5/thumbnails/4.jpg)
Introducing Java EE Batching• Introduced in Java EE 7
• JSR 352 - https://jcp.org/en/jsr/detail?id=352 • Reference implementation:
https://github.com/WASdev/standards.jsr352.jbatch
• Batch Framework:• Batch container for execution of jobs• XML Job Specification Language• Batch annotations and interfaces• Supporting classes and interfaces for interacting with the
container• Depends on CDI
![Page 5: Batching and Java EE (jdk.io)](https://reader036.fdocuments.net/reader036/viewer/2022062503/587fe5511a28ab46228b5461/html5/thumbnails/5.jpg)
Java EE Batching Overview
JobOperator Job Step
JobRepository
ItemReader
ItemProcessor
ItemWriter
1 *
1
1
1 1
1
1
![Page 6: Batching and Java EE (jdk.io)](https://reader036.fdocuments.net/reader036/viewer/2022062503/587fe5511a28ab46228b5461/html5/thumbnails/6.jpg)
Java EE Batching Overview
JobInstance
Job
JobExecution
*
*
EndOfDayJob
EndOfDayJob for 9/1/2016
First attempt at EndOfDay job for 9/1/2016
![Page 7: Batching and Java EE (jdk.io)](https://reader036.fdocuments.net/reader036/viewer/2022062503/587fe5511a28ab46228b5461/html5/thumbnails/7.jpg)
Java EE Batching Features
• Fault-tolerant – checkpoints and job persistence
• Transactions - chunks execute within a JTA transaction
• Listeners – notification of job status/completion/etc.
• Resource management – limits concurrent jobs
• Starting/stopping/restarting – job control API
![Page 8: Batching and Java EE (jdk.io)](https://reader036.fdocuments.net/reader036/viewer/2022062503/587fe5511a28ab46228b5461/html5/thumbnails/8.jpg)
Java EE Batching Deployment
WAR EAR JAR
Deploy batch jobs in:
Manage jobs – split application into modules
Server B
app.war
End of Day Job
Cleanup Job
Server C
app2.war
Analytics Job
Server A
frontend.war
![Page 9: Batching and Java EE (jdk.io)](https://reader036.fdocuments.net/reader036/viewer/2022062503/587fe5511a28ab46228b5461/html5/thumbnails/9.jpg)
Batchlet
![Page 10: Batching and Java EE (jdk.io)](https://reader036.fdocuments.net/reader036/viewer/2022062503/587fe5511a28ab46228b5461/html5/thumbnails/10.jpg)
Exit CodesCode DescriptionSTARTING Job has been submitted to runtime.
STARTED Batch job has started executing.
STOPPING Job termination has been requested.
STOPPED Job has been stopped.
FAILED Job has thrown an error or failured triggered by <failure>
COMPLETED Job has completed normally.
ABANDONDED Job cannot be restarted
![Page 11: Batching and Java EE (jdk.io)](https://reader036.fdocuments.net/reader036/viewer/2022062503/587fe5511a28ab46228b5461/html5/thumbnails/11.jpg)
Basic Layout
CDI Configuration
Job Configuration
Batchlet
![Page 12: Batching and Java EE (jdk.io)](https://reader036.fdocuments.net/reader036/viewer/2022062503/587fe5511a28ab46228b5461/html5/thumbnails/12.jpg)
Job ConfigurationMETA-INF/batch-jobs/<job-name>.xml
![Page 13: Batching and Java EE (jdk.io)](https://reader036.fdocuments.net/reader036/viewer/2022062503/587fe5511a28ab46228b5461/html5/thumbnails/13.jpg)
Batch Runtime
![Page 14: Batching and Java EE (jdk.io)](https://reader036.fdocuments.net/reader036/viewer/2022062503/587fe5511a28ab46228b5461/html5/thumbnails/14.jpg)
Batchlet with Termination
Jobs should implement and terminate when requested!
![Page 15: Batching and Java EE (jdk.io)](https://reader036.fdocuments.net/reader036/viewer/2022062503/587fe5511a28ab46228b5461/html5/thumbnails/15.jpg)
Batching & Resources
![Page 16: Batching and Java EE (jdk.io)](https://reader036.fdocuments.net/reader036/viewer/2022062503/587fe5511a28ab46228b5461/html5/thumbnails/16.jpg)
Concurrent Resources
![Page 17: Batching and Java EE (jdk.io)](https://reader036.fdocuments.net/reader036/viewer/2022062503/587fe5511a28ab46228b5461/html5/thumbnails/17.jpg)
IDs and NamesinstanceId
• ID represents an instance of a job.• Created when JobOperator start method invoked.
executionId• ID that represents the next attempt to run a particular job instance. • Created when a job is started/restarted.• Only one executionId for a job can be started at a time
stepExecutionId• ID for an attempt to execute a particular step in a job
jobName• name of the job from XML (actually id) <job id=“”>
jobXMLName• name of the config file in META-INF/batch-jobs
![Page 18: Batching and Java EE (jdk.io)](https://reader036.fdocuments.net/reader036/viewer/2022062503/587fe5511a28ab46228b5461/html5/thumbnails/18.jpg)
JobInstance vs. JobExecution
JobInstance
JobExecution
1
*
• BatchStatus• createTime• endTime• executionID• exitStatus• jobName• jobParameters,
lastUpdateTime• startTime
• instanceId• jobName
![Page 19: Batching and Java EE (jdk.io)](https://reader036.fdocuments.net/reader036/viewer/2022062503/587fe5511a28ab46228b5461/html5/thumbnails/19.jpg)
Managing Jobs• JobOperator – interface for operating on batch jobs.
• BatchRuntime.getJobOperator()• JobOperator:
• Provides information on current and completed jobs• Used to start/stop/restart/abandon jobs• Security is implementation dependent• JobOperator interacts with JobRepository
• JobRepository• Implementation out-side scope of JSR• No API for deleting old jobs
• Reference implementation provides no API for cleanup!
![Page 20: Batching and Java EE (jdk.io)](https://reader036.fdocuments.net/reader036/viewer/2022062503/587fe5511a28ab46228b5461/html5/thumbnails/20.jpg)
JobOperator MethodsType Methodvoid Abandon(long executionId)
JobExecution getJobExecution(long executionId)
List<JobExecution> getJobExecutions(JobInstance instance)
JobInstance getJobInstance(long executionId)
int getJobInstanceCount(String jobName)
List<JobInstance> getJobInstances(String jobName,int start, in count)
Set<String> getJobNames()
Properties getParameters(long executionId)
List<Long> getRunningExecutions(String jobName)
List<StepExecution> getStepExecutions(long jobExecutionId)
long Restart(long executionId, Properties restartParams)
long start(String jobXMLName, Properties jobParams)
void Stop(long executionId)
![Page 21: Batching and Java EE (jdk.io)](https://reader036.fdocuments.net/reader036/viewer/2022062503/587fe5511a28ab46228b5461/html5/thumbnails/21.jpg)
Listing Batch Jobs
![Page 22: Batching and Java EE (jdk.io)](https://reader036.fdocuments.net/reader036/viewer/2022062503/587fe5511a28ab46228b5461/html5/thumbnails/22.jpg)
Chunking• Chunking is primary pattern for batch processing in JSR-
352.• Encapsulates the ETL pattern:
• Pieces: Reader/Processor/Writer• Reader/Processor invoked until an entire chuck of data is
processed.• Output is written atomically
• Implementation:• Interfaces: ItemReader/ItemWriter/ItemProcessor• Classes: AbstractReader/AbstractWriter/AbstractProcessor
Reader Processor Writer
![Page 23: Batching and Java EE (jdk.io)](https://reader036.fdocuments.net/reader036/viewer/2022062503/587fe5511a28ab46228b5461/html5/thumbnails/23.jpg)
Chunking
![Page 24: Batching and Java EE (jdk.io)](https://reader036.fdocuments.net/reader036/viewer/2022062503/587fe5511a28ab46228b5461/html5/thumbnails/24.jpg)
Chunk ConfigurationParameter Descriptioncheckpoint-policy Possible values: item or customitem-count Number of items to be processed per
chunk. Default is 10.time-limit Time in seconds before taking a
checkpoint. Default is 0 (means after each chunk)
skip-limit Number of exceptions a step will skip if there are configured skippable exceptions.
retry-limit Number of times a step will be retried if it has throw a skippable exception.
![Page 25: Batching and Java EE (jdk.io)](https://reader036.fdocuments.net/reader036/viewer/2022062503/587fe5511a28ab46228b5461/html5/thumbnails/25.jpg)
Skippable Exceptions
![Page 26: Batching and Java EE (jdk.io)](https://reader036.fdocuments.net/reader036/viewer/2022062503/587fe5511a28ab46228b5461/html5/thumbnails/26.jpg)
ChunkingStep ItemReader ItemProcessor ItemWriter
read()
itemprocess(item)
item
read()
itemprocess(item)
itemwrite(items)
execute()
ExitStatus
![Page 27: Batching and Java EE (jdk.io)](https://reader036.fdocuments.net/reader036/viewer/2022062503/587fe5511a28ab46228b5461/html5/thumbnails/27.jpg)
Chunking: ItemReader
![Page 28: Batching and Java EE (jdk.io)](https://reader036.fdocuments.net/reader036/viewer/2022062503/587fe5511a28ab46228b5461/html5/thumbnails/28.jpg)
Chunking: ItemProcessor
![Page 29: Batching and Java EE (jdk.io)](https://reader036.fdocuments.net/reader036/viewer/2022062503/587fe5511a28ab46228b5461/html5/thumbnails/29.jpg)
Chunking: ItemWriter
![Page 30: Batching and Java EE (jdk.io)](https://reader036.fdocuments.net/reader036/viewer/2022062503/587fe5511a28ab46228b5461/html5/thumbnails/30.jpg)
Demo
![Page 31: Batching and Java EE (jdk.io)](https://reader036.fdocuments.net/reader036/viewer/2022062503/587fe5511a28ab46228b5461/html5/thumbnails/31.jpg)
Runtime ParametersSet Property
Retrieve Property
![Page 32: Batching and Java EE (jdk.io)](https://reader036.fdocuments.net/reader036/viewer/2022062503/587fe5511a28ab46228b5461/html5/thumbnails/32.jpg)
Pre-Defined PropertiesSet Property
Property Injected
![Page 33: Batching and Java EE (jdk.io)](https://reader036.fdocuments.net/reader036/viewer/2022062503/587fe5511a28ab46228b5461/html5/thumbnails/33.jpg)
Step Exceptions• Parallel running instances (partition) complete before the
job completes.• Batch status transitions to FAILED
![Page 34: Batching and Java EE (jdk.io)](https://reader036.fdocuments.net/reader036/viewer/2022062503/587fe5511a28ab46228b5461/html5/thumbnails/34.jpg)
Job Listener Configuration
Listener Config
![Page 35: Batching and Java EE (jdk.io)](https://reader036.fdocuments.net/reader036/viewer/2022062503/587fe5511a28ab46228b5461/html5/thumbnails/35.jpg)
Job Listener Implementation
![Page 36: Batching and Java EE (jdk.io)](https://reader036.fdocuments.net/reader036/viewer/2022062503/587fe5511a28ab46228b5461/html5/thumbnails/36.jpg)
Step Listener Configuration
Listener Config
![Page 37: Batching and Java EE (jdk.io)](https://reader036.fdocuments.net/reader036/viewer/2022062503/587fe5511a28ab46228b5461/html5/thumbnails/37.jpg)
Step Listener Implementation
![Page 38: Batching and Java EE (jdk.io)](https://reader036.fdocuments.net/reader036/viewer/2022062503/587fe5511a28ab46228b5461/html5/thumbnails/38.jpg)
Partition Configuration
![Page 39: Batching and Java EE (jdk.io)](https://reader036.fdocuments.net/reader036/viewer/2022062503/587fe5511a28ab46228b5461/html5/thumbnails/39.jpg)
Partition Implementation
![Page 40: Batching and Java EE (jdk.io)](https://reader036.fdocuments.net/reader036/viewer/2022062503/587fe5511a28ab46228b5461/html5/thumbnails/40.jpg)
Decision Configuration
Decision
What next?
![Page 41: Batching and Java EE (jdk.io)](https://reader036.fdocuments.net/reader036/viewer/2022062503/587fe5511a28ab46228b5461/html5/thumbnails/41.jpg)
Decision Implementation
Dependency Injection!
![Page 42: Batching and Java EE (jdk.io)](https://reader036.fdocuments.net/reader036/viewer/2022062503/587fe5511a28ab46228b5461/html5/thumbnails/42.jpg)
SplitupdateExisting processNewStorms
Flow & Splits JCL• <flow> element is used to implement process workflows.• <split> element is used to run jobs in parallel
retrieveTracking
processDecider
stormReader
stormProcessor
stormWriter
updateExistingStorms
![Page 43: Batching and Java EE (jdk.io)](https://reader036.fdocuments.net/reader036/viewer/2022062503/587fe5511a28ab46228b5461/html5/thumbnails/43.jpg)
Flows & Splits
![Page 44: Batching and Java EE (jdk.io)](https://reader036.fdocuments.net/reader036/viewer/2022062503/587fe5511a28ab46228b5461/html5/thumbnails/44.jpg)
Checkpoint Algorithm Configuration
![Page 45: Batching and Java EE (jdk.io)](https://reader036.fdocuments.net/reader036/viewer/2022062503/587fe5511a28ab46228b5461/html5/thumbnails/45.jpg)
Checkpoint Algorithm Implementation
![Page 46: Batching and Java EE (jdk.io)](https://reader036.fdocuments.net/reader036/viewer/2022062503/587fe5511a28ab46228b5461/html5/thumbnails/46.jpg)
Hadoop Overview• Massively scalable storage and batch data processing
system• Written in Java• Huge ecosystem
• Meant for massive data processing jobs• Horizontally scalable• Uses MapReduce programming model• Handles processing of petabytes of data• Started at Yahoo! In 2005.
![Page 47: Batching and Java EE (jdk.io)](https://reader036.fdocuments.net/reader036/viewer/2022062503/587fe5511a28ab46228b5461/html5/thumbnails/47.jpg)
Hadoop
MapReduce(Distributed Computation)
HDFS(Distributed Storage)
YARN Framework
Common Utilities
![Page 48: Batching and Java EE (jdk.io)](https://reader036.fdocuments.net/reader036/viewer/2022062503/587fe5511a28ab46228b5461/html5/thumbnails/48.jpg)
HadoopTypically Hadoop is used when:
• Analysis is performed on unstructured datasets
• Data is stored across multiple servers (HDFS)
• Non-Java processes are fed data and managed
Ex. https://svi.nl/HuygensSoftware
![Page 49: Batching and Java EE (jdk.io)](https://reader036.fdocuments.net/reader036/viewer/2022062503/587fe5511a28ab46228b5461/html5/thumbnails/49.jpg)
Spring vs. Java EE Batching• Spring Batch 3.0 implements JSR-352!• Batch artifacts developed against JSR-352 won’t work
within a traditional Spring Batch Job• Same two processing models as Spring Batch:
• Item – aka chunking• Task - aka Batchlet
![Page 50: Batching and Java EE (jdk.io)](https://reader036.fdocuments.net/reader036/viewer/2022062503/587fe5511a28ab46228b5461/html5/thumbnails/50.jpg)
Terminology ComparisonJSR-352 Spring BatchJob Job
Step Step
Chunk Chunk
Item Item
ItemReader ItemReader/ItemStream
ItemProcessor ItemProcessor
ItemWriter ItemWriter/ItemStream
JobInstance JobInstance
JobExecution JobExecution
StepExecution StepExecution
JobListener JobExecutionListener
StepListener StepExecutionListener
![Page 51: Batching and Java EE (jdk.io)](https://reader036.fdocuments.net/reader036/viewer/2022062503/587fe5511a28ab46228b5461/html5/thumbnails/51.jpg)
Scaling Batch Jobs• Traditional Spring Batch Scaling:
• Split – running multiple steps in parallel• Multiple threads – executing a single step via multiple threads• Partitioning – dividing data up for parallel processing• Remote Chunking – executing the processor logic remotely
• JSR-352 Job Scaling• Split – running multiple steps in parallel• Partitioning – dividing data up – implementation slightly
different.
![Page 52: Batching and Java EE (jdk.io)](https://reader036.fdocuments.net/reader036/viewer/2022062503/587fe5511a28ab46228b5461/html5/thumbnails/52.jpg)
JSR-352/Spring/HadoopHadoop• Massively parallel / large jobs• Processing petabytes of data (BIG DATA)JSR-352/Spring• Traditional batch processing jobs• Structured data/business processesJSR-352 vs. Spring• Java EE versus Spring containers• Spring has better job scaling capabilities
![Page 53: Batching and Java EE (jdk.io)](https://reader036.fdocuments.net/reader036/viewer/2022062503/587fe5511a28ab46228b5461/html5/thumbnails/53.jpg)
JSR-352 Implementations• JBeret
• http://tinyurl.com/z4qx3wo• WebSphere/WebLogic/Payara
• jbatch (reference)• http://tinyurl.com/jk6vcb8• WildFly/JBoss
• SpringBatch• http://tinyurl.com/mt8v3k7
![Page 54: Batching and Java EE (jdk.io)](https://reader036.fdocuments.net/reader036/viewer/2022062503/587fe5511a28ab46228b5461/html5/thumbnails/54.jpg)
Best Practices• Package/deploy batch jobs separately• Implement logic to cleanup old jobs• Implement logic for auto-restart• Test restart and checkpoint logic• Configure database to store jobs • Configure thread pool for batch jobs• Only invoke batch jobs from logic that is secured (@Role
etc.)
![Page 55: Batching and Java EE (jdk.io)](https://reader036.fdocuments.net/reader036/viewer/2022062503/587fe5511a28ab46228b5461/html5/thumbnails/55.jpg)
Resources• JSR-352
https://jcp.org/en/jsr/detail?id=352 • Java EE Support
http://javaee.support/contributors/• Spring Batch
http://docs.spring.io/spring-batch/reference/html/spring-batch-intro.html
• Spring JSR-352 Supporthttp://docs.spring.io/spring-batch/reference/html/jsr-352.html
![Page 56: Batching and Java EE (jdk.io)](https://reader036.fdocuments.net/reader036/viewer/2022062503/587fe5511a28ab46228b5461/html5/thumbnails/56.jpg)
Resources• Java EE 7 Batch Processing and World of Warcraft
http://tinyurl.com/gp8yls8• Three Key Concepts for Understanding JSR-352
http://tinyurl.com/oxe2dhu• Java EE Tutorial https
://docs.oracle.com/javaee/7/tutorial/batch-processing.htm