Oozie meetup - HA + Cron Scheduling
-
Upload
mona-chitnis -
Category
Engineering
-
view
107 -
download
0
Transcript of Oozie meetup - HA + Cron Scheduling
Oozie MeetupBowen Zhang
Agenda● Oozie Log HA● Oozie cron scheduling
○ Use cases○ Troubleshooting
● Prospective 4.1 release
Oozie Log HA● HA implemented already
○ Server○ HCat○ SLA○ Sharelib
● Remaining piece○ log streaming: if a node is down, all oozie.log
content on that node is not accessible
Proposed Solution● YARN faced the same issue when it
comes to log streaming and retrieval● Currently, YARN puts container log to
HDFS when log aggregation is enabled● Oozie can duplicate logs onto HDFS
○ Log directly to HDFS○ Copy the log onto HDFS during log rotation
Direct logging to HDFS● Pros
○ complete log HA with 100% accuracy of the content
● Cons○ This could be hard to implement and may
need significant oozie logging mechanism changes
○ This introduces strict dependency on HDFS○ Potential server performance issues.
Copy log rotation● Pros
○ Easy to implement without significant changes to oozie logging structure
○ Less performance issue● Cons
○ Always has less than one hour window of log unavailability due to rotation schedule
Other ideas?Other ideas are always welcome.Eg. Putting it into DB?Eg. Integrate Zookeeper to solve this problem?
Coordinator Cron SchedulingVarious Cron syntaxes exist and unix cron syntax is only one of them.● Oozie cron has 5 fields since oozie
operates on per minute base.● Weekday starts at 2 which is Monday● Complicated Overflowing ranges are
discouraged to use
Use cases● A job running at 9am every weekday
○ frequency="0 9 * * 2-6" or "0 9 * * MON-FRI"○ Notice in the first expression, we use 2-6
instead of 1-5● A job running every 15 minutes from 9-
11am every day○ frequency="0/15 9,10 * * *" or "0,15,30,45
9,10 * * *"○ Notice hour field should be 9,10 instead of 9-
11
Use cases continued● A job running at 9am of every last
Friday of the month○ frequency = "0 9 * * 6L"
● A job running at 9am of every 2nd Friday of the month○ frequency = "0 9 * * 6#2"
General mistakes● Oozie timezone by default is UTC. So
your cron syntax should calculate this timezone differences○ If you live in LA and want to run a job at 9am
every day:■ frequency = "0 9 * * *" is wrong!!!■ frequency = "0 16 * * *" is the right one
○ We understand the inconvenience, but that’s the oozie server timezone.
General mistakes continued● Day of month and day of week are
union, not intersection○ "0 10 12-18 * 2-6" DOES NOT MEAN running a
job at 10 am on weekdays between 12 and 18th of the month
○ It MEANS running a job at 10 am on weekdays AND on 12-18th of the month. Only one of the two needs to be satisfied for the job to fire.
General Mistakes Continued● An overflow range goes wild
○ "0 23-1 * * *" is reasonable○ "0 23-1 * DEC-MAR FRI-MON". What does this
mean?■ Cron can produce many different edge
cases in the above scenario. Do it at your own peril!
4.1 ReleaseWe have around 200 patches for this release and it’s been a while!● All HA related work● Cron scheduling for coordinator job● Consolidation of JPAExecutors (not user
facing”● Introduction of SharelibService (some
user facing backward imcompatibilty issue)
4.1 Release continued● Better integration with YARN RM HA
and restart● Oozie Sqoop CLI functionality● Many more versatility of Coordinator
job functionalities● Major overhaul of coordinator job
execution order