Valsts reģionālās attīstības aģentūra Māris Kromāns 28.02.2013.
© 2010 Tieto Corporation Internals of Concurrent Managers UKOUG Conference Series Technology &...
-
Upload
toby-warmington -
Category
Documents
-
view
219 -
download
3
Transcript of © 2010 Tieto Corporation Internals of Concurrent Managers UKOUG Conference Series Technology &...
© 2
010
Tie
to C
orpo
ratio
n
Internals of Concurrent Managers
UKOUG Conference Series Technology & E-Business Suite 2010
Māris Elsiņš
Senior Oracle Applications DBATieto Latvia,[email protected]
© 2010 Tieto Corporation2
Who I am?• 8 years in IT
• 3 years – PL/SQL developer• 5 years – Oracle [Apps] DBA (started with 11.5.7 and 8.1.7)
• Certificates• 10g OCM• 9i / 10g / 11g OCP• 11i Applications Database Administrator OCP• 11i System Administrator OCE
• Conferences• UKOUG 2007/2008/2010• LVOUG 2009/2010• EMEA Harmony 2010
• Current employer – Tieto Latvia• All kinds of oracle DBA tasks - patching, upgrade, performance tuning,
troubleshooting, planning and implementation of backup and recovery procedures, cross platform migration, etc
• Planning of sydtem architecture, design and implementation of HA solutions (RAC, Data Guard, Custom cold failover)
• Implementation of system specific monitoring, automation of routine tasks• Technical project planning and coordination, management of team of DBAs
2010-11-29
© 2010 Tieto Corporation3
Purpose of this session• Provide background knowledge for successful
troubleshooting of concurrent request problems• What this presentation is about?
• Life cycle of a concurrent request• Implementation details of the most interesting phases of a life cycle
of a concurrent request• Internals of Concurrent Managers and Conflict Resolution Managers• DB objects that can be used for troubleshooting
• Why it is important?• DBA has to be quick when solving real issues• Knowing the process before the problems happen decreases solving
time• Querying DB objects is faster then navigating through Forms• Knowing the process is important when tuning the setup of
Concurrent Managers
2010-11-29
© 2010 Tieto Corporation4
Most interesting DB objects• Objects
• FND_CONCURRENT_REQUESTS (FCR)
• FND_CONCURRENT_PROGRAMS (FCP)
• FND_CONCURRENT_QUEUES (FCQ)
• FND_CONCURRENT_PROCESSES (FCPROC)
• FND_CONCURRENT_PROGRAM_SERIAL (FCPS)
• FND_CONC_RELEASE_CLASSES (FCRC)
• FND_LOOKUPS (FL)
• FND_CONCURRENT_WORKER_REQESTS (FCWR)
• FND_CRM_HISTORY (FCH)
• FND_CONC_WAITING_REQUESTS• Some examples how the tables are used will be uncovered
in this presentation
2010-11-29
© 2010 Tieto Corporation5
Types of concurrent managers
• Types of concurrent managers• Internal Concurrent Manager• Service Manager• Concurrent Manager• Conflict Resolution Manager• Internal Monitor• Transaction Manager• Scheduler/Prereleaser Manager• …
• Today the accent is on• Concurrent Manager• Conflict Resolution Manager
2010-11-29
© 2010 Tieto Corporation6
Life cycle of a concurrent request
INACTIVE (I)• Disabled (U)• On Hold (H)• No Manager (M)
PENDING (P)• Normal (I)• Standby (Q)• Scheduled (P)• Waiting (Z)
RUNNING (R)• Normal (R)• Paused (W)• Resuming (B)• Terminating (T)
COMPLETED (C)• Normal (C)• Error (E)• Warning (G)• Cancelled (D)• Terminated (X)
2010-11-29
FND_LOOKUPS contains the values of Phase/Status Codes
«Oracle® E-Business Suite System Administrator's Guide – Maintenance» describes the meaning of each phase and status
© 2010 Tieto Corporation7
Typical life cycle of most requestsPending / Scheduled• Request is scheduled for the execution in future
Pending / Standby• Time to execute the request has arrived• Request waits for to be evaluated by the Conflict Resolution Manager
Pending / Normal• Request is allowed to be executed• Request is waiting to be picked up by Concurrent Manager
Running / Running• Request is beeing executed by a concurrent manager
Completed / Normal
2010-11-29
© 2010 Tieto Corporation8
Phase/Status representation in tables• FND_CONCURRENT_REQUESTS
• PHASE_CODE• STATUS_CODE
• Is that so simple? NO!• Phase = Inactive practically not used
• Inactive/Disabled: FCR.PHASE_CODE=‘P’, FCP.ENABLED_FLAG=‘N’• Inactive/On Hold: FCR.PHASE_CODE=‘P’, FCR.HOLD_FLAG=‘Y’• Inactive/No manager: FCR.PHASE_CODE=‘P’ and nonexistance of
specific rows in FND_CONCURRENT_WORKER_REQESTS view• Pending/Scheduled is actually
• STATUS_CODE=‘Q’+(FCR.REQUESTED_START_DATE>SYSDATE)
• Why so complicated? • No need to update statuses too often
2010-11-29
© 2010 Tieto Corporation9
Pending / Scheduled
2010-11-29
© 2010 Tieto Corporation10
Schedule types
• As Soon as Possible/OnceFCR.REQUESTED_START_DATE
• Periodically/On Specific DaysFND_CONC_RELEASE_CLASSES (FCRC)
• Join with FCR on column RELEASE_CLASS_ID• FCRC.DATE1 - “Start at” field in the form• FCRC.DATE2 - “End at” field in the form• FCRC.CLASS_TYPE – «P» for «Periodically», «S» for «On Specific
days»• FCRC.CLASS_INFO – actual schedule data
• FCR.REQUESTED_START_DATE always contains the time of next execution, only this field used to determine when the request should be run by the concurrent manager
2010-11-29
© 2010 Tieto Corporation11
Periodic schedulesFCRC.CLASS_INFO contents• Values like «X:Y:Z»
• X – number of months/weeks/days/hours/minutes the request has to be rescheduled from prior run.
• Y – time units: “M” – months, “D” – days, “H” – hours, “N” – minutes• Z – rescheduling type: «S» – from the start of the prior run, «C» –
from the completion of the prior run.• Samples
• 30:N:S – Repeat every 30 minutes from the start of the prior run• 5:N:C – Repeat every 5 minutes from the completion of the prior run• 12:H:S – Repeat every 12 hours from the start of the prior run
2010-11-29
© 2010 Tieto Corporation12
«On Specific Days» schedulesFCRC.CLASS_INFO contents• Values like
«XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXYZZZZZZZ»• X, Y, Z contains values 0 or 1, representing if the option is selected
or not• X – digit in each position represent dates 1 – 31• Y – Last Day of month• Z – day of week, Su - Sa
• Samples• 000000000000000000000000000000000000001 – Days of week:
Sa• 111111111000000000000000000000000111110 – Dates: 1 2 3 4 5 6
7 8 9. Days of week: Mo Tu We Th Fr• 000000000000000000000000000000010000000 – Last day of
month
2010-11-29
© 2010 Tieto Corporation13
Reporting the schedules• Easy to schedule requests, but hard to keep track• Check if there are no obsolete schedules that can be removed or tuned• Might be few repeting schedules of the same concurrent program can
be consolidated into one
2010-11-29
reporting_concurrent_request_schedules.sql
© 2010 Tieto Corporation14
Pending / Standby
2010-11-29
© 2010 Tieto Corporation15
Conflict Resolution Manager (CRM)• CRM resolves conflicts in execution of requests enforced by
incompatibility rules• Simplified CRM workflow
• Startup – load all incompatibility rules 1. Loaded from FND_CONCURRENT_PROGRAM_SERIAL2. Stored in memory by FNDCRM (CRM binary executable)
• Each itereation1. If FCQ.CONTROL_CODE=‘V’, reload incompatibility rules (?)2. Lock STATUS_CODE for all Pending/Normal and Pending/Scheduled
requests3. Check each Pending request against the incompatibility rules and verify
there are running concurrent managers that can process them (FND_CONCURRENT_WORKER_REQUESTS)1. «Release» Pending/Standby requests, which don’t break any rules2. «Return» Pending/Normal requests, which break some rules (?)
4. Record statistics in FND_CRM_HISTORY5. Sleep for «Sleep Seconods»
2010-11-29
© 2010 Tieto Corporation16
Interesting DB objects• FND_CONCURRENT_WORKER_REQUESTS
• View returning mappings of requests and concurrent managers able to execute them
• Definition of the view contains hardcoded IDs according to the Specialization Rules of Concurrent Managers
• Rebuilt by «Build Concurrent Request Queue View» request on change of Specilaization rules
• FND_CRM_HISTORY• Good information for tuning and troubleshooting• Records statistics for each CRM run, stotistics include:
• REQUESTS_EXAMINED• REQUESTS_STANDBY• REQUESTS_RELEASED• REQUESTS_RETURNED
• Can be joined with FCR to find out how many requests were released in each RCM execution iterationFCR.CRM_RELEASE_DATE between FCH.WORK_START AND FCH.WORK_END
• Purged by «Purge Concurrent Request and/or Manager Data», leaving 1 day of history (R12.1.3)
2010-11-29
© 2010 Tieto Corporation17
Graphing FND_CRM_HISTORY
2010-11-29
© 2010 Tieto Corporation18
Why is my request still pending?• Check the status!
• Pending/Scheduled – time for execution has not come yet• Pending/Normal –waiting to be picked up by concurrent manager
• The request has just been released• All concurrent manager processes are busy executing requests• Long request execution queue
• Pending/Standby – Waiting for CRM to be releasedSELECT WREQID, PHASE, STATUS, WHY FROM FND_CONC_WAITING_REQUESTS WHERE REQID = &REQUEST_ID --5820658
2010-11-29
© 2010 Tieto Corporation19
Pending / Normal
2010-11-29
© 2010 Tieto Corporation20
Workflow of «Concurrent Managers»
2010-11-29
© 2010 Tieto Corporation21
Building SQL for querying the requests queue Select R.Rowid
From Fnd_Concurrent_Requests R
Where R.Hold_Flag = 'N'
And R.Status_Code = 'I'
And R.Requested_Start_Date <= Sysdate
And (R.Node_Name1 is null or
(R.Node_Name1 is not null and
FND_DCP.target_node_mgr_chk(R.request_id) = 1))
AND EXISTS
(Select Null
From Fnd_Concurrent_Programs P
Where P.Enabled_Flag = 'Y'
And R.Program_Application_Id = P.Application_Id
And R.Concurrent_Program_Id = P.Concurrent_Program_Id
AND EXISTS
(Select Null
From Fnd_Oracle_Userid O
Where R.Oracle_Id = O.Oracle_Id
AND EXISTS (Select Null
From Fnd_Conflicts_Domain C
Where P.Run_Alone_Flag = C.RunAlone_Flag
And R.CD_Id = C.CD_Id))
…
…
And (P.Execution_Method_Code != 'S' OR
(R.PROGRAM_APPLICATION_ID, R.CONCURRENT_PROGRAM_ID) IN
((0, 98), (0, 100), (0, 31721), (0, 31722), (0, 31757)))
AND ((R.PROGRAM_APPLICATION_ID, R.CONCURRENT_PROGRAM_ID) NOT IN
((510, 40032),
(510, 40033),
(510, 42156),
(510, 42157),
(530, 43793),
(530, 43794),
(535, 42626),
(535, 42627),
(535, 42628)) AND
((R.REQUEST_CLASS_APPLICATION_ID IS NULL AND
R.CONCURRENT_REQUEST_CLASS_ID IS NULL) OR
(R.REQUEST_CLASS_APPLICATION_ID,
R.CONCURRENT_REQUEST_CLASS_ID) NOT IN ((0, 2)))))
ORDER BY NVL(R.priority, 999999999), R.Priority_Request_ID, R.Request_ID
2010-11-29
Query only «Pending/Normal» requestsDistributed Concurrent Processing implementation«Run Alone» flag implementation with Conflict domainsImplementation of «Immediate» type executables (Subroutines)Exclusion specialization rules for concurrent programsExclusion Specialization rule for Request typeExecution order for concurrent requests
© 2010 Tieto Corporation22
Building SQL for querying the requests queue • The query is built at the startup of the concurrent manager• It hardcodes all specialization rules for the manager• Any changes to specialization rules force restart of the
concurrent manager processes (and runs «Build Concurrent Request Queue View» concurrent program too)• Be careful! It does restart automatically! What happens if there are
long running requests?• Use «request types»!
2010-03-26
© 2010 Tieto Corporation23
Locking the STATUS_CODESELECT ...
FROM fnd_concurrent_requests R,
fnd_concurrent_programs P,
fnd_application A,
fnd_user U,
fnd_oracle_userid O,
fnd_conflicts_domain C,
fnd_concurrent_queues Q,
fnd_application A2,
fnd_executables E,
fnd_conc_request_arguments X
WHERE R.Status_code = 'I'
And ((R.OPS_INSTANCE is null) or (R.OPS_INSTANCE = -1) or
(R.OPS_INSTANCE =
decode(:dcp_on, 1, FND_CONC_GLOBAL.OPS_INST_NUM, R.OPS_INSTANCE)))
And R.Request_ID = X.Request_ID(+)
And R.Program_Application_Id = P.Application_Id(+)
And R.Concurrent_Program_Id = P.Concurrent_Program_Id(+)
And R.Program_Application_Id = A.Application_Id(+)
And P.Executable_Application_Id = E.Application_Id(+)
And P.Executable_Id = E.Executable_Id(+)
And P.Executable_Application_Id = A2.Application_Id(+)
And R.Requested_By = U.User_Id(+)
And R.Cd_Id = C.Cd_Id(+)
And R.Oracle_Id = O.Oracle_Id(+)
And Q.Application_Id = :q_applid
And Q.Concurrent_Queue_Id = :queue_id
And (P.Enabled_Flag is NULL OR P.Enabled_Flag = 'Y')
And R.Hold_Flag = 'N‘…
…And R.Requested_Start_Date <= Sysdate
And (R.Enforce_Seriality_Flag = 'N' OR
(C.RunAlone_Flag = P.Run_Alone_Flag And
(P.Run_Alone_Flag = 'N' OR Not Exists
(Select Null
From Fnd_Concurrent_Requests Sr
Where Sr.Status_Code In ('R', 'T')
And Sr.Enforce_Seriality_Flag = 'Y'
And Sr.CD_id = C.CD_Id))))
And Q.Running_Processes <= Q.Max_Processes
And R.Rowid = :reqname
And ((P.Execution_Method_Code != 'S' OR
(R.PROGRAM_APPLICATION_ID, R.CONCURRENT_PROGRAM_ID) IN
((0, 98), (0, 100), (0, 31721), (0, 31722), (0, 31757))) AND
((R.PROGRAM_APPLICATION_ID, R.CONCURRENT_PROGRAM_ID) NOT IN
((510, 40032),
(510, 40033),
(510, 42156),
(510, 42157),
(530, 43793),
(530, 43794),
(535, 42626),
(535, 42627),
(535, 42628)) AND
((R.REQUEST_CLASS_APPLICATION_ID IS NULL AND
R.CONCURRENT_REQUEST_CLASS_ID IS NULL) OR
(R.REQUEST_CLASS_APPLICATION_ID, R.CONCURRENT_REQUEST_CLASS_ID) NOT IN
((0, 2)))))
FOR UPDATE OF R.status_code NoWait
2010-11-29
© 2010 Tieto Corporation24
Locking the STATUS_CODE• Query for locking the STATUS_CODE reimplements the same validation
criteria to make sure situation has not changed• All processes of a concurrent manager use the same query to fetch the
«cache size» number of requests• As more processes of the same manager are run, as higher the competition
for requests («ORA-00054: resource busy and acquire with NOWAIT specified», or 0 rows updated by the query if the status has been changed already)
• As higher the competition, as faster runs out the list of cached request ids for each manager
• As sooner the list of cached queries runs out, as more often FCR is queried • We want to query FCR as saldom as possible
• Not hard to get to point where FCR queries are TOP SQLs in DB• Even more important if you have RAC• The key is to minimize the number of concurrent manager processes• Cache size and sleep seconds have some effect
2010-11-29
© 2010 Tieto Corporation25
Cache size and Sleep seconds• «Sleep seconds»
• sleep time of Conflict Resolution Manager affects how soon the request will be passed to execution
• are spent only when no requests are in pending/normal status in FCR
• should be chosen based on max time the request is allowed to spend in the queue
• Max time 20s and 5 managers? == 100s sleep seconds?
• «Cache size»• Large cache sizes make changes of request priorities less effective
(do you use different priorities)• Small cache size is OK for Long-running requests queue• Larger cache sizes are OK for Short-running requests queues that
have few concurrent manager instances.• Large cache size increases the number of failed attempts to lock the
status code.
2010-11-29
© 2010 Tieto Corporation26
Running / Normal
2010-11-29
© 2010 Tieto Corporation27
How to find processes and sessions?select r.request_id req_id,
r.phase_code p,
r.status_code s,
(select node_name || ':'
from applsys.fnd_concurrent_processes cp
where concurrent_process_id = r.controlling_manager) ||
r.os_process_id cp_process,
gi.INSTANCE_NAME || ':' || ss.sid || ',' || ss.serial# inst_sid_serial#,
gi.HOST_NAME || ':' || pp.spid db_process,
ss.program,
ss.status,
ss.sql_id || ':' || ss.sql_child_number sql_id_chld,
ss.event,
ss.WAIT_TIME,
ss.STATE
from applsys.fnd_concurrent_requests r,
gv$session ss,
gv$process pp,
gv$instance gi,
applsys.fnd_concurrent_processes cp
where request_id = &request_id
and ss.audsid(+) = r.oracle_session_id
and pp.inst_id(+) = ss.inst_id
and pp.addr(+) = ss.paddr
and gi.inst_id(+) = ss.inst_id
and cp.concurrent_process_id(+) = r.controlling_manager
2010-11-29
• For completed requests CP_PROCESS field is still visible
© 2010 Tieto Corporation28
Completed / Normal……and also the presentation is
Completed/Normal
2010-11-29
© 2010 Tieto Corporation29
Where to get more information?• OTN - Oracle E-Business Suite System Administrator's
Guide Documentation Set• http://etrm.oracle.com – ER Diagrams and information about
the DB Objects• http://appsdbalife.wordpress.com – Comment and ask the
questions, I will answer!
2010-11-29
Thank you!
?
© 2
010
Tie
to C
orpo
ratio
n Māris Elsiņš
Senior Oracle Applications DBATieto Latvia,[email protected]