KPI Investigation Methodology Process

download KPI Investigation Methodology Process

of 27

description

Helping doc....

Transcript of KPI Investigation Methodology Process

  • KPI Investigation MethodologyProcesses, methods and tools to troubleshoot

    Alcatel-Lucent InternalProprietary Use pursuant to Company instruction.

    AgendaIntroductionKPI Investigation Methodology OverviewFirst Level InvestigationDeep Investigation

    Alcatel-Lucent InternalProprietary Use pursuant to Company instruction.

    * | UA7.1.3 KPI Investigation Methodology | Feb. 2011 Introduction1

    Alcatel-Lucent InternalProprietary Use pursuant to Company instruction.

    * | UA7KPI Investigation Methodology IntroductionKPI degradation after Swap/Software Upgrade / Hardware Replacement / Feature Activation is a sensitive issue for customers. KPI degradations need to be quickly detected through accurate monitoring. Main objectives of this document:Present the recommended Processes,and Tools to troubleshoot efficiently the performance issues observed during Global Market.In particular, clarify the roles of the on-site team and back-office teams during the First Level Investigation and Deep Investigation steps of the investigation

    Alcatel-Lucent InternalProprietary Use pursuant to Company instruction.

    * | UA7.1.3 KPI Investigation Methodology | Feb. 2011 KPI Investigation Methodology Overview2

    Alcatel-Lucent InternalProprietary Use pursuant to Company instruction.

    * | UA7.1.3 KPI Investigation Methodology | Feb. 2011 KPI Investigation Methodology OverviewInvestigation process (organizational view)Monitor the network (customer)First Level Analysis Monitor the network (ALU)Deep Inves- tigation- Collect additional traces (RNC and NodeB internal traces, Network Analyzer traces, CTg, Call Failure Trace, etc.)- Change parameters and perform the related monitoring- Perform drive tests- Apply fixes and assess performance recoveryDeeply analyze parameter settings and counters- Analyze the issue based on all available data (including traces)- Maintain a comprehensive document (usually Powerpoint) with investigation results and current action planOptimization/Support/GPS troubleshooting experts subject matter experts (GPS group)First Level InvestigationOpen an AR that should include all the outputs from the First Level AnalysisKPI degradationAlarms,Configuration

    Alcatel-Lucent InternalProprietary Use pursuant to Company instruction.

    * | UA7.1.3 KPI Investigation Methodology | Feb. 2011 KPI Investigation Methodology OverviewInvestigation process (functional view)Customer feedbackCheck parameter changesRecord pre-defined set of Call Trace & product tracesBasic counter analysis to identify the TopN most impacted NodeBs or CellsCS CDR TroubleshootingPS R99 CDR TroubleshootingHSxPA CDR TroubleshootingMobility TroubleshootingAccessibility TroubleshootingThroughput TroubleshootingCheck alarms & stabilityKPI degradationDaily monitoring- Logs are uploaded as recommended, then AR is opened- Before considering going deeper (e.g. open CR, trigger more support), check the Release Notes applicable to current SW load to verify if issue is a known issueFirst Level Investigation

    Deep InvestigationOptimization/Support/GPS and troubleshooting experts

    Alcatel-Lucent InternalProprietary Use pursuant to Company instruction.

    * | UA7.1.3 KPI Investigation Methodology KPI Investigation Methodology OverviewWeekly performance investigation meetingsFor each identified performance issue , technical meetings ;these meetings are usually chaired by a single matter expert or Optimization troubleshooting expert should be scheduled on a weekly basis in order to:Share the latest investigation results from each involved team Synch-up on the recent key events and data logsAgree on potential new axis of investigationsDetermine a clear action plan (next steps) and assess its feasibility

    Alcatel-Lucent InternalProprietary Use pursuant to Company instruction.

    * | UA7.1.3 KPI Investigation Methodology | Feb. 2011 KPI Investigation Methodology OverviewUseful Links Internal tools (ALU-proprietary) [1/3]WPS (Wireless Provisioning System): https://wcdma-ll.app.alcatel-lucent.com/livelink/livelink.exe?func=ll&objId=63414868 Launch WPS installation (file wips-Vxxx_windows.exe)When prompted during installation, load the relevant UMTS plugin (file UMTS_Access-Vxxx.wipsar)Call Failure Trace: https://wcdma-ll.app.alcatel-lucent.com/livelink/livelink.exe?func=ll&objId=57344700At above link can be found:Presentations on CFT capabilities and CFT usage examplesCFT Reader tooleDAT (Evolved Data Analysis Tool): http://navigator.web.alcatel-lucent.com/index.htm RNC Log Collection: https://wcdma-ll.app.alcatel-lucent.com/livelink/livelink.exe?func=ll&objId=61064195 NodeB Log Collection: https://wcdma-ll.app.alcatel-lucent.com/livelink/livelink.exe?func=ll&objId=60312775

    Alcatel-Lucent InternalProprietary Use pursuant to Company instruction.

    * | UA7.1.3 KPI Investigation Methodology | Feb. 2011 KPI Investigation Methodology OverviewUseful Links Internal tools (ALU-proprietary) [3/3]RFO Product: http://aww.srd.alcatel.com/icr/SDCT/sdct.php The License must be requested through the SDCT (Software Delivery Center Tool) homepage accessible at above link. Once under the SDCT homepage: Download the SDCT User Guide (Online documentation PowerPoint link at the top of the page) and read itDownload the W-CDMA RFO License Ordering presentation (W-CDMA_RFO_License link at the bottom-right of the page) and read itClick on the Request link at the middle of the page to initiate the license request procedure. - If you are not yet registered in the SDC database, you will first need to register (Registering button) - When prompted the template to select is RFO Product license

    Alcatel-Lucent InternalProprietary Use pursuant to Company instruction.

    * | UA7.1.3 KPI Investigation Methodology | Feb. 2011 KPI Investigation Methodology OverviewUseful Links External toolsWireshark:Wireshark: http://www.wireshark.org/download.htmlWinPcap: http://www.winpcap.org/archive/4.1beta5_WinPcap.exe - Packet capture and filtering engine of many open source and commercial network tools - Used by Wireshark Iperf: https://wcdma-ll.app.alcatel-lucent.com/livelink/livelink.exe?func=ll&objid=57980928

    Alcatel-Lucent InternalProprietary Use pursuant to Company instruction.

    * | UA7.1.3 KPI Investigation Methodology | Feb. 2011 First Level Investigation3

    Alcatel-Lucent InternalProprietary Use pursuant to Company instruction.

    * | UA7.1.3 KPI Investigation Methodology | Feb. 2011 First Level InvestigationPre-Requisites: Collection of reference data before Upgrade [2/2]The KPI Investigation Methodology is fully applicable only if the following reference data are properly collected in case of Swap/the SW Upgrade / HW Replacement / Feature Activation (continued):

    Snapshot (.xml or .xcm file containing the parameter configuration data) Daily network Snapshots should be collected during at least 1 week prior to the upgrade, to check configuration (parameters) stability. The Snapshots should be stored in a LiveLink repository for easy access by back-office teams.Hardware Inventory HW Inventory files should be collected.Drive Test.Call Trace.

    Alcatel-Lucent InternalProprietary Use pursuant to Company instruction.

    * | UA7.1.3 KPI Investigation Methodology First Level InvestigationPre-requisites: The KPI Investigation Methodology refers to many internal (ALU-proprietary) and external troubleshooting toolsTool X: Internal tool (ALU-proprietary)Tool Y: External tool

    Tools needed for First Level InvestigationTools needed for Deep Investigation (i.e. to collect additional logs and deeply analyze the issue)WPS (Wireless Provisioning System)NPO (Nw Performance Optimizer) UTRAN Call Trace WizardHFB (Historical Fault Browser)RNC Log CollectionNodeB Log CollectionBTS Set TraceseDAT, to post-process UE traces and Call TracesRFO Product, to post-process Call Traces WQA, to post-process CTn and CFT tracesCFT Reader, to post-process CFT tracesRemote Connection to NodeB (through Telnet session or by using TIL tool), to check the NodeB status and collect specific NodeB internal traces (e.g. IMT logs)Audit Tool, to produce the Audit and NodeB Profiling excel files based on Snapshot and HW Inventory Wireshark, Windump, Iperf, DU Meter to investigate E2E Tput

    Alcatel-Lucent InternalProprietary Use pursuant to Company instruction.

    * | UA7.1.3 KPI Investigation Methodology | Feb. 2011 First Level InvestigationSelection of TopN most impacted network elementsCustomer feedbackDaily monitoringCheck para-meter changes1/ Basic counter analysis to identify the TopN most impacted NodeBs or Cells For this, use standard NPO Views2/ Also try to roughly identify the failure causesCheck alarms & stability- Focus on the TopN NEs determined through counter analysis- Use e.g. HFB toolRecord pre-defined set of Call Trace & product traces- Launch CTg+CFT and RNC Log Collection on the TopN NEs- Launch NodeB Log Collection and BTS Set Traces with NodeB KPI AP template on the Top3 NodeBsFirst Level Investigation

    Deep InvestigationKPI degradationRule to select the TopN most impacted (or worst) network elements: If degradation is observed at RNC level Identify the TopN NodeBs If degradation is observed at NodeB level Identify the TopN Cells- Logs are uploaded as recommended, then AR is opened- Before considering going deeper, check the Release Notes applicable to current SW load to verify if issue is a known issue

    Alcatel-Lucent InternalProprietary Use pursuant to Company instruction.

    * | UA7.1.3 KPI Investigation Methodology | Feb. 2011 First Level InvestigationParameter check [2/2]Method to compute all the differences between 2 Snapshots with WPS tool:Import in WPS the earliest available Snapshot after the observed KPI degradation. Use Replace initial snapshot and discard existing workorders option. Under the same WPS application, import on top of previously loaded Snapshot the latest available Snapshot before the observed KPI degradation. Use Resynchronize the planned configuration with new snapshot option. Under the Workorders tab you will see the delta workorder to transition from old to new Snapshot, showing all the differences. You can analyze the computed differences here, or make a right-click on the table to export them as an .html file and analyze them under Excel (recommended option).

    Alcatel-Lucent InternalProprietary Use pursuant to Company instruction.

    * | UA7.1.3 KPI Investigation Methodology First Level Investigation Call Trace [1/2]CTB/CTg Call Trace logs: In order to collect the relevant data in CTg logs while capturing a sufficiently high number of failures, the CTg session can be configured to capture only a few specific call types. CFT Call Trace logs:Since UA7.1 release, Call Failure Trace (CFT) logs have proven to be a powerful additional type of Call Trace for KPI investigation. A CFT record is being captured each time an RRC Connection failure, RAB Assignment failure or RAB Drop occurs.Differently from CTg logs, CFT logs do not contain the whole call flow but are basically a detailed snapshot of the call at failure time; CFT logs are typically smaller than CTg logs. In typical live environment a CFT session will capture the great majority of call failures, whereas a CTg session usually captures only a relatively small part of the total number calls.A CFT session will attempt to capture all call failures, regardless to the call types specified in the CTg Call Trace template CFT is particularly powerful for Multi-RAB types of KPI degradations CFT logs should be collected each time a CTg session is launched. This can be done by making sure to use a CTg Call Trace template that includes the record called UeCallFailureFull.CFT logs should always be made available together with the CTg logs under the log storage location (e.g. in a folder named CallTrace).

    Alcatel-Lucent InternalProprietary Use pursuant to Company instruction.

    * | UA7.1.3 KPI Investigation Methodology | First Level Investigation

    WPS snapshot delta (e.g. .html format), 2 snapshots (latest snapshot before the degradation, earliest snapshot after the degr.) in folder SnapshotsOutput of RNC Log Collection in folder RNC logsOutput of NodeB Log Collection (at least 1 BTS) in folder NodeB logsCTg traces in folder CallTraceCall Failure Trace data in folder CallTrace

    Alcatel-Lucent InternalProprietary Use pursuant to Company instruction.

    * | UA7.1.3 KPI Investigation Methodology | Feb. 2011 Deep Investigation4

    Alcatel-Lucent InternalProprietary Use pursuant to Company instruction.

    * | UA7.1.3 KPI Investigation Methodology Deep InvestigationOverviewCustomer feedbackCS CDR TroubleshootingPS R99 CDR TroubleshootingHSxPA CDR TroubleshootingMobility TroubleshootingAccessibility TroubleshootingThroughput TroubleshootingKPI degradationDaily monitoring- Before considering going deeper, check the Release Notes applicable to current SW load to verify if issue is a known issue.Alarms and Configuration.First Level Investigation

    Deep InvestigationOptimization/Support/GPS troubleshooting experts.First Level Analysis

    Alcatel-Lucent InternalProprietary Use pursuant to Company instruction.

    * | UA7.1.3 KPI Investigation Methodology 1 Deep Investigation Counter-based in-depth analysisDeeper counter-based analysis is needed in order to characterize the degradation. The aim is:To narrow down as much as possible where the degradation occurs (product, board, procedure)To identify a common profile of TopN most impacted cells (HW, configuration, traffic profile, etc)The analysis is done with:Counter correlation => attempt to correlate the general counter showing the degradation and a counter for a specific lower level procedure.Comparison of distribution of failure types (e.g. which cause of drop has increased the most)Counters-based analysis is also used:To quantify patterns identified by CTg analysisTo assess the KPI recovery after fix delivery

    Alcatel-Lucent InternalProprietary Use pursuant to Company instruction.

    * | UA7.1.3 KPI Investigation Methodology Deep InvestigationCS CDR Troubleshooting cookbookAnalysis of relevant counters for CS CDR degradationAnalysis of CTg traces CS CDR troubleshootingConsolidate global analysis (using all tools outputs)Analysis of CFT dataLaunch CTg+CFT session on TopN cellsConsolidate CTg + CFT + NPO analysis (cause of drops, drop scenario, IMSI)

    Alcatel-Lucent InternalProprietary Use pursuant to Company instruction.

    * | UA7.1.3 KPI Investigation Methodology | Deep InvestigationPS R99 Troubleshooting cookbookAnalysis of relevant counters for PS CDR degradationAnalysis of CTg traces PS R99 CDR troubleshootingConsolidate global analysis (using all tools outputs)Analysis of CFT dataLaunch CTg+CFT session on TopN cellsConsolidate CTg + CFT + NPO analysis (cause of drops, drop scenario, IMSI)

    Alcatel-Lucent InternalProprietary Use pursuant to Company instruction.

    * | UA7.1.3 KPI Investigation Methodology Deep InvestigationHSxPA CDR Troubleshooting cookbookAnalysis of relevant counters for HSxPA CDR degradationAnalysis of CTg traces HSxPA CDR troubleshootingConsolidate global analysis (using all tools outputs)Analysis of CFT dataLaunch CTg+CFT session on TopN cellsConsolidate CTg + CFT + NPO analysis (cause of drops, drop scenario, IMSI)If needed, perform Drive tests + UE traces

    Alcatel-Lucent InternalProprietary Use pursuant to Company instruction.

    * | UA7.1.3 KPI Investigation Methodology Deep InvestigationMobility troubleshooting cookbookMobility troubleshootingAnalysis of relevant counters for Mobility degradationAnalysis of CTg traces Consolidate global analysis (using all tools outputs)Perform WQA CTn analysisLaunch CTn session on TopN cellsConsolidate CTg + NPO analysisIf needed, perform Drive tests + UE traces

    Alcatel-Lucent InternalProprietary Use pursuant to Company instruction.

    * | UA7.1.3 KPI Investigation Methodology Deep InvestigationAccessibility troubleshooting cookbookAccessibility troubleshootingIn case of RRC degradation:

    Perform drive-tests (UE traces) Analysis of relevant counters for Accessibility degradationAnalysis of CTg traces Consolidate global analysis (using all tools outputs)Consolidate CTg + CFT + NPO analysisLaunch CTg+CFT session on TopN cellsAnalysis of CFT data

    Alcatel-Lucent InternalProprietary Use pursuant to Company instruction.

    * | UA7.1.3 KPI Investigation Methodology | Feb. 2011 www.alcatel-lucent.comwww.alcatel-lucent.com

    **********************