KPI Investigation Methodology Process UA7
-
Upload
amit-kumar-raut -
Category
Documents
-
view
85 -
download
1
description
Transcript of KPI Investigation Methodology Process UA7
KPI Investigation MethodologiesProcesses, methods and tools to troubleshoot
Agenda
1. Introduction
2. KPI Investigation Methodology Overview
3. First Level Investigation
4. Deep Investigation
Alcatel-Lucent – InternalProprietary – Use pursuant to Company instruction. 3 | UA7.1.3 KPI Investigation Methodology | Feb. 2011
Introduction1
Alcatel-Lucent – InternalProprietary – Use pursuant to Company instruction. 4 | UA7KPI Investigation Methodology
Introduction
KPI degradation after Swap/Software Upgrade / Hardware Replacement / Feature Activation is a sensitive issue for customers. KPI degradations need to be quickly detected through accurate monitoring.
Main objectives of this document:
Present the recommended Processes,and Tools to troubleshoot efficiently the performance issues observed during Global Market.
In particular, clarify the roles of the on-site team and back-office teams during the “First Level Investigation” and “Deep Investigation” steps of the investigation
Alcatel-Lucent – InternalProprietary – Use pursuant to Company instruction. 5 | UA7.1.3 KPI Investigation Methodology | Feb. 2011
KPI Investigation Methodology Overview2
Alcatel-Lucent – InternalProprietary – Use pursuant to Company instruction. 6 | UA7.1.3 KPI Investigation Methodology | Feb. 2011
KPI Investigation Methodology Overview
Investigation process (organizational view)Monitor the network (customer)
Monitor the network (customer)
First Level AnalysisFirst Level Analysis
Monitor the network (ALU)Monitor the network (ALU)
DeepDeepInves-Inves-tigatiotigationn
- Collect additional traces (RNC and NodeB internal traces, Network Analyzer traces, CTg, Call Failure Trace, etc.)
- Change parameters and perform the related monitoring
- Perform drive tests- Apply fixes and
assess performance recovery
- Collect additional traces (RNC and NodeB internal traces, Network Analyzer traces, CTg, Call Failure Trace, etc.)
- Change parameters and perform the related monitoring
- Perform drive tests- Apply fixes and
assess performance recovery
Deeply analyze parameter settings and
counters
Deeply analyze parameter settings and
counters
- Analyze the issue based on all available data (including traces)
- Maintain a comprehensive document (usually Powerpoint) with investigation results and current action plan
- Analyze the issue based on all available data (including traces)
- Maintain a comprehensive document (usually Powerpoint) with investigation results and current action plan
Optimization/Support/GPS troubleshooting experts subject matter experts
(GPS group)
First Level First Level InvestigatInvestigationion
Open an AR that should include all the outputs from the First Level Analysis
Open an AR that should include all the outputs from the First Level Analysis
KPI degradationKPI degradation
Alarms,Configuration
Alcatel-Lucent – InternalProprietary – Use pursuant to Company instruction. 7 | UA7.1.3 KPI Investigation Methodology | Feb. 2011
KPI Investigation Methodology OverviewInvestigation process (functional view)
Customer feedback
Check paramet
er changes
Record pre-defined set of Call Trace &
product traces
Basic counter analysis to
identify the TopN most impacted NodeBs or Cells
CS CDRTroubleshooting
PS R99 CDRTroubleshooting
HSxPA CDRTroubleshooting
MobilityTroubleshooting
AccessibilityTroubleshooting
ThroughputTroubleshooting
Check alarms
& stabilit
y
KPI degradation Daily monitoring
- Logs are uploaded as recommended, then AR is opened- Before considering going deeper (e.g. open CR, trigger more support), check the “Release Notes” applicable to current SW load to verify if issue is a known issue
First LevelFirst LevelInvestigatInvestigationion
Deep Deep InvestigationInvestigation Optimization/
Support/GPS and troubleshooting experts
Alcatel-Lucent – InternalProprietary – Use pursuant to Company instruction. 8 | UA7.1.3 KPI Investigation Methodology
KPI Investigation Methodology Overview
Weekly performance investigation meetings
For each identified performance issue , technical meetings ;these meetings are usually chaired by a single matter expert or Optimization troubleshooting expert should be scheduled on a weekly basis in order to: Share the latest investigation results from each involved team Synch-up on the recent key events and data logs Agree on potential new axis of investigations Determine a clear action plan (“next steps”) and assess its feasibility
Alcatel-Lucent – InternalProprietary – Use pursuant to Company instruction. 9 | UA7.1.3 KPI Investigation Methodology | Feb. 2011
KPI Investigation Methodology Overview
Useful Links – Internal tools (ALU-proprietary) [1/3]
WPS (Wireless Provisioning System):https://wcdma-ll.app.alcatel-lucent.com/livelink/livelink.exe?func=ll&objId=63414868
1. Launch WPS installation (file “wips-Vxxx_windows.exe”)
2. When prompted during installation, load the relevant UMTS plugin(file “UMTS_Access-Vxxx.wipsar”)
Call Failure Trace:https://wcdma-ll.app.alcatel-lucent.com/livelink/livelink.exe?func=ll&objId=57344700
At above link can be found:
Presentations on CFT capabilities and CFT usage examples
CFT Reader tool
eDAT (Evolved Data Analysis Tool):http://navigator.web.alcatel-lucent.com/index.htm
RNC Log Collection:https://wcdma-ll.app.alcatel-lucent.com/livelink/livelink.exe?func=ll&objId=61064195
NodeB Log Collection:https://wcdma-ll.app.alcatel-lucent.com/livelink/livelink.exe?func=ll&objId=60312775
Alcatel-Lucent – InternalProprietary – Use pursuant to Company instruction. 10 | UA7.1.3 KPI Investigation Methodology | Feb. 2011
KPI Investigation Methodology Overview
Useful Links – Internal tools (ALU-proprietary) [3/3]
RFO Product:http://aww.srd.alcatel.com/icr/SDCT/sdct.php
The License must be requested through the SDCT (Software Delivery Center Tool)
homepage accessible at above link. Once under the SDCT homepage:
1. Download the SDCT User Guide (“Online documentation PowerPoint” link at the top of
the page) and read it
2. Download the W-CDMA RFO License Ordering presentation (“W-CDMA_RFO_License”
link at the bottom-right of the page) and read it
3. Click on the “Request” link at the middle of the page to initiate the license request
procedure.
- If you are not yet registered in the SDC database, you will first need to register
(“Registering” button)
- When prompted the template to select is “RFO Product license”
Alcatel-Lucent – InternalProprietary – Use pursuant to Company instruction. 11 | UA7.1.3 KPI Investigation Methodology | Feb. 2011
KPI Investigation Methodology Overview
Useful Links – External tools
Wireshark:
Wireshark: http://www.wireshark.org/download.html
WinPcap: http://www.winpcap.org/archive/4.1beta5_WinPcap.exe - Packet capture and filtering engine of many open source and commercial network tools- Used by Wireshark
Iperf:https://wcdma-ll.app.alcatel-lucent.com/livelink/livelink.exe?func=ll&objid=57980928
Alcatel-Lucent – InternalProprietary – Use pursuant to Company instruction. 12 | UA7.1.3 KPI Investigation Methodology | Feb. 2011
First Level Investigation3
Alcatel-Lucent – InternalProprietary – Use pursuant to Company instruction. 13 | UA7.1.3 KPI Investigation Methodology | Feb. 2011
First Level Investigation
Pre-Requisites: Collection of reference data before Upgrade [2/2]
The KPI Investigation Methodology is fully applicable only if the following reference data are properly collected in case of Swap/the SW Upgrade / HW Replacement / Feature Activation (continued):
Snapshot (.xml or .xcm file containing the parameter configuration data) Daily network Snapshots should be collected during at least 1 week prior to the upgrade,
to check configuration (parameters) stability. The Snapshots should be stored in a LiveLink repository for easy access by back-office teams.
Hardware Inventory HW Inventory files should be collected.
Drive Test. Call Trace.
Alcatel-Lucent – InternalProprietary – Use pursuant to Company instruction. 14 | UA7.1.3 KPI Investigation Methodology
First Level Investigation
Pre-requisites:
Tools needed for First Level Investigation
Tools needed for Deep Investigation(i.e. to collect additional logs and deeply analyze the issue)
WPS (Wireless Provisioning System)
NPO (Nw Performance Optimizer)
UTRAN Call Trace WizardHFB (Historical Fault Browser)RNC Log CollectionNodeB Log CollectionBTS Set Traces
eDAT, to post-process UE traces and Call TracesRFO Product, to post-process Call Traces WQA, to post-process CTn and CFT tracesCFT Reader, to post-process CFT tracesRemote Connection to NodeB (through Telnet
session or by using TIL tool), to check the NodeB status and collect specific NodeB internal traces (e.g. IMT logs)
Audit Tool, to produce the Audit and NodeB Profiling excel files based on Snapshot and HW Inventory
Wireshark, Windump, Iperf, DU Meter to investigate E2E Tput
The KPI Investigation Methodology refers to many internal (ALU-proprietary) and external troubleshooting tools
Tool X: Internal tool (ALU-proprietary)
Tool Y: External tool
Alcatel-Lucent – InternalProprietary – Use pursuant to Company instruction. 15 | UA7.1.3 KPI Investigation Methodology | Feb. 2011
First Level Investigation
Selection of “TopN” most impacted network elements
Customer feedback Daily monitoring
Check para-meter changes
1/ Basic counter analysis to identify the TopN most impacted NodeBs or CellsFor this, use standard NPO Views2/ Also try to roughly identify the failure causes
Check alarms & stability- Focus on the TopN NEs determined through counter analysis- Use e.g. HFB tool
Record pre-defined set of Call Trace & product traces- Launch CTg+CFT and RNC Log Collection on the TopN NEs- Launch NodeB Log Collection and BTS Set Traces with “NodeB KPI AP” template on the Top3 NodeBs
First LevelFirst LevelInvestigatioInvestigationn
Deep Deep InvestigationInvestigation
KPI degradation
Rule to select the TopN most impacted (or “worst”) network elements:
If degradation is observed at RNCRNC level Identify the TopN NodeBsNodeBs If degradation is observed at NodeBNodeB level Identify the TopN CellsCells
- Logs are uploaded as recommended, then AR is opened- Before considering going deeper, check the “Release Notes” applicable to current SW load to verify if issue is a known issue
Alcatel-Lucent – InternalProprietary – Use pursuant to Company instruction. 16 | UA7.1.3 KPI Investigation Methodology | Feb. 2011
First Level Investigation
Parameter check [2/2]
Method to compute all the differences between 2 Snapshots with WPS tool:
Import in WPS the earliest available Snapshot after the observed KPI degradation.Use Replace initial snapshot and discard existing workorders option.
Under the same WPS application, import on top of previously loaded Snapshot the latest available Snapshot before the observed KPI degradation.Use Resynchronize the planned configuration with new snapshot option.
Under the Workorders tab you will see the delta workorder to transition from old to new Snapshot, showing all the differences.You can analyze the computed differences here, or make a right-click on the table to export them as an .html file and analyze them under Excel (recommended option).
Alcatel-Lucent – InternalProprietary – Use pursuant to Company instruction. 17 | UA7.1.3 KPI Investigation Methodology
First Level Investigation
Call Trace [1/2] CTB/CTg Call Trace logs:
In order to collect the relevant data in CTg logs while capturing a sufficiently high number of failures, the CTg session can be configured to capture only a few specific call types.
CFT Call Trace logs: Since UA7.1 release, Call Failure Trace (CFT) logs have proven to be a powerful additional
type of Call Trace for KPI investigation. A CFT record is being captured each time an RRC Connection failure, RAB Assignment failure or RAB Drop occurs.
Differently from CTg logs, CFT logs do not contain the whole call flow but are basically a detailed snapshot of the call at failure time; CFT logs are typically smaller than CTg logs. In typical live environment a CFT session will capture the great majority of call failures, whereas a CTg session usually captures only a relatively small part of the total number calls.
A CFT session will attempt to capture all call failures, regardless to the call types specified in the CTg Call Trace template CFT is particularly powerful for Multi-RAB types of KPI degradations
CFT logs should be collected each time a CTg session is launched. This can be done by making sure to use a CTg Call Trace template that includes the record called “UeCallFailureFull”.
CFT logs should always be made available together with the CTg logs under the log storage location (e.g. in a folder named “CallTrace”).
Alcatel-Lucent – InternalProprietary – Use pursuant to Company instruction. 18 | UA7.1.3 KPI Investigation Methodology |
First Level Investigation
WPS snapshot delta (e.g. .html format), 2 snapshots (latest snapshot before the degradation, earliest snapshot after the degr.) in folder Snapshots
Output of RNC Log Collection in folder RNC logs
Output of NodeB Log Collection (at least 1 BTS) in folder NodeB logs
CTg traces in folder CallTrace
Call Failure Trace data in folder CallTrace
Alcatel-Lucent – InternalProprietary – Use pursuant to Company instruction. 19 | UA7.1.3 KPI Investigation Methodology | Feb. 2011
Deep Investigation4
Alcatel-Lucent – InternalProprietary – Use pursuant to Company instruction. 20 | UA7.1.3 KPI Investigation Methodology
Deep Investigation
Overview
Customer feedback
CS CDRTroubleshooting
PS R99 CDRTroubleshooting
HSxPA CDRTroubleshooting
MobilityTroubleshooting
AccessibilityTroubleshooting
ThroughputTroubleshooting
KPI degradation Daily monitoring
- Before considering going deeper, check the “Release Notes” applicable to current SW load to verify if issue is a known issue.Alarms and Configuration.
First LevelFirst LevelInvestigatioInvestigationn
Deep Deep InvestigationInvestigation Optimization/
Support/GPS troubleshooting experts.
First Level Analysis
Alcatel-Lucent – InternalProprietary – Use pursuant to Company instruction. 21 | UA7.1.3 KPI Investigation Methodology 1
Deep Investigation
Counter-based in-depth analysis
Deeper counter-based analysis is needed in order to characterize the degradation. The aim is:
To narrow down as much as possible where the degradation occurs (product, board, procedure)
To identify a common profile of TopN most impacted cells (HW, configuration, traffic profile, etc)
The analysis is done with:
Counter correlation => attempt to correlate the general counter showing the degradation and a counter for a specific lower level procedure.
Comparison of distribution of failure types (e.g. which cause of drop has increased the most)
Counters-based analysis is also used:
To quantify patterns identified by CTg analysis
To assess the KPI recovery after fix delivery
Alcatel-Lucent – InternalProprietary – Use pursuant to Company instruction. 22 | UA7.1.3 KPI Investigation Methodology
Deep Investigation
CS CDR Troubleshooting cookbook
Analysis of relevant counters
for CS CDR degradation
Analysis of CTg traces
CS CDR troubleshooting
Consolidate global analysis(using all tools outputs)
Analysis of CFT data
Launch CTg+CFTsession on TopN cells
Consolidate CTg + CFT + NPO analysis(cause of drops, drop scenario, IMSI)
Alcatel-Lucent – InternalProprietary – Use pursuant to Company instruction. 23 | UA7.1.3 KPI Investigation Methodology |
Deep Investigation
PS R99 Troubleshooting cookbook
Analysis of relevant counters
for PS CDR degradation
Analysis of CTg traces
PS R99 CDR troubleshooting
Consolidate global analysis(using all tools outputs)
Analysis of CFT data
Launch CTg+CFTsession on TopN cells
Consolidate CTg + CFT + NPO analysis(cause of drops, drop scenario, IMSI)
Alcatel-Lucent – InternalProprietary – Use pursuant to Company instruction. 24 | UA7.1.3 KPI Investigation Methodology
Deep Investigation
HSxPA CDR Troubleshooting cookbook
Analysis of relevant counters for HSxPA CDR degradation
Analysis of CTg traces
HSxPA CDR troubleshooting
Consolidate global analysis(using all tools outputs)
Analysis of CFT data
Launch CTg+CFTsession on TopN
cells
Consolidate CTg + CFT + NPO analysis(cause of drops, drop scenario, IMSI)
If needed, perform Drive tests + UE traces
Alcatel-Lucent – InternalProprietary – Use pursuant to Company instruction. 25 | UA7.1.3 KPI Investigation Methodology
Deep Investigation
Mobility troubleshooting cookbook
Mobility troubleshooting
Analysis of relevant counters for Mobility
degradationAnalysis of CTg traces
Consolidate global analysis(using all tools outputs)
Perform WQA CTnanalysis
Launch CTnsession on TopN cells
Consolidate CTg + NPO analysis
If needed, perform Drive tests + UE traces
Alcatel-Lucent – InternalProprietary – Use pursuant to Company instruction. 26 | UA7.1.3 KPI Investigation Methodology
Deep Investigation
Accessibility troubleshooting cookbook
Accessibility troubleshooting
In case of RRC degradation:
•Perform drive-tests (UE traces)
Analysis of relevant counters for Accessibility degradation
Analysis of CTg traces
Consolidate global analysis(using all tools outputs)
Consolidate CTg + CFT + NPO analysis
Launch CTg+CFTsession on TopN
cells
Analysis of CFT data
Alcatel-Lucent – InternalProprietary – Use pursuant to Company instruction. 27 | UA7.1.3 KPI Investigation Methodology | Feb. 2011
www.alcatel-lucent.comwww.alcatel-lucent.com