KPI Investigation Methodology Process UA7

27
KPI Investigation Methodologies Processes, methods and tools to troubleshoot

description

alu kpi improvement

Transcript of KPI Investigation Methodology Process UA7

Page 1: KPI Investigation Methodology Process UA7

KPI Investigation MethodologiesProcesses, methods and tools to troubleshoot

Page 2: KPI Investigation Methodology Process UA7

Agenda

1. Introduction

2. KPI Investigation Methodology Overview

3. First Level Investigation

4. Deep Investigation

Page 3: KPI Investigation Methodology Process UA7

Alcatel-Lucent – InternalProprietary – Use pursuant to Company instruction. 3 | UA7.1.3 KPI Investigation Methodology | Feb. 2011

Introduction1

Page 4: KPI Investigation Methodology Process UA7

Alcatel-Lucent – InternalProprietary – Use pursuant to Company instruction. 4 | UA7KPI Investigation Methodology

Introduction

KPI degradation after Swap/Software Upgrade / Hardware Replacement / Feature Activation is a sensitive issue for customers. KPI degradations need to be quickly detected through accurate monitoring.

Main objectives of this document:

Present the recommended Processes,and Tools to troubleshoot efficiently the performance issues observed during Global Market.

In particular, clarify the roles of the on-site team and back-office teams during the “First Level Investigation” and “Deep Investigation” steps of the investigation

Page 5: KPI Investigation Methodology Process UA7

Alcatel-Lucent – InternalProprietary – Use pursuant to Company instruction. 5 | UA7.1.3 KPI Investigation Methodology | Feb. 2011

KPI Investigation Methodology Overview2

Page 6: KPI Investigation Methodology Process UA7

Alcatel-Lucent – InternalProprietary – Use pursuant to Company instruction. 6 | UA7.1.3 KPI Investigation Methodology | Feb. 2011

KPI Investigation Methodology Overview

Investigation process (organizational view)Monitor the network (customer)

Monitor the network (customer)

First Level AnalysisFirst Level Analysis

Monitor the network (ALU)Monitor the network (ALU)

DeepDeepInves-Inves-tigatiotigationn

- Collect additional traces (RNC and NodeB internal traces, Network Analyzer traces, CTg, Call Failure Trace, etc.)

- Change parameters and perform the related monitoring

- Perform drive tests- Apply fixes and

assess performance recovery

- Collect additional traces (RNC and NodeB internal traces, Network Analyzer traces, CTg, Call Failure Trace, etc.)

- Change parameters and perform the related monitoring

- Perform drive tests- Apply fixes and

assess performance recovery

Deeply analyze parameter settings and

counters

Deeply analyze parameter settings and

counters

- Analyze the issue based on all available data (including traces)

- Maintain a comprehensive document (usually Powerpoint) with investigation results and current action plan

- Analyze the issue based on all available data (including traces)

- Maintain a comprehensive document (usually Powerpoint) with investigation results and current action plan

Optimization/Support/GPS troubleshooting experts subject matter experts

(GPS group)

First Level First Level InvestigatInvestigationion

Open an AR that should include all the outputs from the First Level Analysis

Open an AR that should include all the outputs from the First Level Analysis

KPI degradationKPI degradation

Alarms,Configuration

Page 7: KPI Investigation Methodology Process UA7

Alcatel-Lucent – InternalProprietary – Use pursuant to Company instruction. 7 | UA7.1.3 KPI Investigation Methodology | Feb. 2011

KPI Investigation Methodology OverviewInvestigation process (functional view)

Customer feedback

Check paramet

er changes

Record pre-defined set of Call Trace &

product traces

Basic counter analysis to

identify the TopN most impacted NodeBs or Cells

CS CDRTroubleshooting

PS R99 CDRTroubleshooting

HSxPA CDRTroubleshooting

MobilityTroubleshooting

AccessibilityTroubleshooting

ThroughputTroubleshooting

Check alarms

& stabilit

y

KPI degradation Daily monitoring

- Logs are uploaded as recommended, then AR is opened- Before considering going deeper (e.g. open CR, trigger more support), check the “Release Notes” applicable to current SW load to verify if issue is a known issue

First LevelFirst LevelInvestigatInvestigationion

Deep Deep InvestigationInvestigation Optimization/

Support/GPS and troubleshooting experts

Page 8: KPI Investigation Methodology Process UA7

Alcatel-Lucent – InternalProprietary – Use pursuant to Company instruction. 8 | UA7.1.3 KPI Investigation Methodology

KPI Investigation Methodology Overview

Weekly performance investigation meetings

For each identified performance issue , technical meetings ;these meetings are usually chaired by a single matter expert or Optimization troubleshooting expert should be scheduled on a weekly basis in order to: Share the latest investigation results from each involved team Synch-up on the recent key events and data logs Agree on potential new axis of investigations Determine a clear action plan (“next steps”) and assess its feasibility

Page 9: KPI Investigation Methodology Process UA7

Alcatel-Lucent – InternalProprietary – Use pursuant to Company instruction. 9 | UA7.1.3 KPI Investigation Methodology | Feb. 2011

KPI Investigation Methodology Overview

Useful Links – Internal tools (ALU-proprietary) [1/3]

WPS (Wireless Provisioning System):https://wcdma-ll.app.alcatel-lucent.com/livelink/livelink.exe?func=ll&objId=63414868

1. Launch WPS installation (file “wips-Vxxx_windows.exe”)

2. When prompted during installation, load the relevant UMTS plugin(file “UMTS_Access-Vxxx.wipsar”)

Call Failure Trace:https://wcdma-ll.app.alcatel-lucent.com/livelink/livelink.exe?func=ll&objId=57344700

At above link can be found:

Presentations on CFT capabilities and CFT usage examples

CFT Reader tool

eDAT (Evolved Data Analysis Tool):http://navigator.web.alcatel-lucent.com/index.htm

RNC Log Collection:https://wcdma-ll.app.alcatel-lucent.com/livelink/livelink.exe?func=ll&objId=61064195

NodeB Log Collection:https://wcdma-ll.app.alcatel-lucent.com/livelink/livelink.exe?func=ll&objId=60312775

Page 10: KPI Investigation Methodology Process UA7

Alcatel-Lucent – InternalProprietary – Use pursuant to Company instruction. 10 | UA7.1.3 KPI Investigation Methodology | Feb. 2011

KPI Investigation Methodology Overview

Useful Links – Internal tools (ALU-proprietary) [3/3]

RFO Product:http://aww.srd.alcatel.com/icr/SDCT/sdct.php

The License must be requested through the SDCT (Software Delivery Center Tool)

homepage accessible at above link. Once under the SDCT homepage:

1. Download the SDCT User Guide (“Online documentation PowerPoint” link at the top of

the page) and read it

2. Download the W-CDMA RFO License Ordering presentation (“W-CDMA_RFO_License”

link at the bottom-right of the page) and read it

3. Click on the “Request” link at the middle of the page to initiate the license request

procedure.

- If you are not yet registered in the SDC database, you will first need to register

(“Registering” button)

- When prompted the template to select is “RFO Product license”

Page 11: KPI Investigation Methodology Process UA7

Alcatel-Lucent – InternalProprietary – Use pursuant to Company instruction. 11 | UA7.1.3 KPI Investigation Methodology | Feb. 2011

KPI Investigation Methodology Overview

Useful Links – External tools

Wireshark:

Wireshark: http://www.wireshark.org/download.html

WinPcap: http://www.winpcap.org/archive/4.1beta5_WinPcap.exe - Packet capture and filtering engine of many open source and commercial network tools- Used by Wireshark

Iperf:https://wcdma-ll.app.alcatel-lucent.com/livelink/livelink.exe?func=ll&objid=57980928

Page 12: KPI Investigation Methodology Process UA7

Alcatel-Lucent – InternalProprietary – Use pursuant to Company instruction. 12 | UA7.1.3 KPI Investigation Methodology | Feb. 2011

First Level Investigation3

Page 13: KPI Investigation Methodology Process UA7

Alcatel-Lucent – InternalProprietary – Use pursuant to Company instruction. 13 | UA7.1.3 KPI Investigation Methodology | Feb. 2011

First Level Investigation

Pre-Requisites: Collection of reference data before Upgrade [2/2]

The KPI Investigation Methodology is fully applicable only if the following reference data are properly collected in case of Swap/the SW Upgrade / HW Replacement / Feature Activation (continued):

Snapshot (.xml or .xcm file containing the parameter configuration data) Daily network Snapshots should be collected during at least 1 week prior to the upgrade,

to check configuration (parameters) stability. The Snapshots should be stored in a LiveLink repository for easy access by back-office teams.

Hardware Inventory HW Inventory files should be collected.

Drive Test. Call Trace.

Page 14: KPI Investigation Methodology Process UA7

Alcatel-Lucent – InternalProprietary – Use pursuant to Company instruction. 14 | UA7.1.3 KPI Investigation Methodology

First Level Investigation

Pre-requisites:

Tools needed for First Level Investigation

Tools needed for Deep Investigation(i.e. to collect additional logs and deeply analyze the issue)

WPS (Wireless Provisioning System)

NPO (Nw Performance Optimizer)

UTRAN Call Trace WizardHFB (Historical Fault Browser)RNC Log CollectionNodeB Log CollectionBTS Set Traces

eDAT, to post-process UE traces and Call TracesRFO Product, to post-process Call Traces WQA, to post-process CTn and CFT tracesCFT Reader, to post-process CFT tracesRemote Connection to NodeB (through Telnet

session or by using TIL tool), to check the NodeB status and collect specific NodeB internal traces (e.g. IMT logs)

Audit Tool, to produce the Audit and NodeB Profiling excel files based on Snapshot and HW Inventory

Wireshark, Windump, Iperf, DU Meter to investigate E2E Tput

The KPI Investigation Methodology refers to many internal (ALU-proprietary) and external troubleshooting tools

Tool X: Internal tool (ALU-proprietary)

Tool Y: External tool

Page 15: KPI Investigation Methodology Process UA7

Alcatel-Lucent – InternalProprietary – Use pursuant to Company instruction. 15 | UA7.1.3 KPI Investigation Methodology | Feb. 2011

First Level Investigation

Selection of “TopN” most impacted network elements

Customer feedback Daily monitoring

Check para-meter changes

1/ Basic counter analysis to identify the TopN most impacted NodeBs or CellsFor this, use standard NPO Views2/ Also try to roughly identify the failure causes

Check alarms & stability- Focus on the TopN NEs determined through counter analysis- Use e.g. HFB tool

Record pre-defined set of Call Trace & product traces- Launch CTg+CFT and RNC Log Collection on the TopN NEs- Launch NodeB Log Collection and BTS Set Traces with “NodeB KPI AP” template on the Top3 NodeBs

First LevelFirst LevelInvestigatioInvestigationn

Deep Deep InvestigationInvestigation

KPI degradation

Rule to select the TopN most impacted (or “worst”) network elements:

If degradation is observed at RNCRNC level Identify the TopN NodeBsNodeBs If degradation is observed at NodeBNodeB level Identify the TopN CellsCells

- Logs are uploaded as recommended, then AR is opened- Before considering going deeper, check the “Release Notes” applicable to current SW load to verify if issue is a known issue

Page 16: KPI Investigation Methodology Process UA7

Alcatel-Lucent – InternalProprietary – Use pursuant to Company instruction. 16 | UA7.1.3 KPI Investigation Methodology | Feb. 2011

First Level Investigation

Parameter check [2/2]

Method to compute all the differences between 2 Snapshots with WPS tool:

Import in WPS the earliest available Snapshot after the observed KPI degradation.Use Replace initial snapshot and discard existing workorders option.

Under the same WPS application, import on top of previously loaded Snapshot the latest available Snapshot before the observed KPI degradation.Use Resynchronize the planned configuration with new snapshot option.

Under the Workorders tab you will see the delta workorder to transition from old to new Snapshot, showing all the differences.You can analyze the computed differences here, or make a right-click on the table to export them as an .html file and analyze them under Excel (recommended option).

Page 17: KPI Investigation Methodology Process UA7

Alcatel-Lucent – InternalProprietary – Use pursuant to Company instruction. 17 | UA7.1.3 KPI Investigation Methodology

First Level Investigation

Call Trace [1/2] CTB/CTg Call Trace logs:

In order to collect the relevant data in CTg logs while capturing a sufficiently high number of failures, the CTg session can be configured to capture only a few specific call types.

CFT Call Trace logs: Since UA7.1 release, Call Failure Trace (CFT) logs have proven to be a powerful additional

type of Call Trace for KPI investigation. A CFT record is being captured each time an RRC Connection failure, RAB Assignment failure or RAB Drop occurs.

Differently from CTg logs, CFT logs do not contain the whole call flow but are basically a detailed snapshot of the call at failure time; CFT logs are typically smaller than CTg logs. In typical live environment a CFT session will capture the great majority of call failures, whereas a CTg session usually captures only a relatively small part of the total number calls.

A CFT session will attempt to capture all call failures, regardless to the call types specified in the CTg Call Trace template CFT is particularly powerful for Multi-RAB types of KPI degradations

CFT logs should be collected each time a CTg session is launched. This can be done by making sure to use a CTg Call Trace template that includes the record called “UeCallFailureFull”.

CFT logs should always be made available together with the CTg logs under the log storage location (e.g. in a folder named “CallTrace”).

Page 18: KPI Investigation Methodology Process UA7

Alcatel-Lucent – InternalProprietary – Use pursuant to Company instruction. 18 | UA7.1.3 KPI Investigation Methodology |

First Level Investigation

WPS snapshot delta (e.g. .html format), 2 snapshots (latest snapshot before the degradation, earliest snapshot after the degr.) in folder Snapshots

Output of RNC Log Collection in folder RNC logs

Output of NodeB Log Collection (at least 1 BTS) in folder NodeB logs

CTg traces in folder CallTrace

Call Failure Trace data in folder CallTrace

Page 19: KPI Investigation Methodology Process UA7

Alcatel-Lucent – InternalProprietary – Use pursuant to Company instruction. 19 | UA7.1.3 KPI Investigation Methodology | Feb. 2011

Deep Investigation4

Page 20: KPI Investigation Methodology Process UA7

Alcatel-Lucent – InternalProprietary – Use pursuant to Company instruction. 20 | UA7.1.3 KPI Investigation Methodology

Deep Investigation

Overview

Customer feedback

CS CDRTroubleshooting

PS R99 CDRTroubleshooting

HSxPA CDRTroubleshooting

MobilityTroubleshooting

AccessibilityTroubleshooting

ThroughputTroubleshooting

KPI degradation Daily monitoring

- Before considering going deeper, check the “Release Notes” applicable to current SW load to verify if issue is a known issue.Alarms and Configuration.

First LevelFirst LevelInvestigatioInvestigationn

Deep Deep InvestigationInvestigation Optimization/

Support/GPS troubleshooting experts.

First Level Analysis

Page 21: KPI Investigation Methodology Process UA7

Alcatel-Lucent – InternalProprietary – Use pursuant to Company instruction. 21 | UA7.1.3 KPI Investigation Methodology 1

Deep Investigation

Counter-based in-depth analysis

Deeper counter-based analysis is needed in order to characterize the degradation. The aim is:

To narrow down as much as possible where the degradation occurs (product, board, procedure)

To identify a common profile of TopN most impacted cells (HW, configuration, traffic profile, etc)

The analysis is done with:

Counter correlation => attempt to correlate the general counter showing the degradation and a counter for a specific lower level procedure.

Comparison of distribution of failure types (e.g. which cause of drop has increased the most)

Counters-based analysis is also used:

To quantify patterns identified by CTg analysis

To assess the KPI recovery after fix delivery

Page 22: KPI Investigation Methodology Process UA7

Alcatel-Lucent – InternalProprietary – Use pursuant to Company instruction. 22 | UA7.1.3 KPI Investigation Methodology

Deep Investigation

CS CDR Troubleshooting cookbook

Analysis of relevant counters

for CS CDR degradation

Analysis of CTg traces

CS CDR troubleshooting

Consolidate global analysis(using all tools outputs)

Analysis of CFT data

Launch CTg+CFTsession on TopN cells

Consolidate CTg + CFT + NPO analysis(cause of drops, drop scenario, IMSI)

Page 23: KPI Investigation Methodology Process UA7

Alcatel-Lucent – InternalProprietary – Use pursuant to Company instruction. 23 | UA7.1.3 KPI Investigation Methodology |

Deep Investigation

PS R99 Troubleshooting cookbook

Analysis of relevant counters

for PS CDR degradation

Analysis of CTg traces

PS R99 CDR troubleshooting

Consolidate global analysis(using all tools outputs)

Analysis of CFT data

Launch CTg+CFTsession on TopN cells

Consolidate CTg + CFT + NPO analysis(cause of drops, drop scenario, IMSI)

Page 24: KPI Investigation Methodology Process UA7

Alcatel-Lucent – InternalProprietary – Use pursuant to Company instruction. 24 | UA7.1.3 KPI Investigation Methodology

Deep Investigation

HSxPA CDR Troubleshooting cookbook

Analysis of relevant counters for HSxPA CDR degradation

Analysis of CTg traces

HSxPA CDR troubleshooting

Consolidate global analysis(using all tools outputs)

Analysis of CFT data

Launch CTg+CFTsession on TopN

cells

Consolidate CTg + CFT + NPO analysis(cause of drops, drop scenario, IMSI)

If needed, perform Drive tests + UE traces

Page 25: KPI Investigation Methodology Process UA7

Alcatel-Lucent – InternalProprietary – Use pursuant to Company instruction. 25 | UA7.1.3 KPI Investigation Methodology

Deep Investigation

Mobility troubleshooting cookbook

Mobility troubleshooting

Analysis of relevant counters for Mobility

degradationAnalysis of CTg traces

Consolidate global analysis(using all tools outputs)

Perform WQA CTnanalysis

Launch CTnsession on TopN cells

Consolidate CTg + NPO analysis

If needed, perform Drive tests + UE traces

Page 26: KPI Investigation Methodology Process UA7

Alcatel-Lucent – InternalProprietary – Use pursuant to Company instruction. 26 | UA7.1.3 KPI Investigation Methodology

Deep Investigation

Accessibility troubleshooting cookbook

Accessibility troubleshooting

In case of RRC degradation:

•Perform drive-tests (UE traces)

Analysis of relevant counters for Accessibility degradation

Analysis of CTg traces

Consolidate global analysis(using all tools outputs)

Consolidate CTg + CFT + NPO analysis

Launch CTg+CFTsession on TopN

cells

Analysis of CFT data

Page 27: KPI Investigation Methodology Process UA7

Alcatel-Lucent – InternalProprietary – Use pursuant to Company instruction. 27 | UA7.1.3 KPI Investigation Methodology | Feb. 2011

www.alcatel-lucent.comwww.alcatel-lucent.com