Advanced BSOD Troubleshooting

download Advanced BSOD Troubleshooting

of 33

Transcript of Advanced BSOD Troubleshooting

  • 7/30/2019 Advanced BSOD Troubleshooting

    1/33

    Dell Confidential

    Understanding and troubleshooting

    Blue Screen Errors

    Classroom Deck - XPS

  • 7/30/2019 Advanced BSOD Troubleshooting

    2/33

    2Expert Tools

    Dell Confidential - DRAFT

    Class Outline

    1. Understanding a Crash

    2. Tools You Can Use

    3. Sysinternals Suite of Tools

  • 7/30/2019 Advanced BSOD Troubleshooting

    3/33

  • 7/30/2019 Advanced BSOD Troubleshooting

    4/33

    4Expert Tools

    Why Does Windows Crash?

    Microsofts analysis of crash root causes indicates:

    ~70% caused by third-party driver code

    ~15% caused by unknown (memory is too corrupted to tell)

    ~10% caused by hardware issues

    ~5% caused by Microsoft code

    There are lots of third-party drivers!

    From online crash analysis database:

    55,000 unique drivers 24 new/day (28,000 in 2004)

    220,000 total drivers 98 revised/day (130,000 in 2004)

    Many Devices

    Over 1,263,300 distinct Plug and Play (PnP) IDs (680,000 in 2004)

    1,600 PnP IDs added every day

  • 7/30/2019 Advanced BSOD Troubleshooting

    5/33

    5Expert Tools

    Dell Confidential - DRAFT

    Random Reboots IMPORTANT NOTE

    Your customers may be experiencing a Blue

    Screen Error but not know it as the systemrestarts too quickly for them to see it

    Before troubleshooting the Random Reboot

    symptom, be sure to follow these directions:

    Open My computer properties

    Click Advanced tab Click Settings under Startup and Recovery

    Ensure the box Automatically Restart is NOT

    CHECKED

    This will ensure that the customer has time toalert you that a BSOD has just occurred

    instead of an instant reboot

    You do not want to confuse random rebootswith Blue Screen errors as they are oftentroubleshoot differently

  • 7/30/2019 Advanced BSOD Troubleshooting

    6/33

    6Expert Tools

    Dell Confidential - DRAFT

    Physical Memory Dumps

    Most Blue Screen error messages are actually requestedby

    a driver or application to prevent the system from locking upor losing more data

    Physical Memory Dumps are an attempt by Windows XP tostore the conditions that caused the system to give the error

    This data can be very useful for you to determine what driveror application most likely called for the crash

    Once you know what direction to start looking, there aretools available to help you resolve this

    Notice that the previous Blue Screen already told us it was

    called by the file ATI2Dvag.dllsometimes were not thatlucky

  • 7/30/2019 Advanced BSOD Troubleshooting

    7/33

  • 7/30/2019 Advanced BSOD Troubleshooting

    8/338

    Expert ToolsDell Confidential - DRAFT

    What is the issue?

    The initial step in the troubleshooting process is finding what is or is

    not occurring in the system. Once you make contact with the customer you must determine the

    problem that is being encountered Preferred: Connect to the customers system using DellConnect and witness it

    first hand

    If Internet or other issues are keeping DellConnect from working, ask thecustomer for as many details as possible

    With this information in hand, we are able to perform three basic stepsto get started:

    Run MSCONFIG to disable any unnecessary startup items and services

    Open the Event Viewer to check for a history of these issues (covered on the nextslide)

    Check the Device Manager for any drivers with a yellow (!) or red (x)

  • 7/30/2019 Advanced BSOD Troubleshooting

    9/339

    Expert ToolsDell Confidential - DRAFT

    What is the Issue Opening the Event Viewer

    The event viewer contains a historical record of all events, errors andcrashes on the customers system

    IMPORTANT: This tool is most important when the customer can onlyremember the date/time of the error and not the specifics of the BSODerror

    It is accessible by:

    Open the Control Panel (switch to Classic Mode)

    Select Administrative Tools and click Event Viewer

    You will begin your search by looking under both the Application andSystem section

  • 7/30/2019 Advanced BSOD Troubleshooting

    10/3310Expert Tools

    Dell Confidential - DRAFT

    What is the issue Using the Event Viewer

    The Event Viewer will always contain much more information than is

    necessary for our purposes Procedure:

    Ask the Customer what time and date the error last occurred on

    Click Applications first

    Look for any red and white X that says ERROR next to it, that occurred on or

    near the date/time the customer stated

    Double click that item and use DellConnect or the customer to get the information

    the gray box

    Repeat this process with the System section until you feel you have the

    information neceesary

  • 7/30/2019 Advanced BSOD Troubleshooting

    11/3311Expert Tools

    Dell Confidential - DRAFT

    What is the issue Event Log sample

    Scenario: The customer tells you that the system hung on the 11th

    around 3:15pm when he tried to do something, cant remember butthe system is working fine now. This customer wants to know why hissystem is hanging

    Question 1: Why is this a good time to use the Event Viewer?

    Upon opening the Event Viewer, you see this:

    Question 2: Which items should we double click?

  • 7/30/2019 Advanced BSOD Troubleshooting

    12/3312Expert Tools

    Dell Confidential - DRAFT

    What is the issue Event Log sample

    As the Application Hang items seem most relevant to our issue of a

    frozen computer, you double click it and see this:

    QUESTION 1: What application should

    we begin troubleshooting? QUESTION 2: If you are unsure what

    file caused this hang, what tools canyou use to determine what programthis file belongs to?

    NOTE: Blue Screens and other errorswill still have the red x notification.

  • 7/30/2019 Advanced BSOD Troubleshooting

    13/3313Expert Tools

    Dell Confidential - DRAFT

    When did the issue start?

    There are two key reasons to ask the customer When did this issue

    start A customer may actually tell you what is causing the error message

    Example:

    Customer: It started right after I got my DSL installed, actually

    You: What model is your DSL modem, and does it use USB or Networking

    Customer: Oh I connected both

    Question for the class: Why was this discussion useful?

    Over time the availability of system restore points decreases

    System Restore Points may become corrupted or deleted

    Example:

    Customer: This has been happening on and off for six months now, figured Id call in

    You: Do you remember any changes you made during that timeframe Customer: Not really

    You: What do you have on your system

    Customer: The tower, monitor, and my USB Camera

    You and the customer try a system restore, but there are no available restore points

    Question for the class: What tool can you use here to get more information about this error?

  • 7/30/2019 Advanced BSOD Troubleshooting

    14/3314Expert Tools

    Dell Confidential - DRAFT

    When does the problem occur?

    Many issues require a specific circumstance to occur.

    In order to troubleshoot effectively you need to be able to consistently re-create the issue Use your available Lab

    Create the issue when using DellConnect

    Search the Groups.google.com and other online resources for others having the same issue

    Many problems will only occur on the Internet, with a particular browseror only with select web pages.

    The important step is taking ownership and attempting to duplicatingthe issue. EXAMPLE: I recently troubleshot an issue with Outlook Express.

    PROBLEM: A map attachment in an E-Mail was selected to print. The Photo Printing Wizard opened normally; youclick next and are offered an option to select the picture you want to print. At this point the customer was offeredseveral pornographic photos to print in the Picture Selection window.

    DISCUSSION: DellConnect was used to confirm the issue was as described by the customer. Multiple searcheshad been made of the hard drive without any success identifying the problem files.

    RESOLUTION: After confirming the steps I had previously recommended I configured a system in the lab using

    the same application & created a couple of emails with attached photos. When I printed a photo & discovered thatthe Wizard scanned the temporary internet files for photos. Deleting the temporary internet files & cookies fromInternet Options removed the undesirable files from the Photo Printing Wizard & the customer was off todiscuss web viewing habits with his son.

    The moral of this story is that by re-creating the issue on anothersystem we were able to validate what was actually happening, not whatappearedto be happening.

  • 7/30/2019 Advanced BSOD Troubleshooting

    15/3315Expert Tools

    Dell Confidential - DRAFT

    What changes were made prior to this happening?

    Everyone makes changes to their systems, often several times a day. What is different from

    when the system was running normally and the current situation? Changes most likely to cause OS Errors

    Was hardware installed prior to the problem?

    New printer

    USB keyboard, mouse, game pad, joystick, photo reader

    Cameras and their associated software

    Was a new driver installed? Use the Driver Rollback feature in XP

    Uninstall the driver, software and hardware added

    Was a Windows Update installed?

    Check Add/Remove programs for recent XP Updates

    Attempt to perform a system restore to a point before the last update

    Was a program installed?

    Are there new items in add/remove programs? Are two versions of the same application installed in add/remove programs?

    Changes are not limited to new software or hardware

    The customer switched to a new Internet Service Provider

  • 7/30/2019 Advanced BSOD Troubleshooting

    16/3316Expert Tools

    Dell Confidential - DRAFT

    What has already been done to try to resolve the issue

    This step is important for two reasons:

    Efficiency: Do not repeat steps already tried

    Results: Previous attempts to fix the system that were NOT successful can tell

    you what IS NOT the issue

    Example:A customer receives a Blue Screen error when attempting to play one game,

    however they have already tried patching the game, updating video drivers and

    reinstalling the game. Next Step:Since it does not appear the be the game or video drivers, you can safely

    skip those steps and start troubleshooting using MSCONFIG, Event Viewer, updating

    the sound card drivers and running DXDiags.

    Resources to check on previous steps:

    Put the customer on hold if necessary and read all relevant case logs Ask the customer what specificsteps they have performed and their results

    Use DellConnect to see if items such as MSCONFIG, or Anti-Virus have been

    installed by the customer

  • 7/30/2019 Advanced BSOD Troubleshooting

    17/3317Expert Tools

    Dell Confidential - DRAFT

    What type of Internet Connection is present

    As stated earlier, this information is mainly useful to knowing what tools are

    available DellConnect only operates properly using high speed DSL/Cable Internet

    Dial-up connections limit your ability to download updated drivers, the Win Debug Tool and

    SysInternals tools (covered later)

    High Speed connections have a higher likelihood of spyware or virus infestation

    Question: Why is this?

    High Speed connections make it easier to update Anti-Virus and Anti-Spywareapplications

    Naturally it is easier to troubleshoot a system using DellConnect on a highspeed connection than relying on the customers interpretation of the error

  • 7/30/2019 Advanced BSOD Troubleshooting

    18/3318Expert Tools

    Dell Confidential - DRAFT

    Fundamental Troubleshooting

    There are three basic scenarios for OS Errors or crashes

    During boot

    Removal ALL peripherals except for keyboard, mouse and monitor

    Press F8 between the BIOS and XP Screens, select Last known good configuration

    Press F8 between the BIOS and XP Screens, select Safe mode with Networking

    If it still crashes: Potentially a memory, hard drive or video card failure use Dell Diags

    If it boots: Open the Event Viewer in Safe Mode to determine what was causing the error, as well as running

    MSCONFIG and uninstalling all third party software/drivers

    Upon using a device or opening an application Recreate the issue and record the information in the Blue Screen Error

    Research DSN, Support.Microsoft.com and www.google.com

    Update the driver or application

    Run MSCONFIG to remove potential conflicting applications

    Visit the website of the manufacturer to check for

    Known issues similar to yours Verify it is compatible with the customers OS and system configuration such as compatible sound cards

    Randomly

    Ask the customer for the approximate time/date it happened

    Open the Event Viewer, and check both Applications and System sections for Error that fits the

    description of the customer

    http://www.google.com/http://www.google.com/
  • 7/30/2019 Advanced BSOD Troubleshooting

    19/3319Expert Tools

    Dell Confidential - DRAFT

    Advanced Troubleshooting Windows Debugger

    Everyone in this class should be familiar with the Online Crash

    Analysis tool as it is part of the BSOD Tree in DSN

    Therefore we will skip this tool and use the more advanced Windows

    Debugger application if you were not able to resolve the issue with the

    OCA

    Windows Debugger is a tool provided by Microsoft for advancedtroubleshooting of issues (bugs) and software developers

    Instructions are located on the next slide

  • 7/30/2019 Advanced BSOD Troubleshooting

    20/3320

    Expert ToolsDell Confidential - DRAFT

    Advanced Troubleshooting Install/Setup of Debugger

    The Windows Debugger is not included by default on Windows XP

    You must visit this location to download it ONTO THE CUSTOMERSCOMPUTER

    http://www.microsoft.com/whdc/devtools/debugging/default.mspx

    Ensure that you select the version appropriate for the system, as thereare 32-bit and 64-bit versions

    Click either of the links in the middle of the page starting Install Debugging toolsfor windows

    On the next page, choose the latest version available

    NOTE: The file size is approximately 15MB, be prepared to wait

    Run the installer application

    Click Start / All Programs / Debugging Tools for Windows / WinDBG

    Click File / Symbol File Path / typesrv*c:\symbols*http://msdl.microsoft.com/download/symbols without thequotes

    http://www.microsoft.com/whdc/devtools/debugging/default.mspxhttp://www.microsoft.com/whdc/devtools/debugging/default.mspx
  • 7/30/2019 Advanced BSOD Troubleshooting

    21/3321

    Expert ToolsDell Confidential - DRAFT

    Advanced Troubleshooting Using Debugger

    Every time Windows crashes with a Blue Screen, it will attempt to savea dump file into C:\Windows\minidump

    These dump files are named by the date and the order of crash. Forexample Mini080306-1.dmp

    08 = Month of August

    03 = 3rd day

    06 = 2006

    -1 = First crash of this particular day

    In the debugger

    File > Open Crash Dump

    Browse to C:\Windows\minidump

    Open the file closest to the date the customer claims the system crashed Wait for the application to run

    Note the file listed after Probably caused by:

  • 7/30/2019 Advanced BSOD Troubleshooting

    22/3322

    Expert ToolsDell Confidential - DRAFT

    Advanced Troubleshooting Using !Analyzev

    From the previous slide, the debugger said:

    probably caused by: bcmw15.sys A Google search for bcmw15.sys file is not very conclusive

    Spend 5 minutes checking various site results

    It appears to be related to wireless driver in the TCP/IP stack

    Therefore perform the more advanced debugging by typing !Analyze

    v into the command line at the bottom of the debugger This provides a LOT of more information

    However were looking at one field: Process_Name

    This field tells us what process/program asked for the Blue Screen

    The answer is ccEvtMgr which we know to be Norton Anti-Virus

    A google search for CCEvtMgr gives more results

    Conclusion:

    We now know that the Norton ccEvtMgr application caused a Blue Screen errordue to the bcmw15.sys

    Lets perform a lab to test this ourselves

  • 7/30/2019 Advanced BSOD Troubleshooting

    23/3323

    Expert ToolsDell Confidential - DRAFT

    Questions?

    Up until now we have only discussed the preparation and questions

    necessary to begin the debug process

    The next few slides will outline labs that we will guide you through inorder to actually USE the debugger with sample crashes

  • 7/30/2019 Advanced BSOD Troubleshooting

    24/33

  • 7/30/2019 Advanced BSOD Troubleshooting

    25/3325

    Expert ToolsDell Confidential - DRAFT

    Driver Verifier

    Taking it a step further: Driver Verifier

  • 7/30/2019 Advanced BSOD Troubleshooting

    26/33

    26Expert Tools

    Dell Confidential - DRAFT

    Driver Verifier What is it?

    WHAT: Driver Verifier is a tool created by Microsoft meant for hardware

    creators to test their drivers, determining that piece of softwares abilityto operate reliably

    WHEN TO USE: Use this tool when Debug fails to help you for either ofthese reasons:

    The BSOD errors are different every time

    The Debugger tool lists a Windows file (such as NTOSKRNL.EXE) as the cause

    HOW TO USE: Run the Verifier tool to force the system to crash, andthen run the WinDBG program on the latest dump file

    IMPORTANT NOTE: It is very likely this tool will cause the system tocrash, this is on purpose so the tool can determine which driver caused

    the crash

  • 7/30/2019 Advanced BSOD Troubleshooting

    27/33

    27Expert Tools

    Dell Confidential - DRAFT

    Driver Verifier How to start

    How to run:

    Click Start / Run

    Type Verifier

    Click Create

    Custom Settings

    Click Next

  • 7/30/2019 Advanced BSOD Troubleshooting

    28/33

    28Expert Tools

    Dell Confidential - DRAFT

    Driver Verifier - Setup

    If you are unsure what drivercaused the crash:

    Select Individual settings from a full list, click

    Next

    Check all options EXCEPT Low resources sim,

    click Next

    Choose Automatically Select Unsigned drivers,

    click next

    Click Finish and reboot See if the system Blue Screens

    If you know what driver causedthe crash:

    Select Individual settings from a full list

    Check all options EXCEPT Low resources

    sim, click Next

    Choose Select Driver Names from a list,

    click Next

    Scroll through the list and check the driver

    you suspect, click next Click Finish and reboot

    See if the system Blue Screens

  • 7/30/2019 Advanced BSOD Troubleshooting

    29/33

    29Expert Tools

    Dell Confidential - DRAFT

    Driver Verifier How to use it

    Run Driver Verifier until you can force the system to Blue Screen

    Either using All Unsigned Drivers or selecting one manually from a list

    Reboot after the Blue Screen

    Disable Driver Verifier

    Open Verifier.exe from the command line

    Choose Delete Existing Settings and click Finish

    Reboot to finalize settings

    Open the WinDBG Application

    Load the latest dump file

    Perform your analysis again

  • 7/30/2019 Advanced BSOD Troubleshooting

    30/33

    30Expert Tools

    Dell Confidential - DRAFT

    SysInternals - AutoRuns

    WHAT IS IT: AutoRuns is a tool developed by Sysinternals (now ownedby Microsoft), available at www.sysinternals.com, that allows you toenable/disable ANY startup items

    Many items are not listed in MSCONFIG

    Use this when you absolutely cannot get rid of a startup application or service

    causing your Blue Screen errors

    WHERE IS IT: http://download.sysinternals.com/Files/Autoruns.zip HOW TO USE IT:

    Download and run the application

    Allow time for it to scan

    Click Options / Verify Digital Signatures,

    and then Options / Hide Microsoft entries

    Click the refresh button renew the list

    Uncheck any items you feel may be

    causing the errors

    http://download.sysinternals.com/Files/Autoruns.ziphttp://download.sysinternals.com/Files/Autoruns.zip
  • 7/30/2019 Advanced BSOD Troubleshooting

    31/33

    31Expert Tools

    Dell Confidential - DRAFT

    Autoruns

  • 7/30/2019 Advanced BSOD Troubleshooting

    32/33

    32Expert Tools

    Dell Crash Analysis Tool

    What is the Dell Crash Analysis Tool (CAT) ?

    CAT scans systems for suspect drivers and suggests a list ofdriver files that may need to be updated, repaired, or replaced

    For what kind of errors can I use CAT ?

    The Crash Analysis Tool (CAT) helps determine why you arereceiving either a blue screen error or a system crash error

  • 7/30/2019 Advanced BSOD Troubleshooting

    33/33