Visualizing Real-Time Errors and Performance Anomalies 2015 Embedded Systems Conference – Silicon...
-
Upload
simon-blankenship -
Category
Documents
-
view
217 -
download
2
Transcript of Visualizing Real-Time Errors and Performance Anomalies 2015 Embedded Systems Conference – Silicon...
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart1
Visualizing Real-Time Errorsand Performance Anomalies
Dave Stewart, PhDSr. Principal Software Architect – Physio-Control, Inc.
[email protected] ¤ http://davestewart.info
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart2
Visualizing Real-Time Errors and Performance Anomalies
• Troubleshooting Tough Bugs using a Logic Analyzer• Print Debug Macros• Logic Analyzer Debug Macros
• Tracing Code using Logic Analyzer debug macros• Displaying Variable Data on the Logic Analyzer
• Visualizing Real-Time Execution• Focus on Anomalies• Performance Issues• Clock or Synchronization Errors• Troubleshooting Rare Glitches
• Setup Extras• Analog Signals coupled with Digital Debug• Debug Clock Bit• Serial Protocols
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart3
Troubleshooting Tough Bugs using a Logic Analyzer
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart4
What are The Tough Bugs to Debug in Real-Time Systems?
• Glitches• Timing and Synchronization Problems• Driver Errors• Misbehaving Interrupts• Memory Corruption• Priority Inversion• Performance Issues• Hardware Errors
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart5
Limitations of Traditional Debugging
• Print Statements• There is no console, so Print Statements don’t work• Print Statements are too slow thus provide insufficient information• Print function significantly affects real-time performance• Writing debug output to a serial port changes the timing too much• Adding print statements changes program behavior• Can’t measure performance at a fine granularity• Max 50 to 100 print statements per second• The code crashes, but there is insufficient feedback as to where• Can’t see the “integrated picture” at a glance
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart6
Limitations of Traditional Debugging
• Symbolic Debuggers(e.g. IDEs, via JTAG, ActiveSync, or other comm link)
• A symbolic debugger or emulator is not available• Stepping through the code makes the program behave differently• Breakpoints will “break” real-time performance• There is real I/O, it doesn’t work• Debugger doesn’t deal with interrupts properly• There might be a race condition or other synchronization problem• Can’t see the “integrated picture” at a glance
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart7
Solution: Use a Logic Analyzer for Real-Time Issues
• Both broad and detailed view of your code• Go from the “big picture” to microsecond view of software in seconds• Easily 50,000 debug data points per second• Info is time-stamped for timing assessment
• Real-Time• Can use it for interrupts and I/O drivers• Impact on real-time execution is negligible• Identify temporal relationships among tasks• Monitor interrupts and how they may affect execution
• Visualization of Performance and Anomalies• Quickly spot anomalies and different patterns of execution• Obtain sufficient proof that a problem is hardware, not software• Fine-grain timing measurements to identify performance culprits
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart8
Logic Analyzer for Debugging - Should I use it all the time?
• Using a Logic Analyzer is NOT easy• Use Print Statements and Symbolic Debuggers to solve easy and
non-real-time problems first• Add to your repertoire of available tools to solve hard problems
• Solve functional problems using print statements.• If necessary, run the functions on the desktop, and debug them there. • Only move to embedded environment when it is working well.
• Use Symbolic Debuggers mainly for• Tracing through code that fails in a consistent manner• Post-Mortem debugging of crashes, to view all variables
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart9
Logic Analyzer Features
• Low-End Logic Analyzer features are sufficient• Only need about 16 Channels for techniques shown in this presentation• Anything faster than 20MHz is likely fine. Possibly need up to 100MHz if debugging
SPI • Except! Large memory depth
• in Mega-Samples, not Kilo-Samples. This is usually found in high-end analyzers.• Multiple Views
• Timing Diagrams and State Listings (standard on pretty much any logic analyzer)• Decoding of Serial Protocols (not standard)• Useful features: Search, Filtering, and Triggering (capabilities vary tremendously)
• High-Speed Interface to a PC• Built-in PC, USB Memory Stick, Ethernet all OK
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart10
Logic Analyzer Features
• Low cost USB Logic Analyzer Pods are OK• A $25,000 logic analyzer is nice to have and
will have many extras that could be useful• Examples in this presentation use the Tektronix
TLA700
• What is discussed in this talk can be accomplished with a $400 USB pod.
• Examples in this presentation useLogicPort or DigiView
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart11
Sample Logic Analyzer Output - Timing Diagram
• This output format will be used for examples in this presentation.• More details interpreting this diagram to follow.
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart12
Sample Logic Analyzer Output - State Listing
• Primarily useful for automating analysis. Not covered in this presentation, thus we won’t use this view in examples.
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart13
Target Setup: Interconnections
• Black = Required, Blue = Optional
LogicAnalyzer
Real-TimeSystem
Under Test
DEBUG_D0 .. D7Ch.1
RX/TXCh.2 INTR
SPI
Keys
DEBUG_CLK
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart14
An Aside - Print Statement Debugging
• Forms the basis for logic analyzer methods• “Printing” to logic analyzer uses same debug concept as print statements• Abstract the logic analyzer prints using same method as debug prints
• First, example of using macros for Print Statement Debugging
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart15
Print Statement Debugging 101
myfunc() {
code here
printf(“I got here\n”);
more code here
printf(“Going to call yourfunc()\n”);
result = yourfunc();
printf(“My result is %d\n”,result);
etc
}
I got hereGoing to call yourfunc()My result is 384
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart16
Print Statement 101 Debugging: Problems
• A Lot of typing• Minimal information per statement• Hard to separate between debug and normal print statements• Prone to errors• Cannot easily disable them• Ultimately very inefficient
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart17
Debug Macros instead of Print Statements - DEBUG_WHERE()
#define DEBUG_WHERE() \
fprintf(stderr, \
“[%s:%u-%s]\n” , \
__FILE__,__LINE__,__FUNCTION__)
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart18
Print Statement Debugging 101
myfunc() {
code here
DEBUG_WHERE();
more code here
DEBUG_WHERE();
result = yourfunc();
DEBUG_INT(result);
etc
}
[myfile.c:3-myfunc][myfile.c:5-myfunc] [myfile.c:7-myfunc] result=384
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart19
Debug Macros instead of Print Statements - DEBUG_INT()
#define DEBUG_INT(_var) \
fprintf(stderr, \
“[%s:%u-%s] %s=%d\n” , \
__FILE__,__LINE__,__FUNCTION__\
#_var,_var)
Result = 384;DEBUG_INT(results); [myfile.c:7-myfunc] result=384
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart20
Debug Macros instead of Print Statements - DEBUG_HEX8()
• Printing in Hex instead of Decimal prepares is the basis for what a logic analyzer can show.
#define DEBUG_HEX8(_var) \
fprintf(stderr, \
“[%s:%u-%s] %s=0x%02X\n” , \
__FILE__,__LINE__,__FUNCTION__\
#_var,_var)Result = 384;DEBUG_HEX8(results); [myfile.c:7-myfunc] result=0xA4
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart21
Debug Macros instead of Print Statements - DEBUG_HEX16()
• Printing larger size variables can also be helpful.
#define DEBUG_HEX16(_var) \
fprintf(stderr, \
“[%s:%u-%s] %s=0x%04X\n” , \
__FILE__,__LINE__,__FUNCTION__\
#_var,_var)
Result = 0x02A4;DEBUG_HEX16(results); [myfile.c:7-myfunc] result=0x02A4
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart22
Debug Macros instead of Print Statements - DEBUG_HEX32()
• Printing larger size variables can also be helpful.
#define DEBUG_HEX32(_var) \
fprintf(stderr, \
“[%s:%u-%s] %s=0x%08X\n” , \
__FILE__,__LINE__,__FUNCTION__\
#_var,_var)
Result = 0x000352A4;DEBUG_HEX32(results); [myfile.c:7-myfunc] result=0x000352A4
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart23
Debug Macros for Logic Analyzers
• Similar to debugging with Print Statements, Except:• (disadvantage) More difficult to use
• Learning curve could be days, not minutes(but its worth it for the “hard” bugs that could otherwise take weeks to debug)
• (advantage) 1000 times more information• Print Statement: 50 Lines/Data Points per second• Logic Analyzer: 50000 Data Points per second is easy
• (advantage) Visualize execution to spot trouble points in seconds• Don’t need to page through 1000 pages of debug output
• (advantage) Precise real-time view of the system• Data points are time-stamped with microsecond or better resolution
• (advantage) Zoom in/out (like Google maps!)• Takes just a moment to switch from looking at a minute of execution to 10 usec
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart24
Basics of Logic Analyzer Debugging
• Print “Hex” codes to a logic analyzer. Demonstrated here by example:• Using a print statement:
int GetVel(int Pos) {Vel = f(Pos);printf(“Pos=%04X Vel=%04X\n”,Pos,Vel);return (Vel);
}• Using logic analyzer macros:
int GetVel(int Pos) {Vel = f(Pos);LADEBUG_HEX16(0x40,Pos);LADEBUG_HEX16(0x41,Vel);return (Vel);
}
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart25
Basics of Logic Analyzer Debugging
Pos = 0x0063;LADEBUG_HEX16(0x40,Pos);
Timing Diagram of each bit.We’ll see later the value ofthese
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart26
Basics of Logic Analyzer Debugging
Spot the Code Pattern. Each code represents a different variable. Correlate with source code.Variable data displayed in sequence.
Pos
LADEBUG_HEX16(0x40,Pos);LADEBUG_HEX16(0x41,Vel);LADEBUG_HEX16(0x42,Acc);LADEBUG_HEX16(0x43,Time);
Vel Acc Time
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart27
Basics of Logic Analyzer Debugging
Timescale is Microseconds.Easily add debug statements with minimal intrusion on real-time code.
In contrast, each print statement takes 1+ milliseconds.
1 usec
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart28
Defining Logic Analyzer Macros
• The LADEBUG_HEX8 is a hardware independent abstraction• It will generally be necessary to create two platform-dependent
macros:• LADEBUG_INIT() to initialize the hardware• LADEBUG_HEX8() to send an 8-bit code
• Other macros can generally be built upon those in a hardware independent manner
• E.g. LADEBUG_HEX16()• Examples on following slides
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart29
LADEBUG Macro Definition – Example MSP430
• Assume we wanted Port 4 of an MSP430// MSP430 is a simple memory-mapped port
// Using P4.0 thru 4.7 for our 8 bits
static uint8_t *datreg = (uint8_t *) 0x001D;
static uint8_t *dirreg = (uint8_t *) 0x001E;
#define LADEBUG_INIT() { \
*dirreg = 0xFF; } // Initialize as output
#define LADEBUG_HEX8(_val) { \
*datreg =(uint8_t)(_val) } // output code to P4
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart30
LADEBUG_HEX8() Macro with Set/Clear Registers
// Example, ARM, use bits 3 thru 10 of a GPIO bank
static uint32_t *setreg= (uint32_t *) 0x40E00020;
static uint32_t *clrreg= (uint32_t *) 0x40E0002C;
static uint32_t *dirreg= (uint32_t *) 0x40E00014;
#define LADEBUG_INIT() { \
*dirreg |= 0x000007F8; /* set bits for output */ }
#define LADEBUG_HEX8(_hex8) { \uint32_t set32,clr32,h32 =(uint32_t)_hex8; \set32= ( (h32<<3) & 0x000007F8); /* mark bits to set */ \ clr32= (((~h32)<<3)& 0x000007F8); /* mark bits to clear*/ \*setreg = set32; /* write 1 to the set bits register */ \*clrreg = clr32; /* write 1 to the clr bits register */ }
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart31
LADEBUG_HEX8() when Scrounging Bits
• PXA270, we had following bits available, by leveraging the Camera Interface bits:
• GPIO Port2: 84, 85, 93, 94• GPIO Port3: 116,106,107,108• PXA270 has separate set/clear registers.• Code became too complex for macro, hence:• #define LADEBUG_HEX8(val) ladebugHex8(val)
• Code takes longer to execute• 5 usec instead of 0.5 usec• Still very useable with minimal intrusion on system
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart32
LADEBUG_HEX8() when Scrounging BitsladebugHex8(uint8_t hex8) { uint32_t h32 = (uint32_t) hex8; uint32_t set2,set3,clr2,clr3;
// First determine what to write to each register set2=(((h32 & 0x0C)<<(29-2)|((h32 & 0x03)<<(20-0)))); // D3.D2 | D1.D0 set3=(((h32 & 0xE0)<<(10-5)|((h32 & 0x10)<<(20-4)))); // D7.D6.D5 | D4 h32 = ~h32; clr2=(((h32 & 0x0C)<<(29-2)|((h32 & 0x03)<<(20-0)))); // D3.D2 | D1.D0 clr3=(((h32 & 0xE0)<<(10-5)|((h32 & 0x10)<<(20-4)))); // D7.D6.D5 | D4
// The write out to the GPIOs *clrreg2 = clr2; *setreg2 = set2; *clrreg3 = clr3; *setreg3 = set3;}
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart33
Always Test Macros Before Using Them
LADEBUG_HEX8(0xFF);LADEBUG_HEX8(0x01);LADEBUG_HEX8(0x02);LADEBUG_HEX8(0x04);LADEBUG_HEX8(0x08);LADEBUG_HEX8(0x10);LADEBUG_HEX8(0x20);LADEBUG_HEX8(0x40);LADEBUG_HEX8(0x80);LADEBUG_HEX8(0x55);LADEBUG_HEX8(0xAA);
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart34
LADEBUG_HEX16() Macro
• Other macros can be built off the HEX8 base macro
#define LADEBUG_HEX16(code,_val) \
LADEBUG_HEX8(code); \
LADEBUG_HEX8( (_val) >> 8); \
LADEBUG_HEX8( (_val) )
Pos = 0x0063;LADEBUG_HEX16(0x40,Pos);
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart35
LADEBUG_HEX32() Macro
#define LADEBUG_HEX32(code,_val) \
LADEBUG_HEX8(code); \
LADEBUG_HEX8( (_val) >> 24); \
LADEBUG_HEX8( (_val) >> 16); \
LADEBUG_HEX8( (_val) >> 8); \
LADEBUG_HEX8( (_val) )
Sum = 0x0009B900;Avg = 0x0000CFB8;LADEBUG_HEX32(0x44,Sum);LADEBUG_HEX32(0x45,Avg);
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart36
More Logic Analyzer Macros are Possible
• There are many possibilities for defining macros• Use macros to enable a common interface across platforms• For each new platform, define the macros in a hardware-dependent
manner, and place in a .h file• Simply include a different .h file for each platform
• Additional macro examples• Depending on what you have to troubleshoot, define the macros to meet
your specific needs
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart37
Tracing Code using Logic Analyzer Macros
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart38
Tracing Driver, Low-Level, and Real-Time Code
• Tracing is typically the starting point for troubleshooting hard bugs• Need to understand the program flow, including paths taken within
conditionals and loops in the lowest-level code• Especially if you can’t use printf() or serial I/O because they are too slow.• Or can’t use breakpoints as it breaks functionality.
• The code to troubleshoot was written by someone else• The flow of code might be difficult to follow, thus trying a straight code
review can be confusing. Instead, get facts on program flow.
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart39
Where do we start?
• Consider Model of Typical Real-Time Code:• Periodic threads wait for Timer Events• Aperiodic threads may wait for some kind of
message or signal to arrive• Interrupts handlers wait for hardware interrupts
to trigger• In each case, processing is usually the same,
per this basic model
Thread
Read Inputs/Events
Do Processing
Write Outputs
Wait for Event or Timer
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart40
Start by Tracing Who Gets Called When
Mark beginning and end of each key thread or interrupt
Thread
Read Inputs/Events
Do Processing
Write Outputs
Wait for Event or Timer
Beginning of Execution
End of Execution
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart41
Codes to Logic Analyzer
• To trace, send codes to logic analyzer that show when each thread (or function or interrupt handler) starts and finishes
• No specific rules on what codes to use• Using codes that establish patterns make it faster to understand
• My personal convention• Two-digit HEX numbers show up on logic analyzer• Use first digit to indicate which thread • Use second digit to indicate where we are within a thread
• Use “1” as second digit for start• Use “F” as second digit for finish
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart42
Start by Tracing Who Gets Called When
We can monitor up to 16 threads/functions/interrupts at a time using this method.
Thread
Read Inputs/Events
Do Processing
Write Outputs
Wait for Event or Timer
LA_DEBUG(0x21)
LA_DEBUG(0x2F)
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart43
Start by Tracing Who Gets Called When
Add additional debug macros within the code to monitor branches or function calls
Thread
Read Inputs/Events
If (condition) { Do_A() } else { Do_B() }
Write Outputs
Wait for Event or Timer
LA_DEBUG(0x22)LA_DEBUG(0x23)LA_DEBUG(0x24)
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart44
Sample Instrumentation of Codewhile (1) {
LADEBUG_HEX8(0x2F);
wait_for_something();
LADEBUG_HEX8(0x21);
read_stuff()
LADEBUG_HEX8(0x22);
if (condition) {
LADEBUG_HEX8(0x26);
Do_A();
} else {
LADEBUG_HEX8(0x2A);
Do_B();
}
LADEBUG_HEX8(0x2C);
write_stuff();
}
Intentionally skip a few numbers so that if we want to add more codes later, perhaps inside the function Do_A(), we can keep them in sequence. Not essential to keep them in sequence, just helpful to keep things less confusing on the logic analyzer output.
We can put up to 16 codes per thread (second digit 0..F)
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart45
Visualizing Real-Time Execution
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart46
Now What?
• Once code is instrumented, executed, and captured on logic analyzer, most real-time code will exhibit repeated patterns
• Patterns may not be precise, but still recognizable. • “Issues” (errors, glitches, etc.) typically break the pattern. E.g.
• Very periodic pattern is broken• A Non-“xF” code, indicating something not ending in timely manner• Long gap between two logic analyzer codes• Short period of more activity than usual• Specific codes showing up too often• Specific codes not showing up often enough, or ever
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart47
Run Code, Capture on Logic Analyzer
• Following Slides Show Many Examples• Zoom In or Out and Scroll as Needed
• Zoom out to see big picture; Zoom in to see more detail• Scroll left/right to see earlier or later portions of execution
• Look for Anomalies – These are POTENTIAL issues• An anomaly is not necessarily an error
• It represents an area that needs additional focus to see if that’s expected or not
• Search for Specific Codes• Use the Logic Analyzer search function to find a specific thread or function
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart48
Code Trace Example – High Level View
• At the high level, can’t see any of the codes, but we could see patterns
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart49
Code Trace Example – Medium Level View
• Zoom in a bit, and can start seeing codes• When code is efficient and CPU not overloaded, most codes end in “F”,
which indicate end of functions or threads
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart50
Code Trace Example – Anomaly seen in Medium Level View
• An Anomaly – something running for extended period of time!• Code is 0xBA, which is not the end of a function or thread
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart51
Measuring Execution Time
• Use the logic analyzer to measure execution time• Measurements are accurate to sub-microsecond• Put an LADEBUG_HEX8() before and after any code that you want to
measure• Use it for fine-grain measurements. E.g. how long is a 32-bit division?
LADEBUG_HEX8(0x24);
y=x/a; // measuring one line of code
LADEBUG_HEX8(0x25);
• Use it for coarse-grain measurements. E.g. start and stop of thread.
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart52
Measuring Execution Time
Use logic analyzer to mark stop and end of the codeThen read exact
measurement
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart53
Code Trace Example – Low Level View
• When we zoom in more, we can see the interim trace,not just the ends of functions
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart54
Code Trace Example – Low Level View
• Good news. This can be fun. Troubleshooting can be game.• Let’s play Find the Anomaly
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart55
Code Trace Example – Low Level View
• Same code repeated multiple times, indicative of a loop.• But not the same number of repetitions each time?
0x13-0x15 occur 6 times, then 8 times. Why?
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart56
Code Trace Example – Searching for Specific Code• Suppose we want to find the execution of one of the instrumented
functions, and it’s instrumented as 0x31-0x3F. Find 0x31 first.
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart57
Code Trace Example – Searching for Specific CodeSearch for 0x31 – The search found it here Note: Search function is dependent
on logic analyzer software. Works differently on every analyzer.
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart58
Code Trace Example – Searching for Specific CodeZoom in – See the 0x31 See the path within code that was taken:
Branch that prints 0x36 was executed.
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart59
Visual Debugging: More Examples of Anomalies
• Let’s look at a few more examples of identifying anomalies• The more experience you get, the easier it will be to spot the anomaly• As you research each anomaly, you’ll get a better understanding of how the
code works• Once an anomaly becomes understood, and considered “normal”, it is
usually easy to ignore it• For anomalies that indicate real issues, add more LADEBUG_HEX8() codes if
necessary, to gain a better understanding of that code
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart60
Find the Anomaly
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart61
Anomaly: Inconsistent Clock
• Very repetitive until we reach here.
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart62
Anomaly: Inconsistent Clock
• Repeatedly zoom in more to see what is happening
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart63
Anomaly: Inconsistent Clock
• Example of printing data within the streammsgID = wait_for_something();
LADEBUG_HEX16(0x31,msgID);
LADEBUG_HEX8(0x31);
Pattern I use to intermix data, which is surround the data with the “trace” code.
Knowing which message came in was the key to understanding why the timing was different. We got an extra unexpected message.
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart64
Anomaly: Another Inconsistent Clock
• Half a period? Why?
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart65
Anomaly: Another Inconsistent Clock
• Extended Period? Why?
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart66
Find the Anomaly
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart67
Anomaly: Extended Execution Time
• Code after 0x43 much longer one time as compared to other times• Could be alternate thread of execution
• zoom in to see if others are 0x43 as well.• Could be a result of preemption.
• But by what? • If preempted, by something NOT instrumented.
• Possible issue if concerned about race conditions.
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart68
Find the Anomaly
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart69
Anomaly: Burst Pattern
• Thread instrumented with 0x4? executes 4 times in a row at fixed rate, then long delay, then repeats. Why?
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart70
Anomaly: Burst Pattern
• 4th pulse is regularly longer than the other 3. Why?
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart71
Find the Anomaly
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart72
Find the Anomaly
• Don’t see any issues, looks pretty regular• Zoom in on a segment
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart73
Find the Anomaly
• See anything at this zoom level?
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart74
Find the Anomaly• Maybe if we zoom in more?
• Now we’re seeing differences, but still not sure what the pattern is.• The 0x3F/0x4F demark start of repeated pattern
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart75
Find the Anomaly
• Now down to 500 us/Div. See something now?
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart76
No Anomaly! This is a GOOD Thing!
• Not an anomaly, but rather quite the opposite• VERY regular at this zoom level – every thread taking 500 usec
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart77
No Anomaly! This is a GOOD Thing!
• Not an anomaly, but rather quite the opposite• VERY regular at this zoom level – every thread taking 500 usec• Task 2 runs consistently every msec
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart78
No Anomaly! This is a GOOD Thing!
• This is what we strive for!• Very regular at every zoom level• Consistent patterns will make it easy to spot any disturbances• The most reliable real-time systems are repeatable like this during the
steady state!• This is IDEAL. Few systems will look like this, but if it does, then great!
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart79
Find the Anomaly
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart80
Anomaly: Clock Drift
• This looks like clock drift
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart81
Anomaly: Clock Drift
• This is what it looks like zoomed in
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart82
Clock Drift
• Clock Drift is a fact of life• It will happen whenever threads are almost the same rate, but not quite
• E.g. Sensor has a built-in timer, generating data at 100Hz• Sensor has it’s own internal oscillator
• Processor reads this with thread, also running at 100Hz• Thread’s timer is based on processor crystal
• Unless both timers are sourced from the same crystal, there will be clock
• 100 vs 101 Hz would result in one “skipped” or “extra” cycle per second• 100.0 vs 100.1 Hz would be one skipped or extra every 10 seconds
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart83
Clock Drift: One Kind of Glitch
• Since each clock operates within a tolerance, not possible to plan the exact time for the skip
• Real-Time software needs to take this into account for every data value, otherwise problems such as empty or full queues can happen
• For example, suppose software ignores this.• Assume system typically runs about 8 hours between power cycles• Sensor running at 100.01 Hz, thread at 100.00 Hz• Data buffer is 1024 sensor values• Every 100 seconds, sensor adds one more value to buffer than gets read• In 102400 seconds (about 1.2 days), the buffer fills up• One day, someone runs the system for more than a day. Voila, glitch!
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart84
Find the Anomaly
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart85
Find the Anomaly
• Swapping back and forth way too fast.
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart86
Find the Anomaly
• No RTOS will thread switch that quickly. • Threads were running in parallel on two cores.
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart87
Consider Problem Scenario
• Customer reported that their system failed• They reported “garbage on screens. Some obviously wrong numbers”.• They said it’s happened twice over past month• Engineers in failure analysis lab unable to replicate it
• What do you do?
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart88
Consider Problem Scenario: Rare Glitch
• What do you do?• Start by doing what we did above, and just trace the code• Look for the baseline pattern.• Identify variants, i.e. code that runs occasionally, but not at the same time
every cycle relative to the baseline pattern• For each variant, review any shared resources for possible conflict• If the variant uses significant execution time at high priority, measure how
long it is running, and what happens if any other thread is delayed by that long
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart89
Troubleshooting Rare Glitches
• Why does this work to find a rare glitch?• Many rare glitches actually happen all the time. But the odds of the
“collision” that creates the observation is much more rare.• Consider root causes of rare glitches:
• Timing or synchronization error (including issues with locks and mutexes)• Memory corruption• Hardware error
• Almost any other type of error will be reproducible• If troubleshooting a rare glitch, look for timing or synchronization issues!
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart90
Troubleshooting Rare Glitches: Example
• The following slides provide an example• Assume we have been unable to reproduce the issue
• But someone demands we fix it!• We’ll use the LADEBUG output to zero-in on the thread(s) that have the
highest probability of being the source of a rare glitch.
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart91
Troubleshooting Rare Glitches that Can’t be ReplicatedLook for the “baseline” pattern
• This code is not fully deterministic, but still some patterns observed
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart92
Troubleshooting Rare Glitches that Can’t be Replicated : Verify Pattern at Different Levels of Zoom
• One observation when zooming in, Thread 3 regularly follows thread 8
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart93
Troubleshooting Rare Glitches that Can’t be Replicated : Look at the Anomalies
• In one spot, more “activity than usual” with Thread 14 (E) after it.
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart94
Troubleshooting Rare Glitches that Can’t be Replicated : Identify a Potential Variant within the Anomaly
• Zoom on additional activity. Its Thread 2; it preempted thread 3!
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart95
Troubleshooting Rare Glitches that Can’t be Replicated: Check how variant potentially affects other threads
• Search for other instances of thread 2 – E.g. here it preempts thread 8.
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart96
Troubleshooting Rare Glitches that Can’t be Replicated: Give additional scrutiny to the most problematic threads
• Thread 2 is an example of a potentially troublesome variant.• It inconsistently preempts others, sometimes in the middle of execution
• This thread should get additional scrutiny• Review all potential shared resources. • Find all instances of this thread, understand worst-case execution time,
potential priority inversion, and effect on other threads whenever this thread runs.
• If it uses locks or mutexes, try to force the glitch by increasing execution time or forcing a context switch while inside a critical section (Example Follows)
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart97
Forcing Rare Glitch to Show ItselfHypothesis: Mutex is working fine, not cause of glitch. Prove it!
• How do you know that a mutex is working properly? • What if two threads appear to be using same mutex, but mutex was
initialized in different context; no obvious way to know that there is a problem.
• More often than not, mutexes are assumed to work, they don’t get tested explicitly
• Months of testing might never catch it. This could become an escaped defect.
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart98
Forcing Rare Glitch to Show ItselfHypothesis: Mutex is working fine, not cause of glitch. Prove it!
• Example of why a mutex error or race condition shows up as a rare glitch:
• Mutex is held by periodic thread A, about 5 usec every 100 msec, or 0.005% of the time.
• Aperiodic thread B runs about once every ten or so seconds, and also holds mutex for about 5 usec, or 0.0002% of the time.
• Odds of the two colliding if rates are not harmonic is 0.005*0.00002 = 1-in-10-million.
• If rates are harmonic, collision is either never or always!• With thread A running 10 times per second, it runs 10-million times in
about 15 days.
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart99
Consider Problem Scenario: Rare Glitch
• Can we test the mutex?• Yes … but few people know how.
• If there’s concern that maybe the mutex is not working properly, put a “usleep(1000)” in both thread A and thread B when they hold the lock.
• This might mess up the real-time performance, but data integrity should be maintained
• We’ve just changed the 1-in-10-Million odds to about 1-in-10. • If there’s a problem with the mutex, it might now show up within a few
seconds.• Put a LADEBUG_HEX8() code at start and end of critical section in each thread.• If mutex is a problem, you’ll also see on the logic analyzer where one thread
enters the critical section, even though the other one should have it locked.
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart100
Troubleshooting Mutexes
• To test a mutex, insert usleep() while holding the lockmutex_Lock();{ LADEBUG_HEX8(((threadId) << 4) | 0x6); usleep(1000); do critical section stuff LADEBUG_HEX8(((threadId) << 4) | 0x7);}mutex_Unlock();
• LADEBUG Codes 0x36 means start of critical section, 0x37 end• Once one thread starts, no other thread should be able to enter
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart101
Mutex not working
• Sequence 0x36-0x46-0x37 is observed
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart102
Mutex Working
• Usleep() gives plenty of time for thread 4 to interrupt thread 3 while it held mutex• Thread 4 (0x41) interrupts thread 3, but thread 3 holds lock (0x37)• 0x37 indicates thread 3 swapped back in and released lock.• Since threads 3 and 4 had same priority in this example, thread 3 continued.
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart103
Avoiding Rare Glitches
• Rare glitches can be the most devastating of all issues• Usually the source of the most expensive failures• Catastrophic system failures; recalls; regulatory shutdowns.
• Reduce glitches my maximizing determinism• Use harmonic periodic threads when possible• Use sporadic servers if execution is aperiodic • Limit preemption by executing more threads at same priority in FIFO manner
• Use the logic analyzer to verify the determinism • Deterministic systems produce the most consistent patterns• Start by minimizing the variants when possible• Scrutinize every remaining variant• Actively test every synchronization mechanism used in the system
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart104
Setup Extras
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart105
Setup Extras
• Visualizing execution can be augmented by watching other signals at the same time.
• Key presses and other triggers• Serial communication• Multiple processors, each with their own 8-bit logic analyzer
instrumentation• Analog signals or power consumption via oscilloscope integration• Reset or other hardware signals• Mechanical devices like motors and relays that have significant timing
delays
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart106
Example: Troubleshooting High Power Consumption
• Next few pages show a logic analyzer setup with:• LADEBUG_HEX8(): 8-bit debug port (yellow)• LADEBUG_MODE(): 3-bit mode (green) – indicates CPU speed• LADEBUG_TRIG(): 1-bit trigger (magenta) – indicates DVFS change• 2 keys on keypad (cyan)• serial TX/RX (grey)• instantaneous current consumption of CPU (green analog)• input voltage (red).
• Each page is a zoom of the prior page.• The zoom area is shown by the magenta rectangle.
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart107
Example: Troubleshooting High Power Consumption
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart108
Example: Troubleshooting High Power Consumption
Lower measurement is higher power usage
Mode=0 is CPU @ 13MHzMode=5 is CPU @ 500 MHz
Expected power usageAt 500 MHz Anomaly: twice as much power than
expected from CPU @ 500 MHz
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart109
Example: Troubleshooting High Power Consumption
Zoom in around the anomaly
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart110
Example: Troubleshooting High Power Consumption
DVFS TriggeredExpected 100 usec
settling time
Codes that may explain details of DVFS change
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart111
Example: Troubleshooting High Power Consumption
Zoom in to see codes
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart112
Example: Troubleshooting High Power Consumption
From the code, the decision is captured before 0xE4
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart113
Example: Troubleshooting High Power Consumption
Zoom in to see codes
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart114
Example: Troubleshooting High Power Consumption
Codes show exact path through the DVFS state change code. Issue was found to be correctly changing frequency, but adjusting voltage too high when going from 100 MHz 500 MHz. Code was ok going from 13 MHz 500 MHz.
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart115
Practical Issue: Race Condition with DEBUG_HEX8
• When writing multiple bits, there are race conditions due to propagation delay that a logic analyzer can capture
• Logic analyzer runs faster than the GPIO writes, thus may capture transitions as separate events
• Usually easy to ignore when visualizing, as these captures are only a few nanoseconds in length
• For automated analysis, however, it makes it more difficult to not know precisely when the logic analyzer captured a new value, or a transition value
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart116
Race Condition with DEBUG_HEX8 – Example
• When writing multiple bits, there are race conditions due to propagation delays
• Logic analyzer runs faster than the GPIO writes, thus may capture transitions as separate events
Transition from 0x80 to 0x44
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart117
Adding DEBUG_CLK as a 9th GPIO bit
#define LADEBUG_HEX8(_val) { \*clkbit |= 0x01; \*datreg =(uint8_t)(_val); \*clkbit &= ~0x01; \
}
Clock bit eliminates ambiguity
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart118
Example of Needing DEBUG_CLK
• 0x46 is NOT actually there. It was the transition from 0x43 to 0x4E• Tends to happen most if either debug GPIOs are on different registers,
or set/clr registers are used.• 0b01000011• 0b01001110
Bits shown in Blue are changing at different rates, and thus subjected to different delays when changing 01 or 10
No clock bit, transition is due to propagation delays
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart119
DEBUG_CLK Bit
• Use DEBUG_CLK bit if you have one more available GPIO• While not absolutely necessary when visualizing data, it can help• If doing DEBUG_HEX32(), it can help distinguish between values with
repeated bytes.• 0x00001234, 0x00121234, and 0x00123434 all look very similar on logic analyzer
when not using DEBUG_CLK to distinguish start of each byte
• Use it especially when exporting data to an external analysis tool• External Analysis tool should walk through every event captured, and
discard any that don’t have a DEBUG_CLK bit toggle
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart120
Troubleshooting Serial Links
• Green shows I2C. • Obvious Anomaly, I2C stops for extended period of time
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart121
Troubleshooting Serial Links
• Zoom in multiple times to see more
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart122
Troubleshooting Serial Links
• I2C signals more obvious, but can’t tell what is the I2C data.
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart123
Serial Interpreters
• Many Logic Analyzers have Serial Interpreters for popular protocols• RS232/UART, SPI, I2C, SDHC
• They convert the serial data into codes, similar to LADEBUG codes• E.g. Following is an I2C breakout, including address/start/stop bits
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart124
Serial Interpreters
• Couple serial lines with LADEBUG codes to see relationship• Types of issues that can be observed
• Code will “write” register, but serial transmission is usually NOT done when code continues, as operations continue in parallel.
• Observe pre-loading the NEXT byte, while previous byte is being transmitted
• Determine when a status bit gets set relative to the serial transmission• Byte ordering on serial line• Whether data on serial line matches data that was sent or received
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart125
Summary
• Troubleshooting Tough Bugs using a Logic Analyzer• Print Debug Macros• Logic Analyzer Debug Macros
• Tracing Code using Logic Analyzer debug macros• Displaying Variable Data on the Logic Analyzer
• Visualizing Real-Time Execution• Focus on Anomalies• Performance Issues• Clock or Synchronization Errors• Troubleshooting Rare Glitches
• Setup Extras• Analog Signals coupled with Digital Debug• Debug Clock Bit• Serial Protocols
Visualizing Real-Time Errors andPerformance Anomalies
2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart126
Visualizing Real-Time Errorsand Performance Anomalies
Dave Stewart, PhDSr. Principal Software Architect – Physio-Control, Inc.
[email protected] ¤ http://davestewart.info