System Troubleshooting TCS Network, System, and Load Monitoring TCS for Developers.
-
Upload
reynard-banks -
Category
Documents
-
view
217 -
download
1
Transcript of System Troubleshooting TCS Network, System, and Load Monitoring TCS for Developers.
![Page 1: System Troubleshooting TCS Network, System, and Load Monitoring TCS for Developers.](https://reader036.fdocuments.net/reader036/viewer/2022082517/56649edc5503460f94beced6/html5/thumbnails/1.jpg)
System Troubleshooting TCS
Network, System, and Load Monitoring TCS for Developers
![Page 2: System Troubleshooting TCS Network, System, and Load Monitoring TCS for Developers.](https://reader036.fdocuments.net/reader036/viewer/2022082517/56649edc5503460f94beced6/html5/thumbnails/2.jpg)
LBT TCS Cluster
![Page 3: System Troubleshooting TCS Network, System, and Load Monitoring TCS for Developers.](https://reader036.fdocuments.net/reader036/viewer/2022082517/56649edc5503460f94beced6/html5/thumbnails/3.jpg)
Networking VLANS for private
networks 6 Gb non-blocking,
full duplex backbone.
Latency, Throughput, Data Rate
Broadcast Multicast TCP/UDP Bottleneck at the
desktop workstations
![Page 4: System Troubleshooting TCS Network, System, and Load Monitoring TCS for Developers.](https://reader036.fdocuments.net/reader036/viewer/2022082517/56649edc5503460f94beced6/html5/thumbnails/4.jpg)
Diagnostics Theory Memory bound versus CPU bound Network throughput versus speed Multithreading errors Subsystem Interaction printf and syslog Standard Out and Standard Error
![Page 5: System Troubleshooting TCS Network, System, and Load Monitoring TCS for Developers.](https://reader036.fdocuments.net/reader036/viewer/2022082517/56649edc5503460f94beced6/html5/thumbnails/5.jpg)
Monitoring and Diagnostic Tools /sbin/tcpdump /sbin/ifconfig cacti top syslog
top vmstat R gnuplot
![Page 6: System Troubleshooting TCS Network, System, and Load Monitoring TCS for Developers.](https://reader036.fdocuments.net/reader036/viewer/2022082517/56649edc5503460f94beced6/html5/thumbnails/6.jpg)
tcpdump
Interactive
-lett -i <device> {limit}
Device can be eth0 or eth0.20 for vlans
Gather Only
-i <device> -w <file>
Gathers all raw packets and writes them to a file for processing later
![Page 7: System Troubleshooting TCS Network, System, and Load Monitoring TCS for Developers.](https://reader036.fdocuments.net/reader036/viewer/2022082517/56649edc5503460f94beced6/html5/thumbnails/7.jpg)
Reflective Memory
17:51:34.494273 IP 10.10.0.238.5000 > 10.10.0.255.5000: UDP, length 102817:51:34.494282 IP 10.10.0.238.5000 > 10.10.0.255.5000: UDP, length 6017:51:34.494397 IP 10.10.0.239.5000 > 10.10.0.255.5000: UDP, length 6017:51:34.494522 IP 10.10.0.240.5000 > 10.10.0.255.5000: UDP, length 6017:51:34.494531 IP 10.10.0.241.5000 > 10.10.0.255.5000: UDP, length 6017:51:34.504062 IP 10.10.0.245.5000 > 10.10.0.255.5000: UDP, length 6017:51:34.504144 IP 10.10.0.248.5000 > 10.10.0.255.5000: UDP, length 6017:51:34.504266 IP 10.10.0.238.5000 > 10.10.0.255.5000: UDP, length 1028
[root@lbtmu107 ~]# tcpdump -i eth0
![Page 8: System Troubleshooting TCS Network, System, and Load Monitoring TCS for Developers.](https://reader036.fdocuments.net/reader036/viewer/2022082517/56649edc5503460f94beced6/html5/thumbnails/8.jpg)
ifconfig
eth0 Link encap:Ethernet HWaddr 00:11:11:10:04:10 inet6 addr: fe80::211:11ff:fe10:410/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:402698793 errors:0 dropped:0 overruns:0 frame:0 TX packets:74367255 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:3500999197 (3.2 GiB) TX bytes:3982146708 (3.7 GiB) Base address:0xdf40 Memory:fbee0000-fbf00000
eth0.10 Link encap:Ethernet HWaddr 00:11:11:10:04:10 inet addr:10.144.0.131 Bcast:10.144.0.255 Mask:255.255.255.0 inet6 addr: fe80::211:11ff:fe10:410/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:12609308 errors:0 dropped:0 overruns:0 frame:0 TX packets:9774513 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:2701235204 (2.5 GiB) TX bytes:1087406483 (1.0 GiB)
[root@lbtmu01 ~]# ifconfig -a
![Page 9: System Troubleshooting TCS Network, System, and Load Monitoring TCS for Developers.](https://reader036.fdocuments.net/reader036/viewer/2022082517/56649edc5503460f94beced6/html5/thumbnails/9.jpg)
Cacti (http://ldap.lbto.arizona.edu/cacti/)
www.cacti.net LDAP
authentication Customizable views Full Deployment
September, 2006
![Page 10: System Troubleshooting TCS Network, System, and Load Monitoring TCS for Developers.](https://reader036.fdocuments.net/reader036/viewer/2022082517/56649edc5503460f94beced6/html5/thumbnails/10.jpg)
top Time spent lost in system is probably io
which includes networking Sort by memory usage with “M” Top inaccurately reports itself
![Page 11: System Troubleshooting TCS Network, System, and Load Monitoring TCS for Developers.](https://reader036.fdocuments.net/reader036/viewer/2022082517/56649edc5503460f94beced6/html5/thumbnails/11.jpg)
vmstatVmstat is a linux utility for monitoring virtual
memory usage. It can also be used to track down I/O problems including networking.
procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 1 0 626164 533248 12488 64388 1 2 6 5 44 44 9 3 88 0 0 0 626164 533136 12488 64388 0 0 0 0 1613 1161 5 2 93 0 0 0 626164 533136 12496 64388 0 0 0 12 1642 1189 5 3 92 0 0 0 626164 533136 12496 64388 0 0 0 0 1645 1247 4 2 94 0 0 0 626164 533128 12496 64388 0 0 0 0 1640 1195 5 3 92 0 0 0 626164 533128 12496 64388 0 0 0 0 1631 1248 4 2 93 0 1 0 626164 533200 12496 64388 0 0 0 0 1674 1288 5 3 92 0 0 0 626164 533200 12496 64388 0 0 0 1 1622 1210 4 2 94 0 0 0 626164 533200 12500 64388 0 0 0 17 1705 1312 6 3 91 0 0 0 626164 533200 12500 64388 0 0 0 0 1649 1261 5 3 93 0
![Page 12: System Troubleshooting TCS Network, System, and Load Monitoring TCS for Developers.](https://reader036.fdocuments.net/reader036/viewer/2022082517/56649edc5503460f94beced6/html5/thumbnails/12.jpg)
Statistical Analysis
R, gnuplot, and Matlab
All of these packages give you a different view of the data that you gather.
Even if you are not comfortable with them, someone else might be.
Graphs, Charts, baselines, etc…
![Page 13: System Troubleshooting TCS Network, System, and Load Monitoring TCS for Developers.](https://reader036.fdocuments.net/reader036/viewer/2022082517/56649edc5503460f94beced6/html5/thumbnails/13.jpg)
Syslog /var/log/TCS/?[telescope@lbtmu01 ~]$ tail -f /var/log/TCS/user Jul 24 20:55:19 lbtmu105 LBT_ECS: Thermal failed to connect to IP
10.144.0.205 port 50010 Jul 24 20:55:20 lbtmu105 LBT_ECS: Thermal not connected to ThermalBox,
Send Cmd failed Jul 24 20:55:32 lbtmu105 LBT_ECS: Thermal failed to connect to IP
10.144.0.205 port 50010 Jul 24 20:55:33 lbtmu105 LBT_ECS: Thermal not connected to ThermalBox,
Send Cmd failed Jul 24 20:55:43 lbtmu103 last message repeated 58 timesJul 24 20:55:45 lbtmu105 LBT_ECS: Thermal failed to connect to IP
10.144.0.205 port 50010 Jul 24 20:55:46 lbtmu105 LBT_ECS: Thermal not connected to ThermalBox,
Send Cmd failed Jul 24 20:55:58 lbtmu105 LBT_ECS: Thermal failed to connect to IP
10.144.0.205 port 50010 Jul 24 20:55:59 lbtmu105 LBT_ECS: Thermal not connected to ThermalBox,
Send Cmd failed