Some Useful Tools for Programmers – Valgrind, Gprof, GDB, Git Lecturer: Yu-Fan Tung, R02921044...

Post on 28-Dec-2015

234 views 2 download

Tags:

Transcript of Some Useful Tools for Programmers – Valgrind, Gprof, GDB, Git Lecturer: Yu-Fan Tung, R02921044...

Some Useful Tools for Programmers –Valgrind, Gprof, GDB, Git

Lecturer: Yu-Fan Tung, R02921044

Advisor: Tian-Li Yu

Outline

Intro

Valgrind

Gprof

GDB

Git

Other tips

Summary

Intro

Why should programmers use TOOLS?

Programming is not just writing and compiling. It is also about maintaining and debugging.

You can’t solve everything by cout/printf when the program gets large.

Efficiency – Good program is fast. Not only runs fast but develops fast.

Valgrind /ˈvælgrɪnd/

Valgrind includes a number of debugging and profiling tools. The most popular of these is Memcheck.

Checks memory problem in C/C++, in a very easy way.

Valgrind

Installation is kind of complicated…(For Gentoo)

echo 'FEATURES="$FEATURES splitdebug"' >> /etc/portage/make.conf

emerge –aq glibc

emerge –aq valgrind

Valgrind

When to use Valgrind?

1. When the following trash appears.

2. Anytime. It won’t hurt to use Valgrind. Maybe you’ll find some hidden bugs.

Valgrind A sample program in C++#include <iostream>using namespace std;void create(int* pr){ pr=new int[10];}int sum(int* pr){ int sum; for(int i = 1; i <=10; i++) sum+=pr[i]; return sum;}int main(int argc, char** argv){ int* pr=new int[10]; create(pr); cout<<sum(pr)<<endl; return 0;}

Valgrind A sample program in C++#include <iostream>using namespace std;void create(int* pr){ pr=new int[10];}int sum(int* pr){ int sum; for(int i = 0; i <12; i++) sum+=pr[i]; return sum;}int main(int argc, char** argv){ int* pr=new int[10]; create(pr); cout<<sum(pr)<<endl; return 0;}

Valgrind

Now we try to execute it../a.out 1 2

Output:49

(What?! The program didn’t crash. And every time we execute it, the output is always a mysterious 49 in my computer.)

In this case we could easily tell it’s wrong. But in a big program we would be cheated by wrong result.

Valgrind

We try to find out some bugs by Valgrind. First we have to prepare our program.

Compile with flag –g and –O0, e.g.g++ -g –O0 a.cpp

If the program is slow, we can replace –O0 by –O1. It will slightly make the output difficult to read.–O2 includes too many optimizations, thus is not recommended for debugging.

Now we use valgrind --leak-check=yes ./a.out 1 2

Valgrind

The output of Valgrind(Oh the output is really much. Just some of it is shown.)

Bug1, invalid read.

==17748== Invalid read of size 4

==17748== at 0x40087E: sum(int*) (a.cc:10)

==17748== by 0x4008CB: main (a.cc:17)

==17748== Address 0x59eb068 is 0 bytes after a block of size 40 alloc'd

==17748== at 0x4C2B897: operator new[](unsigned long) (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)

==17748== by 0x4008AF: main (a.cc:15)

A common off-by-one error.

Valgrind

Bug2, uninitialized variable.==17748== Conditional jump or move depends on uninitialised value(s)

==17748== at 0x4EC0B44: std::ostreambuf_iterator<char, std::char_traits<char> > std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> > >::_M_insert_int<long>(std::ostreambuf_iterator<char, std::char_traits<char> >, std::ios_base&, char, long) const (in /usr/lib64/gcc/x86_64-pc-linux-gnu/4.6.3/libstdc++.so.6.0.16)

==17748== by 0x4EC0E65: std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> > >::do_put(std::ostreambuf_iterator<char, std::char_traits<char> >, std::ios_base&, char, long) const (in /usr/lib64/gcc/x86_64-pc-linux-gnu/4.6.3/libstdc++.so.6.0.16)

==17748== by 0x4ECBDDC: std::ostream& std::ostream::_M_insert<long>(long) (in /usr/lib64/gcc/x86_64-pc-linux-gnu/4.6.3/libstdc++.so.6.0.16)

==17748== by 0x4008D7: main (a.cc:17)

Well we the only things we could understand are something are uninitialized, and it is not far from a.cc:17.

Valgrind

Bug3, memory leak==17748== 40 bytes in 1 blocks are definitely lost in loss record 1 of 2

==17748== at 0x4C2B897: operator new[](unsigned long) (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)

==17748== by 0x4008AF: main (a.cc:15)

==17748==

==17748== 40 bytes in 1 blocks are definitely lost in loss record 2 of 2

==17748== at 0x4C2B897: operator new[](unsigned long) (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)

==17748== by 0x400859: create(int*) (a.cc:5)

==17748== by 0x4008BF: main (a.cc:16)

Memory leak could cause serious problems. It will not affect your program when a little memory is lost. But if your program calls a leaking function a lot of times, something serious will happen.

Valgrind

How serious memory leak can be…

Consider a GA with a leaking fitness function.

Every evaluation leaks 1000 integers, that equals 8KB.

If the nfe is about 100000, then memory leak is 800MB.

If we submit a few copies of the GA, it will be a tragedy.

Another example is a service. If a user hit certain button, memory leaks some.

The service will perform well until it runs out of memory. And we would have no idea how it would crash suddenly.

Valgrind

There is a summary at the end of output.

==17748== LEAK SUMMARY:

==17748== definitely lost: 80 bytes in 2 blocks

==17748== indirectly lost: 0 bytes in 0 blocks

==17748== possibly lost: 0 bytes in 0 blocks

==17748== still reachable: 0 bytes in 0 blocks

==17748== suppressed: 0 bytes in 0 blocks

Valgrind

Advantages of Valgrind:

Easy and fast to check your program. Don’t need to modify code.

It works on huge program.(The developers claimed so.)

Disadvantages of Valgrind:

It sometimes cause false positives, but not often.

It sometimes cause false negatives, too. But still not often.

A good tool if your program involves some memory manipulation.

Use data structures in STL can prevent some memory leak. For example, use vector instead of c-style array.

GDB

The GNU Project debugger.

Allow you to pause, observe, or even play with your program at runtime.

A very powerful tools. But hard to learn.

This slides will only mention a bit of it.

It is built in almost all UN*X machine. In case you don’t have it, install it with (Gentoo):

emerge –aq gdb

GDB

Four main kind of things GDB can do.

Start your program, specifying anything that might affect its behavior.

Make your program stop on specified conditions.

Examine what has happened, when your program has stopped.

Change things in your program, so you can experiment with correcting the effects of one bug and go on to learn about another.

GDB The same bad code again.#include <iostream>using namespace std;void create(int* pr){ pr=new int[10];}int sum(int* pr){ int sum; for(int i = 0; i <12; i++) sum+=pr[i]; return sum;}int main(int argc, char** argv){ int* pr=new int[10]; create(pr); cout<<sum(pr)<<endl; return 0;}

GDB

Compile your program with –g and –O0. e.g.

g++ -g –O0 …

Launch gdb with the executable.

gdb <yourprogram>

The parameter of you program should be specified later, not now.

GDB

There will be a (gdb) prompt.

Run the program with run command. The abbreviation is r. Attach the parameter for your program now.

r [para1] [para2] […]

For our bad code case, the output is 49[Inferior 1 (process 17870) exited normally]

Still the mysterious 49. We will find out why.

GDB

Breakpoints: make the program pause when some condition is met.

We think the function sum may go wrong so we type

(gdb) b sum

Breakpoint 1 at 0x400868: file a.cc, line 10.

where b stands for break.

We can also set breakpoint at any line in file.

(gdb) b a.cpp:10

GDB Run it again.(gdb) rStarting program: /home/yftung/test/a.out

Breakpoint 1, sum (pr=0x602010) at a.cc:1010 for(int i = 0; i < 12; i++) sum+=pr[i];

The program pause. Now we want to look where we are.(gdb) l5 pr=new int[10];6 }7 int sum(int* pr)8 {9 int sum;10 for(int i = 0; i < 12; i++) sum+=pr[i];11 return sum;12 }13 int main(int argc, char** argv)14 {

GDB

Note that the output is the line which will be executed. That means the line is not executed yet. For this line, the variable sum should equals to 0. We print it out.

(gdb) p sum

$3 = 0

No wrong. (Even GDB can’t find the uninitialized error here).

GDB

But what about the mysterious 49? We execute this line, and go to the next line.

(gdb) n11 return sum;

Then before exiting this function, print sum.

(gdb) p sum$2 = 49

Oh there must be some mistake.

GDB

We observer the array by print it.

(gdb) p *pr@12$4 = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 49, 0}

We print an array by *<arrayname>@<arraysize>

Now we know where 49 comes from. We try to overwrite it. We run it again. When the program pause at line 10,

(gdb) set *(pr+10)=0(gdb) p *pr@12$1 = {0 <repeats 12 times>}

And then continue.(gdb) c

Continuing.

0

[Inferior 1 (process 773) exited normally]

GDB

There are many other useful command in GDB. Including following.

continue: makes the program runs till next breakpoint.

next: makes the program runs till next line.

step: makes the program runs till next line. If a function is in current line, it will step into that function and pause.

where: to see where you are, including the stacks.

GDB

Advantage of GDB: very powerful. You can do many things in GDB.

Disadvantage of GDB: for small program, use cout/printf instead of GDB is surely faster.

For those who are afraid of Text User Interface, a GUI version may be a good choice, ddd.

Profiling

Find out the performance bottleneck of your program.

A statistical way to analyze the performance of program.

To profile, compile your program with –pg and –O2.

g++ -pg –O2 …

Since you care about time consumption, you should profile a program that is optimized.

Profiling

Run the program once after compiled, with the parameter that you want the program to run. It should generate a gmon.out file.

./SGA 200 1000 2 1 0 100 -1 10

Then, print out the result(flat profile).

$ gprof SGA gmon.out -p

Profiling

Profiling

Advantage: Easy to use.

Disadvantage: gprof doesn’t have many features. And sometimes the result of profiling is frustrated…

In our lab it seems that we care more about nfe or convergence time than time consumption…

Git  /ɡɪt/

Distributed revision control and source code management system.

Why not simply backup our file using Dropbox?

Teamwork. Git allow you to contribute to a software even if you know non of the main developers.

Git is very powerful. Thus there are a lot to learn beyond this slides.

Installation(Gentoo):

emerge –aq git

Git

First you have to config your git.

$ git config --global user.email "you@yourdomain.com"

$ git config --global user.name "your_username"

And you may want to apply for some Internet service, like GitHub or Bitbucket, but is not necessary if you want to keep your code local and secret.

Git

If you have a code and you want to push it to Internet, first create a repository in you git service website. And get a URL, like ssh://git@bitbucket.org/Yftung/test.git

And move to your directory that you want to git it.

cd /path/to/my/repo

git remote add origin <your URL here>

git add .

git commit –m ‘initial commit’

git push origin master

Git

If you have a code online and you want to start working, you could go to your service, and get a URL like git@bitbucket.org:Yftung/test.git

And move to your directory that you want to git it.

cd /path/to/my/repo

git clone git@bitbucket.org:Yftung/test.git

Git

Repository: A complex bucket to store files.

Commit: A snapshot of program. When we done something we would commit our program, with some explanation, like

git commit –m ‘memory leak fix at ga.cpp:131’

A commit is assigned with a SHA-1 checksum.

Branch: You want to modify the program, let others see your work, but you don’t want to ruin the original code. You may create a branch. If everyone including you thinks that you have add a good feature to the program, you can then merge it to the main branch.The name of initial branch is Master.

Git

Tracked files: If a file has been seen in previous commit, then it is tracked. Files that are not seen in previous commit is untracked files. If you want to organize it make sure to track them before committing.

Staged files: If tracked files are modified, it will become unstaged. We should staged it before committing, otherwise the change won’t be recorded.

To track/stage a file:git add somefile.cppor if you want to simply stage all:git add .

To untrack a filegit rm objectfile.o

Git

To see the current status

git status

To create a local branch

git branch <branchName>

To create a local branch from existing branch br1

git branch <branchName> br1

To see the local branches

git branch

Git

To delete a local branch

git branch –d <branchName>

To create a remote branch

(you can create it when you push it. See push)

To see all branches including remote ones

git branch –a

Git

To change branch

git checkout <branchName>

To checkout the last commit(local modification will be lost)

git checkout HEAD .

To checkout certain commit

git checkout xxxx .

where xxxx is the name of the commit.(Beginning of the checksum.

To checkout certain commit

Git

To commit current directory

git commit –m ‘commit message’

To assign a tag to a commit, where xxxx is the checksum

git tag <tagName> xxxx

A tag isan alias for checksum.Note: HEAD is a tag for last commit. HEAD^ is for last commit before HEAD, and HEAD^^ is …

To see commit log

git log

To see the difference with current code and other branch

git diff <branchName>

To see the difference between two commits

git diff xxxx yyyy

Git

To reset to HEAD, discarding all changes

git reset --hard HEAD

To reset to HEAD, but keeps all changes. (Pretend to be in HEAD)

git reset –soft HEAD

To push all commits to remote branch

git push origin <remoteBranchName>

To pull all commits from remote branch

git pull origin <remoteBranchName>

Git

To merge with another branch

git merge <branchName>

This means all the modification on that branch will be merge to the current branch.

After merging, we could delete the branch.

git branch –d <branchName>

Git

There are some files we don’t want to track them, like .o files, .a files, .out files. We create a file name .gitignore under the directory.

If you have commit the files you don’t want to, you should first untrack them, then commit the remaining.

Sample of .gitignore

# Comment[Oo]bj[Bb]in[Tt]emp*.[Cc]ache*.o*.log *.a*.out

Git

There are too many function for Git to mention here.

Version control is important, no matter you would like to develop on you own or in a team. So Git is important.

Other tips

Good compiling options will help A LOT.

Add this to your Makefile, if you know howto.

DBGFLAGS= -g –Wall –Wextra

or simply

g++ -g –Wall –Wextra …………

Writing Makefile is worth learning.

Other tips

Bash Shell: The Bourne Again Shell, a popular shell in most linux distributions.

Learning bash shell would solves a lot of problems. Besides, one probably needs a shell to commit a program to run on the cluster.

Summary

Warning may seems annoying, but writing a clean code that no one will warn you may do you good.

Engineers are not workers. We should use the tools well to save our precious time on coding.

There are many programming tools, because there are a lot of demands.

You don’t have to try them all but you should try some, if you haven’t!

Reference

Thanks for attentionAny question or suggestion is welcomed.