Software Carpentry

Post on 08-Nov-2015

54 views 2 download



A Book About SoftwareCarpentry

Transcript of Software Carpentry

  • Sunil Mohan Adapasunil at medhas dot org

    Software Carpentry

    Some content derived from Software Carpentry Lecture Material

    This work and the original are under Create Commons Attribution 3.0 License

  • About the Tutorial Introductory Hands on Interactive Software Carpentry for academics and

    research in any discipline Makes software work easier Enables new kinds of work Gets work done faster

  • Summary The Unix Shell Regular Expressions Make Version Control Python

  • Unix Shell

  • About Shell Why use command line when we have GUI Typical shell: bash Terminal programs: gnome-terminal, Konsole,

    xterm, putty

  • Example Use Cases Mr. A wishes to retrieve all files modified last

    week and replace the phrase this week with next week in those files.

    Everyday, Mr. B wishes to automatically retrieve all files modified on that day and back them up to different location.

    Mr. C likes to rename files so that their extensions are removed

    Mr. D likes to combine to merge fives sets of user lists into a single one

  • File system ls to list the files in the directory

    ls -l to list files with extra information pwd to show the current directory cd to switch to a directory cd to switch to home directory / is the top most directory. It is also the path separator . is the current directory .. is the parent directory. /home/user/work/.. is same

    as /home/user

  • File System Structure /root and /home store user data /bin, /usr/bin, /sbin and /usr/sbin store executable commands /usr stores files related to user applications /usr/local contains applications compiled by the user /var contains (variable) files that usually grow over time /lib contains libraries /tmp contains temporary files /proc is a virtual file system containing kernel information /mnt and /mount contain file system mounts

  • Manipulating Files cp copies one file to another file or directory mv renames a file or moves it to another

    directory, overwriting rm deletes files rm -rf deletes files and directories mkdir creates a directory rmdir removes an empty directory

  • File Permissions ls -l shows ownership and permissions of a file chmod changes the permissions chown changes the ownership su switches the current user by launching a

    new shell

  • Redirection ls > out stores the output of ls into out file cat concatenates files and input given to it cat < out reads the contents of out file and

    provides as input to cat sort < out > sorted sorts a contents of out file

    and stores it in the sorted file | (a pipe) redirects the output of one command

    to another: ls | sort > sorted

  • Some More Commands du to find the size occupied by file on disk less and more for paginated display find to recursively find files matching a complex criteria xargs to convert input into arguments grep to match a pattern/regular expression in a file head and tail to see part of a file sort to sort data in a file uniq to find items after sorting wc counts number of chars, words and lines in a file

  • Jobs Control-C terminates a program Backgrounding a program

    Control-Z and bg & at the end of the command

    jobs list current jobs fg foregrounds a program ps lists processes kill kills a process

  • References Bash Manual Page: man bash GNU/Linux Man Pages: man Learning the Shell:

  • Regular Expressions

  • What are Regular Expressions? A concise and flexible means for matching

    strings of text Like *.txt means all files with .txt extension Parts of matches can be extracted Matched text can be replaced

  • Example Use Cases Mr. A has list of 1000 phrases in a text file. She

    would like to add a full-stop at the end of each line.

    Mr. B has a list of percentages of various categories in Wikipedia and their growth in X (Y) format. He would like to covert it X/Y format.

    Mr. C would like find out all words in a file containing 3 to 5 alphabets.

  • Example Use Cases (contd.) Mr. D would like to list all hexadecimal numbers

    in a file. Mr. E would like to convert all American

    formatted dates in a file to ISO date format. Mr. F would like to retrieve all the sentences

    starting with 'Which' from a file. Mr. G would like to retrieve all words in a

    document containing two Hindi consonants joined by a halant.

  • Where are RegExps Used? Editors: Vim, Emacs, Eclipse, Notepad++ etc. Programming Languages

    Inbuilt: Perl, Ruby, Javascript etc. As library: C, C++, Java, Php, Python etc.

    Unix command line: rename, grep, sed, perl etc. Lot more:

    Configuring Apache Web Server Syntax Highlighting in editors Even Google Search (well... not really. Just code search)

  • Basics A normal alpha-numeric character in regex

    matches that character in target string hello matches the text hello

    . matches any character * repeats the previous expression zero or more


  • Example Applications Unix command line: grep Editor: Vim Programming: Perl

  • Metacharacters . matches any character

    a. matches as, ab etc. ^ matches the beginning of a line $ matches the end of a line | alternation

    H|h matches h or H () grouping

    H|hello matches H or hello (H|h)ello matches Hello or hello

    \ escapes any metacharacter Mr. matches Mr. and Mrs Mr\. matches Mr. and not Mrs

  • Character Classes [Hh] means (h|H) [0-9] means (0|1|2|3|4|5|6|7|8|9) [0-9a-z] means ([0-9]|[a-z]) [^ab] matches any characters but not a and b \x{0915} matches devanagari \n matches a new line \r matches a return \t matches a tab

  • Character Classes (Perl) \w matches a word \W matches a non-word \s matches a whitespace \S matches a non-whitespace \d matches a digit

  • Quantifiers * matches 0 or more times + matches 1 or more times ? matches 0 or 1 time {7} matches 7 times {5,} matches at least 5 times {2,5} matches at least 2 times but no more than

    5 times

  • Greedy vs. Stingy In text "XYZ" to "PQR" ".*" will match "XYZ" to "PQR" ".*?" will match "XYZ" ? applies to all other quantifiers also

  • Substitutions s/hello/Hello/ will substitute Hello with hello s/(H|h)ello/Hi/ will substitute Hello or hello with

    Hi () will extract a match \1, \2 etc. hold the value of the match s/([0-9])([0-9])/\2\1/ matches two digits and

    reverses them

  • Modifiers i means case-insensitive match

    /Hello/i will match hello, Hello or HELLO g means global matching m means multi-line string

  • References Perl Regular Expressions: man perlre

  • Build Tools

  • Building a Project

    file3.c file4.c

    file3.o file4.o

    file1.c file2.c

    file1.o file2.o library2.somain.c



  • Make Needs a dependency graph Operates on files and time stamps Executes shell commands Other uses

    Any set of tasks with dependency graphs Automated testing Building documentation Even booting an operating system!

  • Writing Makefileshello: hello.o

    gcc hello.o -o hello

    hello.o: hello.cgcc hello.c -c -o


    clean:rm -f hello.o hello

  • Using Make $ make

    $ make clean

  • Basics

    hello: hello.ogcc hello.o -o hello

    hello.o: hello.cgcc hello.c -c -o


    clean:rm -f hello.o hello




  • Bigger Projecthello: main.o filel.o file2.o

    gcc main.o file1.o file2.o -o hello

    main.o: main.c file1.h file2.hgcc main.c -c -o main.o

    file1.o: file1.c file1.hgcc file1.c -c -o file1.o

    file2.o: file2.c file2.hgcc file2.c -c -o file2.o

    clean:rm -f hello main.o file1.o file2.o

  • Improving: Step 1hello: main.o filel.o file2.o

    gcc $^ -o $@

    mail.o: file1.h file2.hmain.o: main.c

    gcc $^ -c -o $@

    file1.o: file1.hfile1.o: file1.c

    gcc $^ -c -o $@

    file2.o: file2.hfile2.o: file2.c

    gcc $^ -c -o $@

    clean:rm -f hello main.o file1.o file2.o

  • Improving: Step 2hello: main.o filel.o file2.o

    gcc $^ -o $@

    mail.o: file1.h file2.hfile1.o: file1.hfile2.o: file2.h

    %.o: %.cgcc $^ -c -o $@

    clean:rm -f hello main.o file1.o file2.o

  • Improving: Step 3TARGET = helloOBJECTS = main.o file1.o file2.o

    main.o: file1.h file2.hfile1.o: file1.hfile2.o: file2.h

    $(TARGET): $(OBJECTS) gcc $^ -o $@

    %.o: %.c gcc $< -c -o $@

    clean: rm -f $(TARGET) $(OBJECTS)

  • Improving: Step 4TARGET = helloOBJECTS = main.o file1.o file2.o

    main.o: file1.h file2.hfile1.o: file1.hfile2.o: file2.h

    $(TARGET): $(OBJECTS) gcc $^ -o $@

    $(OBJECTS): %.o: %.c gcc $< -c -o $@

    clean: rm -f $(TARGET) $(OBJECTS)

  • Phony Targets Try this:

    $ touch clean

    $ make clean

    What happened and why? Declaring a target as phony addresses the

    problem .PHONY: clean

  • Improving: Step 5TARGET = helloOBJECTS = main.o file1.o file2.o

    main.o: file1.h file2.hfile1.o: file1.hfile2.o: file2.h

    $(TARGET): $(OBJECTS) gcc $^ -o $@

    $(OBJECTS): %.o: %.c gcc $< -c -o $@

    .PHONY: clean

    clean: rm -f $(TARGET) $(OBJECTS)

  • Even Better Build System Autoconf

    Detect system environment and build accordingly M4

    Write macros for Autoconf Automake

    Automatically generate makefiles Libtool

    Automatically handle different library formats in different OSes

  • References GNU Make Manual: info make GNU Automake Manual: info automake GNU Autoconf Manual: info autoconf

  • Version Control

  • Why? Keep track of changes Release management Work as a group Identify regressions easily Maintain personal changes to code elsewhere

  • RevisionsInitial Version

    Added feature 1

    Added feature 2

    Fixed bug 1

    Latest version

  • Release ManagementInitial Version

    Added feature 1

    Fixed bug 1

    Version 1.1

    Added feature 2

    Fixed bug 1

    Version 2.0

  • Work as a GroupInitial Version

    Added feature 1

    A's Feature


    B's Feature

    Latest Version

  • Identify RegressionsBug free version

    Latest version contains a bug

    Bug introduced

  • Personal ChangesFree Software Project

    on the Internet

    Version 2.0

    Version 3.0

    Version 4.0

    My research work

    Idea 1

    Idea 2

    Version 1.0

    Idea 3

  • Getting Started with Git Basic configuration:

    $ git config --global "Your Name Comes Here" $ git config --global

    Creating a repository: $ git init

    Adding files to the repository: $ git add file1.c

    Committing the changes $ git commit

  • Editing Edit your file

    $ nano file1.c Mark for commit

    $ git add file1.c Commit the changes

    $ git commit

  • Reviewing Changes Edit and review changes

    $ nano file1.c $ git diff

    Current status $ git status

  • Reviewing Changes (contd.) Changes between two revisions

    $ git diff r1..r2 History of changes

    $ git log

  • Exchanging Patches The diff format Patch file Producing a patch file

    $ git diff r1..r2 > my_feature.patch Applying a patch

    $ patch -p1 < my_feature.patch Better ways

  • Tagging What are tags? Creating a tag

    $ git tag VERSION_1 Deleting a tag

    $ git tag -d VERSION_1 Retrieving older versions

    $ git checkout -b VERSION_1

  • More Topics of Interest Branching and Merging Pushing and Pulling from repositories Rebasing Bisecting Stashing changes

  • Graphical Tools

  • References Git: Official Git Tutorial:

    ProGit Book: Git Manual Pages: man git

  • Python

    Content derived from Official Python Tutorial

  • Why Python? Easy for beginners Yet powerful Rapid development Scalable for large and complex project Object oriented Cross platform Large set of libraries for performing various


  • First Python Program$ python>>> 2 + 35>>>

  • Hello, World!$ python>>> print "Hello, World!"Hello, World!>>>

  • Hello, World! in a File#!/usr/bin/python

    print "Hello, World!"

  • Python as Calculator>>> 2+24>>> (50-5*6)/45

  • Variables>>> a = 2>>> b = 3>>> print a * b6

  • Strings>>> hello = "Hello">>> world = "World">>> print helloHello>>> print worldWorld>>> print hello + worldHelloWorld>>> print hello + ", " + world + "!"Hello, World!

  • Lists>>> a = ['spam', 'eggs', 100, 1234]>>> a['spam', 'eggs', 100, 1234]>>> a[0]'spam'>>> a[3]1234>>> a[-2]100>>> a[1:-1]['eggs', 100]>>> a[2] = a[2] + 23>>> a['spam', 'eggs', 123, 1234]

  • More on Lists>>> a = [66.25, 333, 333, 1, 1234.5]>>> print a.count(333), a.count(66.25), a.count('x')2 1 0>>> a.insert(2, -1)>>> a.append(333)>>> a[66.25, 333, -1, 333, 1, 1234.5, 333]>>> a.index(333)1>>> a.remove(333)>>> a[66.25, -1, 333, 1, 1234.5, 333]>>> a.reverse()>>> a[333, 1234.5, 1, 333, -1, 66.25]>>> a.sort()>>> a[-1, 1, 66.25, 333, 333, 1234.5]

  • More on Lists>>> mat = [... [1, 2, 3],... [4, 5, 6],... [7, 8, 9],... ]

  • Tuples>>> t = 12345, 54321, 'hello!'>>> t[0]12345>>> t(12345, 54321, 'hello!')>>> # Tuples may be nested:... u = t, (1, 2, 3, 4, 5)>>> u((12345, 54321, 'hello!'), (1, 2, 3, 4, 5))

    >>> t = 12345, 54321, 'hello!'>>> x, y, z = t

  • Dictionaries>>> tel = {'jack': 4098, 'sape': 4139}>>> tel['guido'] = 4127>>> tel{'sape': 4139, 'guido': 4127, 'jack': 4098}>>> tel['jack']4098>>> del tel['sape']>>> tel['irv'] = 4127>>> tel{'guido': 4127, 'irv': 4127, 'jack': 4098}>>> tel.keys()['guido', 'irv', 'jack']>>> 'guido' in telTrue

  • If .. else>>> x = int(raw_input("Please enter an int: "))Please enter an integer: 42>>> if x < 0:... x = 0... print 'Negative changed to zero'... elif x == 0:... print 'Zero'... elif x == 1:... print 'Single'... else:... print 'More'

  • For>>> # Measure some strings:... a = ['cat', 'window', 'defenestrate']>>> for x in a:... print x, len(x) 3window 6defenestrate 12

  • For>>> range(10)[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]>>> a = ['Mary', 'had', 'a', 'little', 'lamb']>>> for i in range(len(a)):... print i, a[i]

  • Break>>> for i in range(10):... if i > 5:... break... print i... 012345

  • Continue>>> for i in range(10):... if i == 5:... continue... print i... 012346789

  • Comments>>> # This is single line comment>>> """ This is a ... multiline... comment"""

  • Functions>>> def fib(n): # print Fibonacci series... """Print a Fibonacci series up to n."""... a, b = 0, 1... while a < n:... print a,... a, b = b, a+b...>>> # Now call the function we just defined:... fib(1000)0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987

  • References Python Programming Language Official

    Website: The Python Tutorial: The Python Standard Library: The Python Language Reference:

  • Feedback & Further Assistance:sunil at medhas dot org

  • Thank you

    Slide 1Slide 2Slide 3Slide 4Slide 5Slide 6Slide 7Slide 8Slide 9Slide 10Slide 11Slide 12Slide 13Slide 14Slide 15Slide 16Slide 17Slide 18Slide 19Slide 20Slide 21Slide 22Slide 23Slide 24Slide 25Slide 26Slide 27Slide 28Slide 29Slide 30Slide 31Slide 32Slide 33Slide 34Slide 35Slide 36Slide 37Slide 38Slide 39Slide 40Slide 41Slide 42Slide 43Slide 44Slide 45Slide 46Slide 47Slide 48Slide 49Slide 50Slide 51Slide 52Slide 53Slide 54Slide 55Slide 56Slide 57Slide 58Slide 59Slide 60Slide 61Slide 62Slide 63Slide 64Slide 65Slide 66Slide 67Slide 68Slide 69Slide 70Slide 71Slide 72Slide 73Slide 74Slide 75Slide 76Slide 77Slide 78Slide 79Slide 80Slide 81Slide 82Slide 83