2/6/2013
1
These are slightly modified versions of slides prepared by Steve Gribble
Guessing Game
• C provides some basic facilities
• C libraries help make those primitive facilities useful
• For each library routine prototype that follows, guess how to use it:
• What are the arguments? What is the return value?
• How would an error be indicated?
• Where was memory created? How will it be freed?
These are slightly modified versions of slides prepared by Steve Gribble
void *memcpy(void *dest, const void *src, size_t n);
Copy Memory
2/6/2013
2
These are slightly modified versions of slides prepared by Steve Gribble
SYNOPSIS#include <string.h>
void *memcpy(void *dest, const void *src, size_t n);
DESCRIPTIONThe memcpy() function copies n bytes from memory area src to memory area dest. The memory areas must not overlap. Use memmove(3) if the memory areas do overlap.
RETURN VALUEThe memcpy() function returns a pointer to dest.
These are slightly modified versions of slides prepared by Steve Gribble
char *strcpy(char *dest, const char *src);
char *strncpy(char *dest, const char *src, size_t n);
Copy A String
2/6/2013
3
These are slightly modified versions of slides prepared by Steve Gribble
SYNOPSIS#include <string.h>char *strcpy(char *dest, const char *src);char *strncpy(char *dest, const char *src, size_t n);
DESCRIPTIONThe strcpy() function copies the string pointed to by src,including the terminating null byte ('\0'), to the buffer pointedto by dest. The strings may not overlap, and the destination string dest must be large enough to receive the copy.
The strncpy() function is similar, except that at most n bytesof src are copied.Warning: If there is no null byte among the first n bytes of src, the string placed in dest will not be null-terminated.
If the length of src is less than n, strncpy() pads the remainder of dest with null bytes.
These are slightly modified versions of slides prepared by Steve Gribble
char *strcat(char *dest, const char *src);
char *strncat(char *dest, const char *src, size_t n);
Concatenate Two Strings
2/6/2013
4
These are slightly modified versions of slides prepared by Steve Gribble
SYNOPSIS#include <string.h>char *strcat(char *dest, const char *src);char *strncat(char *dest, const char *src, size_t n);
DESCRIPTIONThe strcat() function appends the src string to the dest string, overwriting the terminating null byte ('\0') at the end of dest, and then adds a terminating null byte. The strings may notoverlap, and the dest string must have enough space forthe result.
The strncat() function is similar, except that* it will use at most n characters from src; and* src does not need to be null-terminated if it contains n
or more characters.As with strcat(), the resulting string in dest is always null-terminated.
These are slightly modified versions of slides prepared by Steve Gribble
char *strdup(const char *s);
char *strndup(const char *s, size_t n);
Duplicate a String
2/6/2013
5
These are slightly modified versions of slides prepared by Steve Gribble
SYNOPSIS#include <string.h>char *strdup(const char *s);char *strndup(const char *s, size_t n);
DESCRIPTIONThe strdup() function returns a pointer to a new string which is a duplicate of the string s. Memory for the new string is obtained with malloc(3), and can be freed with free(3).
The strndup() function is similar, but only copies at most n characters. If s is longer than n, only n characters are copied, and a terminating null byte ('\0') is added.
RETURN VALUEThe strdup() function returns a pointer to the duplicated string, or NULL if insufficient memory was available.
These are slightly modified versions of slides prepared by Steve Gribble
double strtod(const char *nptr, char **endptr);
ASCII String to Double
2/6/2013
6
These are slightly modified versions of slides prepared by Steve Gribble
RETURN VALUEThese functions return the converted value, if any.If endptr is not NULL, a pointer to the character after the last character used in the conversion is stored in the location referenced by endptr.
If no conversion is performed, zero is returned and the value of nptr is stored in the location referenced by endptr.
If the correct value would cause overflow, plus or minus HUGE_VAL (HUGE_VALF, HUGE_VALL) is returned (according to the sign of the value), and ERANGE is storedin errno. If the correct value would cause underflow, zero is returned and ERANGE is stored in errno.
These are slightly modified versions of slides prepared by Steve Gribble
ssize_t getline(char **lineptr, size_t *n, FILE *stream);
Get a Line of Text
2/6/2013
7
These are slightly modified versions of slides prepared by Steve Gribble
DESCRIPTIONgetline() reads an entire line from stream, storing the address of the buffer containing the text into *lineptr. The buffer is null-terminated and includes the newline character, if one was found.
If *lineptr is NULL, then getline() will allocate a buffer for storing the line, which should be freed by the user program. (In this case, the value in *n is ignored.)
Alternatively, before calling getline(), *lineptr can contain a pointer to a malloc(3)-allocated buffer *n bytes in size. If the buffer is not large enough to hold the line, getline() resizes it with realloc(3), updating *lineptr and *n as necessary.
In either case, on a successful call, *lineptr and *n will be updated to reflect the buffer address and allocated size respectively.
These are slightly modified versions of slides prepared by Steve Gribble
RETURN VALUEOn success, getline() and getdelim() return the number of characters read, including the delimiter character, but not including the terminating null byte. This value can be used to handle embedded null bytes in the line read.
Both functions return -1 on failure to read a line (including end-of-file condition).
2/6/2013
8
These are slightly modified versions of slides prepared by Steve Gribble
char * readline (const char *prompt);
Read a Line of Text(from the terminal)
(with history)
These are slightly modified versions of slides prepared by Steve Gribble
DESCRIPTIONreadline will read a line from the terminal and return it, using prompt as a prompt. If prompt is NULL or the empty string, no prompt is issued. The line returned is allocated with malloc(3); the caller must free it when finished. The line returned has the final newline removed, so only the text of the line remains.
readline offers editing capabilities while the user is entering the line. By default, the line editing commands are similar to those of emacs. A vi-style line editing interface is also available.
This manual page describes only the most basic use of readline. Much more functionalty is available; see The GNU Readline Library and The GNU History Library for additional information.
RETURN VALUEreadline returns the text of the line read. A blank line returns the empty string. If EOF is encountered while reading a line, and the line is empty, NULL is returned. If an EOF is read with a non-empty line, it is treated as a newline.
2/6/2013
9
These are slightly modified versions of slides prepared by Steve Gribble
size_t fread(void *ptr, size_t size, size_t nmemb, FILE *stream);
size_t fwrite(const void *ptr, size_t size, size_t nmemb,FILE *stream);
Read/Write a File
These are slightly modified versions of slides prepared by Steve Gribble
DESCRIPTIONThe function fread() reads nmemb elements of data, each size bytes long, from the stream pointed to by stream, storing them at the location given by ptr.
The function fwrite() writes nmemb elements of data, each size bytes long, to the stream pointed to by stream, obtaining them from the location given by ptr.
RETURN VALUEfread() and fwrite() return the number of items successfully read or written (i.e., not the number of characters). If an error occurs, or the end-of-file is reached, the return value is a short item count (or zero).
fread() does not distinguish between end-of-file and error, and callers must use feof(3) and ferror(3) to determine which occurred.
2/6/2013
10
These are slightly modified versions of slides prepared by Steve Gribble
ssize_t read(int fd, void *buf, size_t count);
ssize_t write(int fd, const void *buf, size_t count);
Read/Write a File
These are slightly modified versions of slides prepared by Steve Gribble
RETURN VALUE (read)On success, the number of bytes read is returned (zero indicates end of file), and the file position is advanced by this number. It is not an error if this number is smaller than the number of bytes requested; this may happen for example because fewer bytes are actually available right now (maybe because we were close to end-of-file, or because we are reading from a pipe, or from a terminal), or becauseread() was interrupted by a signal. On error, -1 is returned,and errno is set appropriately. In this case it is leftunspecified whether the file position (if any) changes.
RETURN VALUE (write)On success, the number of bytes written is returned (zero indicates nothing was written). On error, -1 is returned, and errno is set appropriately.
2/6/2013
11
These are slightly modified versions of slides prepared by Steve Gribble
Using C streams#include <stdio.h>#include <stdlib.h>#include <errno.h>
#define READBUFSIZE 128int main(int argc, char **argv) {
FILE *f;char readbuf[READBUFSIZE];size_t readlen;
if (argc != 2) {fprintf(stderr, "usage: ./fread_example filename\n");return EXIT_FAILURE; // defined in stdlib.h
}
// Open, read, and print the filef = fopen(argv[1], "rb"); // "rb" --> read, binary modeif (f == NULL) {
fprintf(stderr, "%s -- ", argv[1]);perror("fopen failed -- ");return EXIT_FAILURE;
}
// Read from the file, write to stdout.while ((readlen = fread(readbuf, 1, READBUFSIZE, f)) > 0)
fwrite(readbuf, 1, readlen, stdout);fclose(f);return EXIT_SUCCESS; // defined in stdlib.h
}
stderr is a stream for printing error output to a console
fopen opens a stream to read or write a file
perror writes a string describing the last error to stderr
stdout is for printing non-error output to the console
printf(...) is equivalent to fprintf(stdout, ...)
fread_example.c
These are slightly modified versions of slides prepared by Steve Gribble
Fork exec#include <stdio.h>#include <stdlib.h>#include <unistd.h>#include <string.h>#include <errno.h>
#include <sys/types.h>#include <sys/wait.h>
int main(void){
pid_t pid;int status;
printf("parent : %d\n", getpid());
pid = fork();
if (pid == 0) {char* argv[] = {"/bin/bash", 0};printf("child : %d\n", getpid());execv(argv[0], argv);perror(“exec failed”);// should never get here
}
/* in parent */waitpid(pid, &status, 0);printf("Child exited with status %d\n", status);printf("Parent exiting\n");
}
two identical processes at this point, except that child has pid==0
overwrite my virtual address space with /bin/bash and start it up
2/6/2013
12
CSE 333Lecture 6 - system calls, intro to file I/O
These are slightly modified versions of slides prepared by Steve Gribble
Remember this picture?
hardware
operating systemHW/SW interface(x86 + devices)
CPU memory storage networkGPU clock audio radio peripherals
OS / app interface(system calls)
C standard library(glibc)
C application
C++ STL / boost / standard library
C++ application
JRE
Java application
brief diversion
2/6/2013
13
These are slightly modified versions of slides prepared by Steve Gribble
What’s an OS?
Software that:
1. directly interacts with the hardware
‣ OS is trusted to do so; user-level programs are not
‣ OS must be ported to new HW; user-level programs “are portable”
2. manages (allocates, schedules, protects) hardware resources
‣ decides which programs can access which files, memory locations, pixels on the screen, etc., and when
3. abstracts away messy hardware devices
‣ provides high-level, convenient, portable abstractions
• e.g., files vs. disk blocks
These are slightly modified versions of slides prepared by Steve Gribble
OS as an abstraction provider
The OS is the “layer below”
- a module that your program can call (via system calls)
- provides a powerful API (the OS API)
a process running your program
file
syst
em
netw
ork
stac
k
virt
ual m
emor
y
proc
ess
mgm
t
• •
• et
c. •
••
OSAPI
OS
file system- open( ), read( ), write( ), close( ), ...
network stack- connect( ), listen( ), read( ), write ( ), ...
virtual memory- brk( ), shm_open( ), ...
process management- fork( ), wait( ), nice( ), ...
2/6/2013
14
These are slightly modified versions of slides prepared by Steve Gribble
OS as a protection systemOS isolates processes from each other
- but permits controlled sharing between them
‣ through shared name spaces (e.g., FS names)
OS isolates itself from processes
- and therefore must prevent processes from accessing the hardware directly
OS is allowed to access the hardware
- user-level processes run with the CPU in unprivileged mode
- when the OS is running, the CPU is set to privileged mode
- user-level processes invoke a system call to safely enter the OS
proc
ess
A(u
ntru
sted
)
OS(trusted)
proc
ess
B(u
ntru
sted
)
proc
ess
C(u
ntru
sted
)
proc
ess
D(t
rust
ed)
HW (trusted)
These are slightly modified versions of slides prepared by Steve Gribble
OS as a protection system
proc
ess
A(u
ntru
sted
)
OS(trusted)
proc
ess
B(u
ntru
sted
)
proc
ess
C(u
ntru
sted
)
proc
ess
D(t
rust
ed)
HW (trusted)
a CPU (thread of execution) is running user-level code in process A; that CPU is set to unprivileged mode
2/6/2013
15
These are slightly modified versions of slides prepared by Steve Gribble
OS as a protection system
proc
ess
A(u
ntru
sted
)
OS(trusted)
proc
ess
B(u
ntru
sted
)
proc
ess
C(u
ntru
sted
)
proc
ess
D(t
rust
ed)
HW (trusted)
code in process A invokes a system call; the hardware
then sets the CPU to privileged mode and traps into
the OS, which invokes the appropriate system call
handler
syst
em c
all
These are slightly modified versions of slides prepared by Steve Gribble
OS as a protection system
proc
ess
A(u
ntru
sted
)
OS(trusted)
proc
ess
B(u
ntru
sted
)
proc
ess
C(u
ntru
sted
)
proc
ess
D(t
rust
ed)
HW (trusted)
because the CPU executing the thread that’s in the OS is in privileged mode, it is
able to use privileged instructions that interact directly with hardware
devices like disks
2/6/2013
16
These are slightly modified versions of slides prepared by Steve Gribble
OS as a protection system
proc
ess
A(u
ntru
sted
)
OS(trusted)
proc
ess
B(u
ntru
sted
)
proc
ess
C(u
ntru
sted
)
proc
ess
D(t
rust
ed)
HW (trusted)
once the OS has finished servicing the system call (which might involve long
waits as it interacts with HW) it:
(a) sets the CPU back to unprivileged mode, and
(b) returns out of the system call back to the user-level code in process A sy
stem
cal
l ret
urn
These are slightly modified versions of slides prepared by Steve Gribble
OS as a protection system
proc
ess
A(u
ntru
sted
)
OS(trusted)
proc
ess
B(u
ntru
sted
)
proc
ess
C(u
ntru
sted
)
proc
ess
D(t
rust
ed)
HW (trusted)
the process continues executing whatever code
that is next after the system call invocation
2/6/2013
17
These are slightly modified versions of slides prepared by Steve Gribble
Details on x86 / Linux
A more accurate picture:
- consider a typical Linux process
- its thread of execution can be several places
‣ in your program’s code
‣ in glibc, a shared library containing the C standard library, POSIX support, and more
‣ in the Linux architecture-independent code
‣ in Linux x86-32/x86-64 code
yourprogram
glibc
C standardlibrary POSIX
architecture-dependent code
architecture-independent code
Linux kernel
Linuxsystem calls
These are slightly modified versions of slides prepared by Steve Gribble
yourprogram
glibc
Details on x86 / Linux
Some routines your program invokes may be entirely handled by glibc
- without involving the kernel
‣ e.g., strcmp( ) from stdio.h
- ∃ some initial overhead when invoking functions in dynamically linked libraries
- but, after symbols are resolved, invoking glibc routines is nearly as fast as a function call within your program itself
architecture-dependent code
architecture-independent code
Linux kernel
C standardlibrary POSIX
2/6/2013
18
These are slightly modified versions of slides prepared by Steve Gribble
yourprogram
glibc
Details on x86 / Linux
Some routines may be handled by glibc, but they in turn invoke Linux system calls
- e.g., POSIX wrappers around Linux syscalls
‣ POSIX readdir( ) invokes the underlying Linux readdir( )
- e.g., C stdio functions that read and write from files
‣ fopen( ), fclose( ), fprintf( ) invoke underlying Linux open( ), read( ), write( ), close( ), etc.
architecture-dependent code
architecture-independent code
Linux kernel
C standardlibrary POSIX
These are slightly modified versions of slides prepared by Steve Gribble
yourprogram
glibc
Details on x86 / Linux
Your program can choose to directly invoke Linux system calls as well
- nothing forces you to link with glibc and use it
- but, relying on directly invoked Linux system calls may make your program less portable across UNIX varieties
architecture-dependent code
architecture-independent code
Linux kernel
C standardlibrary POSIX
2/6/2013
19
These are slightly modified versions of slides prepared by Steve Gribble
yourprogram
glibc
Details on x86 / Linux
Let’s walk through how a Linux system call actually works
- we’ll assume 32-bit x86 using the modern SYSENTER / SYSEXIT x86 instructions
architecture-dependent code
architecture-independent code
Linux kernel
C standardlibrary POSIX
These are slightly modified versions of slides prepared by Steve Gribble
yourprogram
glibc
Details on x86 / Linux
Remember our process address space picture
- let’s add some details
architecture-dependent code
architecture-independent code
Linux kernel
C standardlibrary POSIX
0x00000000
0xFFFFFFFF
Linuxkernel
stack
shared libraries
heap (malloc/free)
read/write segment.data, .bss
read-only segment.text, .rodata
kernel stack
linux-gate.so
CPU
2/6/2013
20
These are slightly modified versions of slides prepared by Steve Gribble
yourprogram
glibc
Details on x86 / Linux
architecture-dependent code
architecture-independent code
Linux kernel
C standardlibrary POSIX
0x00000000
0xFFFFFFFF
Linuxkernel
stack
shared libraries
heap (malloc/free)
read/write segment.data, .bss
read-only segment.text, .rodata
kernel stack
linux-gate.so
IP
SP
CPUunpriv
process is executing your program code
These are slightly modified versions of slides prepared by Steve Gribble
yourprogram
glibc
Details on x86 / Linux
architecture-dependent code
architecture-independent code
Linux kernel
C standardlibrary POSIX
0x00000000
0xFFFFFFFF
Linuxkernel
stack
shared libraries
heap (malloc/free)
read/write segment.data, .bss
read-only segment.text, .rodata
kernel stack
linux-gate.so
IP
SP
CPUunpriv
process calls into a glibc function (e.g., fopen)
‣ we’ll ignore the messy details of loading / linking shared libraries
2/6/2013
21
These are slightly modified versions of slides prepared by Steve Gribble
yourprogram
glibc
Details on x86 / Linux
architecture-dependent code
architecture-independent code
Linux kernel
C standardlibrary POSIX
0x00000000
0xFFFFFFFF
Linuxkernel
stack
shared libraries
heap (malloc/free)
read/write segment.data, .bss
read-only segment.text, .rodata
kernel stack
linux-gate.soIP
SP
CPUunpriv
glibc begins the process of invoking a Linux system call
‣ glibc’s fopen( ) likely invokes Linux’s open( ) system call
‣ puts the system call # and arguments into registers
‣ uses the call x86 instruction to call into the routine __kernel_vsyscall located in linux-gate.so
These are slightly modified versions of slides prepared by Steve Gribble
yourprogram
glibc
Details on x86 / Linux
architecture-dependent code
architecture-independent code
Linux kernel
C standardlibrary POSIX
0x00000000
0xFFFFFFFF
Linuxkernel
stack
shared libraries
heap (malloc/free)
read/write segment.data, .bss
read-only segment.text, .rodata
kernel stack
linux-gate.soIP
SP
CPUunpriv
linux-gate.so is a vdso
‣ a virtual dynamically linked shared object
‣ is a kernel-provided shared library, i.e., is not associated with a .so file, but rather is conjured up by the kernel and plunked into a process’s address space
‣ provides the intricate machine code needed to trigger a system call
2/6/2013
22
These are slightly modified versions of slides prepared by Steve Gribble
yourprogram
glibc
Details on x86 / Linux
architecture-dependent code
architecture-independent code
Linux kernel
C standardlibrary POSIX
0x00000000
0xFFFFFFFF
Linuxkernel
stack
shared libraries
heap (malloc/free)
read/write segment.data, .bss
read-only segment.text, .rodata
kernel stack
linux-gate.so
IPSP
CPUpriv
linux-gate.so eventually invokes the SYSENTER x86 instruction
‣ SYSENTER is x86’s “fast system call”instruction
‣ it has several side-effects
- causes the CPU to raise its privilege level
- traps into the Linux kernel by changing the SP, IP to a previously determined location
- changes some segmentation related registers (see cse451)
These are slightly modified versions of slides prepared by Steve Gribble
yourprogram
glibc
Details on x86 / Linux
architecture-dependent code
architecture-independent code
Linux kernel
C standardlibrary POSIX
0x00000000
0xFFFFFFFF
Linuxkernel
stack
shared libraries
heap (malloc/free)
read/write segment.data, .bss
read-only segment.text, .rodata
kernel stack
linux-gate.so
IPSP
CPUpriv
The kernel begins executing code at the SYSENTER entry point
‣ is in the architecture-dependent part of Linux
‣ it’s job is to:
- look up the system call number in a system call dispatch table
- call into the address stored in that table entry; this is Linux’s system call handler
- for open, the handler is named sys_open, and is system call #5
2/6/2013
23
These are slightly modified versions of slides prepared by Steve Gribble
yourprogram
glibc
Details on x86 / Linux
architecture-dependent code
architecture-independent code
Linux kernel
C standardlibrary POSIX
0x00000000
0xFFFFFFFF
Linuxkernel
stack
shared libraries
heap (malloc/free)
read/write segment.data, .bss
read-only segment.text, .rodata
kernel stack
linux-gate.so
IPSP
CPUpriv
The system call handler executes
‣ what it does is system-call specific, of course
‣ it may take a long time to execute, especially if it has to interact with hardware
- Linux may choose to context switch the CPU to a different runnable process
These are slightly modified versions of slides prepared by Steve Gribble
yourprogram
glibc
Details on x86 / Linux
architecture-dependent code
architecture-independent code
Linux kernel
C standardlibrary POSIX
0x00000000
0xFFFFFFFF
Linuxkernel
stack
shared libraries
heap (malloc/free)
read/write segment.data, .bss
read-only segment.text, .rodata
kernel stack
linux-gate.so
IPSP
CPUpriv
Eventually, the system call handler finishes
‣ returns back to the system call entry point
- places the system call’s return value in the appropriate register
- calls SYSEXIT to return to the user-level code
2/6/2013
24
These are slightly modified versions of slides prepared by Steve Gribble
yourprogram
glibc
Details on x86 / Linux
architecture-dependent code
architecture-independent code
Linux kernel
C standardlibrary POSIX
0x00000000
0xFFFFFFFF
Linuxkernel
stack
shared libraries
heap (malloc/free)
read/write segment.data, .bss
read-only segment.text, .rodata
kernel stack
linux-gate.so
IP
SP
CPUunpriv
SYSEXIT transitions the processor back to user-mode code
‣ has several side-effects
- restores the IP, SP to user-land values
- sets the CPU back to unprivileged mode
- changes some segmentation related registers (see cse451)
‣ returns execution back to glibc
These are slightly modified versions of slides prepared by Steve Gribble
yourprogram
glibc
Details on x86 / Linux
architecture-dependent code
architecture-independent code
Linux kernel
C standardlibrary POSIX
0x00000000
0xFFFFFFFF
Linuxkernel
stack
shared libraries
heap (malloc/free)
read/write segment.data, .bss
read-only segment.text, .rodata
kernel stack
linux-gate.so
IP
SP
CPUunpriv
glibc continues to execute
‣might execute more system calls
‣ eventually returns back to your program code
2/6/2013
25
These are slightly modified versions of slides prepared by Steve Gribble
If you’re curiousDownload the Linux kernel source code
- get version 2.6.34.8
- available from http://www.kernel.org/
Take a look at:
- arch/x86/kernel/syscall_table_32.S [system call table]
- arch/x86/kernel/entry_32.S [SYSENTER entry point and more]
- arch/x86/vdso/vdso32/sysenter.S [user-land vdso]
And: http://articles.manugarg.com/systemcallinlinux2_6.html
These are slightly modified versions of slides prepared by Steve Gribble
Also...
man, section 2: Linux system calls
- man 2 intro
- man 2 syscalls (or look online here)
man, section 3: glibc / libc library functions
- man 3 intro (or look online here)
2/6/2013
26
These are slightly modified versions of slides prepared by Steve Gribble
straceA useful Linux utility that shows the sequence of system calls that a process makes:
bash$ strace ls 2>&1 | less
[005c7424] execve("/bin/ls", ["ls"], [/* 47 vars */]) = 0[003caffd] brk(0) = 0x9376000[003cc3c3] mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7800000[003cc2c1] access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)[003cc184] open("/etc/ld.so.cache", O_RDONLY) = 3[003cc14e] fstat64(3, {st_mode=S_IFREG|0644, st_size=92504, ...}) = 0[003cc3c3] mmap2(NULL, 92504, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb77e9000[003cc1bd] close(3) = 0[003cc184] open("/lib/libselinux.so.1", O_RDONLY) = 3[003cc204] read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\"..., 512) = 512[003cc14e] fstat64(3, {st_mode=S_IFREG|0755, st_size=122420, ...}) = 0[003cc3c3] mmap2(0x6d6000, 125948, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x6d6000[003cc3c3] mmap2(0x6f3000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1c) = 0x6f3000[003cc1bd] close(3) = 0[003cc184] open("/lib/librt.so.1", O_RDONLY) = 3[003cc204] read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\200X[\0004\0\0\0"..., 512) = 512... etc.
These are slightly modified versions of slides prepared by Steve Gribble
straceA useful Linux utility that shows the sequence of system calls that a process makes:
bash$ strace ls 2>&1 | less
...
[00110424] open(".", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY|O_CLOEXEC) = 3[00110424] fcntl64(3, F_GETFD) = 0x1 (flags FD_CLOEXEC)[00110424] getdents64(3, /* 6 entries */, 32768) = 184[00110424] getdents64(3, /* 0 entries */, 32768) = 0[00110424] close(3) = 0[00110424] fstat64(1, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0[00110424] mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb77ff000[00110424] write(1, "bomstrip.py\nmountlaptop.sh\nteste"..., 43
bomstrip.pymountlaptop.shtestertester.c) = 43[00110424] close(1) = 0[00110424] munmap(0xb77ff000, 4096) = 0[00110424] close(2) = 0[00110424] exit_group(0) = ?
2/6/2013
27
These are slightly modified versions of slides prepared by Steve Gribble
Let’s do some file I/O...
We’ll start by using C’s standard library
- these functions are implemented in glibc on Linux
- they are implemented using Linux system calls
C’s stdio defines the notion of a stream
- a stream is a way of reading or writing a sequence of characters from/to a device
‣ a stream can be either text or binary; Linux does not distinguish
‣ a stream is buffered by default; libc reads ahead of you, writes behind
‣ three streams are provided by default: stdin, stdout, stderr
‣ you can open additional streams to read/write to files
These are slightly modified versions of slides prepared by Steve Gribble
Using C streams#include <stdio.h>#include <stdlib.h>#include <errno.h>
#define READBUFSIZE 128int main(int argc, char **argv) {
FILE *f;char readbuf[READBUFSIZE];size_t readlen;
if (argc != 2) {fprintf(stderr, "usage: ./fread_example filename\n");return EXIT_FAILURE; // defined in stdlib.h
}
// Open, read, and print the filef = fopen(argv[1], "rb"); // "rb" --> read, binary modeif (f == NULL) {
fprintf(stderr, "%s -- ", argv[1]);perror("fopen failed -- ");return EXIT_FAILURE;
}
// Read from the file, write to stdout.while ((readlen = fread(readbuf, 1, READBUFSIZE, f)) > 0)
fwrite(readbuf, 1, readlen, stdout);fclose(f);return EXIT_SUCCESS; // defined in stdlib.h
}
stderr is a stream for printing error output to a console
fopen opens a stream to read or write a file
perror writes a string describing the last error to stderr
stdout is for printing non-error output to the console
printf(...) is equivalent to fprintf(stdout, ...)
fread_example.c
2/6/2013
28
These are slightly modified versions of slides prepared by Steve Gribble
Writing is easy too
see cp_example.c
These are slightly modified versions of slides prepared by Steve Gribble
A gotcha
By default, stdio turns on buffering for streams
- data written by fwrite( ) is copied into a buffer allocated by stdio inside your process’s address space
- at some point, the buffer will be drained into the destination
‣ when you call fflush( ) on the stream
‣ when the buffer size is exceeded (often 1024 or 4096 bytes)
‣ for stdout to a console, when a newline is written (“line buffered”)
‣ when you call fclose( ) on the stream
‣ when your process exits gracefully (exit( ) or return from main( ))
2/6/2013
29
These are slightly modified versions of slides prepared by Steve Gribble
Why is this a gotcha?
What happens if...
- your computer loses power before the buffer is flushed?
- your program assumes data is written to a file, and it signals another program to read it?
What are the performance implications?
- data is copied into the stdio buffer
‣ consumes CPU cycles and memory bandwidth
‣ can potentially slow down high performance applications, like a web server or database (“zero copy”)
These are slightly modified versions of slides prepared by Steve Gribble
What to do about itTurn off buffering with setbuf( )
- this, too, may cause performance problems
‣ e.g., if your program does many small fwrite( )’s, each of which will now trigger a system call into the Linux kernel
Use a different set of system calls
- POSIX provides open( ), read( ), write( ), close( ), and others
- no buffering is done at the user level
but...what about the layers below?
‣ the OS caches disk reads and writes in the FS buffer cache
‣ disk controllers have caches too!
2/6/2013
30
These are slightly modified versions of slides prepared by Steve Gribble
Exercise 1
Write a program that:
- uses argc/argv to receive the name of a text file
- reads the contents of the file a line at a time
- parses each line, converting text into a uint32_t
- builds an array of the parsed uint32_t’s
- sorts the array
- prints the sorted array to stdout
‣ hints: use “man” to read about getline, sscanf, realloc, and qsort
bash$ cat in.txt
1213
3231
00000552bash$ ex1 in.txt
5
52
12133231
bash$
These are slightly modified versions of slides prepared by Steve Gribble
Exercise 2Write a program that:
- loops forever; in each loop, it:
‣ prompts the user to input a filename
‣ reads from stdin to receive a filename
‣ opens and reads the file, and prints its contents to stdout, in the format shown on the right
- hints:
‣ use “man” to read about fgets
‣ or if you’re more courageous, try “man 3 readline” to learn about libreadline.a, and google to learn how to link to it
0000000 50 4b 03 04 14 00 00 00 00 00 9c 45 26 3c f1 d50000010 68 95 25 1b 00 00 25 1b 00 00 0d 00 00 00 43 530000020 45 6c 6f 67 6f 2d 31 2e 70 6e 67 89 50 4e 47 0d0000030 0a 1a 0a 00 00 00 0d 49 48 44 52 00 00 00 91 000000040 00 00 91 08 06 00 00 00 c3 d8 5a 23 00 00 00 090000050 70 48 59 73 00 00 0b 13 00 00 0b 13 01 00 9a 9c0000060 18 00 00 0a 4f 69 43 43 50 50 68 6f 74 6f 73 680000070 6f 70 20 49 43 43 20 70 72 6f 66 69 6c 65 00 000000080 78 da 9d 53 67 54 53 e9 16 3d f7 de f4 42 4b 880000090 80 94 4b 6f 52 15 08 20 52 42 8b 80 14 91 26 2a00000a0 21 09 10 4a 88 21 a1 d9 15 51 c1 11 45 45 04 1b00000b0 c8 a0 88 03 8e 8e 80 8c 15 51 2c 0c 8a 0a d8 0700000c0 e4 21 a2 8e 83 a3 88 8a ca fb e1 7b a3 6b d6 bc...etc.
2/6/2013
31
These are slightly modified versions of slides prepared by Steve Gribble
See you on Wednesday!
Top Related