Unix Programming with Perl

Unix Programming with Perl

Unix Programming with Perl

Cybozu Labs, Inc.

Kazuho Oku

Writing correct codeWriting correct code

tests aren’t enoughtests don’t ensure that the code is correct

writing correct code requires…knowledge of perl and knowledge of the OS

the presentation covers the ascpects of unix programming using perl, includingerrnofork and filehandlesUnix signals

Oct 16 2010 Unix Programming with Perl 2

Errno


The right way to “create a dir if not exists”The right way to “create a dir if not exists”

Is this OK?

if (! -d $dir) { mkdir $dir or die "failed to create dir:$dir:$!";}


The right way to “create a dir if not exists” (2)The right way to “create a dir if not exists” (2)

No!

if (! -d $dir) { # what if another process created a dir # while we are HERE? mkdir $dir or die "failed to create dir:$dir:$!";}



The right way is to check the cause of the error when mkdir fails



So, is this OK?

if (mkdir $dir) { # ok, dir created} elsif ($! =~ /File exists/) { # ok, directory exists} else { die "failed to create dir:$dir:$!";}



No! The message stored in $! depends on OS and / or locale.

if (mkdir $dir) { # ok, dir created} elsif ($! =~ /File exists/) { # ok, directory exists} else { die "failed to create dir:$dir:$!";}



The right way is to use Errno.

use Errno ();

if (mkdir $dir) { # ok, created dir} elsif ($! == Errno::EEXIST) { # ok, already exists} else { die "failed to create dir:$dir:$!";}


$! and Errno$! and Errno

$! is a dualvaris a number in numeric context (ex. 17)

equiv. to the errno global in C

is a string in string context (ex. “File exists”)equiv. to strerror(errno) in C

the Errno modulelist of constants (numbers) that errno ($! in

numeric context) may take


How to find the ErrnosHow to find the Errnos

perldoc -f mkdir doesn’t include a list of errnos it might return

see man 2 mkdirman mkdir will show the man page of the

mkdir commandspecify section 2 for system callsspecify section 3 for C library calls


How to find the ErrnosHow to find the Errnos

errnos on man include those defined by POSIX and OS-specific constants

the POSIX spec. can be found at opengroup.org, etc.http://www.opengroup.org/onlinepubs/

000095399/


Fork and filehandles


Filehandles aren’t cloned by forkFilehandles aren’t cloned by fork

fork clones the memory imageuses CoW (Copy-on-Write) for optimization

fork does not clone file handlesonly increments the refcount to the file

handle in the OSthe file is left open until both the parent and

child closes the file

seek position and lock states are shared between the processes

the same for TCP / UDP socketsOct 16 2010 Unix Programming with Perl 14

File handles aren’t cloned by fork (2)File handles aren’t cloned by fork (2)


memory

Parent Process

Operating System

File System

lock owner, etc.

File

memory

Another Process

memory

Child Process

fork

seek pos. / lock state, etc.

File Control Info.

open(file)

seek pos. / lock state, etc.

File Control Info.

open(file)

Examples of resource collisions due to forkExamples of resource collisions due to fork

FAQ“The SQLite database becomes corrupt”“MySQL reports malformed packet”

mostly due to sharing a single DBI connection created before calling forkSQLite uses file locks for access control

file lock needed for each process, however after fork the lock is shared between the processes

in the case of MySQL a single TCP (or unix) connection is shared between the processes


Examples of resource collisions due to fork (2)Examples of resource collisions due to fork (2)

The wrong code…

my $dbh = DBI->connect(...);

my $pid = fork;if ($pid == 0) { # child process $dbi->do(...);} else { # parent process $dbi->do(...);


How to avoid resource collisions after forkHow to avoid resource collisions after fork

close the file handle in the child process (or in the parent) right after forkonly the refcount will be decremented. lock

states / seek positions do not change


How to avoid collisions after fork (DBI)How to avoid collisions after fork (DBI)

undef $dbh in the child process doesn’t worksince the child process will run things such

as unlocking and / or rollbacks on the shared DBI connection

the connection needs to be closed, without running such operations


How to avoid collisions after fork (DBI) (2)How to avoid collisions after fork (DBI) (2)

the answer: use InactiveDestroy

my $pid = fork;if ($pid == 0) { # child process $dbh->{InactiveDestroy} = 1; undef $dbh; ...}



if fork is called deep inside a module and can’t be modified, then…

# thanks to tokuhirom, kazeburoBEGIN { no strict qw(refs); no warnings qw(redefine); *CORE::GLOBAL::fork = sub { my $pid = CORE::fork; if ($pid == 0) { # do the cleanup for child process $dbh->{InactiveDestroy} = 1; undef $dbh; } $pid; };}



other ways to change the behavior of forkPOSIX::AtFork (gfx)

Perl wrapper for pthread_atforkcan change the behavior of fork(2) called within

XS

forks.pm (rybskej)


Close filehandles before calling execClose filehandles before calling exec

file handles (file descriptors) are passed to the new process created by execsome tools (setlock of daemontools,

Server::Starter) rely on the feature

OTOH, it is a good practice to close the file handles that needn’t be passed to the exec’ed process, to avoid child process from accidentially using them


Close file handles before calling exec (2)Close file handles before calling exec (2)

my $pid = fork;if ($pid == 0) { # child process, close filehandles $dbh->{InactiveDestroy} = 1; undef $dbh; exec ...;...


Close file handles before calling exec (3)Close file handles before calling exec (3)

Some OS’es have O_CLOEXEC flagdesignates the file descriptors to be closed

when exec(2) is being calledis OS-dependent

linux supports the flag, OSX doesn’t

not usable from perl?


Unix Signals


SIGPIPESIGPIPE

“my network application suddenly dies without saying anything”

often due to not catching SIGPIPEa signal sent when failing to write to a

filehandleex. when the socket is closed by peer

the default behavior is to kill the process

solution: $SIG{PIPE} = 'IGNORE';downside: you should consult the return value

of print, etc. to check if the writes succeededOct 16 2010 Unix Programming with Perl 27

Using alarmUsing alarm

alarm can be used (together with SIG{ALRM}, EINTR) to handle timeouts

local $SIG{ALRM} = sub {};alarm($timeout);my $len = $sock->read(my $buf, $maxlen);if (! defined($len) && $! == Errno::EINTR) { warn 'timeout’; return;}


Pros and cons of using alarmPros and cons of using alarm

+ can be used to timeout almost all system calls (that may block)

− the timeout set by alarm(2) is a process-wide global (and so is $SIG{ALRM})use of select (or IO::Select) is preferable for

network access


Writing cancellable codeWriting cancellable code

typical use-case: run forever until receiving a signal, and gracefully shutdownex. Gearman::Worker


Writing cancellable code (2)Writing cancellable code (2)

make your module cancellable- my $len = $sock->read(my $buf, $maxlen);+ my $len;+ {+ $len = $sock->read(my $buf, $maxlen);+ if (! defined($len) && $! == Errno::EINTR) {+ return if $self->{cancel_requested};+ redo;+ }+ }...+ sub request_cancel {+ my $self = shift;+ $self->{cancel_requested} = 1;+ }


Writing cancellable code (3)Writing cancellable code (3)

The code that cancels the operation on SIGTERM$SIG{TERM} = sub { $my_module->request_cancel };$my_module->run_forever();

Or the caller may use alarm to set timeout$SIG{ALRM} = sub { $my_module->request_cancel };alarm(100);$my_module->run_forever();


Proc::Wait3Proc::Wait3

built-in wait() and waitpid() does not return when receiving signalsuse Proc::Wait3 instead


Unix Programming with Perl

Technology

Transcript of Unix Programming with Perl