a) interactive shell - Yale University
Transcript of a) interactive shell - Yale University
Compilers, linkers and makefiles…. Oh, no…
Here are some quick pointers to producing executable code (and lots of other stuff you never thought
you needed to know ).
First, very few of you will be programming directly in assembly code/machine language, so your human
readable program needs to get translated into something useable for the machine.
There are currently three main models/methods for doing this.
a) Interpreted languages and interactive shell (command) programs. Here the idea is that you are
having a conversation with the computer, which is constantly “interpreting” what you are saying as
you tell it what to do next. This is exactly what the “shell” program we’ve been talking about so
much recently does, and a “shell script” is nothing but a list of the commands you would normally
type in with your keyboard but that are stored in a file instead so you can send them (repeatedly) to
the “standard input” channel (“stdin” in C language notation) of the shell program. The
advantage/disadvantage of an interpreted language, like IDL and Matlab in interactive mode, is that
everything is done on the fly and all variables are dynamically allocated. This is actually a good and
a bad thing. Commands are interpreted line by line, and not all once (as in the compiled execution
model below). So if you declare a variable “A,” the language first has to grab the memory from the
operating system, which can be very slow (considering the billions of operations you can now do in 1
sec). That’s bad, but the plus side is that you can change everything “on the fly” as you get new
ideas, and even though setting up the memory for a new variable is painfully slow by the computer’s
clock, it seem almost instantaneous on human timescales. So the feedback you get from the
program feels quick – if you change something in the program, just send the new program back to
the interpreter and see if it works now. The premium here (for an interpreted language) is on the
speed and flexibility of translating your wishes into computer actions. So an inte rpreted language is
ideal for tasks like interactive data analysis, where you have to rescale and replot things, and what
exactly you decide to do next will depends on what you see in the data.
b) Tokenized/semi-interpreted languages. The distinction between the first (interpreted) and the third
model (compilation) has vanished somewhat as compilers have gotten much faste r. Back in the bad
“old” days, though, compiling a relatively simple program to produce machine code could take 2-3
minutes, even if you only changed one letter of the code. If you have to wait 2-3 minutes every time
you change your mind, this will clobber your efficiency and eventually drive you crazy. (Trust me. )
So in the old days, compiled languages were only used for serious, complicated and well-defined
tasks where computing efficiency was at a premium instead of responsiveness. If you’re wondering
what “tokenized” at the top means, it is an approach to telling the computer that is half-way the
interpreted and compiled language models. (The “Basic” language started out as a tokenized
language.) It is designed for writing computer programs that are fairly sophisticated and intended to
be run multiple times, but in a way that avoids most of the pain of that expensive compilation
operation. Instead of looking over the entire program to try to find efficiencies, the tokenizer takes
your input line by line and immediately converts all the operations on that line to a set of “tokens,”
where a token is a numerical value that points to a standard (pre-compiled) subroutine that is to be
executed when that token is encountered. So then a program is turned into string of pre -compiled
tokens, and if you decide to change something on a given line, you re -interpret the new line and
update the associated token list, which is fast since it is essentially “tokenization” is a look-up
operation. (Again, all the machine instructions associated with a token are pre -compiled, before the
tokenizer program runs.) The “tokenized” version of your program is an intermediate level
translation of your human program, where each token represents a generalized operation. The cool
thing about this is that once you have a tokenized program, it really doesn’t matter what language
you used to generate those tokens. Also, to the extent you have access to a token “interpreter” for
every machine type of interest, you know have a “machine independent” language. This is how Java
achieves its “machine independence” yet manages to still produces fairly compact code files. [But
see below for how to improve Java performance on any given machine…]
c) Compilers – the “right” way to do things. In this model, you are forced spell everything out that you
want the computer to do beforehand. You then feed it to a compiler which translates it an object
code (machine language) “stub.” Since the computer science compiler wizards can see everything
that you want to do up front and you are not allowed to change your mind, they can do a much
better job of figuring out how to “optimally” translate your ASCII text instructions into the
instructions the microprocessor actually understands. For example, if the compiler knows you will
be performing the same set of operation on a bunch of array elements, it can arrange the data and
the operations to be done in such a way that the multiple pipeline units on the CPU can be
continually fed and we don’t “bust” caches. Again, in case you just got confused, the difference
between the output of a tokenizer and a compiler is that the compiler doesn’t produce tokens but
rather real machine (assembly) language that can be interpreted directly by the CPU. Since we’re
explicitly dealing with the lowest-level details of the CPU, we can presumably do the best possible
job of mapping our human instructions to machine instructions, and a very big part of why we can
do things more efficiently is because we have the entire program to look at (so in principle we know
what the programmer is up to) and the programmer can’t change his mid-stream. This means we
can look for optimizations and, e.g., think about the best way to store or order the program’s data.
And more importantly, if we know we’re going to need 1 MB of array data, we can allocate it
statically, before the program runs so we don’t have to wait for the operating system to finally give
us 1MB at some random time. And we can also do things like initialize that array data and have it
ready to go before the program is run. One consequence of this is that a simple “hello world”
program that happens to declare a 1 MB array can be very small in size in an interpreted/tokenized
language (after all there are only a few lines of code) yet the compiled program that allocates the
1MB array and initializes it is guaranteed to have a size > 1MB. This is again good and bad. It’s bad
because it takes a lot more space to store the program code (and is why people developed “DLLs –
dynamically linked libraries” of code that are shared between problems and can be loaded into
memory at the last minute.) BUT it’s very fast and efficient. Yes, still we have to wait for the
operating system to free 1MB of memory, but (i) it’s done at the beginning and doesn’t cause
random program execution behavior once the program starts, and (ii) the array loading and
initialization is a simple memory copy, which is a very fast, common operation. And one final big
advantage of the compiled program. Once you gone through all that effort, you don’t have to do it
again. You simply copy our code into memory and let the CPU have at it. With an interpreted
program, in principle you have to reinterpret it from the beginning, set up all the variables, etc.
Bottom line: compiled programs can be 100x faster than interpreted programs. If you want speed,
you don’t use IDL…
d) Just-in-time compilers … a hybrid scheme. O.K. Let’s say you’ve worked hard on debugging your
tokenized program, and you swear you will never touch it again. Well, if you’re not going to alter
your program on the fly, and I know what operations your tokens stand for, I basically have all the
information I need to figure out what you really intend to do, just as in the case of an ASCII text file
that gets compiled. So in the case of Java, the web browser or operating system provides a “just -in-
time” (JIT) compiler.” In the background, while you are online looking at YouTube, the JIT compiler
scans your downloaded Java applets and figures out an optimal way to convert the tokens into
object (directly executable) machine code. In this case, your Java applications will eventually start to
run at close to compiled speeds. The point here is that you are usually just a consumer of the Java
applications and thus don’t need the speed/flexibility in changing an application that is provided by
an interpreter/tokenizer.
e) O.K. I finally understand this ASCII to machine code translation business, but what does a linker do
and why do I need it? After all, didn’t you just tell me a compiler, in particular, produces machine
code that can be directly fed to the CPU? Yup, if your text program is complete, i.e., contains all the
function and subroutine definitions you might need, AND your operating is very simple so that it
only runs one program at a time, i.e., you know exactly where in memory your program will live,
then indeed you don’t need a linker. An example of this is the early IBM computers that operated in
“batch” mode. The computer operator would literally feed the IBM punch card reader a “batch” of
cards (the complete text version of the program). These would get translated and loaded by the
assembler or primitive compiler into machine code that was directly copied into memory. When the
computer operator was satisfied everything was loaded properly and no punch cards had been
eaten or missed, he would press the “go” button and the CPU instruction pointer would get set to
the address of the first memory location, and that particular “batch job” would start executing.
Usually, it would produce a printout on a big line printer that operator would then collect and put
together with the punch cards, which would then all be handed back to the computer user who had
submitted that particular “batch job,” so the user could figure out why his program had failed (or as
occasionally happened, actually worked!). The computer operator would then grab the next batch
of cards, feed it into the punch card reader, and the process would start all over. The reason I have
gone into such excruciating detail is that the “batch mode” of execution still exists today (e.g., on
Yale’s big clusters), and I wanted you to understand how a computer operates at the most basic
level. But it didn’t take longs for things to change…
First of all, the human operator was replaced by an “autoloader” that continually fed punch cards
into the reader and hit the “go” button, etc. without human intervention. Next people decided that
it would be cool if the autoloader could respond differently to different situations, so that it became
a punch card program that had to be loaded into memory first, and depending on what exactly was
going on, it might take up different amounts of the machine memory. That complicated the story
because a user’s code now had to be “relocatable,” i.e., written in such a way that the autoloader
could quickly figure out to change it so it would still run if it were copied to a different chunk of
memory (with a different starting address). This lead to the notion of “relative addressing” (where
the pointer to a particular memory location was computed by adding an offset to a base value
stored in a register on the CPU) and “entry points” or “code hooks” where the autoloader had to
insert specific values computed at run time (after it had decided where to copy the next program to
be executed). The need to be able to load a program into different memory and still have it run
became particularly necessary once people discovered/decided that a computer could appear to do
multiple things at once (“multi-tasking”) by loading several programs in memory at the same time,
and then “time slicing,” e.g., by executing the next 100 instructions of the first program, then
executing the next 100 instructions of the second program and so on. Programs came in different
sizes and lengths, so it was impossible to predict where any given program would end up in
memory. The output of the compiler therefore turned into an “executable image” that had to be
processed by the autoloader before it would do anything useful, and eventually the autoloader got
so complicated that it began to be called an “operating system.”
Just around the time computers started to multi-task, the first random access, non-punch card
storage appeared! One could store the content of thousands, even millions, of punch cards on a
magnetic drum and load any of them into system memory on demand. In particular, because one
could store the compiled, executable image of a program on the magnetic drum too, it suddenly
made sense to separate the process of compilation (translation to machine code from that of
execution. You could compile your program, and then wait weeks to execute your image, or you
could ship the compiled image to your friend via a magnetic tape, and he could copy it onto his
drum without ever having to run a compiler on the source (text) code. (And maybe you didn’t want
him or her to, because you didn’t trust them to keep your source code private.) And so the
compilation stage of the old autoloader turned into just another program (the “compiler”) that you
could call up on demand to produce executable images.
From this point, it was only a small step to the notion of a linker program. For you see, that compiler
program was often abominably slow. Back then, a thousand-line program could even take hours to
compile, and as I noted above, this made it extremely unwieldy to debug. So computer scientists
came up with the idea of breaking up the executable image into chunks. After all, we were already
familiar with the notion of breaking the text version of the program up into subroutines and
functions, so why not do the same for executable image? Let’s say a program consisted of one
thousand lines distributed fairly evenly over ~10 subroutines. Now if I forget a semi-colon
somewhere and the machine code is broken up into chunks corresponding to the subroutines , I
would only have to recompile the subroutine with the mistake, and not the entire program, which
would be ~ten times faster. This breaking up of a program’s machine code is particularly essential
now that are codes are millions of lines long and can contain thousands of subroutines. Breaking up
the executable image into chunks corresponding to separate program functions (which we can then
mix and match as desired) is indeed a brilliant idea – but only as long as we can eventually put all
those pieces back together again!
There are typically two ways this is accomplished. The first is to let the autoloader (the operati ng
system) do it every time the program is run. This may sound like a lot of extra work, and it is. Why
not assemble the complete program once and for all, and then use this complete image from then
on? This is in fact exactly what a linker does, but…. In our example program with 10 subroutines,
the total program size is 10 times larger than the size of the individual chunks, which might be quite
large. Since storage space on magnetic drums is limited, is there any way to save some space? Not
really if all the subroutines/chunks are written you and not in common with those used by other
programs on the machine. But what if your program is a “hello world”-type program:
Main()
{ printf(“hello world!\n”)
}
How much storage space could that possibly take? Unfortunately, if you manage to produce the full
executable image that needs to be loaded into memory, you will be shocked to discover that it could
take several or tens of megabytes! How could this possibly be? Your program only prints out a few
characters!? But there’s a catch? How do you get those characters to appear on a screen? You’re
not actually doing this yourself, but you’re actually calling the “printf” subroutine to do it. And
remember that this printf subroutine has to know how to handle al l kinds of special formats and
potentially send characters over the internet? It’s actually quite complicated, and there’s actually a
fair amount of behind-the-scenes stuff that happens for most programs. So your “trivial” program is
also quite complicated in its totality and takes up a lot of storage space.
Can we find any way to save space this time? Yes, because not just your program uses the printf
subroutine. In general, almost every program on your Mac or PC calls the same printf or windows
graphics routines, so in fact there is a massive duplication of information. So researchers came up
with the concept of “dynamic” (run time) linking. Instead of having to include the printf subroutine
in every compiled image, what if we stored a single copy of a compiled printf subroutine on the
magnetic drum, and only when we are ready to run our program, do we load this into memory too
and somehow join it to the rest of our program? Then suddenly the compiled version of our “hello
world” program needs only a few bytes of storage space since it just basically loads “hello world”
into memory and then immediately calls the printf routine. If we have 100 programs that use the
printf routine, and the compiled printf routine requires 1MB of storage, then we have just saved
ourselves 99 MB of storage! That doesn’t sound like much saved space by today’s standards, but it
was in the past, and this practice of “dynamic linking” in commonly used system subroutines still
continues today since the size of subroutines tends to bloat/grow with time as more features are
added. And actually, a modern operating system contains thousands of sub-programs that call the
same basic routines, so the saving can be quite significant. And… there is still one area of storage
where a few MBs are important. That is your RAM. Even if we do late-time linking of common
subroutines, in the end everything has to be copied and joined together into memory, so we won’t
see any space savings. Indeed, try opening a “trivial” terminal window on your computer and
depending on your operating system, you’ll see that 20 MB of space may vanish. A browser window
may take even more. Students these days tend to have windows open all over the place on their
screen, and the memory they consume quickly adds up. If you have 20 windows open, 400 MB of
your 2GB of main memory is then immediately gone. Some of that can be shuffled back to disk if
some windows are in the background and not being used, but in general you have lost a lot of space
to common subroutines, and continually loading in these subroutines also costs execution time. So
modern operating systems can “cache” these special, commonly used subroutine chunks by keeping
a copy of them in memory. This way if one of them needs to be copied and joined (linked) to some
particular program, it can be done quickly, via a RAM to RAM copy instead of having to pull the
information of the slow hard disk. Moreover, some subroutines are so critical and common that a
lot of work goes into making them re-entrant, that is callable by multiple programs, without having
to create multiple copies of the subroutine code. Subroutines like this (e.g., the one that knows
how to store characters on a hard disk) are defined to be special and are often part of the so-called
system “kernel,” a collection of core “system calls” or system functions that enable your programs
to interact with the hardware and the operating system autoloader and program scheduler, i.e.,
make a modern multi-tasking computer “run.” For these subroutines, we never have to keep
multiple copies around, and we do save space.
So why did I go into this long explanation of something that most people never notice? Because it is
an example of the linking process we’ll discuss in more detail soon , AND because if you don’t know
this is going and you download binary code from friends or web, you can easily get into trouble. For
example, your friend is very proud he finally got his “hello world” program to work, so you ask him if
you can copy the compiled program image to your machine so you can play with it. The file transfer
goes very quickly because it is a simple program (right? ). Then you try to run it. Rats. It fails. The
system complains about an “incompatible executable file type.” What’s the problem? Both of your
machines are running Intel processors, so the machine code should be identical? Yes, but you forgot
that your friend compiled his program on a Scientific Linux machine, and you are trying to run it on a
Mac. Guess what. Macs and Linux machines have slightly different conventions on how a program
image should be formatted and structured so that their autoloaders can figure out how to properly
install and relocate the full program into main system memory. Moreover, if any subroutines are
dynamically linked in at the last minute, there’s no guarantee that those subroutines exist or do the
same thing on both systems. So you are stuck, and in general , the same “binary” (compiled program
image files) cannot be used on different operating systems. Slightly frustrated, you log into a
departmental Linux machine, copy the program binary over to there, and try to run it again. Again
the program fails! This time you get the message “shared library libX11.so.2 cannot be loaded.” But
it’s the same operating system, you complain loudly. libX11 is a collection of X-windows routines,
the ones you need to draw graphics on your screen, so it’s kind of fundamental and surely must
exist on the system you just logged into. Yup, it does – sort of. When you go to /usr/lib (the
directory usually containing all the “shared” routines which will be dynamically linked in at run time)
you indeed find a libX11.so. But it is a symbolic link to a file libX11_v3.so.4 – a different,
incompatible file from the one your friend’s program needs. This is because the departmental
machine is running Scientific Linux 6.1, a new version, while your friend is still running the older
Scientific Linux 5.4 . So even if the two versions of the operating system are in principle highly
compatible, they are often not in practice because the versions of the dynamically linked
libraries/subroutines are different. Is there a quick way to fix this? If the operating system versions
are not so different, you can ask your friend for the libX11.so.2 from his computer so you can install
it on the newer Scientific Linux system. (Using the “ldconfig” program to tell the operating where to
find this new dynamic/shared library – exactly how to do this is starting to enter the world of system
managers, so I won’t go into it more here.) Ta, da. When the autoloader scans the program image
file and discovers it needs to copy in a subroutine that is sitting in the file libX11.so.2, it now knows
where to find it that file, and everything works! You need to know a bit about this dynamic linking
business, so you know what help to ask for when you start seeing error messages like this.
Summary: Modern “executable” compiled files are actually “stubs,” chunks of code that cannot be
run by themselves and need to be attached (linked) to other code and then initialized so that they
will run properly at a specific memory location in RAM. So, executable files in general cannot be
shared between different operating systems, even if they are running on identical hardware. If you
do want to share compiled code (e.g., because you don’t want to give your friend or customers
access to the source code), you have to be very clear about subroutines need to be linked in
dynamically and then give some hints to your friend/customers on how to get those dynamically
linked routines onto their systems if they happen not be installed. This dependence on “dynamically
linked” files can be very sneaky since using shared libraries is often the default behavior of compilers
(which in fact try to use as many shared routines as possible) and you will not be automatically told
that a particular set of shared files is implicitly being used.
Because dynamic linking can become a pain when sharing programs and you sometimes absolutely
want to have an executable file run correctly on another otherwise compatible system, you can
luckily turn off the dynamic linking mechanism (at the expense of increase d file and memory space
usage) and force the compiler (translator program) to produce a program image that has no
“external dependencies,” i.e., force it take whatever dynamically linked routines you might need and
explicitly copy them into your final machine code file so that you don’t need to depend on the
operating system to provide them to you. The technical jargon for this is that you “statically link”
your program, which is probably what you thought you would be doing all along when you started
reading this section. Note that in general, if you don’t set any special options the compiler produces
a hybrid machine code file, where all the non-generic subroutines that you wrote are included in the
code file but the system-shared subroutines are not (and are dynamically linked in).
So without further ado, onto Static linking: how do you create code chunks and then put them all
together (before the autoloader gets its hands on your code)? The best way to see is to examine the
compiler output for our “hello world” program. Written in generic assembly language, it looks
something like this:
Set BASE IP XXXX
Set BASE DP YYYY
PUSH “Hello World!\n” onto stack .
JSR _printf [ZZZZ]
RET
The XXXX, YYYY, ZZZZ are blank memory addresses (“pointers” that have not been initialized yet).
XXXX and YYYY need to be set by operating system autoloader at run time so they point to the start
of the code and data memory segments that the operating system copied the program image to.
(Remember these aren’t known until run time and depend, for example, on what other p rograms
are already loaded into memory, and so where the free memory slots are.) The ZZZZ is the memory
address that we have to “JUMP” to so we can start executing the printf subroutine. Unfortunately,
our hello_world.c program (as typed above) contains no definition of the printf subroutine, so the
poor compiler is stuck. It could simply quit and complain with the error message, “Hey, I can’t
possibly produce a working piece of code since you didn’t tell me what the printf routine is. I quit.”
However, because we are trying to break up our code into smaller, functional chunks that live in
separate files, the compiler will give you the benefit of the doubt, and simply insert a blank “hook”
(the ZZZZ memory value) that you will need to update to point to a piece of real code if you ever
want your program to run.
To make the final working program, we then need to run what has come to be called the “linker
program” or the “link phase” of the compiler. Usually the compiler automatically does this for us,
unless we give it the “-c” option, or you manually do this by executing a command that looks
something like “ln hello_world.o printf.o” . (We can typically add an argument like -static to this
command line to force all the dynamic subroutines to be copied and “linked in” too.) What
does the “ln” program do? It decides on the final memory configuration of the image, e.g., where
the hello_world and printf code chunks sit in memory, and then it scans all the code chunks for
“unresolved external references” – the ZZZZ memory address/hook found in the hello_word.o code
file – and since it now knows where all the subroutines will be sitting in memory, it fills in the ZZZZ’s
with the appropriate values. (It “resolves the links.” N.B. in computer science, “resolving” a generic
name or link means to assign it a specific, appropriate value. For example, if you want to talk to the
machine behind www.google.com, you first have to contact a name server, which then “resolves”
the generic name www.google.com by replacing it with an actual IP address that you can use to
direct data packets to the google server.) As a final step, the “ln” program then creates the
“memory image” of the program by creating a disk file where all the subroutines and data arrays are
properly located and copied, with all “external references” resolved. This memory image can then
be directly (quickly) copied into memory, and except for the last minute fixes the autoloader does,
the program is ready to go.
If for some reason, the linker scanned the subroutines you are using and found a hook that it
couldn’t resolve (e.g., because it wasn’t given the name of a file containing the code chunk for a
particular subroutine), then it will indeed complain “Hey bozo, error: unresolved externa l reference
found to _printf . ” and quite. It will not produce the memory image, because it would not be
complete. This is a common error message that novices encounter, and it means you either forget
to include some of your code chunk files on the command line, or you have to talk to your professor
or system manager about where to find the “right library” that contains the routine that appears to
be missing.
One more thing about external references. We’ll be talking later about “common” or “global”
variables that exist outside of subroutines and can act as central directories/memo boards to keep
track of information we want to share between subroutines. In a program, a particular global
variable can obviously be defined only once. So if we then decide to break our program into chunks,
with subroutines in separate files, only one of the files can contain the definition of that global
variable. How do the other subroutines, which now don’t contain that variable definition, then
figure out how to access that global variable? In those other subroutines, the trick is to define the
variable as “external.” So we do something link this:
external double global_variable;
double function spit_global_variable_back()
{ return( global_variable);
}
Here, the “external” qualifier in the global_variable definition tells the compiler to give you the
benefit of the doubt again, and insert a hook/external reference (another ZZZZ-type memory
address/pointer) that needs to be filled in correctly once the location of global_variable in memory
is known. That doesn’t happen until the “link phase,” and actually the job of the linker is not just to
fill in the blanks/hooks for subroutines defined elsewhere but also those for variables too.
II. Enough talk, a practical example!
The assignment today is to generate and print out 1000 random numbers. The subject of how to
generate (pseudo) random numbers is an important one we will talk more about later, but for now
let’s assume your professor hands you the source code for a function, ran3.c, that you can call to
generate a random number.
Here’s the skeleton of ran3.c (technically known as a “function prototype” – it contains all the
information you need to know on how to use it):
double ran3(long *idum) <- just this line is the prototype, and in ANSI C it is always required if you
plan to use the function in your file (even if it’s definition is external).
{
Stuff
}
This tells us that we need to feed ran3 a pointer to a long integer (that had better be defined
somewhere) as an input parameter, and that we should expect a double precision number in return
The main.c file that will call this routine and print out the random numbers then looks like this:
#include <stdio.h> /* necessary to include the function prototypes for the standard input/output
routines like printf */
void main()
{ double num;
long idum;
int count;
idum = -23;
for(count =0; count < 1000; ++ count)
{ ans = ran3(&idum);
printf(“random number %d -- %f \n’, count,ans);
}
}
O.K. So how do we compile and link everything. Well, if you don’t specify any extra options, the
compiler will do everything (translation to machine code + linking) without asking you.
So try this,
gcc –o myprog.x main.c ran3.c . The -o argument tells the compiler to generate an output (memory
image file) with the name that follows the argument. So this will compile the two source code files you
gave it (main.c and ran3.c) and produce an executable file, myprog.x, which then you can then run by
typing, “./myprog.x” at the shell prompt. (Question: why do I always type “./” before a program that
I’ve written?)
What does the compiler actually do to create your executable file? Lots of things actually. First it runs
something called the CPP (C preprocessor) on each .c (C source code file) to do things like replace the
#include <sdtio.h> by the contents of the file stdio.h. (In general, lines that start with “#” are commands
to the C preprocessor, a program that does something called “macro substitutions.”) The outputs of the
C preprocessor are then stored as temporary source code files , say, temp_file1.c and temp_file2.c .
Then compiler is then run on each source code file to produce the code chunks for the subroutines and
variables defined in those source code files. Because the linker hasn’t been run yet, those code chunks
necessarily contain unresolved external references. Such code chunks are called “object files,” and in
Unix, by convention, end with the letter .o . [So a file ending in .c is a C source code file, and the same
file name, but ending in .o, is the compiled object code file.] Object files are again examples of code
stubs, that cannot be loaded directly into memory and executed. The output of the compiler at this
stage will be the two object files, temp_file1.o and temp_file2.o . Finally, the linker gets called! -> ln –o
myprog.x temp_file.o and temp_file2.o, which copies the machine language for all the routines into the
master memory image, resolves all the links/hooks, and writes the image out to the file myprog.x . Along
the way, though, the compiler realized you didn’t bother to define any of the C language subroutines
like printf ! It therefore goes fishing for the definition of printf. It’s not in any of your files, but when the
ln is called by the C compiler, it is actually called as “ln –o myprog.x tem_file1.o temp_file2.o -lc” . The
last argument tells the compiler to check the C library file, libc.a, for the definition of any missing
subroutines. That is indeed where the definition of printf, so the linker grabs it and copies it into the
image too and resolves the link in your main routine to point to it. [Or on some systems, it replaces that
hook with a special dynamic link flag, that tells the autoloader to load the appropriate routine into
memory too and the resolve the links to it.] If you check the right temporary storage directories, you can
catch the compiler creating all the temporary files I’ve discussed, and then deleting all evidence of this
after it has finished its job…
Phew. Do you ever actually want to deal individually with all these steps? Yes, especially if we have a
really big program that calls lots of subroutines and we want to recompile as little as possible, so we can
rapidly debug our program.
So in this case, we explicitly tell the compiler to create the object files and keep them around. We do
this by using that –c argument:
gcc –c ran3.c produces a file ran3.o, and gcc –c main.c produces a file, main.o. We then link them using
the shorthand command (which knows which language libraries, like libc.a, to include by default),
gcc –o myprog.x main.o ran3.o .
The compiler is smart enough to realize it is dealing with two already pre -compiled object code files, so
it actually just calls the linker (“ln”) as above to produce the final image.
What’s the advantage here? Well, hopefully all the bugs are in the program you wrote, so you don’t ever
have to compile ran3.c again (which maybe could take a really long time). After I fix the typo in main.c, I
simply redo gcc –o main.c, and then redo the link step gcc –o myprog.x main.o ran3.o . All done.
This is a very simple example, but I’ll show you more complicated examples later, where we can use a
program called “make” and “makefiles” to automate all the steps above. The source code for the Linux
operating system kernel, for example, consists of thousands of files and subroutines. Imagine that a
couple friends came over and fixed a few random files for you but forgot tell you which ones. How
would you ever figure which ones they changed and then how to properly re-compile (if necessary) and
link all those thousands of files together? Stay tuned for the next exciting installment ….