To answer this question in depth, one needs to know the C
run-time environment, i.e, at least the basic difference between a
- Hosted environment (where all C standard libraries are available, program starts at main()). E.g. GNU/Linux, Windows.
- Freestanding environment (no libraries available, how to start/load is up to the environment). E.g. Embedded systems.
Now, when we do a
$ gcc <
it goes through all the stages of compilation and linking to yield an executable (in ELF format) in that process, the default entry-points to the program is defined and that will be main() (crt0.o/crt1.o etc which GNU linker [ld] links).
If we want to establish a separate entry point, we have to use the linker option (to ld), and that is -e.
$ cat tmp.c
printf ("Hello World.\n");
Note: we use exit instead of return. We cant return, because we wont link with the C run-time which has the handlers. Here, we need stdio.h (libc) for both exit() and printf().
Lets compile it:
$ gcc -c tmp.c
we got tmp.o, to link this and get a.out, we need the path to the run-time dynamic linker (which is the path to ld-linux), on my RHEL, it happens to be:
(since I have a 64-bit AMD, if you want to find out, just run gcc -v on any program and see the link stage output)
Note: ld is the GNU linker, ld-linux is the dynamic linker (or loader) which the kernel first loads, and is responsible for loading the actual executable and all the required dynamic libraries.
$ ld -dynamic-linker /lib64/ld-linux-x86-64.so.2 -e start tmp.o -lc
Here, -lc is to link libc, once this runs, we get a.out, and we're done. We can examine a.out using ldd.
$ ldd a.out
libc.so.6 => /lib64/tls/libc.so.6 (0x000000328b700000)
If we don't use the -dynamic-linker option to ld, it picks default (/lib/ld64.so.1 for me)
At this point, we are done with our agenda of changing the entry point, but lets go a but deeper on ld-linux - the dynamic linker (part of OS) which actually loads the executable.
Why did we have to give the path to ld-linux in the link step ? That's because, the dynamic-linker is a separate binary, and maybe for modularity reasons, is not within in the kernel, when we give the path to dynamic-linker to ld, it will be added into a section called .interp in the ELF headers. When the ELF binary is run, the loader looks for the path to ld-linux in the .interp section, first loads it, and then hands over the program to ld-linux (we can see the ELF headers with the objdump or readelf utilities).
This is similar to shebang lines in script executables, i.e, if the first line of a shell (bash) script is:
call it hello.sh, chmod +x and execute it, the loader does something similar to:
i.e, load the shell and hand over the script to it. We can as well try this on any dynamic executable (which are dependent on shared libraries)
$ /lib64/ld-linux-x86-64.so.2 /bin/echo hello
LD man page ELF: see .interp section