Thursday, April 3, 2014

system booting

 system booting (PC)

Any system administrator worth their weight in silicon, knows that you need to understand how the booting process works. From the moment you press power until you are presented with the login prompt, a number of things will have occurred:

Note: Most of the information below has been obtained from my cpsc550 (system admin class), researching online, and man pages. For a quick intro on the booting process I would advise a 'man boot'.

Hardware Boot (POST)

When power is turned on or a reset occurs. The cpu has a hardwired address that it begins to load instructions out of nvram (none volitile ram) at address 0xFFF0. This program is referred to in the PC world as the bios ( basic input/output system ) and usually exists in memory space 0xF000 : 0xFFFF. Since there is not much room left on the runway, the first instruction usually consists of a jump back into lower bios memory. This nvram is also called CMOS memory which stands for "Complementary Metal Oxide Semiconductor". The bios program now conducts a POST ( Power on Self Test ) where it makes sure that the memory and other peripherals are there and functioning correctly. If there are errors at this point, your system lets you know this with a series of beep codes. If you check your motherboard manufacturer's website there are details to what these codes mean. A minimal set of instructions now exists in the bios to begin probing a set of system devices for bootable code. This bootable code should begin loading the OS.

OS Loader and Master Boot Record

Once the system has located a bootable device, in most cases your hard disk, it transfers control over to the first sector of that device a.k.a the Master Boot Record (MBR). In a PC this loader has some limitations due to the bios; the most notable is the fact that we need the boot program and partition table to exist in the first 512 bytes. The format of the MBR is as follows:

  • 446 byte program code starts at 0x0000
  • 64 byte partition table starts at 0x01be ( decimal 446)
  • 2 byte signature of ( 0x55aa ) starts at 0x01fe ( decimal 510 )

This all sounds good in theory, but as a programmer I want to know a little more about the mysterious MBR. I remember when I was a youngster that boot sector virus's were prevalent. It makes sense to put one there, seeing as how the OS is not loaded and the system has no protection at this point. I believe that these have all but faded away since the internet has come along, providing a better medium of transport. In order to get a better look at the contents of the boot sector we can run the 'dd' command as root. If we specify 'if='dev/hda' and 'of=mbr_contents' and put a 'count' on it we should get what we are looking for. The 'count' part of the last instruction is important; without it you will attempt to read the whole disk. On a modern system you will generate a very large file in a few seconds. Once we have the these bytes we could make the contents into something that is a little more readable. For this I used the command 'hexdump mbr_contents > mbr_hexdump'. Here are the results from my debian machine.

0000000 ebfa 0120 01b5 494c 4f4c 0516 6b4c 413f
0000010 0000 0000 0fe8 404f b836 b836 8080 5560
0000020 13c0 b800 07c0 d08e 00bc fb08 5352 5606
0000030 8efc 31d8 60ed 00b8 b312 cd36 6110 0db0
0000040 68e8 b001 e80a 0163 4cb0 5ee8 6001 071e
0000050 fa80 75fe 8802 bbf2 0200 768a 891d 80d0
0000060 80e4 e030 0a78 103c 0673 46f6 401c 2c75
0000070 f288 8b66 187e 0966 74ff 5221 08b4 80b2
0000080 13cd 5572 9892 ba91 007f 6642 c031 e840
0000090 0071 3b66 b8bf 7401 e203 5aef 8a53 1e76
00000a0 1fbe e800 004b 99b4 8166 fc7f 494c 4f4c
00000b0 2775 685e 0880 3107 e8db 0035 fb75 06be
00000c0 8900 b9f7 000a a6f3 0d75 02b0 75ae 0608
00000d0 b055 e849 00d5 b4cb b09a e820 00cd bae8
00000e0 fe00 004e 0874 e8bc 6107 e960 ff60 ebf4
00000f0 66fd 66ad c009 0a74 0366 1046 04e8 8000
0000100 02c7 60c3 5555 5066 5306 016a 106a e689
0000110 f653 60c6 5874 c6f6 7420 bb14 55aa 41b4
0000120 13cd 0b72 fb81 aa55 0575 c1f6 7501 524a
0000130 b406 cd08 0713 5872 c051 06e9 e986 cf89
0000140 c159 08ea 4092 e183 f73f 93e1 448b 8b08
0000150 0a54 da39 3873 f3f7 f839 3277 e4c0 8606
0000160 92e0 f1f6 e208 d189 5a41 c688 06eb 5066
0000170 5859 e688 01b8 eb02 b402 5b42 05bd 6000
0000180 13cd 0f73 744d 3109 cdc0 6113 f1eb 40b4
0000190 46e9 88ff 1f64 648d 6110 c1c3 04c0 03e8
00001a0 c100 04c0 0f24 0427 14f0 6040 07bb b400
00001b0 cd0e 6110 00c3 6344 b836 b836 0000 fe00
00001c0 ffff fe05 ffff d222 0704 ca12 0023 0180
00001d0 0001 fe83 ffff 003f 0000 d1e3 0704 0000
00001e0 0000 0000 0000 0000 0000 0000 0000 0000
00001f0 0000 0000 0000 0000 0000 0000 0000 aa55
0000200 0000 0000 0000 0000 0000 0000 0000 0000

There are the aa55 bytes right before the 512 byte boundary ... Sweet! That means we found what we were looking for... but unless your hex op-code reading is better than mine... we still have a problem. My first thought was to take our original file and run it threw gdb (gnu debugger), but this turned out not to work at all. It just complains about it not being an executable file. Next I gave 'ndisasm' the NAS Disassembler a try. This one ate the whole file and spat back the nasm I was looking for. So let's see what the lilo MBR boot loader does:

Click here to have a look at the MBR assembly code.

Kernel Initialization

Once the kernel code has been located and control has been passed over to it, the kernel first initializes the devices through the device drivers on the system. Once this is complete the kernel creates its first process which is the scheduler or the swapper. This process is a kernel process called kswapd. The kernel then mounts the root file system and starts its first user process PID 1 called 'init' ( /sbin/init ). Init is often referred to as the mother of all processes. This can be verified by the reader by typing the command 'pstree'. When init loads it looks in '/etc/initab' for instructions on what to do. The inittab specifies the default run level for the system and calls another script passing that run level as an argument to the script '/etc/init.d/rc 6'. At this point you really should go check out the 'rc' script. It is a small script and the syntax is easy to read. To check what run level your system is in, run the 'runlevel' command. The last thing the inittab does on my system is start up a bunch of virtual consoles that I can access with 'ctr+alt+F1 through F6'.

Run Levels & Init Scripts

The scripts that actually start and stop the services on my system are located in ( /etc/rc[0-6].d ). Once you have looked at the '/etc/init.d/rc' script, you know that it gets past the run level and attempts to call Star and Kill scripts in the appropriate run level directory. My system default runlevel is 6 ( X-graphical mode ), so if I go into '/etc/rc6.d/' and do an 'ls -al' we will see that the files in this directory are nothing more then symbolic links back into ( /etc/init.d ). This is were all the real scripts reside. The run level directories are just there to provide a convenient mechanism to talk to these scripts. The conveniance comes from the way the links are named. They all have an 'S' or 'K' in front of them and then a number. If you go back and look at the ( /etc/init.d/rc ) script you can see that the 'S' starts a service and the 'K' will kill a service. The numbers allow you to specify the order in which the system will execute the scripts. This will allow a service that relies on another service to be started after the service that it requires.

So if we would like to add a new service to be started at runlevel 6 we would do the following things:

  1. Create an init script using the example provided here.
  2. Put your new script in ( /etc/init.d/ )
  3. .
  4. Create a symbolic link from ( /etc/rc6.d/ ) called S[0-99]scriptname that points to your new script
  5. .

That is all it takes. Finally, if you need a configuration file that your service is going to need when it starts, the standard place to put one is in the ( /etc/sysconfig ) directory. I hope that you have learned and had as much fun as I did playing around with this stuff..

link editing

 link editing (ld)

ink-editor, ld(1) concatenates one or more input files (relocatable objects, shared objects or archive libraries) to produce one output file ( relocatable object, exe, or shared object ). Most commonly evoked as a part of the compilation ( cc, gcc ).

Link Editing (ld)

Takes input files from cc, as, or ld and produces one output file of the following formats: relocatable objects, static exe, dynamic exe or shared object. All input files to ld are in the Executable Linker Format (ELF). It is therefore crucial that we understand ELF file format in order to understand link editing. First we shall examine the types of ELF files one can have and there purpose.

  • Relocatable Objects - concatenation of relocatable object input files into one output that can be used again in link-editing. These files contain data telling the linker how to link them to other relocatable objects, shared objects, and executable's.
  • Static exe - all symbol references get bound to the exe, and thus represent a ready to run process. Both forms of executable files contain the data necessary for the operating system to produce an executable image.
  • Dynamic exe - concatenation of relocatable objects that requires intervention by the runtime linker to produce the runnable process. The symbols in the symtab might need binding at runtime. The dynamic executable may also be dependent on shared objects(so). Dynamic executable's are the default output of a compilation.
  • Shared Objects - concatenation of relocatable objects that provides services to dynamic executable's bound at runtime by the runtime linker ld.so.1. Shared objects might also be dependent on other shared objects. Think of Shared objects as dynamic executable's that have not been assigned any virtual address space.

The graphic below demonstrates how to create the various file format discussed above.

Executable Linker Format (ELF)

The ELF file format was created by Unix System Laboratories as a better alternative to a.out and COFF binary formats. Some capabilities of the ELF format include: dynamic linking, dynamic loading, imposing runtime control on a program, and an improved method for creating shared libraries. ELF files contain five section types that may or may not be included in the file. The five types include:

  1. The ELF header.
  2. The Program header table.
  3. The Section header table.
  4. ELF sections. (linker view)
  5. ELF segments. (executable view)

Each of the ELF file formats described above can be looked at in 2 ways (called views). The first view is the linker view and the second is the executable view. The views are summarized in the figure below:

The linker view of ELF files is partitioned into sections while the executable view is partitioned into segments. Sections represent the smallest indivisible unit that can be processed in the ELF file. A segment is a collection of sections and is the smallest unit that can be mapped (mmap) to memory by (exec) or (ld.so.1). These two views allows us to look at information that is specific to linking such as the symbol table and relocation information separate from information specific to creating the process image, like text and data segments. The bulk of the data is therefore stored in sections and segments with the rest of the file (headers) devoted to the organization and access of those sections/segments. The following is a brief description of each of the five file parts.



ELF Header.

This is the only fixed portion of the ELF file, always occurring at the start. It provides information such as: ELF version, target architecture, location of program header table, location of section header table, location of strings table(storing the names of sections), along with the size of each table, and lastly the location of the first instruction that is going to be executed.

#define EI_NIDENT 16

typedef struct {
   unsigned char e_ident[EI_NIDENT];
   uint16_t e_type;
   uint16_t e_machine;
   uint32_t e_version;
   ElfN_Addr e_entry;
   ElfN_Off e_phoff;
   ElfN_Off e_shoff;
   uint32_t e_flags;
   uint16_t e_ehsize;
   uint16_t e_phentsize;
   uint16_t e_phnum;
   uint16_t e_shentsize;
   uint16_t e_shnum;
   uint16_t e_shstrndx;
} ElfN_Ehdr;



Program Header Table

The program header table is only useful to executables and shared objects. This provides organizational information on the array of segments in the file. Each entry in the program header table contains the type, file offset, physical address, virtual address, file size, memory image size, and alignment for a segment in the program. Each segment is copied into memory if its pt_type=PT_LOAD. ?? Question how do we know the physical address ??

typedef struct {
   uint32_t p_type;
   Elf32_Off p_offset;
   Elf32_Addr p_vaddr;
   Elf32_Addr p_paddr;
   uint32_t p_filesz;
   uint32_t p_memsz;
   uint32_t p_flags;
   uint32_t p_align;
} Elf32_Phdr;



Section Header Table

Provides organization information on the array of sections in the ELF file. These entries provide the name, type, memory image starting address (if loadable), file offset, the section's size in bytes, alignment, and how the information in the section should be interpreted.

typedef struct {
   uint32_t sh_name;
   uint32_t sh_type;
   uint32_t sh_flags;
   Elf32_Addr sh_addr;
   Elf32_Off sh_offset;
   uint32_t sh_size;
   uint32_t sh_link;
   uint32_t sh_info;
   uint32_t sh_addralign;
   uint32_t sh_entsize;
} Elf32_Shdr;



ELF Sections

Sections can hold executable code, data, dynamic linking information, debugging data, symbol tables, relocation information, comments, string tables, and notes. Some sections provide information on liking, others are loaded into the process image, while others provide information on building an executable.

ELF Segments

Segments are a groupings of like sections ( text segment, data segment). A process image is created by loading segments into virtual memory segments described by the program header.

Tools readelf

readelf is a tool for viewing elf files. Click here to view and example elfdump. Make sure to view the sections in the example file and return to the example when needed. I found that it gave me a better understanding of the material having an example elf file handy.

Sections of Interest to us

So the basic idea from here is that the link editor concatenates program .text, .data, and .bss sections into the new output file. The rest of the relocation and symbol information is modified or generated to the output file.

ld Execution

So the basic idea from here is that the link editor concatenates program .text, .data, and .bss sections into the new output file. The rest of the relocation and symbol information is modified or generated to the output file.

Here is the program flow for the linker:

  • Verify options passed to it.
  • Concatenate like sections (type, attribute, name) from input relocatable objects to form sections within the output file.
  • Read symbol tables from relocatable object's and shared object's and apply the info to output file by updating other input sections. In addition an output relocation section might be generated.
  • Generate program headers that describe all the segments created.
  • generate dynamic linking info section providing shared object's dependencies and symbol bindings to the runtime linker.

You can change how these sections get mapped by creating a mapping file and using the -M option with (ld). More on this later.

Your Compiler

In practice you rarely invoke ld yourself and it is generally good practice not to. This is because the linker will not attach init and termination code to your program. But we will run some tests on our example program to better understand this (example test.c - the simplest c program).


int main( )
{
return 0;
}

Then we can ask nicely for gcc to compile our test program but not to link it. Once we are done this we can try to manually link the file

gcc -c test.c

ld test.o
ld: warning: cannot find entry symbol _start; defaulting to 0000000008048094

Click here to view the "readelf -a" of the resulting file

The normal way is to have the compiler dirver invoke the linker as follows

gcc test.o

Click here to view the readelf of the resulting file. The deference is rather substantial. To the tune of a lot of extra crap gets included into my simple little program. There is actually more stuff added then there is stuff in my program. At this point it could be said that gcc is the author of my program and not me. So what is all this extra crap that is being added? Lets find out.

One of the only times that it is acceptable to invoke the linker on your own is when you are creating another relocatable object. This is done with the -r option for ld.

ld -r test.o

The moral of the story is that during compilation there is a bunch of extra stuff that gets included in your file. Upon realizing this a good question is what is it? On a Solaris box we can use the -# option to have the compiler display these mysterious files that are included into our code. In linux and gcc you can get the same output with a call to gcc --verbose..

cc -# -o prog test.c

Here is the results on Solaris.

/opt/SUNWspro/bin/../WS6U1/bin/acomp -i test.c -y-fbe -y/opt/SUNWspro/bin/../WS6U1/bin/fbe -y-xarch=generic -y-o -ytest.o -y-s -y-verbose -y-xmemalign=4s -Qy -D__SunOS_5_8 -D__SUNPRO_C=0x520 -D__SVR4 -D__unix -D__sun -D__sparc -D__BUILTIN_VA_ARG_INCR -D__SUN_PREFETCH -Xa -D__PRAGMA_REDEFINE_EXTNAME -Dunix -Dsun -Dsparc -D__RESTRICT -I/opt/SUNWspro/WS6U1/include/cc "-g/opt/SUNWspro/bin/../WS6U1/bin/cc -c "
### Note: LD_LIBRARY_PATH = <null>
### Note: LD_RUN_PATH = <null>
/usr/ccs/bin/ld /opt/SUNWspro/WS6U1/lib/crti.o
/opt/SUNWspro/WS6U1/lib/crt1.o
/opt/SUNWspro/WS6U1/lib/values-xa.o -o prog test.o -Y "P,/opt/SUNWspro/WS6U1/lib:/usr/ccs/lib:/usr/lib" -Qy -lc /opt/SUNWspro/WS6U1/lib/crtn.o
gcc --verbose test.c

Here is the results under debian linux

Reading specs from /usr/lib/gcc-lib/i486-linux/3.3.4/specs
Configured with: ../src/configure -v --enable-languages=c,c++,java,f77,pascal,objc,ada,treelang --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-gxx-include-dir=/usr/include/c++/3.3 --enable-shared --with-system-zlib --enable-nls --without-included-gettext --enable-__cxa_atexit --enable-clocale=gnu --enable-debug --enable-java-gc=boehm --enable-java-awt=xlib --enable-objc-gc i486-linux
Thread model: posix
gcc version 3.3.4 (Debian)
/usr/lib/gcc-lib/i486-linux/3.3.4/cc1 -quiet -v -D__GNUC__=3 -D__GNUC_MINOR__=3 -D__GNUC_PATCHLEVEL__=4 test.c -quiet -dumpbase test.c -auxbase test -version -o /tmp/ccSbXIgh.s
GNU C version 3.3.4 (Debian) (i486-linux)
compiled by GNU C version 3.3.4 (Debian)
GGC heuristics: --param ggc-min-expand=98 --param ggc-min-heapsize=129048
ignoring nonexistent directory "/usr/i486-linux/include"
#include "..." search starts here:
#include <...> search starts here:
/usr/local/include
/usr/lib/gcc-lib/i486-linux/3.3.4/include
/usr/include
End of search list.
as -V -Qy -o /tmp/ccWmNHhp.o /tmp/ccSbXIgh.s
GNU assembler version 2.15 (i386-linux) using BFD version 2.15
/usr/lib/gcc-lib/i486-linux/3.3.4/collect2 --eh-frame-hdr -m elf_i386 -dynamic-linker /lib/ld-linux.so.2 /usr/lib/gcc-lib/i486-linux/3.3.4/../../../crt1.o /usr/lib/gcc-lib/i486-linux/3.3.4/../../../crti.o /usr/lib/gcc-lib/i486-linux/3.3.4/crtbegin.o -L/usr/lib/gcc-lib/i486-linux/3.3.4 -L/usr/lib/gcc-lib/i486-linux/3.3.4/../../.. /tmp/ccWmNHhp.o -lgcc -lgcc_eh -lc -lgcc -lgcc_eh /usr/lib/gcc-lib/i486-linux/3.3.4/crtend.o /usr/lib/gcc-lib/i486-linux/3.3.4/../../../crtn.o


Initialization and Termination Sections

Dynamic Objects provide code for runtime initialization and termination. This code may be in the form of function pointers or one entire block. Each of these sections is built from like section types given by input relocatable objects. Sections:

  • .preinit_array
  • .init_array
  • .fini_array

When creating dynamic objects the link editor identifies these arrays with .dynamic tags DT_PREINIT_ARRAY, DT_PREINIT_ARRAYSZ, AND DT_INIT_ARRAY, DT_INIT_ARRAYSZ, AND DT_FINI_ARRAY, DT_INI_ARRAYSZ.

The sections .init and .fini provide the runtime initialization and termination code for your dynamic executable. Compiler drivers usually supply these sections as files that are tacked onto the beginning and end of the input file list. These sections are provide the requred code in the form of two reserved functions named _init and _fini. When creating a dynamic object the link editor provides symbols with .dynamic tags DT_INIT and DT_FINI. One thing that is very kewl is that you can add functions to the ini_array and the fini_array.

refer back to our ELF file to locate these symbols.

Symbol Processing and Resolution

During input file processing the link editor passes any local symbols straight through to the output file, while global symbols are accumulated internally. The internal symbol table is searched for each new global symbol entry to determine if two are the same and some form of resolution needs to occur.

Basic types of symbol resulution

  • Undefined - global
  • Tentative - occupy storage at runtime
  • Defined - occupy storage in file