Loading of programmer's self-cultivation notes

Loading of programmer's self-cultivation notes

Executable file loading and process

Introduce the loading process of ELF files under Linux, and explore the essence of executable file loading

  • What is the virtual address space of a process
  • Why does a process have its own independent virtual address space
  • Several loading methods
  • Distribution of process virtual address space

Process virtual address space

The 32-bit hardware platform determines the address of the virtual address space from 0 to 2^32 -1, that is 0x00000000 ~ 0xFFFFFFFF, the virtual space size of 4GB; while the 63-bit hardware platform has 64-bit addressing capabilities, and its virtual address space has reached 2^64 bytes, that is 0x0000000000000000 ~ 0xFFFFFFFFFFFFFFFF, total 17179869184GB.

From a program point of view, the space occupied by pointers in C language can be used to calculate the size of the virtual address space. Generally, the number of bits in the size of C language pointers is the same as the number of bits in virtual space, such as pointers under 32-bit platforms. It is 32 bits and 4 bytes.

The following uses 32 as the address space as the main address space, and 64-bit as the extension.

By default, the Linux system allocates the virtual address space of the process as follows:

The space used by the operating system is not allowed to be accessed by the process, and the process cannot fully use the remaining 3GB of virtual space, part of which is reserved for other purposes.

PAE

Intel Pentium Pro CPU under Linux began to use 36-bit physical address in 1995, that is, it can access 64GB of physical memory. At this time, the operating system can only have 4GB of virtual address space and cannot read all of the 64GB of physical memory. PAE is designed to solve this problem.

PAE (Physical Address Extension) is an address extension method. After Inter modified the page mapping method, the new mapping method can access more physical memory. The operating system provides a window mapping method to map additional memory into the address space. The application chooses application and mapping as needed. For example, the 256MB virtual address space of 0x10000000~0x20000000 in the application is used as the window. The program applies for multiple physical spaces of 256MB from the physical space higher than 4GB, numbered A, B, C, and then the window is set as needed Map to different physical space blocks, map 0x10000000~0x20000000 to A when A is used, and map to the past when B and C are used, and so on. Under Windows, this memory operation method is AWE (Address Windows Extensions) . UNIX systems such as Linux use the mmp() system call to achieve this .

Loading method

Overwrite loading

Virtual storage was used widely before it was invented, and it has been almost eliminated.

The overwrite loading method gives the programmer the task of tapping the potential of the memory. When writing the program, the programmer must manually divide the program into several blocks, and then write small auxiliary codes to manage when these modules reside in memory and when they are replace. This auxiliary code is called the Overlay Manager , as shown in the figure below

There is no call dependency between modules A and B, so the two modules share the memory area, when

The memory is overwritten when A is used, the memory is overwritten when B is used, and the coverage manager is used as resident memory.

The multi-module is as follows, the programmer needs to manually organize the modules into a tree structure according to the call dependency between them .

Therefore, the coverage manager needs to ensure some highlights.

  • In the tree structure, the call path from any module to the root of the tree (main) is called the call path. When the module is called, the module on the call path must be in memory. For example, when the C module is executing, both B and main need to be in the memory to ensure that E can correctly return to the module B and main after the execution is completed.

  • Prohibit calling across trees

    Any module is not allowed to call across the tree structure, for example, A cannot call B, E, F. But in many cases, two modules depend on the same module. For example, module E and module C need another module G. The most convenient way is to incorporate module G into the main module, so that G is called between E and C. On the path.

Page mapping

Page mapping is part of the virtual storage mechanism, which was born with the invention of virtual storage.

Page mapping divides the data and instructions in the memory and all disks into several pages according to the unit of **page (Page)**, and all subsequent loading and operation units are pages.

Assuming that all instructions and data of the program are 32KB in total, the program is divided into 8 pages and numbered P0~P7. But 16KB memory cannot load 32KB programs, and the loading process will be carried out according to the principle of dynamic loading at this time. If the program execution entry is at P0, the load manager finds that the P0 of the program is not in the memory, and allocates the memory F0 to P0, and purchases the memory of P0 into F0. Use P5 after running, then load P4 into F1, and so on, as shown below:

If the program continues to run and needs to access P4, the load manager must choose to abandon one of the 4 memory pages currently in use to load P4. There are many abandonment algorithms, such as:

  • FIFO first in first out, then give up F0, P4 is loaded into F0
  • LUR is the least used, then give up F2, P4 into F2

And other algorithms. The so-called load manager here is the modern operating system, and to be precise, the movie is the storage manager of the operating system.

From the operating system perspective, the executable file is loaded and executed in the process

Process establishment

The key feature of the process is that it has an independent virtual address space . When a program is executed, three things usually need to be done at the beginning:

  • Create an independent virtual address space

    That is, to create the corresponding data structure required by the mapping function, and under the Linux of i386, creating a virtual address space is actually just assigning a page directory, and there is even no need to set the mapping relationship. That is, the mapping relationship between virtual space and physical memory is completed.

  • Read the executable file header and establish the mapping relationship between the virtual space and the executable file

    To complete the mapping relationship between the virtual space and the executable file, this step is the most important step in the entire loading process, which is "loading" in the traditional sense.

    As shown in the figure, consider the simplest example, the virtual address is shown in the figure, the file size is 0x000e1, and the alignment is 0x1000. Since the size of the .text segment is less than 0x1000, it needs to be aligned.

    This mapping relationship is a data structure stored inside the operating system . Linux calls a segment in the process virtual space a virtual memory area (VMA.Virtual Memory Area) . Windows is called a virtual section (Virtual Section) . In the above example, a VMA of the .text segment will be set in the corresponding data structure of the process. Its address in the virtual space is 0x08048000~0x08049000the .text offset of 0 in the corresponding ELF file, and the attribute is read-only.

  • Set the instruction register of the CPU to the entry address of the executable file and start the operation

Page fault

After the above steps are executed, the mapping relationship between the executable file and the virtual memory of the process is only established through the header information of the executable file, and the instructions and data of the executable file are not loaded into the memory.

Assuming that in the above example, the program entry address is 0x08048000, which happens to be the actual address of the .text section. When the CPU finds an empty page when it intends to execute, it will consider this to be a page fault (Page Fault) . The CPU transfers control to the operating system. The operating system will query the data structure mentioned above, find the VMA where the empty page is located, calculate the offset of the corresponding page in the executable file, and then allocate a physical page in the physical memory. Establish a mapping relationship between the virtual page in the process and the allocated physical page, and then return the control to the process. The process restarts from the position of the page error just now. As shown in the figure below, it is an executable file. The relationship between physical memory:

Process virtual memory space distribution

ELF file link view and execution view

In the actual scene, the number of ELF file segments is relatively large, but due to the need to perform page alignment and other operations, if the page is allocated in one segment, it will inevitably cause greater waste.

But from the perspective of the operating system to load executable files (the operating system does not need to know which section name is, what is the role, etc.), it is found that it does not care about the actual content of the executable file, but only cares about the problems related to loading , The most important issue is the authority of the segment.

The permission combination of the segment is basically the following three

  • Readable and executable segment represented by code segment
  • Readable and writable segment represented by data segment and BSS segment
  • The permission represented by the read-only data segment is a read-only segment

Therefore, we find a scheme that divides the types of permissions and merges the segments with the same permissions for mapping . The merged data is called Segment . If the .textsegment and the .initsegment are combined as a Segment , then it can be loaded when loading. Think of them as a whole to load, so that you can significantly reduce the problem of internal fragmentation of the page, thereby saving space.

Compare with the following figure, the left side is loading by segment, and the side is loading by segment after merging.

Write an example program below:

SectionMapping.c
#include <stdlib.h>

int main(){
	while(1){
		sleep(1000);
	}
	return 0;
}
gcc -static SectionMapping.c -o SectionMapping.elf
 

Then usereadelf

readelf -S SectionMapping.elf
 

Get the following information

There are 33 section headers, starting at offset 0xc4d50:

Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  [ 0]                   NULL             0000000000000000  00000000
       0000000000000000  0000000000000000           0     0     0
  [ 1] .note.ABI-tag     NOTE             0000000000400190  00000190
       0000000000000020  0000000000000000   A       0     0     4
  [ 2] .note.gnu.build-i NOTE             00000000004001b0  000001b0
       0000000000000024  0000000000000000   A       0     0     4
readelf: Warning: [ 3]: Link field (0) should index a symtab section.
  [ 3] .rela.plt         RELA             00000000004001d8  000001d8
       00000000000001f8  0000000000000018  AI       0    24     8
  [ 4] .init             PROGBITS         00000000004003d0  000003d0
       0000000000000017  0000000000000000  AX       0     0     4
  [ 5] .plt              PROGBITS         00000000004003e8  000003e8
       00000000000000a8  0000000000000000  AX       0     0     8
  [ 6] .text             PROGBITS         0000000000400490  00000490
       00000000000874fc  0000000000000000  AX       0     0     16
  [ 7] __libc_freeres_fn PROGBITS         0000000000487990  00087990
       00000000000024e9  0000000000000000  AX       0     0     16
  [ 8] __libc_thread_fre PROGBITS         0000000000489e80  00089e80
       00000000000003d7  0000000000000000  AX       0     0     16
  [ 9] .fini             PROGBITS         000000000048a258  0008a258
       0000000000000009  0000000000000000  AX       0     0     4
  [10] .rodata           PROGBITS         000000000048a280  0008a280
       000000000001bd7c  0000000000000000   A       0     0     32
  [11] __libc_subfreeres PROGBITS         00000000004a6000  000a6000
       0000000000000048  0000000000000000   A       0     0     8
  [12] __libc_IO_vtables PROGBITS         00000000004a6060  000a6060
       00000000000006a8  0000000000000000   A       0     0     32
  [13] __libc_atexit     PROGBITS         00000000004a6708  000a6708
       0000000000000008  0000000000000000   A       0     0     8
  [14] .stapsdt.base     PROGBITS         00000000004a6710  000a6710
       0000000000000001  0000000000000000   A       0     0     1
  [15] __libc_thread_sub PROGBITS         00000000004a6718  000a6718
       0000000000000010  0000000000000000   A       0     0     8
  [16] .eh_frame         PROGBITS         00000000004a6728  000a6728
       0000000000009c38  0000000000000000   A       0     0     8
  [17] .gcc_except_table PROGBITS         00000000004b0360  000b0360
       0000000000000085  0000000000000000   A       0     0     1
  [18] .tdata            PROGBITS         00000000006b0b40  000b0b40
       0000000000000020  0000000000000000 WAT       0     0     8
  [19] .tbss             NOBITS           00000000006b0b60  000b0b60
       0000000000000040  0000000000000000 WAT       0     0     8
  [20] .init_array       INIT_ARRAY       00000000006b0b60  000b0b60
       0000000000000010  0000000000000008  WA       0     0     8
  [21] .fini_array       FINI_ARRAY       00000000006b0b70  000b0b70
       0000000000000010  0000000000000008  WA       0     0     8
  [22] .data.rel.ro      PROGBITS         00000000006b0b80  000b0b80
       0000000000000464  0000000000000000  WA       0     0     32
  [23] .got              PROGBITS         00000000006b0fe8  000b0fe8
       0000000000000008  0000000000000008  WA       0     0     8
  [24] .got.plt          PROGBITS         00000000006b1000  000b1000
       00000000000000c0  0000000000000008  WA       0     0     8
  [25] .data             PROGBITS         00000000006b10c0  000b10c0
       0000000000001af0  0000000000000000  WA       0     0     32
  [26] .bss              NOBITS           00000000006b2bc0  000b2bb0
       0000000000001718  0000000000000000  WA       0     0     32
  [27] __libc_freeres_pt NOBITS           00000000006b42d8  000b2bb0
       0000000000000028  0000000000000000  WA       0     0     8
  [28] .comment          PROGBITS         0000000000000000  000b2bb0
       0000000000000025  0000000000000001  MS       0     0     1
  [29] .note.stapsdt     NOTE             0000000000000000  000b2bd8
       0000000000001408  0000000000000000           0     0     4
  [30] .symtab           SYMTAB           0000000000000000  000b3fe0
       000000000000a6c8  0000000000000018          31   693     8
  [31] .strtab           STRTAB           0000000000000000  000be6a8
       0000000000006532  0000000000000000           0     0     1
  [32] .shstrtab         STRTAB           0000000000000000  000c4bda
       0000000000000176  0000000000000000           0     0     1
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
  L (link order), O (extra OS processing required), G (group), T (TLS),
  C (compressed), x (unknown), o (OS specific), E (exclude),
  l (large), p (processor specific)
 

View the ELF Segmentinformation, just as the structure of the Section attribute is called the segment table, the structure of the segment is described as the program header (Program Header) , which describes how the ELF file should be mapped into the virtual space of the process by the operating system:

readelf -l SectionMapping.elf
 

The results are as follows:

Elf file type is EXEC (Executable file)
Entry point 0x400a00
There are 6 program headers, starting at offset 64

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  LOAD           0x0000000000000000 0x0000000000400000 0x0000000000400000
                 0x00000000000b03e5 0x00000000000b03e5  R E    0x200000
  LOAD           0x00000000000b0b40 0x00000000006b0b40 0x00000000006b0b40
                 0x0000000000002070 0x00000000000037c0  RW     0x200000
  NOTE           0x0000000000000190 0x0000000000400190 0x0000000000400190
                 0x0000000000000044 0x0000000000000044  R      0x4
  TLS            0x00000000000b0b40 0x00000000006b0b40 0x00000000006b0b40
                 0x0000000000000020 0x0000000000000060  R      0x8
  GNU_STACK      0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x0000000000000000 0x0000000000000000  RW     0x10
  GNU_RELRO      0x00000000000b0b40 0x00000000006b0b40 0x00000000006b0b40
                 0x00000000000004c0 0x00000000000004c0  R      0x1

 Section to Segment mapping:
  Segment Sections...
   00     .note.ABI-tag .note.gnu.build-id .rela.plt .init .plt .text __libc_freeres_fn __libc_thread_freeres_fn .fini .rodata __libc_subfreeres __libc_IO_vtables __libc_atexit .stapsdt.base __libc_thread_subfreeres .eh_frame .gcc_except_table 
   01     .tdata .init_array .fini_array .data.rel.ro .got .got.plt .data .bss __libc_freeres_ptrs 
   02     .note.ABI-tag .note.gnu.build-id 
   03     .tdata .tbss 
   04     
   05     .tdata .init_array .fini_array .data.rel.ro .got 
 

From the perspective of loading, we only need to care about two "LOAD" type segments, and the others only play a supporting role in loading. As you can see here, the file is re-divided into three parts:

  • Readable and executable LOAD segment
  • Read and write LOAD segment
  • Unmapped segment

Therefore, all Sections with the same attributes are classified into the same Segment and mapped to the same VMA. Therefore, Segment and Section divide the ELF file from different angles.

  • Link view: Divide from the perspective of Section
  • Execution view: Divided from the perspective of Segment

The following figure shows the mapping relationship between the executable file segment and the process virtual space:

ELF executable files and shared library files have a special data structure called Program Header Table (Program Header Table) to save segment information, because ELF files do not need to be loaded, so there is no program header table. The specific structure is as follows: And correspond to the readelf -lread data one-to-one

typedef struct{
    Elf32_Word        p_type;
    Elf32_Off            p_offset;
    Elf32_Addr         p_vaddr;
    Elf32_Addr         p_paddr;
    Elf32_Word        p_filesz;
    Elf32_Word        p_memsz;
    Elf32_Word        p_flags;
    Elf32_Word        p_align;
}Elf32_Phdr;
 

The basic meaning is as follows

Among them, p_memsz >= p_fileszif p_memsz <= p_fileszit means that the size of the space allocated by the segment segment in the memory exceeds the actual size of the file, this part of the extra space is all filled with "0". In this way, when we construct the ELF executable file, we don't need to set up an additional BSS segment, and can expand the p_memsz of the data segment, and those extra parts are the BSS. The difference between the data segment and the BSS is that the data segment initializes the content from the file, while the BSS is all initialized to 0. Therefore, it can be seen that the previous BSS has actually been incorporated into the data type segment, but has not been displayed.

Heap and stack

View the virtual space distribution of the process as shown in the figure:

The meaning can be seen in my other article, "Analysis of the File Structure of/proc/{pid}/maps"

Among them, the major device number, the minor device number, and the file node number are all 0, indicating that it is not mapped to a file. This kind of VMA is called an anonymous virtual memory area (Anonymous Virtual Memory Area) . We are currently concerned that the two VMAs Heap and Stack exist in almost all processes. The malloc() function memory allocation is allocated from the heap. The heap is managed by the system library and vsyscallis located in the kernel space. The specific role is to be studied in detail. Judging from the name, it is guessed that it should be VMA related to kernel communication.

A process can be divided into the following areas:

  • Code VMA, permission is read-only, executable; there are image files
  • Data VMA, permissions can be read and written; there are image files
  • Heap VMA, readable, writable and non-executable; anonymous, scalable
  • Stack VMA, readable, writable and non-executable; anonymous, expandable downward

The details are shown in the figure:

Linux kernel loading ELF process

  • The bash process calls the fork()system call to create a new process

  • Call the execve()system call to execute the specified ELF file, the original bash process continues to return to wait for the end of the new process just started, and then wait for the user to enter a command. execve()Is defined in the unsitd.hfile, the prototype is as follows

    int execve(const char * filename,char * const argv[ ],char * const envp[ ]);

    The three parameters are the file name of the executed program, the execution parameters, and the environment variables

Ah~~~ The steps here are quite many and complicated, and I don t understand it without actual experiments, so let s just paste the image and watch it;

Then the final result is that the return address is changed to the entry address of the loaded ELF program:

  • Static link: the address pointed to by e_entry in the header file of the ELF file
  • Dynamic link: dynamic linker address

So far, the loading part of the executable file has been introduced