Basic knowledge of programming

Basic knowledge of programming

Heroine declaration

The OpsDev team started a technology sharing activity in the form of an "open class" in mid-September. Students who are good at a certain field systematically introduced technical points and details to everyone. The first public class was taught by Mr. Li Gang. The content is "Basics of Applied Programming". It is divided into 5 classes. The syllabus and key dry goods of each class will be compiled and shared for more students to learn together.

PS: Rich first-line technology and diversified forms of expression are all in the "HULK first-line technology talk", pay attention!


I have been engaged in web programming under Linux for many years. Recently, I have the honor to give training to the students in the group. I hope to introduce my experience in application programming over the years. Today, I will introduce some basic knowledge of programming to you.

Just Another Day on Aerosol Earth



Operating system introduction

Let's first look at the architecture diagram of a unix system:

From the inside out, the unix system architecture is divided into:

  1. Kernel: Controls hardware resources and provides an environment for running applications

  2. System call: the programming interface of the kernel

  3. Shell and library functions: provide programming and running interfaces for applications

  4. Application: a program written by ourselves


System calls and library functions

Applications can call system calls and library functions. Many library functions call system calls. Use the following picture to show the difference between the two:

As mentioned above, the kernel is used to control hardware resources. For example, reading and writing files from the disk is equivalent to controlling the hardware of the hard disk to perform IO operations. This thing needs the kernel to do it. How to tell the kernel to do this? The system call does this thing, it is a set of programming interfaces exposed by the kernel, by calling this interface to execute the code in the kernel.

Library functions are not kernel code, but a higher level of functional packaging. For example, the commonly used printf function, we can call this function to output content to the display, but controlling the output of the display is what the kernel does. The system call provides the write method, and printf is equivalent to encapsulating the write system call to the application. Provides a more friendly operation mode.

In summary, the system call is equivalent to the call entry of the kernel code, and the library function is a friendly encapsulation of the functions to be used by the application.


User mode and kernel mode

There will be a difference between user mode and kernel mode when the program is running, please see the figure below:

In a nutshell, when the program is executed, if the code of the application we write is executed, these codes are running in user mode; when the system call is called in the code, the code in the kernel will be executed next. The code is running in the kernel mode.


How the program is executed

We use the most classic hello world to illustrate how the program is executed.

Program source code hello.c:

1. the hello.c we developed is stored on the disk, we first compile it to get the executable file hello:

This step of compilation actually went through the following processing flow:

When we run the hello program, the effect is as follows:

The actual execution process is as follows:

1. we enter ./hello through the keyboard, and the shell reads each character into the register, and then stores it in the main memory:

When we enter enter, the shell knows that we have completed the command input, and loads the hello program from the disk to the main memory:

Using DMA (direct memory access) technology, data is directly loaded from the disk to the main memory, avoiding transmission through the CPU.

When the code is loaded into the main memory, the CPU starts to execute the program instructions. These instructions copy the hello, world\n string from the memory to the register, and then transfer it to the display device, which is responsible for displaying the result:


Process, thread, coroutine

Let's first look at the layout of a classic program in memory:

  1. The text section is our program code

  2. In initialized data are some global variables and static variables that are clearly initialized in our code.

  3. The bss section puts global variables and static variables that are not initialized in the code

  4. Heap is the memory space dynamically applied for in our program, such as calling malloc, this space is very large

  5. The space in the stack is used by local face changes and function calls in the program


What is a process? I understand that a process is a concrete manifestation of a running program in the operating system.

Each process has the above layouts in memory, and they are all independent and not shared.


So what is a thread? My understanding is that each thread represents an independent flow of execution.

For example, in the hello program, if the printf statement is to be executed 10 times, then these 10 times can be executed one after the other, which is an execution flow;

You can also create 10 execution streams, each of which executes printf once. If we have at least 10 cpu cores, then 10 printf can be executed in parallel.

Therefore, the thread is the smallest unit of operating system scheduling.

Among multiple threads in the same process, the stack space (stack) is independent of each other, and other data is shared, so the first thing to solve for high-performance multi-threaded programming is the problem of locking.


Some modern programming languages have this concept, what is it?

I understand that coroutine is a concept of user mode.

Threads are scheduled by the operating system kernel, and we cannot intervene. The coroutine is a user-mode program, which is equivalent to the application scheduling itself.

Because it is a user mode program, it is equivalent to multiple coroutines running in one thread.

It should be noted that only the kernel's scheduling of threads can use the multi-core resources of the cpu to make the program parallel, so multiple coroutines in a thread cannot be parallelized.

Sharing between parent and child processes

Let's look at a picture:

This picture describes the relevant data structure maintained in the operating system when we open a file with the open system call in the program. The most important explanation here is that the leftmost part of the process table entry can be understood as user mode. In the program, other file table entry and v-node table entry are in the kernel.

When we use fork to create a new process, the contents of the user mode will be copied to the child process, and the part of the kernel will remain unchanged, as shown in the following figure:

Therefore, the parent-child process is equivalent to sharing the data structure in the kernel through the same file descriptor. This feature is the most important point for us to do a smooth restart of the service in the future.


Concluding remarks

The above are the basic knowledge that the author feels very useful in programming these years. If you have a correct understanding of this knowledge, you can figure out many questions on your own.

The author is also constantly learning. If there is a mistake, I hope to correct it. We will make progress together, thank you!

Talking about HULK's first-line technology

The technology sharing official account created by the 360 cloud platform team covers many technical fields such as cloud computing, database, big data, monitoring, pan-front-end, automated testing, etc. Through solid technology accumulation and rich first-line practical experience, it will bring you The most predictable technology sharing