How a computer program works
A computer program is a sequence of instructions for a computer to do a particular task.
A task can be a simple as calculating a simple value or something more sophisticated, like playing music or browsing the web.
All computer programs work in the same way; they receive an input and return an output.
For instance, when you type in a URL into your browser’s address bar, you’re performing an input. Needless to say, the web page you see is the output.
This mutual interaction is called Input and Output, a.k.a I/O.
In this guide, we’ll discuss what exactly happens – in plain English – from the time you open a program until it sends you back some output.
This article is the first part of the article series The journey of a program to the CPU – A friendly guide for the curious coders.
What will be covered:
- A bit of history on running programs
- Device drivers
- Compilers
- Processes
- A bit of computer memory
When you open a program
When you open a program, the operating system allocates a space (as big as needed) on the memory to the program.
This space will contain a data structure that keeps every information about the program for as long as it’s running.
Next, the OS puts the instructions of the program, as well as its data into that space.
What’s loaded into the memory, isn’t the actual program, but it’s an instance of the program, which has transformed into an executable unit.
If programs had soul, this could be it.
This unit is called a process.
Since we usually have tens of programs running simultaneously, the operating system uses scheduling algorithms, to fairly distribute system resources between programs.
From here, the CPU (which is the brain of the computer) takes over and executes the instructions of the process that’s scheduled to run, one after another.
We’ll cover these steps in more detail in the coming sections.
How it started
Operating systems make it easy to run tens of programs with a few clicks of a mouse.
This wasn’t always the case, though.
There was a time when they were no operating system.
In the early days of computing, computers were only able to run one program at a time, loaded into memory by a human operator.
Programmers gave their program as a deck of punched cards to an operator and came back a few hours later (probably after having a coffee) for the printed results.
Over time, computers got faster and faster and could execute way more than what bear hands could feed them.
Operating systems were invented to make this slow and human-intervened operation as automatic as possible, and use the available computing resources to the fullest.
Device Drivers
Back in the days, a program written for a certain device model couldn’t be used for another model.
Imagine a program could only print the output to a certain model of printer, but not the other.
There was no API specification for manufacturers to implement standardized hardware APIs.
Programmers had to know everything about a certain model of hardware to program for it.
And not every device worked the same way.
At the same time, sharing software was becoming a thing; software companies were selling software products to universities, banks, department stores, etc.
However, the hardware inconsistencies were still the nightmare of many programmers. They couldn’t test their code on every single device, and doing so wasn’t possible anyway.
Operating systems fixed this problem as well, by abstracting away the complexities associated with computer hardware.
They provided a set of contracts manufacturers had to implement.
As a result, any manufacturer had to supply a computer program alongside the actual device.
These programs are called device drivers.
Device drivers are installed on the operating system and are the interface between computer programs and the underlying device.
Nowadays, the device drivers are provided by the hardware manufacturers – either in the form of CDs (when you buy the device) or they could be already included in the operating system’s drivers library.
You may have probably installed a driver or two for your modem, graphic card, sound card, or printer when installing a fresh operating system.
How a bunch of electronics circuits can understand a program
Although computer programs are instructions for computers, these instructions are initially written in a human-friendly – high-level – programming language.
That said, a written program cannot be instantly executed by a machine.
To be executed by a machine, a computer program has to be translated into a set of low-level instructions that a machine can understand.
This low-level equivalent is called machine code, or native code.
Every application installed on your computer is already translated to machine-ready code.
The machine code is in the form of primitive instructions hardwired into the CPU’s architecture.
These instructions are known as the CPU’s instruction set.
CPU instructions are basic such as memory-access instructions, bit manipulation, incrementation, decrementation, addition, subtraction, division, multiplication, comparisons, etc.
The CPU is extremely fast at executing these operations, though. A CPU can execute more than two billion instructions per second.
Each basic instruction doesn’t necessarily mean something on its own, but they become meaningful when they all come together at a higher level.
Imagine a 3D printer, which extrudes plastic materials bit by bit, and layer after layer, until the final model appears. You can think of those plastic bits as the result of each instruction that makes sense when they are put together at the end.
Introducing compilers
Now, the question is how a high-level program is translated to machine code?
Enter Compilers!
Compilers are programs that translate a program written in a high-level programming language into a set of primitive instructions a CPU can execute.
Each programming language has its own set of compilers, for instance, a program written in C++, cannot be compiled to machine code with a Python compiler, and vice versa.
Additionally, a C++ program that has been compiled to run on Windows, cannot run on Linux.
That said, programmers need to pick their compiler program based on the programming language and the target machine.
Types of compilers
Compilers basically pass through the source code line by line, until a few hundred lines of high-level code are translated into thousands of low-level and machine-ready code.
Depending on the features of the source language, some compilers might need to pass through the code more than once.
This categorizes compilers into single-pass and multi-pass compilers.
In the early days of compiling, the design of the compiler was mostly inspired by the resource limitations.
It was a time when full compilation in one pass wasn’t possible, due to hardware limitations.
This led the computer scientists to split the compilation process into multiple passes, which involved passing through the source code multiple times – each time for a different purpose.
Nowadays, although there’s no hardware limitation issue, some modern language features are complicated enough to require a multi-pass approach anyway.
Compiling phases
Designing a compiler for a programming language is a complicated job.
That said, compiler creators split the compiler’s functionality into multiple independent sub-systems, also known as phases.
Each phase has its own responsibility and might involve multiple passes over the source code.
The phases are:
- The front end
- The middle end
- The back end
In the front end phase, the source program is translated into an intermediate representation, known (IR).
IR isn’t the source program anymore but isn’t the target machine code either, yet.
IR is reused across all phases until the opcode is eventually generated.
Lexical analysis, syntax analysis, semantic analysis, and type checking (in type-safe languages), are done during this phase to make sure the program has been written correctly.
The middle end, however, is more involved in code optimization, including dead-code elimination (removing a code that has no effect on the program).
The back-end phase gets the output of the middle end and performs another round of optimization but this time specific to the target machine.
Finally, the final representation is generated, such as an opcode or byte code.
This representation is the instructions finally decoded and executed by the CPU.
Although the front end, middle end, and back end are interfaced, they work independently of each other.
This enables compiler creators to mix & match various phases to make special compilers for different source programs and target machines.
For instance, the same front end can be mixed with different back ends, to make C++ compilers for Mac OS, Windows, and Linux operating systems.
Developing compilers as decoupled phases means smaller programs to maintain, which eventually results in more reliable compilers.
Processes
As mentioned in the introduction, an instance of a program loaded into the memory is called a process.
When you run a program, the operating system allocates a subset of the main memory (RAM in this case) to the process.
This space contains every piece of information the process needs to run, as well as every piece of information it will populate during run-time.
This information is contained in a data structure called Process Control Block (PCB).
A PCB contains the following information about the respective process:
- The program’s instructions in machine-ready code.
- The program’s inputs and outputs
- The data generated during the run time – this section is called the heap
- A call stack to keep track of called functions (subroutines) in the program
- A list of descriptors (a.k.a handles) the process is using, including open files, database connections, etc
- The program’s security settings, like who is allowed to run the program
Any time you open a program, the operating system creates a PCB for the program on the memory.
That said, if you run the same program three times, you’ll have three separate processes loaded into the memory.
For example, Google Chrome manages each tab as an independent process.
This means once you open a new tab in Google Chrome, a new process is created.
Process security
For security reasons, a process cannot access the PCB of another process.
This memory-level isolation ensures a buggy or a malicious process wouldn’t be able to tamper with another process.
This strategy is called Memory Protection.
In Google Chrome’s case, an unresponsive tab wouldn’t affect the other tabs as they are managed in independent processes.
We’ll discuss memory protection in more detail in the next sections.
Computer’s main memory
Memory – a.k.a the main memory or RAM – is one of the main components of a computer.
The memory stores process instructions and process data.
Memory is the only storage system CPU can access, for storing and retrieving data.
That’s why it’s called the Main Memory.
Memory consists of many physical blocks to store data.
Each block has a unique address for reference, e.g. 1, 2, 3, 4, and so forth.
Each time the CPU needs to fetch data (like process instructions or data) from the memory, it sends the memory a set of signals over a set of wires called address bus and control bus.
These signals include the memory address (in binary format) sent over the address bus and a read signal over the control bus.
Once the memory receives the read request from the CPU, it fetches the data and returns it over the data bus, which is a set of electronic wires to transmit data.
Address bus, control bus, and data bus are the components of a larger signal transmission network, called the system bus. I’ll cover the system bus in the next chapters of this series.
Alright, I think it does it for this article.
So far, you know how a computer program is transformed into a process and placed into the main memory to be executed by the CPU.
In the next article, I’ll cover how these memory-based instructions & data are executed by the CPU.
Never miss a guide like this!
Disclaimer: This post may contain affiliate links. I might receive a commission if a purchase is made. However, it doesn’t change the cost you’ll pay.