What happens when I run a program on my computer?

My job for about a year now has been writing embedded C on bare metal – meaning there is no operating system on the computer my software runs on.
Doing this kind of work has really pushed me to learn more about the inner workings of computers, and I've really enjoyed it 😊

I thought it would be fun to write a wee guide to what happens under the hood when you run a program on your PC. I hope it can help anyone interested get started with understanding all this cool stuff which is easy to ignore when you don't need to know about it.

The CPU (Central Processing Unit)

CPU Whenever you perform any kind of action on a computer, whether that be running a Python script or browsing the web, somewhere down the line this will result in your CPU doing something. The CPU is where software happens. Some people find it helpful to think of it as the computer's brain: it is the part of a computer that performs calculations, actions, and runs programs.

Modern CPUs are almost always microprocessors, meaning the whole thing is on a single small silicon chip.

Instructions

The CPU takes a well-defined set of inputs, called instructions, that tell it what to do. When you compile a C program, you are transforming it into a set of these instructions to be executed by the CPU.

These instructions are delivered to the CPU from memory.

Memory

Computing requires data, otherwise, what are we computing? We need somewhere to store this data, which is where memory comes in.

There are different types of memory on a computer, but perhaps the most important distinction is that between disk and main memory.

Main memory is where stuff that the computer is currently using is stored. RAM (Random Access Memory) is used for this as it's quick to access. Although RAM encompasses a few different types of data storage, some of them non-volatile, the term is often used interchangeably with main memory.

Disk, on the other hand, refers to the permanent storage on your computer. It is slower to access but is non-volatile and allows for larger storage capacity.

Even though reading and writing RAM is faster than the same operations to disk, most read/writes to RAM go through a small cache that lives on the CPU itself, which is even faster.

So: let's say we have a compiled C program saved on our disk. From above, we know that running it will result in some instructions being run on our CPU.
But what steps have to happen in between?

The Program Counter

Your compiled C program is stored on your disk, but if we had to grab each CPU instruction sequentially from disk, it would be really slow. Instead your operating system copies it to RAM where it can be accessed much faster. Next, your OS has to tell the CPU where your program is. It does this by loading the value of the first memory address into a special register on the CPU.

CPUs all have a set of registers, which are small sections of very fast storage that the CPU accesses directly. There's a special register called the Program Counter, which stores a memory address. Hardwired logic in the CPU causes it to read this memory address, where a CPU instruction will be stored. The CPU then decodes and executes the instruction. The Program Counter then increments, and the CPU reads the next instruction. This is how our program is run 😊

Nice and simple, right? In reality there a few more steps in between.

Virtual memory and the MMU

The above description assumes that your computer program has been loaded into a contiguous section of memory. In reality, this wouldn't be very efficient and could result in lots of gaps of unused RAM. Your computer's operating system copies your program into multiple different sections of RAM, and then maps that fragmented memory onto a continuous virtual address space.

Virtual 
memory

The CPU has address translation hardware called the Memory Management Unit (MMU) that translates virtual addresses to physical ones.

Virtual memory solves another problem: often the amount of RAM on your computer simply won't be enough to hold all the different programs you want to use at once. To handle this, we copy areas of RAM that haven't been used recently enough back to disk, freeing up space in RAM for other programs. This technique is called swapping, and virtual memory hides the complexity of this from application code.

Swapping creates the illusion of much more RAM than you actually have, though it does come with a performance penalty.

Another advantage virtual memory offers is increased security, as it isolates the memory of separate processes.

Quick summary

Our computer program lives on our disk, where it is stored permanently. Whenwe want to run the program, your OS copies the contents of the program to memory, where it can be accessed quickly. In order to use the available memory efficiently, the OS creates a continuous virtual address space to hide the reality of fragmented physical memory. This also allows us to swap data back and forth between disk and memory depending on whether it's currently being used, allowing for more programs to run at once.

Once our program is loaded into memory, we point the CPU to its first location in memory, which contains a CPU instruction. The CPU's Memory Management Unit translates the virtual memory addresses to real physical ones. Once we have executed the first instruction, we move to the next address, and execute the instruction there, until the program is done.

Thank you

I hope you enjoyed this little tour through your computer. The engineering that has gone into modern computers is amazing, particularly the microelectronics! Although there are some other details I have left out, and I'm sure there are plenty I don't know about, I hope this is a nice overview. Things look a little different on many embedded targets, particularly without an OS. Maybe I will write something about that next 😊