Learning Embedded Part 1: Initialization and Vector Table

Created: 2023-11-04 | Last edited: 2023-11-10

Introduction

In this post we will be going over getting the microcontroller fully initialized and discussing the linker, memory segments and the vector table. I will start by discussing how a basic C program is formatted for those that do not know. If I was not doing bare metal this part of the series would not exist. When using a HAL or any starter code; startup code, a linker script and similar things are usually included. When trying to create a new project this is helpful, but to truly understand what is happening behind the scenes we need to write it ourselves. That is what this post will be focusing on, understanding these very low level concepts.

Memory Sections

First off lets go over all of the different sections in a normal C program. The job of the linker script is to create these sections and place them into the correct places in the microcontrollers Flash and SRAM.

.text: This section contains the all of the code and symbols that are created during compilation.
.rodata: This section contains any read only data. Sometimes (and in my case) this section can be apart of the .text section.
.data: This section contains variables that persist through the length of the program and that are initialized. These include global and static variables.
.bss: This section contains variables that persist through the length of the program and that are NOT initialized. These will be set to 0 by the startup code.

Linker Script

The job of the Linker is to take all of the object files and link them together into a single executable. This was shown when I went over the Makefile; main.o and init.o were built and these were linked together into main.elf. The job of the linker script is to tell the linker where to put the different memory sections of the object files into memory. I will be going over the linker script that is used in this project, it is shown below.

MEMORY
{
    flash : origin = 0x08000000, len = 256k
    sram1 : origin = 0x20000000, len = 48k
    sram2 : origin = 0x10000000, len = 16k
}

SECTIONS
{
    .text : {
        *(.vector_table);
        *(.text);
    } >flash

    .data : {
        _data_values = LOADADDR(.data);
        _data_start = .;
        *(.data);
        _data_end = .;
    } >sram1 AT>flash

    .bss : {
        _bss_start = .;
        *(.bss);
        _bss_end = .;
    } >sram1
}

Memory

So first off with the MEMORY part of the file. This section allows us to specify different parts of the memory in our microcontroller. In the case of the STM32l432KC there is 256k of flash and 64k of SRAM split into two sections. This information can be found in the datasheet. Specifying the different parts of memory, their locations and lengths allows us to put specific parts of the program into different parts of the memory.

As you can see in the SECTIONS part of the linker script I am using there is no .rodata section. The read only data will be stored in the .text section for now. The *(.text) syntax tells the linker to take the .text sections from all the object files and place them where the *(.text) is located, in this case in the .text section. The *(.vector_table) has to do with the microcontroller startup code, that will be covered later. The ">flash" at the end of the .text section tells the linker to put the .text section into the flash memory which is specified in the MEMORY section.

The .data section gets a bit more interesting. Here I have defined some extra symbols. The "." is a variable that stores the current memory address, this is automatically updated by the linker. The _data_start and _data_end symbols store the location of the .data section in memory. There are two different types of memory addresses that are important here. The first is LMA (load memory address), this specifies where the section will be loaded into memory, so basically the physical address. The second is VMA (virtual memory address), this specifies the virtual address; it is where the data will be placed into RAM so that it can be accesses during the programs runtime. In the case of the previous section, .text, the ">flash" specifies the LMA, which means that it will load the .text section into the flash at address 0x08000000 (since .text is the first section). The .data section is unique because it has initialized data that needs to be saved in the flash so that it can later be loaded into the RAM. Therefore we use the ">sram1 AT>flash" syntax. This means that the LMA is the flash, and the VMA is the sram1. This means that the .data section is stored in the flash but copied into sram1 during runtime. The "LOADADDR()" function gives us the LMA. Having the LMA of the .data section will allow us direct access to the values stored in it. Therefore we will be able to initalize the global and static variables to their correct values since this does not happen by default. This is what the next section of this post will cover

Initialization

Since everything here is being done bare metal, everything has to be done manually. The linker script that I showed above is just listing where things will be located in memory. We will use this information to manually initalize the memory segments.

extern uint32_t _data_values;
extern uint32_t _data_start;
extern uint32_t _data_end;
extern uint32_t _bss_start;
extern uint32_t _bss_end;

With the code segment shown above we are able to access the symbols that are declared in the linker script. The extern keyword in C tells the linker that the symbol does exist even though it is not located in this file or any included header files. Therefore the linker will not throw an error saying that it cannot find this symbol.

Initalization Code

Below is the code block that I will spend the majority of this section talking about. It is what will initalize the necessary memory segments. If you are paying attention then you may realize that this code does not deal with the .text section at all. Before you read on think about for a minute why this may be the case.

Answer

Have you figured it out? Well if you recall the .text section contains program code. 
The machine code has to be placed in flash for it to be able to be executed.
Therefore when the compiler generates the machine code, it is placed into
the .text section of the binary.

void 
init()
{
    uint32_t* src = &_data_values;
    uint32_t* dest = &_data_start;
    uint32_t len = &_data_end - &_data_start;

    while (len--)
        *dest++ = *src++;

    // zero out the uninitialized global/static variables
    dest = &_bss_start;
    len = &_bss_end - &_bss_start;
    while (len--)
        *dest++ = 0;

    main();
}

As you can see this code is fairly simple. To start we simpily declare variables to access the linker symbols. But wait, why are we taking the address of the linker symbols? In the linker we set these symbols to store specific memory addresses, therefore we should just be able to access those values, right? Well unfortunately this is incorrect logic, I made this same incorrect assumption. Lets do the same thing here, try to think about this and check yourself below (I could not figure this one out on my own).

Answer

Moral of the story, linker symbols don't work like normal C variables, they are 
just symbols. Therefore they do NOT have memory allocated to store their values in 
the symbol table, that means that the value will just be garbage. To get the
values that are stored there we need to access the address directly as that
is the only thing in the symbol table for the linker symbols. Then obviously
dereferencing that address will give us that value stored at it.

Here is a small sample from the symbol table from our program.
The column before the name is its size, as you can see for _bss_start,
one of the linker symbols, its size is 0 meaning it cannot hold a value.

08000ef9 g     F .text  0000001c systick_handler
08000ca1 g     F .text  00000050 blinky
20000004 g       .bss   00000000 _bss_start

NOTE: You can ignore the other symbols I am showing for now, we will get to
those in a later post, I put them here to show the comparison to a linker symbol


Linker manual reference

Now that we have that covered lets move on. We simply take the .data data stored in the flash memory and copy it to the location in virtual memory that is allocated for the .data section. The .bss section is even simpler because if you recall, the .bss section is for uninitialized variables that persist through the length of the program. Since they are uninitialized we simply set them to 0. In terms of initalization that is basically it. As you can see at the end of this init() function we make a call to the main() function. As you may assume, this means that the init() function is called before main. That leads us into the last topic of the post; the vector table.

Vector Table

The vector table, also known as the interrupt vector table, stores the addresses of the ISRs as well as a few other important pieces of data that get the system going. Those of you with a keen eye will have noticed that in the linker script the very first thing placed into memory is the vector table. Below is part of the vector table that is being used in the this project.

const void* vectors[] __attribute__((section(".vector_table"))) = {
    (void*)0x2000C000, /* Stack pointer at top of sram1 48k */ 
    init,              /* Reset Handler */
    default_handler,   /* NMI */
    default_handler,   /* Hard Fault */
    /* ... */
}

Now that you are able to see the beginning section of the vector table it should make some sense on why it is the first thing loaded into memory. Upon system startup the vector table is read. The first entry is to the top of the stack, this allows an environment to be setup so that C can run. It is necessary to load the stack pointer in first as the next entry, the reset handler, is the address of the first function ran. This will be the init function because of the reasons we discussed earlier. And at the end of init() main() is called. So finally after all of this, we are at a point where we can start writing code in our main function that will run on the microcontroller. The vector table has loads of entries as it stores all of the ISRs. As we go over different peripherals and need to use their interrupts, we will be modifying the vector table to let the system know the location of the interrupt handlers.