执行存储在ARM处理器上的外部SPI闪存中的程序 [英] Executing programs stored in external SPI flash memory on an ARM processor

查看:107
本文介绍了执行存储在ARM处理器上的外部SPI闪存中的程序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个ARM处理器,能够与外部闪存芯片接口.编写在芯片上的是为要执行的ARM体系结构编译的程序.我需要知道的是将这些数据从外部闪存获取到ARM处理器中以进行执行.

我可以提前运行某种复制例程,将数据复制到可执行内存空间中吗?我想可以,但是ARM处理器正在运行操作系统,并且闪存中没有剩余大量可用空间.我还希望能够安排两个或什至三个程序一次执行,并且一次将多个程序复制到内部闪存中是不可行的.一旦程序位于可访问的内存空间中,便可以使用该操作系统来启动程序,因此任何需要事先完成的事情都可以.

解决方案

通过阅读@FiddlingBits和@ensc的现有答案,我认为我可以提供不同的方法.

您说您的Flash芯片无法进行内存映射.这是一个很大的限制,但我们可以使用它.

是的,您可以提前运行复制例程.只要将其放入RAM中就可以执行.

DMA,以使其更快:

如果您有一个外设DMA控制器(例如Atmel SAM3N系列上的可用控制器),则可以使用DMA控制器复制内存块,而处理器实际上会做一些有用的事情.

MMU使其更简单:

如果有可用的MMU,则只需选择要执行代码的RAM区域,将代码复制到其中,并在出现每个页面错误时,将正确的代码重新加载到相同的区域即可轻松完成此操作地区.但是,这已经由@ensc提出,因此我还没有添加任何新内容.

注意:如果不清楚,则MMU与MPU并不相同

没有MMU解决方案,但有MPU可用:

没有MMU,任务会有些棘手,但仍然可以完成.您将需要了解编译器如何生成代码,并阅读位置独立代码(PIC).然后,您将需要在RAM中分配一个区域,从中执行外部闪存芯片代码并在其中复制部分内容(确保从正确的位置开始执行).每当任务尝试访问其指定区域之外的内存时,都需要将MPU配置为生成故障,然后您需要获取正确的内存(这可能会成为一个复杂的过程),重新加载并继续执行.

没有MMU和MPU:

如果您没有MMU,则现在很难完成此任务.在这两种情况下,您都严格限制外部代码的大小.基本上,存储在外部闪存芯片上的代码现在必须能够完全正确地放入要在其中执行代码的RAM中的分配区域.如果您可以将代码拆分成彼此不交互的单独任务,则可以,但是不能.

如果要生成PIC,则可以仅编译任务并将它们顺序放置在内存中.否则,您将需要使用链接程序脚本来控制代码的生成,以使将要存储在外部闪存中的每个已编译任务将从RAM中的相同预定义位置执行(这可能需要您了解管理程序调用(SVC),它将引发中断,使您可以提取下一条指令,将其放入内存中,然后重新开始.

尽管不建议这样做,因为与执行代码相比,您将花费更多的时间进行上下文切换,您实际上不能使用变量(为此需要使用RAM),不能使用函数调用(除非您手动处理分支指令,哎呀!),您的闪存将被写入如此之多,以至于很快就会变得毫无用处.关于最后一部分,关于Flash变得毫无用处,我将假定您想按RAM中的指令执行指令.除了所有这些限制之外,您还仍必须为堆栈,堆和全局变量使用一些RAM (有关详细信息,请参见我的附录).可以通过外部闪存运行的所有任务共享该区域,但是您需要为此编写一个自定义的链接描述文件,否则会浪费您的RAM.

让您更清楚的是了解如何编译C代码.即使您使用的是C ++,也要先问自己一个问题,我的设备上的变量和指令在哪里编译为?

基本上,在尝试此操作之前,您必须了解的是:

  • 代码将在何处执行(闪存/RAM)
  • 此代码如何链接到其堆栈,堆和全局变量(您将为此任务分配一个单独的堆栈,为全局变量分配一个单独的空间,但可以共享堆).
  • 此外部代码的堆栈,堆和全局变量所在的位置(为此,我试图暗示您需要对C代码进行多少控制)

如何使用外设DMA控制器:

对于正在使用的微控制器,DMA控制器实际上未连接到嵌入式Flash进行读取或写入.如果您也是这种情况,则无法使用.但是,在这方面您的数据表尚不清楚,我怀疑您需要使用串行端口运行测试以查看其是否可以正常工作.

除此之外,我担心使用DMA控制器时的写操作可能会比您手动执行的操作更为复杂,这是因为缓存了页面写操作.您将需要确保只在页面内进行DMA传输,并且DMA传输永远不会越过页面边界.另外,我不确定当您告诉DMA控制器将闪存中的数据写回到同一位置时(可能需要执行此操作以确保仅覆盖正确的部分).

有关可用闪存和RAM的问题:

我担心您先前提出的有关一次执行一条指令的问题.如果真是这样,那么您不妨编写一个解释器. 如果您没有足够的内存来容纳需要执行的任务的全部代码,那么您将需要将该任务编译为PIC,并将全局偏移量表(GOT)与该内存所需的所有内存一起放置在ram中任务的全局变量.这是解决整个任务没有足够空间的唯一方法.您还必须为其堆栈分配足够的空间.

如果您没有足够的RAM(我怀疑您没有),则可以在每次需要在外部Flash芯片上的任务之间进行切换时将RAM内存调出并将其转储到Flash中,但是我会再次强调建议不要多次写入闪存.这样,您可以使外部闪存上的任务为其全局共享一个RAM.

对于所有其他情况,您将编写一个口译员.我什至做了不可思议的事情,我试图想出一种方法来使用微控制器的内存控制器的中止状态"(数据表,这表明您的闪存最多具有10,000个写入/擦除周期(我花了一段时间才找到它).这意味着当您编写和擦除Flash区域时,您将在上下文中切换任务10,000次,而Flash的这一部分将变得无用.

APPENDIX

请先简短阅读此博客条目继续阅读下面的我的评论.

C变量在嵌入式ARM芯片上的位置:

我最好不是从抽象的概念中学习,而是从具体的示例中学习,所以我将尝试为您提供可以使用的代码.基本上所有魔术都发生在您的链接描述文件中.如果您阅读并理解它,您将看到代码发生了什么.让我们现在解剖一个:

OUTPUT_FORMAT("elf32-littlearm", "elf32-littlearm", "elf32-littlearm")
OUTPUT_ARCH(arm)
SEARCH_DIR(.)

/* Memory Spaces Definitions */

MEMORY
{
  /* Here we are defining the memory regions that we will be placing
   * different sections into. Different regions have different properties,
   * for example, Flash is read only (because you need special instructions
   * to write to it and writing is slow), while RAM is read write.
   * In the brackets after the region name:
   *   r - denotes that reads are allowed from this memory region.
   *   w - denotes that writes are allowed to this memory region.
   *   x - means that you can execute code in this region.
   */

  /* We will call Flash rom and RAM ram */
  rom (rx)  : ORIGIN = 0x00400000, LENGTH = 0x00040000 /* flash, 256K */
  ram (rwx) : ORIGIN = 0x20000000, LENGTH = 0x00006000 /* sram, 24K */
}

/* The stack size used by the application. NOTE: you need to adjust  */
STACK_SIZE = DEFINED(STACK_SIZE) ? STACK_SIZE : 0x800 ;

/* Section Definitions */
SECTIONS
{
    .text :
    {
        . = ALIGN(4);
        _sfixed = .;
        KEEP(*(.vectors .vectors.*))
        *(.text .text.* .gnu.linkonce.t.*)
        *(.glue_7t) *(.glue_7)
        *(.rodata .rodata* .gnu.linkonce.r.*)  /* This is important, .rodata is in Flash */
        *(.ARM.extab* .gnu.linkonce.armextab.*)

        /* Support C constructors, and C destructors in both user code
           and the C library. This also provides support for C++ code. */
        . = ALIGN(4);
        KEEP(*(.init))
        . = ALIGN(4);
        __preinit_array_start = .;
        KEEP (*(.preinit_array))
        __preinit_array_end = .;

        . = ALIGN(4);
        __init_array_start = .;
        KEEP (*(SORT(.init_array.*)))
        KEEP (*(.init_array))
        __init_array_end = .;

        . = ALIGN(0x4);
        KEEP (*crtbegin.o(.ctors))
        KEEP (*(EXCLUDE_FILE (*crtend.o) .ctors))
        KEEP (*(SORT(.ctors.*)))
        KEEP (*crtend.o(.ctors))

        . = ALIGN(4);
        KEEP(*(.fini))

        . = ALIGN(4);
        __fini_array_start = .;
        KEEP (*(.fini_array))
        KEEP (*(SORT(.fini_array.*)))
        __fini_array_end = .;

        KEEP (*crtbegin.o(.dtors))
        KEEP (*(EXCLUDE_FILE (*crtend.o) .dtors))
        KEEP (*(SORT(.dtors.*)))
        KEEP (*crtend.o(.dtors))

        . = ALIGN(4);
        _efixed = .;            /* End of text section */
    } > rom /* All the sections in the preceding curly braces are going to Flash in the order that they were specified */

    /* .ARM.exidx is sorted, so has to go in its own output section.  */
    PROVIDE_HIDDEN (__exidx_start = .);
    .ARM.exidx :
    {
      *(.ARM.exidx* .gnu.linkonce.armexidx.*)
    } > rom
    PROVIDE_HIDDEN (__exidx_end = .);

    . = ALIGN(4);
    _etext = .;

    /* Here is the .relocate section please pay special attention to it */
    .relocate : AT (_etext)
    {
        . = ALIGN(4);
        _srelocate = .;
        *(.ramfunc .ramfunc.*);
        *(.data .data.*);
        . = ALIGN(4);
        _erelocate = .;
    } > ram  /* All the sections in the preceding curly braces are going to RAM in the order that they were specified */

    /* .bss section which is used for uninitialized but zeroed data */
    /* Please note the NOLOAD flag, this means that when you compile the code this section won't be in your .hex, .bin or .o files but will be just assumed to have been allocated */
    .bss (NOLOAD) :
    {
        . = ALIGN(4);
        _sbss = . ;
        _szero = .;
        *(.bss .bss.*)
        *(COMMON)
        . = ALIGN(4);
        _ebss = . ;
        _ezero = .;
    } > ram

    /* stack section */
    .stack (NOLOAD):
    {
        . = ALIGN(8);
        _sstack = .;
        . = . + STACK_SIZE;
        . = ALIGN(8);
        _estack = .;
    } > ram

    . = ALIGN(4);
    _end = . ;

    /* heap extends from here to end of memory */
}

这是为SAM3N自动生成的链接描述文件(您的链接描述文件只应在存储区域定义中有所不同).现在,让我们来看看关闭设备电源后启动设备会发生什么情况.

发生的第一件事是ARM内核读取存储在FLASH存储器向量表中的地址,该地址指向您的重置向量.重置矢量只是一个函数,对我来说,它也是由Atmel Studio自动生成的.在这里:

void Reset_Handler(void)
{
    uint32_t *pSrc, *pDest;

    /* Initialize the relocate segment */
    pSrc = &_etext;
    pDest = &_srelocate;

    /* This code copyes all of the memory for "initialised globals" from Flash to RAM */
    if (pSrc != pDest) {
        for (; pDest < &_erelocate;) {
            *pDest++ = *pSrc++;
        }
    }

    /* Clear the zero segment (.bss). Since it in RAM it could be anything after a reset so zero it. */
    for (pDest = &_szero; pDest < &_ezero;) {
        *pDest++ = 0;
    }

    /* Set the vector table base address */
    pSrc = (uint32_t *) & _sfixed;
    SCB->VTOR = ((uint32_t) pSrc & SCB_VTOR_TBLOFF_Msk);

    if (((uint32_t) pSrc >= IRAM_ADDR) && ((uint32_t) pSrc < IRAM_ADDR + IRAM_SIZE)) {
        SCB->VTOR |= 1 << SCB_VTOR_TBLBASE_Pos;
    }

    /* Initialize the C library */
    __libc_init_array();

    /* Branch to main function */
    main();

    /* Infinite loop */
    while (1);
}

现在,在我解释您编写的C代码如何适合所有这些的同时,请耐心等待一段时间.

考虑以下代码示例:

int UninitializedGlobal; // Goes to the .bss segment (RAM)
int ZeroedGlobal[10] = { 0 }; // Goes to the .bss segment (RAM)
int InitializedGlobal[10] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 11 }; // Goes to the .relocate segment (RAM and FLASH)
const int ConstInitializedGlobal[10] = { 21, 22, 23, 24, 25, 26, 27, 28, 29, 30 }; // Goes to the .rodata segment (FLASH)

void function(int parameter)
{
    static int UninitializedStatic; // Same as UninitializedGlobal above.
    static int ZeroedStatic = 0; // Same as ZeroedGlobal above.
    static int InitializedStatic = 7; // Same as InitializedGlobal above.
    static const int ConstStatic = 18; // Same as ConstInitializedGlobal above. Might get optimized away though, lets assume it doesn't.

    int UninitializedLocal; // Stacked. (RAM)
    int ZeroedLocal = 0; // Stacked and then initialized (RAM)
    int InitializedLocal = 7; // Stacked and then initialized (RAM)
    const int ConstLocal = 91; // Not actually sure where this one goes. I assume optimized away.

    // Do something with all those lovely variables...
}

I have an ARM processor that is able to interface with an external flash memory chip. Written to the chip are programs compiled for the ARM architecture ready to be executed. What I need to know how to do is get this data from the external flash onto the ARM processor for execution.

Can I run some sort of copy routine ahead-of-time where the data is copied into executable memory space? I suppose I could, but the ARM processor is running an operating system and I don't have a ton of space left over in flash to work with. I'd also like to be able to schedule the execution of two or even three programs at once, and copying multiple programs into internal flash at one time isn't feasible. The operating system can be used to launch the programs once they're within accessible memory space, so anything that needs to be done beforehand can be.

解决方案

From reading the existing answers by @FiddlingBits and @ensc I think that I can offer a different approach.

You said that your Flash chip can not be memory mapped. This is a pretty big limitation but we can work with it.

Yes you can run a copy routine ahead of time. So long as you place it into RAM you can execute it.

DMA to make it faster:

If you have a Peripheral DMA Controller (like the one available on the Atmel SAM3N family) then you can use the DMA Controller to copy out chunks of memory while your processor does actually useful things.

MMU to make it simpler:

If you have an MMU available then you can do this easily by just picking out a region of RAM where you want your code to execute, copying the code into it and on every page fault, reloading the correct code into the very same region. However, this was already proposed by @ensc so I'm not adding anything new yet.

Note: In case it's not clear, an MMU is not the same as an MPU

No MMU solution but an MPU is available:

Without an MMU the task is a little trickier but it is still possible to do. You will need to understand how your compiler generates code and read up about Position Independent Code (PIC). Then you will need to allocate a region in RAM that you will execute your external flash chip code from and copy parts of it in there (making sure that you start executing it from the correct location). The MPU will need to be configured to generate a fault any time that task tries to access memory outside of its assigned region and you will then need to fetch the correct memory (this could become a complicated process), reload and continue execution.

No MMU and no MPU available:

If you don't have an MMU this task now becomes very difficult to do. In both cases you have a severe restriction on how big the external code can be. Basically, your code that is stored on the external Flash chip now must be able to fit exactly inside the allocated region in RAM where you will execute it from. If you can split that code up into separate tasks that don't interact with each other than you can do it but otherwise you can not.

If you are generating PIC then you can just compile the tasks and place them in memory sequentially. Otherwise, you will need to use the linker script to control the code generation such that each compiled task that will be stored in external flash will execute from the same predefined location in RAM (which will either require you to learn about ld overlays or compile them separately).

Summary:

To answer your question more completely I would need to know what chip and what operating system you are using. How much RAM is available would also help me better understand your constraints.

However, you asked if it was possible to load more than one task at a time to run. If you use PIC like I suggested it should be possible to do so. If not, then you would need to decide ahead of time where each of the tasks will run and that would enable to load/run some of the combinations simultaneously.

And finally, depending on your system and chip this could be easy or hard.

EDIT 1:

Additional information given:

  1. The chip is SAM7S (Atmel)
  2. It does have a Peripheral DMA Controller.
  3. It doesn't have a MMU or MPU.
  4. 8K of internal RAM, which is a limitation for us.
  5. It has roughly 28K of flash left over after the operating system, which is custom-written, has been installed.

Additional questions posed:

  1. Ideally, I'd like to copy the programs over into flash memory space and execute them from there. Theoretically this is possible. Would it be impossible to execute the programs instruction by instruction?

Yes it is possible to execute a program instruction by instruction (but there is a limitation with that approach too that I will get to in a sec). You would start by allocating a (4 byte aligned) address in memory where your single instruction would go. It is 32 bits (4 bytes) wide and immediately following it you would place a second instruction that you would never change. This second instruction would be a supervisor call (SVC) that would raise an interrupt allowing you to fetch the next instruction, place it in memory and start again.

Though possible it isn't recommended because, you will spend more time context switching than executing code, you can't actually use variables (you need to use RAM for that), you can't use function calls (unless you manually process branch instructions, ouch!) and your flash will be written to so much that it will be made useless very fast. With that last one, about Flash being made useless, I will assume that you wanted to execute instruction by instruction from RAM. On top of all of these restrictions you will still have to use some RAM for your stack, heap and globals (see my Appendix for details). This area can be shared by all the tasks running from external flash but you will need to write a custom linker script for this, otherwise you will waste your RAM.

What will make this clearer for you is understanding how C code is compiled. Even if you're using C++ start by asking yourself this, where are the variables and instructions on my device compiled to?

Basically what you MUST know before attempting this is:

  • where the code will execute (Flash/RAM)
  • how this code is linked to its stack, heap and globals (you would allocate a separate stack for this task, and separate space for globals but you can share the heap).
  • where this external code's stack, heap and globals reside (with this I'm trying to hint at how much control you will need to have over your C code)

Edit 2:

How to utilize the Peripheral DMA Controller:

For the microcontroller I'm working with, the DMA controller is actually not connected to the Embedded Flash for either reading or writing. If this is the case for you too you cannot use it. However, your datasheet is unclear in this regard and I suspect that you will need to run a test using the Serial Port to see if it can actually work.

In addition to this, I am concerned that the write operation when using the DMA controller may be more complicated than you doing it manually because of cached page writes. You will need to ensure that you only do the DMA transfers within pages and that a DMA transfer never crosses the page boundary. Also, I'm not sure what happens when you tell the DMA controller to write from flash back into the same location (which you might need to do to ensure you only overwrite the correct parts).

Concerns about the available flash and RAM:

I am concerned with your earlier question about executing it one instruction at a time. If that is the case, then you might as well write an interpreter. If you don't have enough memory to contain the entire code of the task you need to execute then you will need to compile the task as PIC with the Global Offset Table (GOT) being placed in ram along with all the required memory for that task's globals. That's the only way to get around not having enough space for the whole task. You will also have to allocate enough space for its stack too.

If you don't have enough RAM (which I suspect you won't) you can swap your RAM memory out and dump it into Flash every time you need to change between tasks on the external Flash chip but again I would strongly advise against writing to your flash memory many times. That way you can make the tasks on the external flash share a piece of RAM for their globals.

For all other cases you will be writing an interpreter. I have even done the unthinkable, I have tried to think of a way to use the Abort Status of your microcontroller's memory controller (section 18.3.4 Abort Status in the datasheet) as an MPU but have failed to find even a remotely clever way to use it.

Edit 3:

I would suggest reading the section 40.8.2 Non-volatile Memory (NVM) Bits in the datasheet which suggests that your flash has a maximum of 10,000 write/erase cycles (it took me a while to find it). That means by the time you've written and erased the flash region where you will be context switching the tasks 10,000 times that part of Flash will be rendered useless.

APPENDIX

Please have a short read of this blog entry before continuing to read my comments below.

Where C variables live on an embedded ARM chip:

I learn best not from abstract concepts but concrete examples so I will try and give you code to work with. Basically all the magic happens in your linker script. If you read and understand it you will see what happens to your code. Let's dissect one now:

OUTPUT_FORMAT("elf32-littlearm", "elf32-littlearm", "elf32-littlearm")
OUTPUT_ARCH(arm)
SEARCH_DIR(.)

/* Memory Spaces Definitions */

MEMORY
{
  /* Here we are defining the memory regions that we will be placing
   * different sections into. Different regions have different properties,
   * for example, Flash is read only (because you need special instructions
   * to write to it and writing is slow), while RAM is read write.
   * In the brackets after the region name:
   *   r - denotes that reads are allowed from this memory region.
   *   w - denotes that writes are allowed to this memory region.
   *   x - means that you can execute code in this region.
   */

  /* We will call Flash rom and RAM ram */
  rom (rx)  : ORIGIN = 0x00400000, LENGTH = 0x00040000 /* flash, 256K */
  ram (rwx) : ORIGIN = 0x20000000, LENGTH = 0x00006000 /* sram, 24K */
}

/* The stack size used by the application. NOTE: you need to adjust  */
STACK_SIZE = DEFINED(STACK_SIZE) ? STACK_SIZE : 0x800 ;

/* Section Definitions */
SECTIONS
{
    .text :
    {
        . = ALIGN(4);
        _sfixed = .;
        KEEP(*(.vectors .vectors.*))
        *(.text .text.* .gnu.linkonce.t.*)
        *(.glue_7t) *(.glue_7)
        *(.rodata .rodata* .gnu.linkonce.r.*)  /* This is important, .rodata is in Flash */
        *(.ARM.extab* .gnu.linkonce.armextab.*)

        /* Support C constructors, and C destructors in both user code
           and the C library. This also provides support for C++ code. */
        . = ALIGN(4);
        KEEP(*(.init))
        . = ALIGN(4);
        __preinit_array_start = .;
        KEEP (*(.preinit_array))
        __preinit_array_end = .;

        . = ALIGN(4);
        __init_array_start = .;
        KEEP (*(SORT(.init_array.*)))
        KEEP (*(.init_array))
        __init_array_end = .;

        . = ALIGN(0x4);
        KEEP (*crtbegin.o(.ctors))
        KEEP (*(EXCLUDE_FILE (*crtend.o) .ctors))
        KEEP (*(SORT(.ctors.*)))
        KEEP (*crtend.o(.ctors))

        . = ALIGN(4);
        KEEP(*(.fini))

        . = ALIGN(4);
        __fini_array_start = .;
        KEEP (*(.fini_array))
        KEEP (*(SORT(.fini_array.*)))
        __fini_array_end = .;

        KEEP (*crtbegin.o(.dtors))
        KEEP (*(EXCLUDE_FILE (*crtend.o) .dtors))
        KEEP (*(SORT(.dtors.*)))
        KEEP (*crtend.o(.dtors))

        . = ALIGN(4);
        _efixed = .;            /* End of text section */
    } > rom /* All the sections in the preceding curly braces are going to Flash in the order that they were specified */

    /* .ARM.exidx is sorted, so has to go in its own output section.  */
    PROVIDE_HIDDEN (__exidx_start = .);
    .ARM.exidx :
    {
      *(.ARM.exidx* .gnu.linkonce.armexidx.*)
    } > rom
    PROVIDE_HIDDEN (__exidx_end = .);

    . = ALIGN(4);
    _etext = .;

    /* Here is the .relocate section please pay special attention to it */
    .relocate : AT (_etext)
    {
        . = ALIGN(4);
        _srelocate = .;
        *(.ramfunc .ramfunc.*);
        *(.data .data.*);
        . = ALIGN(4);
        _erelocate = .;
    } > ram  /* All the sections in the preceding curly braces are going to RAM in the order that they were specified */

    /* .bss section which is used for uninitialized but zeroed data */
    /* Please note the NOLOAD flag, this means that when you compile the code this section won't be in your .hex, .bin or .o files but will be just assumed to have been allocated */
    .bss (NOLOAD) :
    {
        . = ALIGN(4);
        _sbss = . ;
        _szero = .;
        *(.bss .bss.*)
        *(COMMON)
        . = ALIGN(4);
        _ebss = . ;
        _ezero = .;
    } > ram

    /* stack section */
    .stack (NOLOAD):
    {
        . = ALIGN(8);
        _sstack = .;
        . = . + STACK_SIZE;
        . = ALIGN(8);
        _estack = .;
    } > ram

    . = ALIGN(4);
    _end = . ;

    /* heap extends from here to end of memory */
}

This is an automatically generated linker script for the SAM3N (your linker script should only differ in the memory region definitions). Now, let's go through what happens when your device boots after being powered off.

The first thing that happens is that the ARM core reads the address stored in the FLASH memory's vector table that points to your reset vector. The reset vector is just a function and for me it is also autogenerated by Atmel Studio. Here it is:

void Reset_Handler(void)
{
    uint32_t *pSrc, *pDest;

    /* Initialize the relocate segment */
    pSrc = &_etext;
    pDest = &_srelocate;

    /* This code copyes all of the memory for "initialised globals" from Flash to RAM */
    if (pSrc != pDest) {
        for (; pDest < &_erelocate;) {
            *pDest++ = *pSrc++;
        }
    }

    /* Clear the zero segment (.bss). Since it in RAM it could be anything after a reset so zero it. */
    for (pDest = &_szero; pDest < &_ezero;) {
        *pDest++ = 0;
    }

    /* Set the vector table base address */
    pSrc = (uint32_t *) & _sfixed;
    SCB->VTOR = ((uint32_t) pSrc & SCB_VTOR_TBLOFF_Msk);

    if (((uint32_t) pSrc >= IRAM_ADDR) && ((uint32_t) pSrc < IRAM_ADDR + IRAM_SIZE)) {
        SCB->VTOR |= 1 << SCB_VTOR_TBLBASE_Pos;
    }

    /* Initialize the C library */
    __libc_init_array();

    /* Branch to main function */
    main();

    /* Infinite loop */
    while (1);
}

Now, bear with me for a little longer while I explain how C code that you write fits into all of this.

Consider the following code example:

int UninitializedGlobal; // Goes to the .bss segment (RAM)
int ZeroedGlobal[10] = { 0 }; // Goes to the .bss segment (RAM)
int InitializedGlobal[10] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 11 }; // Goes to the .relocate segment (RAM and FLASH)
const int ConstInitializedGlobal[10] = { 21, 22, 23, 24, 25, 26, 27, 28, 29, 30 }; // Goes to the .rodata segment (FLASH)

void function(int parameter)
{
    static int UninitializedStatic; // Same as UninitializedGlobal above.
    static int ZeroedStatic = 0; // Same as ZeroedGlobal above.
    static int InitializedStatic = 7; // Same as InitializedGlobal above.
    static const int ConstStatic = 18; // Same as ConstInitializedGlobal above. Might get optimized away though, lets assume it doesn't.

    int UninitializedLocal; // Stacked. (RAM)
    int ZeroedLocal = 0; // Stacked and then initialized (RAM)
    int InitializedLocal = 7; // Stacked and then initialized (RAM)
    const int ConstLocal = 91; // Not actually sure where this one goes. I assume optimized away.

    // Do something with all those lovely variables...
}

这篇关于执行存储在ARM处理器上的外部SPI闪存中的程序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆