An automatic code overlaying technique for multicores with explicitly-managed memory hierarchies
The explicitly-managed memory hierarchies, where a hierarchy of distinct memories is
exposed to the programmer and managed explicitly by software, are not only found in typical
embedded processors but also found in a class of high performance multicore architectures.
Code overlay techniques have been widely used to execute a program whose code is
bigger than the available code memory in the system. To generate an efficient overlaid
executable with maximum storage savings as well as minimum performance overhead, the …
exposed to the programmer and managed explicitly by software, are not only found in typical
embedded processors but also found in a class of high performance multicore architectures.
Code overlay techniques have been widely used to execute a program whose code is
bigger than the available code memory in the system. To generate an efficient overlaid
executable with maximum storage savings as well as minimum performance overhead, the …
SRC: an automatic code overlaying technique for multicores with explicitly-managed memory hierarchies
C Jang - Proceedings of the international conference on …, 2011 - dl.acm.org
In this paper, we propose an efficient code overlay technique that automatically generates
an overlay structure for a given memory size for multicores with explicitly-managed memory
hierarchies. We observe that finding an efficient overlay structure with minimum memory
copying overhead is similar to the problem that finds a code placement with minimum
conflict misses in the instruction cache. Our algorithm exploits the temporal-ordering
information between functions during program execution. Experimental results on the Cell …
an overlay structure for a given memory size for multicores with explicitly-managed memory
hierarchies. We observe that finding an efficient overlay structure with minimum memory
copying overhead is similar to the problem that finds a code placement with minimum
conflict misses in the instruction cache. Our algorithm exploits the temporal-ordering
information between functions during program execution. Experimental results on the Cell …