Imposing coarse-grained reconfiguration to general purpose processors

M Duric, M Stanic, I Ratkovic, O Palomar… - 2015 International …, 2015 - ieeexplore.ieee.org
M Duric, M Stanic, I Ratkovic, O Palomar, O Unsal, A Cristal, M Valero, A Smith
2015 International Conference on Embedded Computer Systems …, 2015ieeexplore.ieee.org
Mobile devices execute applications with diverse compute and performance demands. This
paper proposes a general purpose processor that adapts the underlying hardware to a
given workload. Existing mobile processors need to utilize more complex heterogeneous
substrates to deliver the demanded performance. They incorporate different cores and
specialized accelerators. On the contrary, our processor utilizes only modest homogeneous
cores and dynamically provides an execution substrate suitable to accelerate a particular …
Mobile devices execute applications with diverse compute and performance demands. This paper proposes a general purpose processor that adapts the underlying hardware to a given workload. Existing mobile processors need to utilize more complex heterogeneous substrates to deliver the demanded performance. They incorporate different cores and specialized accelerators. On the contrary, our processor utilizes only modest homogeneous cores and dynamically provides an execution substrate suitable to accelerate a particular workload. Instead of incorporating accelerators, the processor reconfigures one or more cores into accelerators on-the-fly. It improves performance with minimal hardware additions. The accelerators are made of general purpose ALUs reconfigured into a compute fabric and the general purpose pipeline that streams data through the fabric. To enable reconfiguration of ALUs into the fabric, the floorplan of a 4-core processor is changed to place the ALUs in close proximity on the chip. A configurable switched network is added to couple and dynamically reconfigure the ALUs to perform computation of frequently repeated regions, instead of executing general purpose instructions. Through this reconfiguration, the mobile processor specializes its substrate for a given workload and maximizes performance of the existing resources. Our results show that reconfiguration accelerates a set of selected compute intensive workloads by 1.56×, 2,39×, 3,51×, when configuring the accelerator of 1-, 2-, or 4- cores respectively.
ieeexplore.ieee.org
Showing the best result for this search. See all results