Yasir Qadri, Muhammad and Qadri, Nadia N and Fleury, Martin and McDonald-Maier, Klaus D (2017) Energy-efficient data prefetch buffering for low-end embedded processors. Microelectronics Journal, 62. pp. 57-64. DOI https://doi.org/10.1016/j.mejo.2017.01.014
Yasir Qadri, Muhammad and Qadri, Nadia N and Fleury, Martin and McDonald-Maier, Klaus D (2017) Energy-efficient data prefetch buffering for low-end embedded processors. Microelectronics Journal, 62. pp. 57-64. DOI https://doi.org/10.1016/j.mejo.2017.01.014
Yasir Qadri, Muhammad and Qadri, Nadia N and Fleury, Martin and McDonald-Maier, Klaus D (2017) Energy-efficient data prefetch buffering for low-end embedded processors. Microelectronics Journal, 62. pp. 57-64. DOI https://doi.org/10.1016/j.mejo.2017.01.014
Abstract
An energy-efficient architecture should jointly optimize energy consumption and throughput, as captured by the Energy-Delay-Square Product (ED2P) metric. This paper introduces a prefetch data buffer micro-architecture, which achieves that goal with the aid of software-inserted control words to govern the prefetch process. The proposed architecture is aimed at low-end embedded processors, which, so as to reduce energy consumption, lack a cache-based memory hierarchy. By identifying after compilation which data should be prefetched and modifying the object code, the rate of prefetch misses is reduced. And by pre-computing memory addresses using auxiliary software after compilation and modifying the object code, address computation by hardware at run time is avoided, reducing pipeline stalls and, thus, improving throughput. Additionally in the case of branches, by prefetching two data items at any one time, alternative instruction outcomes are anticipated. The paper contains results from running a range of well-known and representative benchmarks on the proposed architecture. There was an improvement of 6−20% compared to an unbuffered architecture in execution times when tested over those seven benchmarks. Furthermore, the average ED2P for the buffered architecture when normalized against the same architecture without buffering was found to vary between 54% and 90% according to benchmarking, though there is a cost in code size increase. That is to say, for the benchmarks tested there was a net energy efficiency improvement of between 10% and 46% in comparison with the equivalent unbuffered architecture with a lower area overhead.
Item Type: | Article |
---|---|
Uncontrolled Keywords: | Control words; Data prefetch; Embedded processor; Micro-architecture |
Subjects: | Q Science > QA Mathematics > QA75 Electronic computers. Computer science |
Divisions: | Faculty of Science and Health Faculty of Science and Health > Computer Science and Electronic Engineering, School of |
SWORD Depositor: | Unnamed user with email elements@essex.ac.uk |
Depositing User: | Unnamed user with email elements@essex.ac.uk |
Date Deposited: | 24 Feb 2017 15:07 |
Last Modified: | 30 Oct 2024 20:27 |
URI: | http://repository.essex.ac.uk/id/eprint/19146 |
Available files
Filename: microelectronicsJournalSubmission_RevisedFinalCopy.pdf