Research Repository

A Hierarchical and Distributed Fault Tolerant Proposal for NoC-Based MPSoCs

Wachter, Eduardo and Fochi, Vinicius and Barreto, Francisco and Amory, Alexandre and Moraes, Fernando (2018) 'A Hierarchical and Distributed Fault Tolerant Proposal for NoC-Based MPSoCs.' IEEE Transactions on Emerging Topics in Computing, 6 (4). pp. 524-537. ISSN 2168-6750

Full text not available from this repository.

Abstract

Aggressive scaling of CMOS process technology allows the fabrication of highly integrated chips such as NoC-based MPSoCs. However, fault probability increases when devices’ size reduces. Hence, fault tolerant design has an important role in current nanometric technologies, leading to research on fault mitigation techniques for NoC-based MPSoCs. Most of the state-of-the-art papers present partial solutions to design a fault tolerant MPSoC, i.e., they present fault tolerant mechanisms for either NoCs or processing elements (PEs). The goal of this paper is to propose a comprehensive integration of previously defined recovery mechanisms. The main novelty is the system-level integration itself, which is organized in a hierarchical and distributed manner, ensuring the correct execution of applications in the presence of multiple transient or permanent faults in both the NoC and/or the PEs. The combination of both NoC and PE recovery methods enable the proposed system to tolerate a very severe number of faults. Depending on the severity of the fault in the NoC, it may operate in degraded mode or require the search of fault-free paths. In both cases, the communication is reestablished in less than 50 microseconds. Faults detected into the PEs fire a lightweight and fast task relocation protocol, which executes in less than one millisecond.

Item Type: Article
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions: Faculty of Science and Health
Faculty of Science and Health > Computer Science and Electronic Engineering, School of
SWORD Depositor: Elements
Depositing User: Elements
Date Deposited: 19 Mar 2019 16:43
Last Modified: 06 Jan 2022 13:58
URI: http://repository.essex.ac.uk/id/eprint/24083

Actions (login required)

View Item View Item