Hwsw codesign of embedded systems 29 software fault tolerance fault tolerant software design techniques h h rb h v1 h v2 h v3 nvp primary primary alternate alternate nindependent program variants execute in parallel on the identical input. A software application can prevent total loss of functionality by. Faultavoidance and faultremoval features of the computer. The use of causeeffect graphing for software specification and validation was investigated. Fault tolerance design for surviving component failures is becoming a necessity for a growing number of companies, far beyond its traditional application areas, like aerospace and telecommunications. Software fault tolerance is the ability of a software to detect and recover from a fault that is happening or has already happened. The mrp approach can be used for modeling fault tolerant software systems. Fault tolerance fault tolerance a product oriented concept accepts faults in a limited capacity and masks their manifestation a fault tolerant design enables a system to continue its intended operation, possibly at a reduced level, rather than failing completely, when some part of the system fails. The fault avoidance and the fault tolerance approaches for.
Multiversion software reliability through faultavoidance and. It can also be error, flaw, failure, or fault in a computer program. Pdf software reliability through faultavoidance and. As infrastructurerelated fault tolerance is discussed in the coming section, here the software aspect of fault tolerance is discussed. In general fault tolerance is always based on various assumptions concerning the degree of perfectionism certain work items are carried out.
Reliability analysts, software reliability engineers, software system designers, designers of faulttolerant software abstract the effect of failure correlation is to reduce the output space in which a voter makes decisions. Runtime techniques are used to ensure that system faults do not. Both schemes are based on software redundancy assuming that the events of coincidental software failures are rare. Reliability is a popular aspect of software dependability, which relies, in particular, on fault forecasting and fault removal.
Some of the methods for avoidance and detection of software faults are summarized. We modeled the reliability and the availability of a hotstandby duplex system considering design faults, and we subsequently analyzed the performance. As more and more complex systems get designed and built, especially safety critical systems, software fault tolerance and the next generation of hardware fault tolerance will need to evolve to be able to solve the design fault. There are two basic techniques for obtaining fault tolerant software. Reliability in software system can be achieved using which of the following strategies. Most bugs arise from mistakes and errors made by developers, architects. Fault tolerance computing draft carnegie mellon university. It is stated in statistical terms as a probability which reflects the fact that failures occur at unpredictable times. These techniques contributes to system reliability through use of structured. These faults are usually found in either the software or hardware of the system in which the software is running in order to provide service in accordance to the provided specifications. Various software fault injection and detection models are studied, and the behavior of the models has been summarized. Development techniques are used that either minimize the. Faultintolerance and faulttolerance the fault intolerance or faultavoidance approach improves system reliability by removing the source of failures i.
Fault avoidance and fault tolerance linkedin slideshare. Design diverse software fault tolerance techniques 5. The following four sections describe fault tolerance strategies that are commonly utilized to improve software reliability hech86. Perrun failure probability and runs executiontime distribution for a particular fault tolerant technique can be. For systems that require high reliability, this may still be a necessity. Basic fault tolerant software techniques geeksforgeeks.
Fault tolerant software has the ability to satisfy requirements despite failures. Lastly, advanced software fault tolerance models were studied to provide alternatives and improvements in situations where simple software fault tolerance strategies break down. Multiversion software reliability through fault avoidance and fault tolerance. The philosophy which attempts to accomplish this goal is known as fault avoidance. At least in complex systems can be utilized on simple systems or when any other approach is physically impossible fault avoidance techniques can also be combined with fault tolerance 3.
Fault avoidance results from conservative design practices such as the use of high reliability parts. As software fault tolerance is often measured in terms of system availability, which is a function of reliability, we should include various single version sv software based approaches of fault tolerance for more effective software fault avoidance in order to combat latent defects, environment and operational faults. Index termsdesign diversity, fault tolerance, multiple computation, nversion programming, nversion software, software reliability, tolerance ofdesign faults. A voting strategy called consensus voting may in part compensate for the problems that arise from this. Software reliability through fault avoidance and fault tolerance. Reliability engineering cs 410510 software engineering class. Citeseerx document details isaac councill, lee giles, pradeep teregowda. Mcq questions on software engineering set2 infotechsite. Fault avoidance, fault removal and fault tolerance represent three.
Fault forecasting consists of estimating the presence. Multiversion software reliability through faultavoidance. Work in 45 aims to treat software fault tolerance as a robust supervisory control rsc problem and propose a rsc approach to software fault tolerance. In this approach the software component under consideration is treated as a controlled object that is modeled as a generalized kripke structure or finitestate concurrent system 44,45. Software reliability through faultavoidance and fault. Fault avoidance alone is rarely used to provide system level reliability. Reliability in a software system can be achieved using which of the following strategies. Fault tolerance is the realization that we will have faults in our system hardware andor software and we have to design the system in such a way that it will be tolerant of those faults. We have continued collection of data on the relationships between software faults and reliability, and the coverage provided by the testing process as measured by different metrics. Multiversion software reliability through faultavoidance and fault tolerance. Topics covered include fault avoidance, fault removal, and fault tolerance, along with statistical methods for the objective assessment of predictive accuracy. Two approaches to increasing system reliability are fault avoidance and fault tolerance. Pdf software reliability through faultavoidance and faulttolerance.
Fault avoidance fault detection fault tolerance, recovery and repair. This paper provides a concepeual framework for expressing the attributes of what constitutes dependable and reliable computing. Proper design of fault tolerant systems begins with the requirements speci. This article aims to discuss various issues of software fault avoidance. As more and more complex systems get designed and built, especially safety critical systems, software fault tolerance and the next generation of hardware fault tolerance will need to evolve to be able to solve the design fault problem. Redundancy underlies all approaches to fault tolerance. Software fault tolerance is an immature area of research. In the period reported here we have worked on the following. If its operating quality decreases at all, the decrease is proportional to the severity of the failure, as compared to a naively designed system, in which even a small failure. This is the basic property of a system which we seek to enhance through the concept of fault tolerance. An introduction to the design and analysis of fault.
Reliability and fault tolerance goals to understand some of the factors influencing the reliability of a hardware system to understand some of the factors which affect the reliability of a system and how software design faults can be tolerated. Similarly, the software that supports the highlevel semantic interface 1. Describes why faults occur and how modern digital systems are fault tolerant. Guest editors introduction understanding fault tolerance and. Fault tolerant software assures system reliability by using protective redundancy at the software level. All software defects are eliminated prior to operation. Reliability and fault tolerance nversion programming vs.
Faulttolerant software assures system reliability by using protective redundancy at the software level. Pdf software reliability through faultavoidance and fault. Factors influencing sr are fault count and operational profile dependability means fault avoidance, fault tolerance, fault removal and fault forecasting. That is, it should compensate for the faults and continue to.
Hardware reliability an overview sciencedirect topics. Fault tolerance fault tolerance a product oriented concept accepts faults in a limited capacity and masks their manifestation a faulttolerant design enables a system to continue its intended operation, possibly at a reduced level, rather than failing completely, when some part of the system fails. Use of informationhiding, strong typing, good engineering principles. Fault tolerance is the property that enables a system to continue operating properly in the event of the failure of or one or more faults within some of its components. As software fault tolerance is often measured in terms of system availability, which is a function of reliability, we should include various single version sv software based approaches of fault tolerance for more effective software fault avoidance in order to combat latent defects, environment and. Software fault tolerance carnegie mellon university.
If me defects remain, the operation is reliable only as long as the defects are not involved in progran execution. Reliability oriented design methods and programming techniques 4. There are two basic techniques for obtaining faulttolerant software. Pdf fault tolerant software reliability engineering. Textbook n no textbook n useful references n software fault tolerance techniques and implementation n laura pullum, artechhouse publishers, 2001, isbn 1 5805377 n software reliability engineering n michael r.
The fault avoidance and the fault tolerance approaches for increasing the reliability of aerospace and automotive systems. Motivation for software fault tolerance usual method of software reliability is fault avoidance using good software engineering methodologies large and complex systems fault avoidance not successful rule of thumb fault density in software is 1050 per 1,000 lines of code for good software and 15 after intensive testing using automated tools. A software application can prevent total loss of functionality by graceful degradation functionality alternatives. Various methods of software fault mitigation, in case the software fault cannot be avoided are discussed. We will now consider several methods for dealing with software faults. This course has been developed by the centre for software reliability with funding from the engineering and physical sciences research council grant number 00711eng95 as part of their. Mcq on software reliability in software engineering part1. Planning to avoid failur es fault avoidance is the most important aspect of fault tolerance. Four papers generated during the reporting period are included as. Failures result from unexpected problems internal to the system that eventually manifest themselves in the systems external behaviour and these problems are called errors and their mechanical or algorithmic cause are termed faults. Software reliability through faultavoidance and fault tolerance. Software fault is also known as defect, arises when the expected result dont match with the actual results. Bug life cycle defect life cycle in software testing duration. Nversion approach to faulttolerant software bers the set of good similar results at a decision point, then the decision algorithm will arrrive at an erroneous decision result.
Reliability and fault tolerance nversion programming vs recovery blocks. Nov 26, 2015 fault tolerance fault tolerance a product oriented concept accepts faults in a limited capacity and masks their manifestation a fault tolerant design enables a system to continue its intended operation, possibly at a reduced level, rather than failing completely, when some part of the system fails. Fault avoidance the basic idea is that if you are really careful as you develop the software system, no faults will creep in. Fault avoidance is a technique that is used in an attempt to prevent the occurrence of faults. The fault avoidance and the fault tolerance approaches for increasing the reliability of aerospace and automotive systems 2005014157. Data diverse software fault tolerance techniques 6. A designer must analyze the envir onment and deter mine the failur es that must be tolerated to achieve the desir ed level of r eliability. The fault avoidance or prevention techniques are dependability enhancing. Thus, we ob served that system availability and reliability can be in creased when our fault avoidance scheme is used in the remaining system component after some of system com ponents are. The study 29 shows that system and applications software can potentially detect and correct some or many of these errors by using different software fault tolerance approaches such as replication, voting, and masking with a focus on algorithmbased fault tolerance 7, 31,32,33,34,35,37 or by using a combined software and hardware approaches. A fault avoidance b fault tolerance c fault detection. Sw faulttolerance techniques software faulttolerance is based on hw faulttolerance software fault detection is a bigger challenge many software faults are of latent type that shows up later. Though the goal of fault avoidance is to reduce the likelihood of failure, even after the most careful application of fault avoidance techniques, failures. In this project we have proposed to investigate a number of experimental and theoretical issues associated with the practical use of multiversion software in providing dependable software through.
Approaches to software fault tolerance the usual method to attain reliability of software operation is fault avoidance or intolerance l i. Software designers or system integrators who want an introduction to the problems found in designing for fault tolerance and to the range of design solutions. Topics reliability, failure and faults failure modes. Lastly, advanced software faulttolerance models were studied to. Reliability of computer systems and networks offers in depth and uptodate coverage of reliability and availability for students with a focus on important applications areas, computer systems, and networks. A survey of software fault tolerance techniques jonathan m. Smith computer science deparunent, columbia university, new york, ny 10027 cucs32588 abstract this report examines the state of the field of software fault tolerance. Citeseerx the fault avoidance and the fault tolerance. Software fault tolerance is the ability of computer software to continue its normal operation despite the presence of system or hardware faults. Introduction thetransfer ofthe concepts offault tolerance to comlputersoftware, that is discussed in this paper, began about20yearsafterthe first systematicdiscussionoffault. Guest editors introduction understanding fault tolerance. Though the goal of fault avoidance is to reduce the likelihood of failure, even after the most careful application of fault avoidance techniques, failures will occur.
In this work we discuss the fault avoidance and the fault tolerance approaches for increasing the reliability of aerospace and automotive systems. If its operating quality decreases at all, the decrease is proportional to the severity of the failure, as compared to a naively designed system, in which even a small failure can cause total breakdown. The fault intolerance or fault avoidance approach improves system reliability by removing the source of failures i. Reliability the probability that a device or system will perform a required function under stated conditions for a stated period of time. Sep 21, 2015 summary software reliability is defined as the probability of failurefree operation of a software system for a specified time in a specified environment. Diversity and fault avoidance for dependable replication.
Fault avoidance the primary purpose of fault avoidance and detection techniques is to identify and repair incorrect program operation prior to releasing a system. Software fault avoidance aims to produce fault free software through various approaches having the common objective of reducing the number of latent defects in software programs. Terminology, techniques for building reliable systems, andfault tolerance are discussed. For example, two similar errors will out weigh one good result in the threeversion case, anda set ofthree similar errors will prevail overaset oftwosimilar good results wheni n 5. For most other systems, eventually you give up looking for faults and ship it. Faulttolerant software has the ability to satisfy requirements despite failures. Proper design of faulttolerant systems begins with the requirements speci. Fault avoidance and tolerance technique fault tolerance. Software fault tolerance is the ability of a software to detect and recover from a fault that.
1433 36 935 1022 1357 115 203 453 1457 1585 937 1582 1514 199 771 1499 253 1011 1668 666 1637 1302 1327 72 1251 1214 1562 555 545 167 1179 980 571 1069 1315 211 350 1423 23 1429