Intelligent Measurement Data Processing 
for the Construction of Dependable IT Systems


Sponsors:

Hungarian-Portugal Bilateral Scientific and Technology Development Cooperation Agreement
Period:
2004-2005
Participants:
CISUC - Center of Informatics and Systems, University of Coimbra, Portugal
Department of Measurement and Information Systems, Budapest University of Technology and Economics, Hungary
Motivation:
Dependability is by definition the complex ability to guarantee that a user can justifiably rely on the services delivered by IT systems. The factors influencing the reliability of microelectronics based equipment have changed drastically during the last decade. Improved technologies have reduced the rate of occurrence of catastrophic faults originating by circuit defects. Transient faults originated by power and signal noises, natural background radiation effects become more and more dominant due to the miniaturization reducing both physical size and energy levels in the circuitry.

A proper protection of the QoS (Quality of Service) against the effects of such faults necessitates the use of fault tolerance (FT) techniques. Many mission critical systems use rough granular redundancy, like modular replication and voting to compensate the effects of faults. However a wide spectrum of applications including the majority of embedded systems cannot tolerate large cost overheads resulting from a high level of redundancy. These systems have to exploit a FT scheme based on a fine granular redundancy.

The key factor in the assurance of proper fault coverage is a good fitting between FT measures and fault occurrence profiles, thus a protection of the system against the most frequently occurring faults. This harmonization of faults and protective FT measures is usually done by using benchmarks supporting the reuse of the observations made in previous systems of a similar architecture and technology as the target system.

The trustworthiness of the fault model is a crucial factor in the design for dependability process.  The main source of difficulties in the creation of a realistic fault model is the lack of a direct observability of faults, as only their manifestation in the form of failures can be measured. Failures result from a complex, time depending interaction between faults and hardware and software architecture, workload etc. Accordingly, a faithful characterization of faults poses a highly demanding measurement problem due to the large number of factors and complexity of the interactions:

Obviously, fault models have to be incorporated into the design and analysis process of IT systems. This can be done either by simply publishing statistics for the system designer on the types, effects and frequency of faults to be anticipated. A more efficient way is provided by extending the design database of the system engineering tool with the fault model thus supporting automated dependability analysis and incorporation of FT measures.

This way, an effective design for dependability technology requires an experimental assessment of the potential faults, the development of FT measures and their experimental validation, and the formulation of the fault models and FT measures in a form reusable in the design process.

Project aim:
The main objective of the research proposal is to explore the usefulness and efficency of intelligent data processing methods in the field of fault modelling with a special emphasis on the comparison of heuristic and automatically generated models. The practical usefulness of the models will be proven by applying and validating the models in design for dependability pilot applications.

In order to assess the feasibility of the approach, a complete design for dependability roundtrip will be carried out on pilot examples:

Further information:
István Majzik, Ph.D.
András Pataricza, Ph.D.