ABSTRACT
In this paper, a methodology for determining and characterizing error latency is developed. The method is based on real workload data, gathered by an experiment instrumented on a VAX 11/780 during the normal workload cycle of the installation. This is the first attempt at jointly studying error latency and workload variations in a full production system. Distributions of error latency were generated by simulating the occurrence of faults under varying workload conditions. A family of error latency distributions so generated illustrate that error latency is not so much a function of when in time a fault occurred but rather a function of the workload that followed the failure. The study finds that the mean error latency varies by a 1 to 8 (hours) ratio between high and low workloads. The method is general and can be applied to any system.
- Butner 80.S. E. Butner and R. K. Iyer, "A Statistical Study of Reliability and System Load at SLAC," Diges~, Tenth Inferncaiona~ Symposium on Fcmlt Tolerant Computing, Kyoto, Japan, Oct 1980.Google Scholar
- Castillo 80.X. Castillo and D. P. Siewiorek, "A Performance Reliabtlity Model for Computing Systems," Digesz, Tenth Internation~ Symposium on Faul.t Tolerant Computing, Kyoto, Japan, Oct 1980.Google Scholar
- Castillo 81.X. CastiUo and D. P. Siewlorek, "Workload, Performance and Reliability of Digital Computing Systems," Digest, ~2eventh International Symposium on Fault-Tolerant Computing, Portland, Maine, June 1981, pp. 84-89.Google Scholar
- Chillarege 85.R. Chtllarege and R. K. Iyer, "An Experimental Study of Error Latency and System Workload." CSG Technical Report, Univ. of Illinois, Urbana, IL.Google Scholar
- Cortes 84.M. L. Cortes and R. K. Iyer, "Device Failures and System Activity: A Thermal Effects Model," "Digest, Fourteenth lnter. Symposium on Fault- Tolerant Computing," Orlando, Florida June 1984.Google Scholar
- DEC 80a.Digital Equipment Corporation, VAX Hardware Handboott, DEC 1980.Google Scholar
- DEC 80b.Digital Equipment Corporation, VAX Architecture Handbook, DEC 1980.Google Scholar
- DEC 80c.Digital Equipment Corporation, KA780 Field Maintenxmze Print Set, DEC 1980.Google Scholar
- Gunther 80.N. L. Gunther and W. C, Carter, "Remarks on the Prob. of detecting faults," Digest lOth International SympoMum on F_m_d_t-Tolerance Comput- /rig, Kyoto, Japan, Oct 1980.Google Scholar
- Iyer 82a.R. K. Iyer, S, E, Burner and E. J. McCIuskey, "A Statistical Fallure/Imad Relationship; Results of a Multi-Computer Study," IFP~ Transactions on Computers, July 1982.Google Scholar
- Iyer 82b.R. K. Iyer and D. J. Rossetti, "A Statistical Load Dependency of CPU Errors at SLAC," Digest, 12th International Symposium on Fault Tolerant Comtncing, Santa Monica, California, June 1982.Google Scholar
- Iyer 83.R. K. Iyer and D. J. Rossetti, "Permanent CPU Errors and System Activity: Measurement and Modellng", Digest, Real-Time Systems Symtx~iurn, Arlington, Virginia, Dec 1983.Google Scholar
- Lala 83.J. H. Lala, "Fault Detection, Isolation and Reconfiguration ff FIMP: Methods and Experimental Results", Firth Dig. Avionics Syst. Conf., 1983.Google Scholar
- McGough 81.J. G. McGough and F. L. Swern, "Measurement of Fault Latency in a Digital Avionic Mint Processor," NASA Contractor Report 3462, Oct 1981.Google Scholar
- McGough 83.J. G. McGough and F. L. Swern, "Measurement of Fault Latency in a Digital Avionic Mini Processor Part H," NASA Contractor Report 3651, Jan 1983.Google Scholar
- Rossetti 81.D. J. Rossetti and R. K. lyer, "A Software System for Reliability and Workload Analysis," CRC Tech Rpt 81-18, Center for Reliable Computing, Computer Systems Laboratory, Stanford Univ, Stanford, C.A., Dec 198 I.Google Scholar
- Shedletsky 73.J. J. Schedletsky and E. J. McCluskey, "The Error Latency of a Fault in a Combinational ClrcuR," Digest FTCS-3, june 1973.Google Scholar
- TEK84.Tektronix, User's Manual 9 I DW l For VAX/UNIX 4.1bsd Releasel, 1984, Tektronix, Oregon, USA.Google Scholar
Index Terms
- The effect of system workload on error latency: an experimental study
Recommendations
The effect of system workload on error latency: an experimental study
In this paper, a methodology for determining and characterizing error latency is developed. The method is based on real workload data, gathered by an experiment instrumented on a VAX 11/780 during the normal workload cycle of the installation. This is ...
Low-Latency Network-Adaptive Error Control for Interactive Streaming
MM '19: Proceedings of the 27th ACM International Conference on MultimediaWe introduce a novel network-adaptive algorithm that is suitable for alleviating network packet losses for low-latency interactive communications between a source and a destination. Network packet losses happen in a bursty manner as well as an arbitrary ...
Low-Latency Network-Adaptive Error Control for Interactive Streaming
We introduce a novel network-adaptive algorithm that is suitable for alleviating network packet losses for low-latency interactive communications between a source and a destination. Our network-adaptive algorithm estimates in real-time the best parameters ...
Comments