148:(NFF) occurrences in electronic products and systems. NFF implies that a failure (fault) occurred or was reported to have occurred during a product’s use. The product was analyzed or tested to confirm the failure, but “a failure or fault” could be not found. A common example of the NFF phenomenon occurs when your computer “hangs up”. Clearly, a “failure” has occurred. However, if the computer is rebooted, it often works again. The impact of NFF and intermittent failures can be profound. Due to their characteristics, manufacturers may assume a cause(s) rather than spend the time and cost to determine a root cause. For example, a hard drive supplier claimed NFFs were not failures and allowed all NFF products to be returned to the field. Later it was determined that these products had a significantly higher return rate, suggesting that the NFF condition was actually a result of intermittent failures in the product. The result was increased maintenance costs, decreased equipment availability, increased customer inconvenience, reduced customer confidence, damaged company reputation, and in some cases potential safety hazards.
246:
avoided by using techniques such as dynamic instruction delaying. This is a type of algorithm that calculates the scheduling priorities during the execution of the system. The objective is to respond dynamically to the changing conditions and form a self-sustained, optimized configuration. Another approach to mitigating delay is core frequency scaling, which scales down the performance of the CPU to a lower frequency when less is needed and scales it up to a higher frequency when more is needed. Thread migration is another technique used to overcome intermittent failure. A thread is an ordered set of instructions that tells a computer exactly what to do. When a specific thread encounters failures, the content of the thread within the faulty computer core is transferred to another thread within an idle core, where the problem is addressed and solved.
141:
that occur over a period of time (or are sometimes instantaneous). They have a specific failure site (location of failure), mode (how the failure manifests itself), and mechanism, and there is no unpredictable recovery for the failed system. Since intermittent faults are not easily repeatable, it is more difficult to conduct a failure analysis for them, understand their root causes, or isolate their failure site than it is for permanent failures.
22:
238:
benefit from continuous and complete test coverage while any environmental stressing of the system is performed. This type cannot be performed by scanning testing technology but needs to have some form of electronic neural-network which can perform these test without the need for any scanning and/or digital averaging; this testing regime is covered by the DoD's
186:") because each individual factor does not create the problem alone, so the factors can only be identified while the malfunction is actually occurring. The person capable of identifying and solving the problem is seldom the usual operator. Because the timing of the malfunction is unpredictable, and both device or system
237:
In complex, multiple channel systems, where the fault/s might be in an interconnection, the ideal method of finding an intermittent fault is to be able to monitor, detect and isolate all channels or electrical paths continuously and simultaneously. This methodology allows the system under test to
140:
Intermittent faults are not easily repeatable because of their complicated behavioral patterns. These are also sometimes referred to as “soft” failures, since they do not manifest themselves all the time and disappear in an unpredictable manner. In contrast, “hard” failures are permanent failures
233:
In electrical systems and cable systems, time domain reflectometry techniques can be used: pulses are sent down electric wiring and the pulses reflected back are examined for anomalies, for example intermittent leakage during the stresses of aircraft operation; this can only be done for one test
245:
Three main methodologies to mitigate intermittent behavior in integrated circuits are dynamic instruction delaying, core frequency scaling, and thread migration. When the processor incurs more than the expected time to execute a process, time delay and timing violation occur. This fault may be
229:
can be changed as a routine measure, without bothering to troubleshoot the fault at all. Connectors can be disconnected and reseated. This is sometimes a measure of desperation; things are changed until the fault stops happening, and it is hoped that it is actually resolved rather than
122:, often called simply an "intermittent" (or anecdotally "interfailing"), is a malfunction of a device or system that occurs at intervals, usually irregular, in a device or system that functions normally at other times. Intermittent faults are common to all branches of
214:
Changing operating circumstances while the fault is present to see if the fault temporarily clears or changes. For example, tapping components, cooling them with freezer spray, heating them. Striking the cabinet may temporarily clear the
163:, which need not be identified) a minor change in temperature, vibration, orientation, voltage, etc. (Sometimes this is described as an "intermittent connection" rather than "fault".) In computer software a program may (
210:
Automatic logging of relevant parameters over a long enough time for the fault to manifest can help; parameter values at the time of the fault may identify the cause so that appropriate remedial action can be
194:, the fault is often simply tolerated if not too frequent unless it causes unacceptable problems or dangers. For example, some intermittent faults in critical equipment such as medical
358:
242:
published in March 2015 and it calls for testing technology to operate in the Class 1 category in order to combat intermittent faults effectively.
151:
A simple example of an effectively random cause in a physical system is a borderline electrical connection in the wiring or a component of a
329:
137:, which occur simultaneously. The more complex the system or mechanism involved, the greater the likelihood of an intermittent fault.
355:
239:
86:
58:
379:
65:
39:
105:
198:
equipment could result in killing a patient or in aeronautics causes a flight to be aborted or in some cases crash.
72:
266:
Bakhshi, Roozbeh; Kunche, Surya; Pecht, Michael (2014-02-18). "Intermittent
Failures in Hardware and Software".
54:
43:
301:
Qi, H.; Ganesan, S.; Pecht, M. (May 2008). "No-fault-found and
Intermittent Failures in Electronic Products".
171:
a variable which is required to be initially zero; if the program is run in circumstances such that memory is
168:
183:
133:. An intermittent fault is caused by several contributing factors, some of which may be effectively
378:"No Fault Found, Retest OK, Cannot Duplicate or Fault Not Found? - Towards a standardised taxonomy
32:
79:
333:
222:
418:
8:
159:, the cause that must be identified and rectified) two conductors may touch subject to (
393:
413:
218:
a database of similar faults which have been resolved in identical or similar equipment
152:
283:
234:
channel at time and is generally limited to intermittent faults >100milliseconds.
314:
310:
275:
399:
Sci.electronics.repair FAQ, see section "Troubleshooting of
Intermittent Problems"
398:
226:
175:
always clear before it starts, it will malfunction on the rare occasions that (
145:
221:
precautionary changes, without attempting to pinpoint the fault. For example,
407:
366:
362:
287:
179:) the memory where the variable is stored happens to be non-zero beforehand.
195:
182:
Intermittent faults are notoriously difficult to identify and repair ("
123:
345:: "Z3T CHASSIS - NO START UP - INTERMITTENT. D1124 (5.1V) ZENER LEAKY"
279:
21:
187:
130:
127:
369:; Smith, Paul; IEEE SENSORS JOURNAL, VOL. 5, NO. 6, DECEMBER 2005"
134:
191:
381:" Samir Khan, Paul Phillips, Chris Hockley, Ian Jennions"
206:
Some techniques to resolve intermittent faults are:
46:. Unsourced material may be challenged and removed.
327:Example of an intermittent TV fault in a database
265:
405:
354:"Spread Spectrum Time Domain Reflectometry for
300:
201:
106:Learn how and when to remove this message
144:Intermittent failures can be a cause of
406:
261:
259:
44:adding citations to reliable sources
15:
13:
394:A discussion of software debugging
14:
430:
387:
256:
20:
268:Journal of Electronic Packaging
31:needs additional citations for
372:
348:
330:"Highlandelectrix PANASONI.TV"
321:
315:10.1016/j.microrel.2008.02.003
294:
1:
250:
356:Locating Intermittent Faults
303:Microelectronics Reliability
7:
10:
435:
202:Troubleshooting techniques
190:and engineers' time incur
223:electrolytic capacitors
55:"Intermittent fault"
40:improve this article
120:intermittent fault
280:10.1115/1.4026639
116:
115:
108:
90:
426:
382:
376:
370:
352:
346:
344:
342:
341:
332:. Archived from
325:
319:
318:
298:
292:
291:
263:
225:subject to high
111:
104:
100:
97:
91:
89:
48:
24:
16:
434:
433:
429:
428:
427:
425:
424:
423:
404:
403:
390:
385:
377:
373:
353:
349:
339:
337:
328:
326:
322:
299:
295:
264:
257:
253:
227:ripple currents
204:
112:
101:
95:
92:
49:
47:
37:
25:
12:
11:
5:
432:
422:
421:
416:
402:
401:
396:
389:
388:External links
386:
384:
383:
371:
367:Furse, Cynthia
361:2010-05-01 at
347:
320:
309:(5): 663–674.
293:
254:
252:
249:
248:
247:
243:
235:
231:
219:
216:
212:
203:
200:
146:no-fault-found
114:
113:
28:
26:
19:
9:
6:
4:
3:
2:
431:
420:
417:
415:
412:
411:
409:
400:
397:
395:
392:
391:
380:
375:
368:
364:
363:archive.today
360:
357:
351:
336:on 2009-04-13
335:
331:
324:
316:
312:
308:
304:
297:
289:
285:
281:
277:
274:(1): 011014.
273:
269:
262:
260:
255:
244:
241:
240:MIL-PRF-32516
236:
232:
228:
224:
220:
217:
213:
209:
208:
207:
199:
197:
193:
189:
185:
180:
178:
174:
170:
166:
162:
158:
154:
149:
147:
142:
138:
136:
132:
129:
125:
121:
110:
107:
99:
88:
85:
81:
78:
74:
71:
67:
64:
60:
57: –
56:
52:
51:Find sources:
45:
41:
35:
34:
29:This article
27:
23:
18:
17:
374:
350:
338:. Retrieved
334:the original
323:
306:
302:
296:
271:
267:
205:
196:life support
184:troubleshoot
181:
176:
172:
164:
160:
156:
150:
143:
139:
126:, including
119:
117:
102:
93:
83:
76:
69:
62:
50:
38:Please help
33:verification
30:
419:Maintenance
408:Categories
340:2010-07-19
251:References
169:initialise
167:) fail to
124:technology
66:newspapers
414:Debugging
288:1043-7398
155:, where (
359:Archived
230:dormant.
188:downtime
131:software
128:computer
96:May 2024
177:cause 2
165:cause 1
161:cause 2
157:cause 1
153:circuit
80:scholar
286:
215:fault.
211:taken.
173:almost
135:random
82:
75:
68:
61:
53:
87:JSTOR
73:books
284:ISSN
192:cost
59:news
311:doi
276:doi
272:136
118:An
42:by
410::
365:"
307:48
305:.
282:.
270:.
258:^
343:.
317:.
313::
290:.
278::
109:)
103:(
98:)
94:(
84:·
77:·
70:·
63:·
36:.
Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.