222:
Autocorrelation methods need at least two pitch periods to detect pitch. This means that in order to detect a fundamental frequency of 40 Hz, at least 50 milliseconds (ms) of the speech signal must be analyzed. However, during 50 ms, speech with higher fundamental frequencies may not necessarily
408:
A. Michael Noll, “Pitch
Determination of Human Speech by the Harmonic Product Spectrum, the Harmonic Sum Spectrum and a Maximum Likelihood Estimate,” Proceedings of the Symposium on Computer Processing in Communications, Vol. XIX, Polytechnic Press: Brooklyn, New York, (1970), pp.
112:
which are composed of multiple sine waves with differing periods or noisy data. Nevertheless, there are cases in which zero-crossing can be a useful measure, e.g. in some speech applications where a single source is assumed. The algorithm's simplicity makes it "cheap" to implement.
135:
Current time-domain pitch detector algorithms tend to build upon the basic methods mentioned above, with additional refinements to bring the performance more in line with a human assessment of pitch. For example, the YIN algorithm and the MPM algorithm are both based upon
202:
function such as normalized cross correlation, and frequency domain processing utilizing spectral information to identify the pitch. Then, among the candidates estimated from the two domains, a final pitch track can be computed using
175:
which attempts to match the frequency domain characteristics to pre-defined frequency maps (useful for detecting pitch of fixed tuning instruments); and the detection of peaks due to harmonic series.
117:
444:
Brown JC and
Puckette MS (1993). A high resolution fundamental frequency determination based on phase changes of the Fourier transform. J. Acoust. Soc. Am. Volume 94, Issue 2, pp. 662–667
89:) and so there may be different demands placed upon the algorithm. There is as yet no single ideal PDA, so a variety of algorithms exist, most falling broadly into the classes given below.
195:
164:
183:
124:
algorithms work this way. These algorithms can give quite accurate results for highly periodic signals. However, they have false detection problems (often "
186:(magnitude based) can be used to go beyond the precision provided by the FFT bins. Another phase-based approach is offered by Brown and Puckette
128:"), can sometimes cope badly with noisy signals (depending on the implementation), and - in their basic implementations - do not deal well with
445:
116:
More sophisticated approaches compare segments of the signal with other segments offset by a trial period to find a match. AMDF (
207:. The advantage of these approaches is that the tracking error in one domain can be reduced by the process in the other domain.
582:
538:
378:
156:. This requires more processing power as the desired accuracy increases, although the well-known efficiency of the
20:
92:
A PDA typically estimates the period of a quasiperiodic signal, then inverts that value to give the frequency.
356:
562:
78:
577:
86:
457:
558:
Alain de
Cheveigne and Hideki Kawahara: YIN, a fundamental frequency estimator for speech and music
277:
247:
178:
To improve on the pitch estimate derived from the discrete
Fourier spectrum, techniques such as
157:
422:,” Journal of the Acoustical Society of America, Vol. 41, No. 2, (February 1967), pp. 293–309.
43:
472:
306:
242:
433:
Accurate and
Efficient Fundamental Frequency Determination from Precise Partial Estimates.
160:, a key part of the periodogram algorithm, makes it suitably efficient for many purposes.
8:
204:
179:
476:
310:
219:
can vary from 40 Hz for low-pitched voices to 600 Hz for high-pitched voices.
338:
172:
153:
105:
419:
534:
496:
488:
374:
330:
322:
252:
59:
55:
557:
480:
342:
314:
291:
257:
67:
432:
199:
137:
121:
514:
237:
359:
In
Proceedings of the International Computer Music Conference (ICMC’05), 2005.
571:
492:
326:
101:
82:
47:
39:
563:
AudioContentAnalysis.org: Matlab code for various pitch detection algorithms
500:
397:
334:
280:, technical report, Dept. of Computer Science, University of Regina, 2003.
278:
Pitch
Extraction and Fundamental Frequency: History and Current Techniques
148:
Frequency domain, polyphonic detection is possible, usually utilizing the
149:
63:
51:
393:
120:), ASMDF (Average Squared Mean Difference Function), and other similar
484:
458:"A spectral/temporal method for robust fundamental frequency tracking"
318:
232:
132:
sounds (which involve multiple musical notes of different pitches).
129:
74:
35:
168:
109:
198:, are based upon a combination of time domain processing using an
216:
528:
292:"YIN, a fundamental frequency estimator for speech and music"
100:
One simple approach would be to measure the distance between
435:
Proceedings of the 4th AES Brazil
Conference. 113-118, 2006.
19:"Pitch tracking" redirects here. For the baseball term, see
223:
have the same fundamental frequency throughout the window.
194:
Spectral/temporal pitch detection algorithms, e.g. the
471:(6). Acoustical Society of America (ASA): 4559–4571.
305:(4). Acoustical Society of America (ASA): 1917–1930.
108:). However, this does not work well with complicated
289:
529:Huang, Xuedong; Alex Acero; Hsiao-Wuen Hon (2001).
62:or a musical note or tone. This can be done in the
371:Statistical Digital Signal Processing and Modeling
163:Popular frequency domain algorithms include: the
569:
465:The Journal of the Acoustical Society of America
431:Mitre, Adriano; Queiroz, Marcelo; Faria, RĂ©gis.
299:The Journal of the Acoustical Society of America
189:
290:de Cheveigné, Alain; Kawahara, Hideki (2002).
143:
21:Glossary of baseball (P) § pitch tracking
152:to convert the signal to an estimate of the
456:Zahorian, Stephen A.; Hu, Hongbing (2008).
373:. John Wiley & Sons, Inc. p. 393.
455:
210:
524:
522:
73:PDAs are used in various contexts (e.g.
570:
16:Algorithm to estimate signal frequency
519:
513:Stephen A. Zahorian and Hongbing Hu.
387:
368:
118:average magnitude difference function
95:
515:YAAPT Pitch Tracking MATLAB Function
13:
533:. Prentice Hall PTR. p. 325.
14:
594:
551:
507:
449:
104:points of the signal (i.e. the
438:
425:
412:
402:
362:
349:
283:
270:
196:YAAPT pitch tracking algorithm
1:
263:
215:The fundamental frequency of
420:Cepstrum Pitch Determination
357:A smarter way to find pitch.
190:Spectral/temporal approaches
7:
226:
144:Frequency-domain approaches
87:musical performance systems
79:music information retrieval
10:
599:
531:Spoken Language Processing
394:Pitch Detection Algorithms
18:
583:Digital signal processing
355:P. McLeod and G. Wyvill.
165:harmonic product spectrum
38:designed to estimate the
28:pitch detection algorithm
248:Linear predictive coding
396:, online resource from
369:Hayes, Monson (1996).
211:Speech pitch detection
184:Grandke interpolation
180:spectral reassignment
44:fundamental frequency
243:Frequency estimation
477:2008ASAJ..123.4559Z
311:2002ASAJ..111.1917D
205:dynamic programming
418:A. Michael Noll, “
173:maximum likelihood
154:frequency spectrum
106:zero-crossing rate
96:General approaches
54:signal, usually a
578:Audio engineering
485:10.1121/1.2916590
319:10.1121/1.1458024
253:MUSIC (algorithm)
182:(phase based) or
56:digital recording
590:
545:
544:
526:
517:
511:
505:
504:
462:
453:
447:
442:
436:
429:
423:
416:
410:
406:
400:
391:
385:
384:
366:
360:
353:
347:
346:
296:
287:
281:
274:
258:Sinusoidal model
68:frequency domain
598:
597:
593:
592:
591:
589:
588:
587:
568:
567:
554:
549:
548:
541:
527:
520:
512:
508:
460:
454:
450:
443:
439:
430:
426:
417:
413:
407:
403:
392:
388:
381:
367:
363:
354:
350:
294:
288:
284:
275:
271:
266:
229:
213:
200:autocorrelation
192:
146:
138:autocorrelation
122:autocorrelation
98:
24:
17:
12:
11:
5:
596:
586:
585:
580:
566:
565:
560:
553:
552:External links
550:
547:
546:
539:
518:
506:
448:
437:
424:
411:
401:
386:
379:
361:
348:
282:
268:
267:
265:
262:
261:
260:
255:
250:
245:
240:
238:Beat detection
235:
228:
225:
212:
209:
191:
188:
145:
142:
97:
94:
15:
9:
6:
4:
3:
2:
595:
584:
581:
579:
576:
575:
573:
564:
561:
559:
556:
555:
542:
540:0-13-022616-5
536:
532:
525:
523:
516:
510:
502:
498:
494:
490:
486:
482:
478:
474:
470:
466:
459:
452:
446:
441:
434:
428:
421:
415:
405:
399:
395:
390:
382:
380:0-471-59431-8
376:
372:
365:
358:
352:
344:
340:
336:
332:
328:
324:
320:
316:
312:
308:
304:
300:
293:
286:
279:
273:
269:
259:
256:
254:
251:
249:
246:
244:
241:
239:
236:
234:
231:
230:
224:
220:
218:
208:
206:
201:
197:
187:
185:
181:
176:
174:
171:analysis and
170:
166:
161:
159:
155:
151:
141:
139:
133:
131:
127:
126:octave errors
123:
119:
114:
111:
107:
103:
102:zero crossing
93:
90:
88:
84:
83:speech coding
80:
76:
71:
69:
65:
61:
57:
53:
49:
48:quasiperiodic
45:
41:
37:
33:
29:
22:
530:
509:
468:
464:
451:
440:
427:
414:
404:
389:
370:
364:
351:
302:
298:
285:
276:D. Gerhard.
272:
221:
214:
193:
177:
162:
147:
134:
125:
115:
99:
91:
72:
31:
27:
25:
150:periodogram
70:, or both.
64:time domain
52:oscillating
572:Categories
398:Connexions
264:References
130:polyphonic
493:0001-4966
327:0001-4966
233:Auto-Tune
110:waveforms
75:phonetics
36:algorithm
501:18537404
409:779–797.
335:12002874
227:See also
169:cepstral
34:) is an
473:Bibcode
343:1607434
307:Bibcode
537:
499:
491:
377:
341:
333:
325:
217:speech
66:, the
60:speech
461:(PDF)
339:S2CID
295:(PDF)
46:of a
40:pitch
535:ISBN
497:PMID
489:ISSN
375:ISBN
331:PMID
323:ISSN
481:doi
469:123
315:doi
303:111
158:FFT
58:of
50:or
42:or
32:PDA
574::
521:^
495:.
487:.
479:.
467:.
463:.
337:.
329:.
321:.
313:.
301:.
297:.
167:;
140:.
85:,
81:,
77:,
26:A
543:.
503:.
483::
475::
383:.
345:.
317::
309::
30:(
23:.
Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.