157:
475:
192:
288:
495:
87:. When resampling audio to a notably lower pitch, it may be preferred that the source audio is of a higher sample rate, as slowing down the playback rate will reproduce an audio signal of a lower resolution, and therefore reduce the perceived clarity of the sound. On the contrary, when resampling audio to a notably higher pitch, it may be preferred to incorporate an interpolation filter, as frequencies that surpass the
1479:
83:. When using this method, the frequencies in the recording are always scaled at the same ratio as the speed, transposing its perceived pitch up or down in the process. Slowing down the recording to increase duration also lowers the pitch, while speeding it up for a shorter duration respectively raises the pitch, creating the so-called
453:
is the representation of verbal text in compressed time. While one might expect speeding up to reduce comprehension, Herb
Friedman says that "Experiments have shown that the brain works most efficiently if the information rate through the ears—via speech—is the 'average' reading rate, which is about
187:
of the signal, and sinusoidal "tracks" are created by connecting peaks in adjacent frames. The tracks are then re-synthesized at a new time scale. This method can yield good results on both polyphonic and percussive material, especially when the signal is separated into sub-bands. However, this
295:
In order to preserve an audio signal's pitch when stretching or compressing its duration, many time-scale modification (TSM) procedures follow a frame-based approach. Given an original discrete-time audio signal, this strategy's first step is to split the signal into short
271:
This is much more limited in scope than the phase vocoder based processing, but can be made much less processor intensive, for real-time applications. It provides the most coherent results for single-pitched sounds like voice or musically monophonic instrument recordings.
150:("beat") waveforms at all non-integer compression/expansion rates, which renders the results phasey and diffuse. Recent improvements allow better quality results at all compression/expansion ratios but a residual smearing effect still remains.
543:
For example, one could move the pitch of every note up by a perfect fifth, keeping the tempo the same. One can view this transposition as "pitch shifting", "shifting" each note up 7 keys on a piano keyboard, or adding a fixed amount on the
428:. However, simply superimposing the unmodified analysis frames typically results in undesired artifacts such as phase discontinuities or amplitude fluctuations. To prevent these kinds of artifacts, the analysis frames are adapted to form
253:
or the synchronized overlap-add method (SOLA) and performs somewhat faster than the phase vocoder on slower machines but fails when the autocorrelation mis-estimates the period of a signal with complicated harmonics (such as
519:
an audio sample while holding speed or duration constant. This may be accomplished by time stretching and then resampling back to the original length. Alternatively, the frequency of the sinusoids in a
153:
The phase vocoder technique can also be used to perform pitch shifting, chorusing, timbre manipulation, harmonizing, and other unusual modifications, all of which can be changed as a function of time.
275:
High-end commercial audio processing packages either combine the two techniques (for example by separating the signal into sinusoid and transient waveforms), or use other techniques based on the
59:
These processes are often used to match the pitches and tempos of two pre-recorded clips for mixing when the clips cannot be reperformed or resampled. Time stretching is often used to adjust
264:(formerly Cool Edit Pro) seems to solve this by looking for the period closest to a center period that the user specifies, which should be an integer multiple of the tempo, and between 30
374:
336:
426:
91:(determined by the sampling rate of the audio reproduction software or device) will create usually undesired sound distortions, a phenomenon that is also known as
762:
1002:
138:
perform an inverse STFT by taking the inverse
Fourier transform on each chunk and adding the resulting waveform chunks, also called overlap and add (OLA).
586:-like effect, which may be desirable or undesirable. A process that preserves the formants and character of a voice involves analyzing the signal with a
67:
to fit exactly into the 30 or 60 seconds available. It can be used to conform longer material to a designated time slot, such as a 1-hour broadcast.
962:
860:
685:
967:
1052:
1459:
92:
435:
The strategy of how to derive the synthesis frames from the analysis frames is a key difference among different TSM procedures.
552:. One can view the same transposition as "frequency scaling", "scaling" (multiplying) the frequency of every note by 3/2.
800:
David Malah (April 1979). "Time-domain algorithms for harmonic bandwidth reduction and time scaling of speech signals".
1508:
777:
742:
Jont B. Allen (June 1977). "Short Time
Spectral Analysis, Synthesis, and Modification by Discrete Fourier Transform".
578:
Time domain processing works much better here, as smearing is less noticeable, but scaling vocal samples distorts the
1300:
602:
84:
903:
981:
Free and commercial versions of a popular 3rd party time stretching library for iOS, Linux, Windows and Mac OS X
1045:
56:
is a simpler process which affects pitch and speed simultaneously by slowing down or speeding up a recording.
75:
The simplest way to change the duration or pitch of an audio recording is to change the playback speed. For a
1434:
1087:
571:, which adds a fixed frequency offset to the frequency of every note. (In theory one could perform a literal
338:. To achieve the actual time-scale modification, the analysis frames are then temporally relocated to have a
156:
885:
344:
306:
250:
184:
180:
125:
942:
864:
621:
standard for media playback. Similar controls are ubiquitous in media applications and frameworks such as
135:
apply some processing to the
Fourier transform magnitudes and phases (like resampling the FFT blocks); and
1454:
383:
174:
1503:
724:
454:
200–300 wpm (words per minute), yet the average rate of speech is in the neighborhood of 100–150 wpm."
129:
847:
689:
1208:
1193:
1184:
1038:
975:
Theory, equations, figures and performances of a real-time guitar pitch shifter running on a DSP chip
595:
231:
710:
575:
in which the musical pitch space location is scaled , but that is highly unusual, and not musical.)
1483:
591:
279:
transform, or artificial neural network processing, producing the highest-quality time stretching.
243:
64:
1159:
936:
583:
474:
33:
1513:
1295:
697:
601:
A detailed description of older analog recording techniques for pitch shifting can be found at
450:
80:
996:
195:
Modelling a monophonic sound as observation along a helix of a function with a cylinder domain
1119:
1009: (archived 2023-02-02), a well-known algorithm for extreme (>10×) time stretching
933:
A comprehensive overview of current time and pitch modification techniques by
Stephan Bernsee
516:
227:
147:
984:
1219:
1189:
957:
568:
8:
1134:
930:
626:
300:
of fixed length. The analysis frames are spaced by a fixed number of samples, called the
20:
1285:
1199:
758:
113:
One way of stretching the length of a signal without affecting the pitch is to build a
1092:
1082:
958:
New Phase-Vocoder
Techniques for Pitch-Shifting, Harmonizing and Other Exotic Effects
661:
524:
may be altered directly, and the signal reconstructed at the appropriate time scale.
88:
1396:
1305:
1224:
828:
656:
643:
521:
124:
compute the instantaneous frequency/amplitude relationship of the signal using the
60:
1386:
1249:
1244:
1204:
1144:
1061:
1006:
651:
235:
223:
376:. This frame relocation results in a modification of the signal's duration by a
44:
is the opposite: the process of changing the pitch without affecting the speed.
1449:
1444:
1391:
1371:
1351:
1069:
647:
261:
191:
146:
components well, but early implementations introduced considerable smearing on
1497:
1401:
1315:
1254:
1234:
1179:
1112:
951:
481:
458:
114:
108:
76:
53:
37:
16:
Changing the speed or duration of an audio signal without affecting its pitch
946:
945:
A Javascript pitchshifter based on smbPitchShift code, from the open source
1439:
1406:
1381:
1376:
1259:
1139:
1102:
1097:
1077:
638:
287:
49:
987:
commercial cross-platform library, mainly used by DJ and DAW manufacturers
1429:
1413:
1356:
1346:
1310:
1290:
1129:
999:
Free MATLAB implementations of various Time-Scale
Modification procedures
614:
549:
536:
501:
219:
183:
of the signal. In this method, peaks are identified in frames using the
45:
1018:
443:
For the specific case of speech, time stretching can be performed using
432:, prior to the reconstruction of the time-scale modified output signal.
1361:
1320:
1239:
1124:
833:
816:
494:
484:
1012:
218:
and
Schafer in 1978 put forth an alternate solution that works in the
1174:
1149:
1027:— open-source library for changing the tempo, pitch and playback rate
1024:
666:
622:
545:
529:
255:
143:
972:
814:
1229:
1107:
1030:
1015:
open source and commercial libraries for real time audio stretching
556:
239:
1214:
598:
and then resynthesizing it at a different fundamental frequency.
587:
579:
457:
Listening to time-compressed speech is seen as the equivalent of
276:
215:
993:
from Qneo - specialized synthesizer for creative voice sculpting
889:
978:
560:
132:
of a short, overlapping and smoothly windowed block of samples;
1464:
802:
IEEE Transactions on
Acoustics, Speech, and Signal Processing
744:
IEEE Transactions on
Acoustics, Speech, and Signal Processing
444:
265:
210:
188:
method is more computationally demanding than other methods.
1021:— open source library for time stretching and pitch shifting
990:
963:
A new Approach to Transient Processing in the Phase Vocoder
939:
C source code for doing frequency domain pitch manipulation
618:
613:
Pitch-corrected audio timestretch is found in every modern
1366:
850:, Creative Computing Vol. 9, No. 7 / July 1983 / p. 122
32:
is the process of changing the speed or duration of an
817:"A Review of Time-Scale Modification of Music Signals"
386:
347:
309:
603:Alvin and the Chipmunks § Recording technique
555:Musical transposition preserves the ratios of the
480:Pitch shifting (frequency scaling) is provided on
438:
420:
368:
330:
19:"Timestretch" redirects here. For the album, see
1495:
763:"Speech Processing Based on a Sinusoidal Model"
756:
179:Another method for time stretching relies on a
168:
161:
160:Sinusoidal analysis/synthesis system (based on
1046:
937:Stephan Bernsee's smbPitchShift C source code
815:Jonathan Driedger and Meinard Müller (2016).
741:
230:) of a given section of the wave using some
79:recording, this can be accomplished through
931:Time Stretching and Pitch Shifting Overview
799:
291:Frame-based approach of many TSM procedures
1053:
1039:
904:"HTMLMediaElement.playbackRate - Web APIs"
1460:Music technology (electronic and digital)
954:- A good description of the phase vocoder
832:
362:
324:
286:
190:
155:
608:
559:frequencies that determine the sound's
282:
1496:
117:after Flanagan, Golden, and Portnoff.
1034:
861:"Listen to podcasts in half the time"
548:, or adding a fixed amount in linear
515:These techniques can also be used to
369:{\displaystyle H_{s}\in \mathbb {N} }
331:{\displaystyle H_{a}\in \mathbb {N} }
1060:
421:{\displaystyle \alpha =H_{s}/H_{a}}
234:(commonly the peak of the signal's
97:
52:and intended for live performance.
48:is pitch scaling implemented in an
13:
686:"Dolby, The Chipmunks And NAB2004"
14:
1525:
1301:Recording studio as an instrument
924:
508:keep frequency ratio and harmony.
1477:
500:Frequency shifting provided by
493:
473:
464:
102:
896:
439:Speed hearing and speed talking
268:and the lowest bass frequency.
878:
853:
841:
808:
793:
770:The Lincoln Laboratory Journal
750:
735:
717:
678:
199:
1:
952:The Phase Vocoder: A Tutorial
672:
70:
1484:Record production portal
973:How to build a pitch shifter
776:(2): 153–167, archived from
594:vocoder plus any of several
540:, depending on perspective.
251:time-domain harmonic scaling
169:Sinusoidal spectral modeling
7:
1455:Music technology (electric)
979:ZTX Time Stretching Library
632:
175:Spectral modeling synthesis
162:McAulay & Quatieri 1988
10:
1530:
596:pitch detection algorithms
527:Transposing can be called
208:
172:
142:The phase vocoder handles
130:discrete Fourier transform
106:
18:
1509:Digital signal processing
1473:
1422:
1329:
1268:
1158:
1068:
943:pitchshift.js from KievII
246:one period into another.
232:pitch detection algorithm
65:television advertisements
804:. ASSP-27 (2): 121–133.
746:. ASSP-25 (3): 235–238.
705:Cite magazine requires
646:— real-time changes of
584:Alvin and the Chipmunks
204:
729:www.atarimagazines.com
451:Time-compressed speech
422:
370:
332:
292:
222:: attempt to find the
196:
165:
81:sample rate conversion
36:without affecting its
1372:Ghostwriters in music
423:
371:
333:
290:
228:fundamental frequency
226:(or equivalently the
194:
159:
609:In consumer software
569:amplitude modulation
384:
345:
307:
283:Frame-based approach
985:Elastique by zplane
1286:Hip hop production
834:10.3390/app6020057
504:Frequency Shifter
418:
366:
328:
293:
197:
166:
1504:Audio engineering
1491:
1490:
1093:Critical distance
725:"Variable speech"
662:Scrubbing (audio)
378:stretching factor
340:synthesis hopsize
242:processing), and
89:Nyquist frequency
63:and the audio of
61:radio commercials
1521:
1482:
1481:
1480:
1397:Session musician
1062:Music production
1055:
1048:
1041:
1032:
1031:
919:
918:
916:
914:
900:
894:
893:
888:. Archived from
886:"Speeding iPods"
882:
876:
875:
873:
872:
863:. Archived from
857:
851:
845:
839:
838:
836:
821:Applied Sciences
812:
806:
805:
797:
791:
790:
789:
788:
782:
767:
757:McAulay, R. J.;
754:
748:
747:
739:
733:
732:
721:
715:
714:
708:
703:
701:
693:
688:. Archived from
682:
657:Pitch correction
644:Dynamic tonality
522:sinusoidal model
497:
477:
430:synthesis frames
427:
425:
424:
419:
417:
416:
407:
402:
401:
375:
373:
372:
367:
365:
357:
356:
337:
335:
334:
329:
327:
319:
318:
302:analysis hopsize
98:Frequency domain
1529:
1528:
1524:
1523:
1522:
1520:
1519:
1518:
1494:
1493:
1492:
1487:
1478:
1476:
1469:
1418:
1387:Record producer
1340:
1336:
1325:
1279:
1275:
1264:
1205:Double tracking
1161:
1154:
1145:Sound recording
1083:Audio mastering
1064:
1059:
1007:Wayback Machine
968:PICOLA and TDHS
927:
922:
912:
910:
902:
901:
897:
884:
883:
879:
870:
868:
859:
858:
854:
848:Variable Speech
846:
842:
813:
809:
798:
794:
786:
784:
780:
765:
759:Quatieri, T. F.
755:
751:
740:
736:
723:
722:
718:
707:|magazine=
706:
704:
695:
694:
684:
683:
679:
675:
635:
617:as part of the
611:
588:channel vocoder
582:into a sort of
565:frequency shift
513:
512:
511:
510:
509:
498:
489:
488:
487:
478:
467:
441:
412:
408:
403:
397:
393:
385:
382:
381:
361:
352:
348:
346:
343:
342:
323:
314:
310:
308:
305:
304:
298:analysis frames
285:
249:This is called
238:, or sometimes
236:autocorrelation
213:
207:
202:
177:
171:
128:, which is the
111:
105:
100:
85:Chipmunk effect
73:
30:Time stretching
27:
17:
12:
11:
5:
1527:
1517:
1516:
1511:
1506:
1489:
1488:
1474:
1471:
1470:
1468:
1467:
1462:
1457:
1452:
1447:
1442:
1437:
1432:
1426:
1424:
1420:
1419:
1417:
1416:
1411:
1410:
1409:
1399:
1394:
1392:Rhythm section
1389:
1384:
1379:
1374:
1369:
1364:
1359:
1354:
1352:Audio engineer
1349:
1343:
1341:
1339:
1338:
1334:
1330:
1327:
1326:
1324:
1323:
1318:
1313:
1308:
1303:
1298:
1296:Overproduction
1293:
1288:
1282:
1280:
1278:
1277:
1273:
1269:
1266:
1265:
1263:
1262:
1257:
1252:
1247:
1242:
1237:
1232:
1227:
1225:Exciter effect
1222:
1217:
1212:
1202:
1197:
1187:
1182:
1177:
1172:
1166:
1164:
1156:
1155:
1153:
1152:
1147:
1142:
1137:
1132:
1127:
1122:
1117:
1116:
1115:
1110:
1100:
1095:
1090:
1085:
1080:
1074:
1072:
1066:
1065:
1058:
1057:
1050:
1043:
1035:
1029:
1028:
1022:
1016:
1010:
1000:
994:
988:
982:
976:
970:
965:
960:
955:
949:
947:KievII library
940:
934:
926:
925:External links
923:
921:
920:
895:
892:on 2006-09-02.
877:
852:
840:
807:
792:
749:
734:
716:
692:on 2008-05-27.
676:
674:
671:
670:
669:
664:
659:
654:
641:
634:
631:
610:
607:
537:pitch shifting
499:
492:
491:
490:
479:
472:
471:
470:
469:
468:
466:
463:
440:
437:
415:
411:
406:
400:
396:
392:
389:
364:
360:
355:
351:
326:
322:
317:
313:
284:
281:
262:Adobe Audition
206:
203:
201:
198:
181:spectral model
170:
167:
164:, p. 161)
140:
139:
136:
133:
107:Main article:
104:
101:
99:
96:
72:
69:
15:
9:
6:
4:
3:
2:
1526:
1515:
1514:Sound effects
1512:
1510:
1507:
1505:
1502:
1501:
1499:
1486:
1485:
1472:
1466:
1463:
1461:
1458:
1456:
1453:
1451:
1448:
1446:
1443:
1441:
1438:
1436:
1435:Interpolation
1433:
1431:
1428:
1427:
1425:
1421:
1415:
1412:
1408:
1405:
1404:
1403:
1402:Backup singer
1400:
1398:
1395:
1393:
1390:
1388:
1385:
1383:
1380:
1378:
1375:
1373:
1370:
1368:
1365:
1363:
1360:
1358:
1355:
1353:
1350:
1348:
1345:
1344:
1342:
1335:
1332:
1331:
1328:
1322:
1319:
1317:
1316:Wall of Sound
1314:
1312:
1309:
1307:
1304:
1302:
1299:
1297:
1294:
1292:
1289:
1287:
1284:
1283:
1281:
1274:
1271:
1270:
1267:
1261:
1258:
1256:
1253:
1251:
1248:
1246:
1243:
1241:
1238:
1236:
1235:Octave effect
1233:
1231:
1228:
1226:
1223:
1221:
1218:
1216:
1213:
1210:
1206:
1203:
1201:
1198:
1195:
1191:
1188:
1186:
1183:
1181:
1180:Chorus effect
1178:
1176:
1173:
1171:
1168:
1167:
1165:
1163:
1157:
1151:
1148:
1146:
1143:
1141:
1138:
1136:
1133:
1131:
1128:
1126:
1123:
1121:
1118:
1114:
1113:Wah-wah pedal
1111:
1109:
1106:
1105:
1104:
1101:
1099:
1096:
1094:
1091:
1089:
1086:
1084:
1081:
1079:
1076:
1075:
1073:
1071:
1067:
1063:
1056:
1051:
1049:
1044:
1042:
1037:
1036:
1033:
1026:
1023:
1020:
1017:
1014:
1011:
1008:
1004:
1001:
998:
995:
992:
989:
986:
983:
980:
977:
974:
971:
969:
966:
964:
961:
959:
956:
953:
950:
948:
944:
941:
938:
935:
932:
929:
928:
909:
905:
899:
891:
887:
881:
867:on 2011-08-29
866:
862:
856:
849:
844:
835:
830:
826:
822:
818:
811:
803:
796:
783:on 2012-05-21
779:
775:
771:
764:
760:
753:
745:
738:
730:
726:
720:
712:
699:
698:cite magazine
691:
687:
681:
677:
668:
665:
663:
660:
658:
655:
653:
649:
645:
642:
640:
637:
636:
630:
628:
624:
620:
616:
606:
604:
599:
597:
593:
589:
585:
581:
576:
574:
573:pitch scaling
570:
567:performed by
566:
563:, unlike the
562:
558:
553:
551:
547:
541:
539:
538:
533:
531:
525:
523:
518:
507:
503:
496:
486:
483:
476:
465:Pitch scaling
462:
460:
459:speed reading
455:
452:
448:
446:
436:
433:
431:
413:
409:
404:
398:
394:
390:
387:
379:
358:
353:
349:
341:
320:
315:
311:
303:
299:
289:
280:
278:
273:
269:
267:
263:
259:
257:
252:
247:
245:
241:
237:
233:
229:
225:
221:
217:
212:
193:
189:
186:
182:
176:
163:
158:
154:
151:
149:
145:
137:
134:
131:
127:
123:
122:
121:
120:Basic steps:
118:
116:
115:phase vocoder
110:
109:Phase vocoder
103:Phase vocoder
95:
94:
90:
86:
82:
78:
77:digital audio
68:
66:
62:
57:
55:
54:Pitch control
51:
47:
43:
42:Pitch scaling
39:
35:
31:
25:
23:
1475:
1440:Loudness war
1407:Ghost singer
1382:Orchestrator
1377:Horn section
1260:Reverse echo
1220:Equalization
1190:Delay effect
1169:
1140:Punch in/out
1135:Ping-ponging
1103:Effects unit
1098:Effects loop
1088:Audio mixing
1078:Audio filter
911:. Retrieved
907:
898:
890:the original
880:
869:. Retrieved
865:the original
855:
843:
824:
820:
810:
801:
795:
785:, retrieved
778:the original
773:
769:
752:
743:
737:
728:
719:
690:the original
680:
639:Beatmatching
612:
600:
577:
572:
564:
554:
542:
535:
528:
526:
514:
505:
456:
449:
442:
434:
429:
377:
339:
301:
297:
294:
274:
270:
260:
248:
214:
178:
152:
141:
119:
112:
74:
58:
50:effects unit
41:
34:audio signal
29:
28:
21:
1430:Click track
1414:Vocal coach
1357:Backup band
1337:professions
1311:Turntablism
1185:Compression
1170:Pitch shift
1130:Overdubbing
1070:Engineering
1019:Rubber Band
1003:PaulStretch
997:TSM toolbox
991:Voice Synth
913:1 September
615:web browser
550:pitch space
220:time domain
200:Time domain
46:Pitch shift
22:Timestretch
1498:Categories
1362:Bandleader
1321:Xenochrony
1276:aesthetics
1240:Noise gate
1200:Distortion
1162:processing
1125:Microphone
1025:SoundTouch
871:2008-07-24
787:2014-09-07
673:References
485:Harmonizer
256:orchestral
209:See also:
173:See also:
71:Resampling
1272:Practices
1175:Auto-Tune
1150:Tape loop
1120:Diffusion
827:(2): 57.
667:Nightcore
623:GStreamer
546:Mel scale
530:frequency
517:transpose
388:α
359:∈
321:∈
258:pieces).
244:crossfade
148:transient
93:aliasing.
1347:Arranger
1306:Sampling
1230:Flanging
1108:Talk box
761:(1988),
633:See also
580:formants
557:harmonic
506:does not
482:Eventide
240:cepstral
144:sinusoid
1250:Pumping
1215:Ducking
1160:Signal
1005:at the
532:scaling
277:wavelet
216:Rabiner
24:(album)
1450:Medley
1445:Mashup
1255:Reverb
1245:Phaser
1013:Bungee
652:timbre
648:tuning
561:timbre
224:period
1465:Remix
1423:Other
1333:Roles
1291:Lo-fi
1194:STEED
781:(PDF)
766:(PDF)
627:Unity
445:PSOLA
211:PSOLA
38:pitch
915:2021
711:help
650:and
625:and
619:HTML
502:Bode
205:SOLA
185:STFT
126:STFT
1209:ADT
908:MDN
829:doi
592:LPC
590:or
534:or
380:of
1500::
1367:DJ
906:.
823:.
819:.
772:,
768:,
727:.
702::
700:}}
696:{{
629:.
605:.
461:.
447:.
266:Hz
40:.
1211:)
1207:(
1196:)
1192:(
1054:e
1047:t
1040:v
917:.
874:.
837:.
831::
825:6
774:1
731:.
713:)
709:(
414:a
410:H
405:/
399:s
395:H
391:=
363:N
354:s
350:H
325:N
316:a
312:H
26:.
Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.