443:
76:, language is first and foremost spoken language. The medium of spoken language is sound. The Pangloss Collection gives access to original recordings simultaneously with transcriptions and translations, as a resource for further research. After being recorded in its cultural context, texts have been transcribed in collaboration with
475:
459:
427:
411:
525:
521:
120:. The project originated in 1996 from the collaboration of Boyd Michailovsky, linguist at LACITO, with John B. Lowe, engineer; they were later joined by Michel Jacobson, engineer, who developed some tools for the project, and brought it online.
147:
The archive has grown steadily since the early 2000s, incorporating corpora from various linguists, whether members of LACITO or not. In 2009, the archive had 200 recordings in 45 languages. In 2014, the (newly renamed)
125:
to conserve, and to make available for research, recorded and transcribed oral traditions and other linguistic materials in (mainly) unwritten languages, giving simultaneous access to sound recordings and text
515:
382:
559:
554:
256:
342:
544:
369:
284:
108:
network of archival repositories and of the
Digital Endangered Languages and Music Archive Network (DELAMAN).
471:
133:
455:
320:
246:
Michailovsky, Boyd, Martine
Mazaudon, Alexis Michaud, Séverine Guillaume, Alexandre François &
97:
549:
357:
407:
315:
219:
207:
191:
179:
167:
56:
423:
439:
274:. Int'l Workshop on Resources and Tools in Field Linguistics. Las Palmas, Canary Is., Spain.
88:
The archived data is structured in accordance with the latest data-processing standards, as
215:
32:
395:
8:
510:
129:
329:
89:
325:
247:
52:
529:
211:
195:
183:
175:
77:
28:
24:
303:
500:
538:
223:
199:
187:
141:
496:
252:
Documenting and researching endangered languages: the
Pangloss Collection
137:
101:
93:
73:
48:
128:.” The earliest archived corpora in the collection were languages from
516:
Access to the
Pangloss Collection through the CoCoON search interface
159:
in 196 languages, totalling 780 hours of audio and video recordings.
490:
522:
Access to the
Pangloss Collection through the OLAC search interface
251:
67:
171:
163:
36:
504:
44:
302:
Jacobson, Michel; Michailovsky, Boyd; Lowe, John B. (2001).
105:
40:
16:
Digital library of audio recordings in endangered languages
511:
Access to the
Pangloss Collection through its language map
242:
240:
238:
301:
272:
The LACITO Archive : its purpose and implementation
310:. Special issue: “Speech Annotation and Corpus Tools”.
235:
297:
295:
293:
100:. The software used to prepare and disseminate it is
27:
whose objective is to store and facilitate access to
360:
of the
Pangloss Collection (retrieved 24 April 2021)
269:
385:
on the Cocoon homepage (retrieved 10 January 2022).
304:"Linguistic documents synchronizing sound and text"
290:
83:
536:
155:As of April 2021, the Pangloss archive contains
68:A sound archive with synchronized transcriptions
560:French National Centre for Scientific Research
162:Languages in the Pangloss Collection include
270:Jacobson, Michel; Michailovsky, Boyd (2002).
104:. The Pangloss Collection is a member of the
370:Screen capture of LACITO's archive contents
363:
343:Screen capture of LACITO's archive contents
336:
285:Screen capture of LACITO's archive homepage
278:
394:Source: number of language entries in its
55:, spontaneous speech, in otherwise little-
353:
351:
319:
257:Language Documentation & Conservation
116:The collection was initially called the
537:
348:
152:had 1,400 recordings in 70 languages.
491:Homepage of the Pangloss Collection
13:
555:Creative Commons-licensed websites
14:
571:
495:Sample text from the collection:
484:
507:, presented in bilingual format.
123:The purpose of the archive was “
96:, and may be downloaded under a
465:
449:
433:
417:
401:
84:A structured, open architecture
47:, the collection provides free
35:of the world. Developed by the
388:
383:list of all Pangloss resources
375:
263:
1:
545:Endangered languages projects
330:10.1016/S0167-6393(00)00070-4
229:
62:
59:languages of all continents.
7:
10:
576:
398:(retrieved 24 April 2021).
111:
98:Creative Commons license
497:“The Ogre Kanayongba”
308:Speech Communication
33:endangered languages
372:— 26 November 2009.
287:— 27 February 2001.
150:Pangloss Collection
72:For the science of
21:Pangloss Collection
528:2021-04-24 at the
456:Yongning Na corpus
358:“About us” section
499:, a story in the
90:open architecture
567:
479:
469:
463:
453:
447:
437:
431:
421:
415:
405:
399:
392:
386:
379:
373:
367:
361:
355:
346:
345:— 22 April 2002.
340:
334:
333:
323:
299:
288:
282:
276:
275:
267:
261:
248:Evangelia Adamou
244:
157:5,038 recordings
51:to documents of
29:audio recordings
575:
574:
570:
569:
568:
566:
565:
564:
535:
534:
530:Wayback Machine
487:
482:
470:
466:
454:
450:
438:
434:
422:
418:
406:
402:
396:list of corpora
393:
389:
380:
376:
368:
364:
356:
349:
341:
337:
300:
291:
283:
279:
268:
264:
260:8, pp. 119-135.
245:
236:
232:
212:Southwest China
196:Southwest China
184:Southwest China
114:
86:
78:native speakers
70:
65:
25:digital library
17:
12:
11:
5:
573:
563:
562:
557:
552:
550:Sound archives
547:
533:
532:
519:
513:
508:
501:Limbu language
493:
486:
485:External links
483:
481:
480:
464:
448:
432:
416:
408:Mwotlap corpus
400:
387:
374:
362:
347:
335:
321:10.1.1.467.490
314:(1–2): 79–96.
289:
277:
262:
233:
231:
228:
138:eastern Africa
118:LACITO Archive
113:
110:
85:
82:
69:
66:
64:
61:
15:
9:
6:
4:
3:
2:
572:
561:
558:
556:
553:
551:
548:
546:
543:
542:
540:
531:
527:
523:
520:
517:
514:
512:
509:
506:
502:
498:
494:
492:
489:
488:
477:
476:230 resources
473:
472:Cèmuhî corpus
468:
461:
460:301 resources
457:
452:
445:
444:363 resources
441:
436:
429:
428:551 resources
425:
424:Japhug corpus
420:
413:
412:564 resources
409:
404:
397:
391:
384:
378:
371:
366:
359:
354:
352:
344:
339:
331:
327:
322:
317:
313:
309:
305:
298:
296:
294:
286:
281:
273:
266:
259:
258:
253:
249:
243:
241:
239:
234:
227:
225:
224:New Caledonia
221:
217:
213:
209:
205:
201:
197:
193:
189:
185:
181:
177:
173:
169:
165:
160:
158:
153:
151:
145:
143:
142:French Guiana
139:
135:
134:New Caledonia
131:
127:
121:
119:
109:
107:
103:
99:
95:
91:
81:
79:
75:
60:
58:
54:
50:
49:online access
46:
42:
38:
34:
30:
26:
22:
467:
451:
435:
419:
403:
390:
377:
365:
338:
311:
307:
280:
271:
265:
255:
220:Austronesian
208:Sino-Tibetan
203:
192:Sino-Tibetan
180:Sino-Tibetan
168:Austronesian
161:
156:
154:
149:
146:
124:
122:
117:
115:
87:
71:
20:
18:
440:Ersu corpus
204:Yongning Na
102:open-source
94:open format
74:linguistics
539:Categories
230:References
126:annotation
63:Principles
57:documented
39:centre of
316:CiteSeerX
53:connected
526:Archived
381:Source:
250:. 2014.
214:), and
92:, in an
172:Vanuatu
164:Mwotlap
136:, from
132:, from
112:History
318:
216:Cèmuhî
176:Japhug
37:LACITO
505:Nepal
130:Nepal
45:Paris
23:is a
226:).
202:(or
200:Naxi
198:),
188:Ersu
186:),
174:),
140:and
106:OLAC
41:CNRS
19:The
503:of
326:doi
43:in
31:in
541::
524:.
474::
458::
442::
426::
410::
350:^
324:.
312:33
306:.
292:^
254:.
237:^
222:;
210:;
206::
194:;
182:;
170:;
144:.
80:.
518:.
478:.
462:.
446:.
430:.
414:.
332:.
328::
218:(
190:(
178:(
166:(
Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.