Knowledge

Word list

Source 📝

83: 2319: 365: 473:. For statistical purpose, all these words are summed up under the base word form *possib*, allowing the ranking of a concept and form occurrence. Moreover, other languages may present specific difficulties. Such is the case of Chinese, which does not use spaces between words, and where a specified chain of several characters can be interpreted as either a phrase of unique-character words, or as a multi-character word. 25: 817:. It includes the F.F.1 list with 1,500 high-frequency words, completed by a later F.F.2 list with 1,700 mid-frequency words, and the most used syntax rules. It is claimed that 70 grammatical words constitute 50% of the communicatives sentence, while 3,680 words make about 95~98% of coverage. A list of 3,000 frequent words is available. 779:
A corpus of 5 million running words, from written texts used in United States schools (various grades, various subject areas). Its value is in its focus on school teaching materials, and its tagging of words by the frequency of each word, in each of the school grade, and in each of the subject areas
762:
The General Service List contains 2,000 headwords divided into two sets of 1,000 words. A corpus of 5 million written words was analyzed in the 1940s. The rate of occurrence (%) for different meanings, and parts of speech, of the headword are provided. Various criteria, other than frequence and
399:
made a long critical evaluation of this traditional textual analysis approach, and support a move toward speech analysis and analysis of film subtitles available online. This has recently been followed by a handful of follow-up studies, providing valuable frequency count analysis for various
597: 456:
In any case, the basic "word" unit should be defined. For Latin scripts, words are usually one or several characters separated either by spaces or punctuation. But exceptions can arise, such as English "can't", French "aujourd'hui", or idioms. It may also be preferable to group words of a
745:
The Teacher Word Book contains 30,000 lemmas or ~13,000 word families (Goulden, Nation and Read, 1990). A corpus of 18 million written words was hand analysed. The size of its source corpus increased its usefulness, but its age, and language changes, have reduced its applicability
737:). 20th century's works all suffer from their age. In particular, words relating to technology, such as "blog," which, in 2014, was #7665 in frequency in the Corpus of Contemporary American English, was first attested to in 1999, and does not appear in any of these three lists. 2114:
Soares, Ana Paula; Machado, João; Costa, Ana; Iriarte, Álvaro; SimÔes, Alberto; de Almeida, José João; Comesaña, Montserrat; Perea, Manuel (April 2015), "On the advantages of word frequency and contextual diversity measures extracted from subtitles: The case of Portuguese",
521:
of the ratio between its frequency and the frequency of the most frequent item. The most common item belongs to frequency class 0 (zero) and any item that is approximately half as frequent belongs in class 1. In the example list above, the misspelled word
663:'s modern language teaching summary encourages first to "move from high frequency vocabulary and special purposes vocabulary to low frequency vocabulary, then to teach learners strategies to sustain autonomous vocabulary expansion" ( 763:
range, were carefully applied to the corpus. Thus, despite its age, some errors, and its corpus being entirely written text, it is still an excellent database of word frequency, frequency of meanings, and reduction of noise (
532: 485:
holds for frequency lists drawn from longer texts of any natural language. Frequency lists are a useful tool when building an electronic dictionary, which is a prerequisite for a wide range of applications in
1943:
Dimitropoulou, M.; Duñabeitia, Jon Andoni; Avilés, Alberto; Corral, José; Carreiras, Manuel (2010), "SUBTLEX-GR: Subtitle-Based Word Frequencies as the Best Estimate of Reading Behavior: The Case of Greek",
1622:
The corpus of characters and their pedagogical aspect in ancient and contemporary China (fr: Les corpus de caractÚres et leur dimension pédagogique dans la Chine ancienne et contemporaine)
626: 331:) noted the incredible help provided by computing capabilities, making corpus analysis much easier. He cited several key issues which influence the construction of frequency lists: 1784:"Moving beyond Kucera and Francis: a critical evaluation of current word frequency norms and the introduction of a new and improved word frequency measure for American English" 733:
and colleagues hand-counted 18,000,000 running words to provide the first large-scale English language frequency list, before modern computers made such projects far easier (
1257: 905:
Chinese corpora have long been studied from the perspective of frequency lists. The historical way to learn Chinese vocabulary is based on characters frequency (
515: 170:), but is mainly intended for course writers, not directly for learners. Frequency lists are also made for lexicographical purposes, serving as a sort of 945:
list of about 8,600 common traditional Chinese words are two other lists displaying common Chinese words and characters. Following the SUBTLEX movement,
1993: 825: 2106: 792:
These now contain 1 million words from a written corpus representing different dialects of English. These sources are used to produce frequency lists (
592:{\displaystyle N=\left\lfloor 0.5-\log _{2}\left({\frac {\text{Frequency of this item}}{\text{Frequency of most common item}}}\right)\right\rfloor } 166:. A lexicon sorted by frequency "provides a rational basis for making sure that learners get the best return for their vocabulary learning effort" ( 1867: 828:. Jean Baudot made a study on the model of the American Brown study, entitled "Fréquences d'utilisation des mots en français écrit contemporain". 844: 1984:
Pham, H.; Bolger, P.; Baayen, R.H. (2011), "SUBTLEX-VIE : A Measure for Vietnamese Word and Character Frequencies on Film Subtitles",
1730: 832: 655:
Those lists are not intended to be given directly to students, but rather to serve as a guideline for teachers and textbook authors (
2172: 1517: 1468:
Laufer, B. (1997), "What's in a word that makes it hard or easy? Some intralexical factors that affect the learning of words.",
1261: 2268: 1518:"The word frequency effect: a review of recent developments and implications for the choice of frequency estimates in German" 1459: 1358: 2057:
Tang, K. (2012), "A 61 million word corpus of Brazilian Portuguese film subtitles as a resource for linguistic research",
683:). Memorization is positively affected by higher word frequency, likely because the learner is subject to more exposures ( 1552:
Rudell, A.P. (1993), "Frequency of word usage and perceived word difficulty : Ratings of Kucera and Francis words",
2363: 1295: 2358: 1506: 1477: 126: 104: 64: 1373: 97: 2348: 182:". While word counting is a thousand years old, with still gigantic analysis done by hand in the mid-20th century, 35: 1387: 2353: 2286: 1875:
Cai, Q.; Brysbaert, M. (2010), "SUBTLEX-CH: Chinese Word and Character Frequencies Based on Film Subtitles",
1783: 934: 605: 2202: 2165: 2067: 1516:
Brysbaert, Marc; Buchmeier, Matthias; Conrad, Markus; Jacobs, Arthur M.; Bölte, Jens; Böhl, Andrea (2011).
1204: 974: 888: 716: 395:
proposed to tap into the large number of subtitles available online to analyse large numbers of speeches.
2331: 183: 175: 1076:
Amenta, Simona; Mandera, PaweƂ; Keuleers, Emmanuel; Brysbaert, Marc; Crepaldi, Davide (7 January 2022).
1278:
Comprendre et aider les enfants en difficulté scolaire: Le Vocabulaire fondamental, 70 mots essentiels
1276: 814: 767:). This list was updated in 2013 by Dr. Charles Browne, Dr. Brent Culligan and Joseph Phillips as the 1182: 942: 487: 190: 174:
to ensure that common words are not left out. Some major pitfalls are the corpus content, the corpus
158:(generally sorted by frequency of occurrence either by levels or as a ranked list) within some given 2334:
incorrectly led you here, you may wish to change the link to point directly to the intended article.
1828: 1683:
Gimenes, Manuel; New, Boris (2016), "Worldlex: Twitter and blog word frequencies for 66 languages",
2324: 2226: 1655: 768: 91: 42: 2181: 2158: 1672: 1335: 186:
of large corpora such as movie subtitles (SUBTLEX megastudy) has accelerated the research field.
1771: 163: 108: 2150: 1321: 2261: 930: 688: 2220: 1884: 820:
The French Ministry of the Education also provide a ranked list of the 1,500 most frequent
755: 644: 960:
Most frequently used words in different languages based on Knowledge or combined corpora.
8: 1924: 1019:
Boada, Roger; Guasch, Marc; Haro, Juan; Demestre, Josep; Ferré, Pilar (1 February 2020).
1005: 984: 400:
languages. Indeed, the SUBTLEX movement completed in five years full studies for French (
1888: 1447: 2304: 2243: 2208: 2140: 2045: 1994:"SUBTLEX-US : Adding Part of Speech Information to the SUBTLEXus Word Frequencies" 1968: 1907: 1816: 1763: 1638: 1590: 1421: 1307: 1155:"Words and phrases: Frequency, genres, collocates, concordances, synonyms, and WordNet" 1058: 852: 696: 687:). Lexical access is positively influenced by high word frequency, a phenomenon called 500: 2132: 2100: 2089: 2037: 1973: 1912: 1861: 1850: 1808: 1767: 1710: 1702: 1582: 1578: 1537: 1502: 1498: 1473: 1455: 1354: 1237: 1112: 1050: 1042: 938: 1594: 1062: 913:
mentioned its importance for Chinese as a foreign language learning and teaching in
140: 2144: 2124: 2079: 2049: 2029: 1963: 1953: 1902: 1892: 1840: 1820: 1798: 1753: 1745: 1692: 1574: 1529: 1494: 1208: 1104: 1032: 969: 860: 726: 639:, are used to identify the least common, specialized terms to be replaced by their 636: 518: 369: 1154: 2214: 2128: 1897: 1533: 1130: 1093:"The Development of Word Frequency Lists Prior to the 1944 Thorndike-Lorge List" 41:
The references used may be made clearer with a different or consistent style of
1562: 1037: 1020: 910: 848: 629: 373: 46: 2033: 1749: 1697: 929:) provided large databases with frequency ranks for characters and words. The 2342: 2196: 2093: 1958: 1829:"SUBTLEX--NL: A new measure for Dutch word frequency based on film subtitles" 1706: 1116: 1108: 1046: 1021:"SUBTLEX-CAT: Subtitle word frequencies and contextual diversity for Catalan" 775:
The American Heritage Word Frequency Book (Carroll, Davies and Richman, 1971)
482: 2014: 2136: 2041: 1977: 1916: 1854: 1812: 1714: 1541: 1286:- Citing V.A.C Henmon (dead link, no Internet Archive copy, 10 August 2023) 1233: 1092: 1054: 987:– shows changes in word/phrase frequency (and relative frequency) over time 730: 2084: 1758: 1586: 205:, where frequency here usually means the number of occurrences in a given 1845: 1803: 1486: 836: 821: 722: 660: 458: 380: 206: 159: 1405: 855:, number of occurrence in the source corpus, frequency rank, associated 1923:
Cuetos, F.; Glez-nosti, Maria; BarbĂłn, AnalĂ­a; Brysbaert, Marc (2011),
1168: 957:
Wiktionary:Frequency lists contains frequency lists in more languages.
949:
recently made a rich study of Chinese word and character frequencies.
1925:"SUBTLEX-ESP : Spanish word frequencies based on film subtitles" 979: 462: 364: 202: 171: 1731:"SUBTLEX-FR: The use of film subtitles to estimate word frequencies" 1620: 1561:
Segui, J.; Mehler, Jacques; Frauenfelder, Uli; Morton, John (1982),
2006: 1942: 840: 640: 425: 1546: 741:
The Teachers Word Book of 30,000 words (Thorndike and Lorge, 1944)
209:, from which the rank can be derived as the position in the list. 2012: 721:
Word counting is an ancient field, with known discussion back to
445: 155: 2318: 2068:"SUBTLEX- AL: Albanian word frequencies based on film subtitles" 2015:"Subtlex-pl: subtitle-based word frequency estimates for Polish" 2013:
Mandera, P.; Keuleers, E.; Wodniecka, Z.; Brysbaert, M. (2014).
1351:
Fréquences d'utilisation des mots en français écrit contemporain
1212: 469:
are words of the same word family, represented by the base word
448:) and Catalan (2019). SUBTLEX-IT (2015) provides raw data only. 2180: 1922: 894: 879:
made a completely new counting based on online film subtitles.
856: 421: 1515: 1075: 676: 1560: 871:
This Lexique3 is a continuous study from which originate the
692: 788:
The Brown (Francis and Kucera, 1982) LOB and related corpora
695:). The effect of word frequency is related to the effect of 1472:, Cambridge: Cambridge University Press, pp. 140–155, 893:
There have been several studies of Spanish word frequency (
198: 179: 1728: 876: 401: 392: 2113: 1729:
New, B.; Brysbaert, M.; Veronis, J.; Pallier, C. (2007).
1296:
Liste des "70 mots essentiels" recensés par V.A.C. Henmon
437: 379:
Most of currently available studies are based on written
1454:, Cambridge: Cambridge University Press, pp. 6–19, 1077: 1018: 933:
list of 8,848 high and medium frequency words in the
608: 535: 503: 16:
Bare list of a language's words in corpus linguistics
1991: 409: 526:has a ratio of 76/3789654 and belongs in class 16. 620: 591: 509: 1470:Vocabulary: Description, Acquisition and Pedagogy 1452:Vocabulary: Description, Acquisition and Pedagogy 1423:Most frequently used words in different languages 675:Word frequency is known to have various effects ( 2340: 2117:The Quarterly Journal of Experimental Psychology 2066:Avdyli, Rrezarta; Cuetos, Fernando (June 2013), 1992:Brysbaert, M.; New, Boris; Keuleers, E. (2012), 1983: 1630: 1448:"Vocabulary size, text coverage, and word lists" 1169:"Corpus of Contemporary American English (COCA)" 813:. An attempt was made in the 1950s–60s with the 429: 2105:: CS1 maint: DOI inactive as of August 2024 ( 1563:"The word frequency effect and lexical access" 1336:"Maitrise de la langue Ă  l'Ă©cole: Vocabulaire" 2166: 670: 383:, more easily available and easy to process. 359: 2065: 1874: 1866:: CS1 maint: multiple names: authors list ( 1781: 946: 615: 609: 441: 417: 405: 396: 2316: 1826: 1489:(2006), "Language Education - Vocabulary", 998: 925:) and the Taiwanese Ministry of Education ( 413: 2173: 2159: 1682: 1491:Encyclopedia of Language & Linguistics 707:Below is a review of available resources. 2323:This article includes a language-related 2305:Japanese-Language Proficiency Test / JLPT 2083: 1967: 1957: 1906: 1896: 1844: 1802: 1757: 1696: 1610: 1274: 1090: 1069: 1036: 918: 699:, the age at which the word was learned. 344:treatment of idioms and fixed expressions 127:Learn how and when to remove this message 65:Learn how and when to remove this message 859:, etc., available under an open license 363: 90:This article includes a list of general 1618: 1439: 1202: 906: 621:{\displaystyle \lfloor \ldots \rfloor } 139:For word lists used in word games, see 2341: 1670: 1551: 1485: 1467: 1445: 1348: 1232: 1006:"Crr Â» Subtitle Word Frequencies" 810: 793: 781: 764: 747: 734: 684: 680: 664: 656: 328: 184:natural language electronic processing 167: 2269:Test of Chinese as a Foreign Language 2154: 1722: 1654:Taiwan Ministry of Education (1997), 1600: 1338:. MinistĂšre de l'Ă©ducation nationale. 872: 2056: 1827:Keuleers, E, M, B.; New, B. (2010), 1782:Brysbaert, Marc; New, Boris (2009), 835:provides 142,000 French words, with 433: 76: 18: 1653: 926: 386: 13: 1636: 1625:(These de doctorat), Paris: INALCO 922: 410:Brysbaert, New & Keuleers 2012 96:it lacks sufficient corresponding 14: 2375: 1671:New, Boris; Pallier, Christophe, 1091:Bontrager, Terry (1 April 1991). 517:of an item in the list using the 467:possible, impossible, possibility 201:(word types) together with their 2317: 1607:(frequency list of German words) 1556:, vol. 25, pp. 455–463 921:). As a frequency toolkit, Da ( 461:under the representation of its 141:Scrabble § Acceptable words 81: 23: 1450:, in Schmitt; McCarthy (eds.), 1414: 1398: 1380: 1366: 1342: 1328: 1314: 1300: 1289: 1268: 1250: 1226: 635:Frequency lists, together with 451: 1640:Jun Da: Chinese text computing 1499:10.1016/B0-08-044854-2/00678-7 1388:"Spanish word frequency lists" 1196: 1185:. The Economist. 20 April 2006 1175: 1161: 1147: 1123: 1084: 1012: 824:, provided by the lexicologue 430:Pham, Bolger & Baayen 2011 317: 211: 1: 2287:Test of Proficiency in Korean 1631:Written texts-based databases 1613:Why Johnny can't read Chinese 1434: 915:Why Johnny Can't Read Chinese 576:Frequency of most common item 476: 2203:Simplified Technical English 2129:10.1080/17470218.2014.964271 1898:10.1371/journal.pone.0010729 1579:10.1016/0028-3932(82)90061-6 975:Most common words in English 889:Most common words in Spanish 717:Most common words in English 702: 493:German linguists define the 154:) is a list of a language's 7: 2262:Hanyu Shuiping Kaoshi / HSK 2088:(inactive 28 August 2024), 2072:ILIRIA International Review 1677:(in French) (3.01 ed.) 1353:, Presses de L'UniversitĂ©, 963: 831:More recently, the project 650: 436:) and Portugal Portuguese ( 10: 2380: 1407:Wiktionary:Frequency lists 1392:Vocabularywiki.pbworks.com 1038:10.3758/s13428-019-01233-1 939:Republic of China (Taiwan) 935:People's Republic of China 900: 886: 882: 809:A review has been made by 714: 710: 671:Effects of words frequency 360:Traditional written corpus 354: 341:treatment of word families 322: 138: 2364:Computational linguistics 2296: 2278: 2253: 2236: 2189: 2034:10.3758/s13428-014-0489-4 2001:Behavior Research Methods 1833:Behavior Research Methods 1791:Behavior Research Methods 1750:10.1017/s014271640707035x 1738:Applied Psycholinguistics 1698:10.3758/s13428-015-0621-0 1685:Behavior Research Methods 1619:Allanic, Bernard (2003), 1534:10.1027/1618-3169/a000123 1275:Ouzoulias, AndrĂ© (2004), 1258:"Le français fondamental" 1025:Behavior Research Methods 799: 488:computational linguistics 426:Dimitropoulou et al. 2010 335:corpus representativeness 191:computational linguistics 178:, and the definition of " 162:, serving the purpose of 2359:Quantitative linguistics 2227:New General Service List 1959:10.3389/fpsyg.2010.00218 1611:DeFrancis, John (1966), 1603:Deutsche Sprachstatistik 1183:"It's the links, stupid" 1109:10.1080/0270271910120201 991: 952: 947:Cai & Brysbaert 2010 769:New General Service List 442:Avdyli & Cuetos 2013 418:Cai & Brysbaert 2010 406:Brysbaert & New 2009 397:Brysbaert & New 2009 338:word frequency and range 2349:Lists of language lists 2182:Word lists by frequency 1946:Frontiers in Psychology 1522:Experimental Psychology 1322:"PDF 3000 French words" 1203:Merholz, Peter (1999). 909:). American sinologist 414:Keuleers & New 2010 111:more precise citations. 1601:Meier, Helmut (1967), 622: 593: 573:Frequency of this item 511: 432:), Brazil Portuguese ( 376: 350:various other criteria 164:vocabulary acquisition 2085:10.21113/iir.v3i1.112 2059:UCL Work Pap Linguist 715:Further information: 689:word frequency effect 677:Brysbaert et al. 2011 623: 594: 512: 404:), American English ( 367: 2354:Language acquisition 2221:General Service List 1846:10.3758/brm.42.3.643 1804:10.3758/brm.41.4.977 1440:Theoretical concepts 815:Français fondamental 805:Traditional datasets 756:General Service List 645:semantic compression 606: 533: 501: 347:range of information 197:is a sorted list of 2184:and number of words 1889:2010PLoSO...510729C 1674:Manuel de Lexique 3 1493:, Oxford: 494–499, 1446:Nation, P. (1997), 1349:Baudot, J. (1992), 985:Google Ngram Viewer 446:Mandera et al. 2014 2244:Academic Word List 1605:, Hildesheim: Olms 1236:(26 August 2003). 1097:Reading Psychology 895:Cuetos et al. 2011 697:age-of-acquisition 618: 589: 507: 497:(frequency class) 438:Soares et al. 2015 422:Cuetos et al. 2011 377: 2314: 2313: 2022:Behav Res Methods 1461:978-0-521-58551-4 1360:978-2-7606-1563-2 811:New & Pallier 637:semantic networks 578: 577: 574: 510:{\displaystyle N} 495:HĂ€ufigkeitsklasse 370:personal pronouns 315: 314: 137: 136: 129: 75: 74: 67: 2371: 2335: 2321: 2175: 2168: 2161: 2152: 2151: 2147: 2110: 2104: 2096: 2087: 2062: 2053: 2019: 2004: 1998: 1988: 1980: 1971: 1961: 1952:(December): 12, 1939: 1929: 1919: 1910: 1900: 1871: 1865: 1857: 1848: 1823: 1806: 1788: 1778: 1776: 1770:. Archived from 1761: 1735: 1723:SUBTLEX movement 1717: 1700: 1678: 1666: 1665: 1664: 1649: 1648: 1647: 1637:Da, Jun (1998), 1626: 1615: 1606: 1597: 1567:Neuropsychologia 1557: 1545: 1511: 1482: 1464: 1428: 1427: 1418: 1412: 1411: 1402: 1396: 1395: 1384: 1378: 1377: 1370: 1364: 1363: 1346: 1340: 1339: 1332: 1326: 1325: 1318: 1312: 1311: 1304: 1298: 1293: 1287: 1285: 1283: 1272: 1266: 1265: 1260:. Archived from 1254: 1248: 1247: 1245: 1244: 1230: 1224: 1223: 1221: 1220: 1211:. Archived from 1209:Internet Archive 1200: 1194: 1193: 1191: 1190: 1179: 1173: 1172: 1165: 1159: 1158: 1151: 1145: 1144: 1142: 1141: 1127: 1121: 1120: 1088: 1082: 1081: 1073: 1067: 1066: 1040: 1016: 1010: 1009: 1002: 970:Letter frequency 873:Subtlex movement 727:Edward Thorndike 643:in a process of 627: 625: 624: 619: 598: 596: 595: 590: 588: 584: 583: 579: 575: 572: 571: 562: 561: 519:base 2 logarithm 516: 514: 513: 508: 387:SUBTLEX movement 305:transducionalify 212: 132: 125: 121: 118: 112: 107:this article by 98:inline citations 85: 84: 77: 70: 63: 59: 56: 50: 27: 26: 19: 2379: 2378: 2374: 2373: 2372: 2370: 2369: 2368: 2339: 2338: 2337: 2336: 2329: 2328: 2315: 2310: 2292: 2274: 2249: 2232: 2215:Special English 2185: 2179: 2098: 2097: 2017: 1996: 1927: 1859: 1858: 1786: 1774: 1733: 1725: 1662: 1660: 1645: 1643: 1633: 1509: 1480: 1462: 1442: 1437: 1432: 1431: 1420: 1419: 1415: 1404: 1403: 1399: 1386: 1385: 1381: 1372: 1371: 1367: 1361: 1347: 1343: 1334: 1333: 1329: 1320: 1319: 1315: 1306: 1305: 1301: 1294: 1290: 1281: 1273: 1269: 1256: 1255: 1251: 1242: 1240: 1231: 1227: 1218: 1216: 1201: 1197: 1188: 1186: 1181: 1180: 1176: 1167: 1166: 1162: 1153: 1152: 1148: 1139: 1137: 1135:psycnet.apa.org 1129: 1128: 1124: 1089: 1085: 1074: 1070: 1017: 1013: 1004: 1003: 999: 994: 966: 955: 903: 891: 885: 877:New et al. 2007 802: 725:time. In 1944, 719: 713: 705: 673: 653: 607: 604: 603: 570: 566: 557: 553: 546: 542: 534: 531: 530: 502: 499: 498: 479: 454: 428:), Vietnamese ( 402:New et al. 2007 393:New et al. 2007 389: 362: 357: 325: 320: 144: 133: 122: 116: 113: 103:Please help to 102: 86: 82: 71: 60: 54: 51: 40: 34:has an unclear 28: 24: 17: 12: 11: 5: 2377: 2367: 2366: 2361: 2356: 2351: 2322: 2312: 2311: 2309: 2308: 2300: 2298: 2294: 2293: 2291: 2290: 2282: 2280: 2276: 2275: 2273: 2272: 2265: 2257: 2255: 2251: 2250: 2248: 2247: 2240: 2238: 2234: 2233: 2231: 2230: 2224: 2218: 2212: 2206: 2200: 2193: 2191: 2187: 2186: 2178: 2177: 2170: 2163: 2155: 2149: 2148: 2123:(4): 680–696, 2111: 2078:(1): 285–292, 2063: 2054: 2028:(2): 471–483. 2010: 1989: 1981: 1940: 1920: 1872: 1839:(3): 643–650, 1824: 1797:(4): 977–990, 1779: 1777:on 2016-10-24. 1759:1854/LU-599589 1724: 1721: 1720: 1719: 1691:(3): 963–972, 1680: 1668: 1651: 1632: 1629: 1628: 1627: 1616: 1608: 1598: 1573:(6): 615–627, 1558: 1549: 1528:(5): 412–424. 1513: 1507: 1483: 1478: 1465: 1460: 1441: 1438: 1436: 1433: 1430: 1429: 1413: 1410:, 21 July 2024 1397: 1379: 1365: 1359: 1341: 1327: 1313: 1308:"Generalities" 1299: 1288: 1267: 1264:on 2010-07-04. 1249: 1225: 1195: 1174: 1160: 1146: 1122: 1083: 1068: 1031:(1): 360–375. 1011: 996: 995: 993: 990: 989: 988: 982: 977: 972: 965: 962: 954: 951: 919:DeFrancis 1966 911:John DeFrancis 902: 899: 887:Main article: 884: 881: 869: 868: 849:part of speech 826:Étienne Brunet 807: 806: 801: 798: 790: 789: 777: 776: 760: 759: 743: 742: 712: 709: 704: 701: 672: 669: 652: 649: 630:floor function 617: 614: 611: 600: 599: 587: 582: 569: 565: 560: 556: 552: 549: 545: 541: 538: 506: 481:It seems that 478: 475: 453: 450: 388: 385: 374:Serbo-Croatian 361: 358: 356: 353: 352: 351: 348: 345: 342: 339: 336: 324: 321: 319: 316: 313: 312: 309: 306: 302: 301: 299: 297: 294: 293: 290: 287: 283: 282: 280: 278: 275: 274: 271: 268: 264: 263: 260: 257: 253: 252: 250: 248: 245: 244: 241: 238: 234: 233: 230: 227: 223: 222: 219: 216: 195:frequency list 135: 134: 89: 87: 80: 73: 72: 36:citation style 31: 29: 22: 15: 9: 6: 4: 3: 2: 2376: 2365: 2362: 2360: 2357: 2355: 2352: 2350: 2347: 2346: 2344: 2333: 2332:internal link 2326: 2325:list of lists 2320: 2306: 2302: 2301: 2299: 2295: 2288: 2284: 2283: 2281: 2277: 2270: 2266: 2263: 2259: 2258: 2256: 2252: 2245: 2242: 2241: 2239: 2235: 2228: 2225: 2222: 2219: 2216: 2213: 2210: 2207: 2204: 2201: 2198: 2197:Basic English 2195: 2194: 2192: 2188: 2183: 2176: 2171: 2169: 2164: 2162: 2157: 2156: 2153: 2146: 2142: 2138: 2134: 2130: 2126: 2122: 2118: 2112: 2108: 2102: 2095: 2091: 2086: 2081: 2077: 2073: 2069: 2064: 2061:(24): 208–214 2060: 2055: 2051: 2047: 2043: 2039: 2035: 2031: 2027: 2023: 2016: 2011: 2008: 2002: 1995: 1990: 1987: 1982: 1979: 1975: 1970: 1965: 1960: 1955: 1951: 1947: 1941: 1937: 1933: 1926: 1921: 1918: 1914: 1909: 1904: 1899: 1894: 1890: 1886: 1882: 1878: 1873: 1869: 1863: 1856: 1852: 1847: 1842: 1838: 1834: 1830: 1825: 1822: 1818: 1814: 1810: 1805: 1800: 1796: 1792: 1785: 1780: 1773: 1769: 1765: 1760: 1755: 1751: 1747: 1743: 1739: 1732: 1727: 1726: 1716: 1712: 1708: 1704: 1699: 1694: 1690: 1686: 1681: 1676: 1675: 1669: 1659: 1658: 1657:ć…«ćć…­ćčŽćžžç”šèȘžè©žèȘżæŸ„ć ±ć‘Šæ›ž 1652: 1642: 1641: 1635: 1634: 1624: 1623: 1617: 1614: 1609: 1604: 1599: 1596: 1592: 1588: 1584: 1580: 1576: 1572: 1568: 1564: 1559: 1555: 1550: 1548: 1543: 1539: 1535: 1531: 1527: 1523: 1519: 1514: 1510: 1508:9780080448541 1504: 1500: 1496: 1492: 1488: 1484: 1481: 1479:9780521585514 1475: 1471: 1466: 1463: 1457: 1453: 1449: 1444: 1443: 1425: 1424: 1417: 1409: 1408: 1401: 1393: 1389: 1383: 1375: 1369: 1362: 1356: 1352: 1345: 1337: 1331: 1323: 1317: 1309: 1303: 1297: 1292: 1280: 1279: 1271: 1263: 1259: 1253: 1239: 1235: 1234:Kottke, Jason 1229: 1215:on 1999-10-13 1214: 1210: 1206: 1205:"Peterme.com" 1199: 1184: 1178: 1170: 1164: 1156: 1150: 1136: 1132: 1131:"APA PsycNet" 1126: 1118: 1114: 1110: 1106: 1103:(2): 91–116. 1102: 1098: 1094: 1087: 1079: 1072: 1064: 1060: 1056: 1052: 1048: 1044: 1039: 1034: 1030: 1026: 1022: 1015: 1007: 1001: 997: 986: 983: 981: 978: 976: 973: 971: 968: 967: 961: 958: 950: 948: 944: 940: 936: 932: 928: 924: 920: 916: 912: 908: 898: 896: 890: 880: 878: 875:cited above. 874: 866: 865: 864: 862: 858: 854: 850: 846: 842: 838: 834: 829: 827: 823: 822:word families 818: 816: 812: 804: 803: 797: 795: 787: 786: 785: 783: 774: 773: 772: 770: 766: 757: 753: 752: 751: 749: 740: 739: 738: 736: 732: 728: 724: 718: 708: 700: 698: 694: 690: 686: 682: 678: 668: 666: 662: 658: 648: 646: 642: 638: 633: 631: 612: 585: 580: 567: 563: 558: 554: 550: 547: 543: 539: 536: 529: 528: 527: 525: 520: 504: 496: 491: 489: 484: 474: 472: 468: 464: 460: 449: 447: 443: 440:), Albanian ( 439: 435: 431: 427: 423: 419: 415: 411: 407: 403: 398: 394: 384: 382: 375: 371: 368:Frequency of 366: 349: 346: 343: 340: 337: 334: 333: 332: 330: 310: 307: 304: 303: 300: 298: 296: 295: 291: 288: 285: 284: 281: 279: 277: 276: 272: 269: 266: 265: 261: 258: 255: 254: 251: 249: 247: 246: 242: 239: 236: 235: 231: 228: 225: 224: 220: 217: 214: 213: 210: 208: 204: 200: 196: 192: 187: 185: 181: 177: 173: 169: 165: 161: 157: 153: 149: 142: 131: 128: 120: 117:December 2023 110: 106: 100: 99: 93: 88: 79: 78: 69: 66: 58: 48: 44: 38: 37: 32:This article 30: 21: 20: 2120: 2116: 2075: 2071: 2058: 2025: 2021: 2000: 1985: 1949: 1945: 1935: 1931: 1880: 1876: 1836: 1832: 1794: 1790: 1772:the original 1741: 1737: 1688: 1684: 1673: 1661:, retrieved 1656: 1644:, retrieved 1639: 1621: 1612: 1602: 1570: 1566: 1553: 1525: 1521: 1490: 1469: 1451: 1422: 1416: 1406: 1400: 1391: 1382: 1368: 1350: 1344: 1330: 1316: 1302: 1291: 1277: 1270: 1262:the original 1252: 1241:. Retrieved 1238:"kottke.org" 1228: 1217:. Retrieved 1213:the original 1198: 1187:. Retrieved 1177: 1163: 1149: 1138:. Retrieved 1134: 1125: 1100: 1096: 1086: 1078:"SUBTLEX-IT" 1071: 1028: 1024: 1014: 1000: 959: 956: 914: 907:Allanic 2003 904: 892: 870: 861:CC-by-sa-4.0 830: 819: 808: 791: 778: 761: 758:(West, 1953) 744: 720: 706: 693:Segui et al. 674: 654: 634: 601: 523: 494: 492: 480: 470: 466: 455: 452:Lexical unit 420:), Spanish ( 416:), Chinese ( 390: 378: 326: 194: 188: 151: 147: 145: 123: 114: 95: 61: 52: 33: 1932:PsicolĂłgica 845:syllabation 837:orthography 794:Nation 1997 782:Nation 1997 765:Nation 1997 748:Nation 1997 735:Nation 1997 731:Irvin Lorge 723:Hellenistic 685:Laufer 1997 681:Rudell 1993 665:Nation 2006 661:Paul Nation 657:Nation 1997 459:word family 444:), Polish ( 381:text corpus 329:Nation 1997 318:Methodology 218:Occurrences 168:Nation 1997 160:text corpus 109:introducing 2343:Categories 1744:(4): 661. 1663:2010-08-21 1646:2010-08-21 1487:Nation, P. 1435:References 1243:2008-06-05 1219:2008-06-05 1189:2008-06-05 1140:2023-05-15 937:, and the 524:outragious 483:Zipf's law 477:Statistics 424:), Greek ( 412:), Dutch ( 311:123,567th 92:references 55:March 2021 47:footnoting 2303:list for 2285:list for 2267:list for 2260:list for 2094:2365-8592 2007:databases 1938:: 133–143 1768:145366468 1707:1554-3528 1374:"Lexique" 1117:0270-2711 1047:1554-3528 980:Long tail 703:Languages 641:hypernyms 616:⌋ 613:… 610:⌊ 564:⁡ 551:− 463:base word 434:Tang 2012 391:However, 292:34,589th 286:stringyfy 240:2,098,762 229:3,789,654 203:frequency 172:checklist 148:word list 2297:Japanese 2289:(10,635) 2137:25263599 2101:citation 2042:24942246 1978:21833273 1917:20532192 1883:(6): 8, 1877:PLOS ONE 1862:citation 1855:20805586 1813:19897807 1715:26170053 1595:39694258 1547:database 1542:21768069 1426:, ezglot 1063:84843788 1055:30895456 964:See also 927:TME 1997 841:phonetic 833:Lexique3 651:Pedagogy 586:⌋ 544:⌊ 471:*possib* 465:. Thus, 327:Nation ( 273:1,357th 262:1,356th 176:register 43:citation 2271:(~8600) 2254:Chinese 2237:Add-ons 2229:(~2800) 2217:(~1500) 2209:Globish 2190:English 2145:5376519 2050:2334688 1969:3153823 1908:2880003 1885:Bibcode 1821:4792474 1587:7162585 923:Da 1998 901:Chinese 883:Spanish 867:Subtlex 857:lexemes 711:English 628:is the 355:Corpora 323:Factors 156:lexicon 152:lexicon 105:improve 2330:If an 2307:(8009) 2279:Korean 2264:(8848) 2223:(2000) 2211:(1500) 2205:(~875) 2143:  2135:  2092:  2048:  2040:  2003:: 1–22 1976:  1966:  1915:  1905:  1853:  1819:  1811:  1766:  1713:  1705:  1593:  1585:  1540:  1505:  1476:  1458:  1357:  1284:, Retz 1115:  1061:  1053:  1045:  853:gender 800:French 602:where 270:56,975 259:57,897 207:corpus 94:, but 2246:(570) 2199:(850) 2141:S2CID 2046:S2CID 2018:(PDF) 1997:(PDF) 1928:(PDF) 1817:S2CID 1787:(PDF) 1775:(PDF) 1764:S2CID 1734:(PDF) 1591:S2CID 1282:(PDF) 1059:S2CID 992:Notes 953:Other 221:Rank 199:words 2133:PMID 2107:link 2090:ISSN 2038:PMID 1986:ACOL 1974:PMID 1913:PMID 1868:link 1851:PMID 1809:PMID 1711:PMID 1703:ISSN 1583:PMID 1554:Most 1538:PMID 1503:ISBN 1474:ISBN 1456:ISBN 1355:ISBN 1113:ISSN 1051:PMID 1043:ISSN 754:The 256:king 243:2nd 232:1st 215:Type 193:, a 180:word 150:(or 45:and 2125:doi 2080:doi 2030:doi 1964:PMC 1954:doi 1903:PMC 1893:doi 1841:doi 1799:doi 1754:hdl 1746:doi 1693:doi 1575:doi 1530:doi 1495:doi 1105:doi 1033:doi 943:TOP 941:'s 931:HSK 897:). 796:). 784:). 750:). 667:). 659:). 555:log 548:0.5 372:in 267:boy 226:the 189:In 2345:: 2327:. 2139:, 2131:, 2121:68 2119:, 2103:}} 2099:{{ 2074:, 2070:, 2044:. 2036:. 2026:47 2024:. 2020:. 1999:, 1972:, 1962:, 1948:, 1936:32 1934:, 1930:, 1911:, 1901:, 1891:, 1879:, 1864:}} 1860:{{ 1849:, 1837:42 1835:, 1831:, 1815:, 1807:, 1795:41 1793:, 1789:, 1762:. 1752:. 1742:28 1740:. 1736:. 1709:, 1701:, 1689:48 1687:, 1589:, 1581:, 1571:20 1569:, 1565:, 1536:. 1526:58 1524:. 1520:. 1501:, 1390:. 1207:. 1133:. 1111:. 1101:12 1099:. 1095:. 1057:. 1049:. 1041:. 1029:52 1027:. 1023:. 863:. 851:, 847:, 843:, 839:, 771:. 729:, 679:; 647:. 632:. 490:. 408:; 237:he 146:A 2174:e 2167:t 2160:v 2127:: 2109:) 2082:: 2076:3 2052:. 2032:: 2009:) 2005:( 1956:: 1950:1 1895:: 1887:: 1881:5 1870:) 1843:: 1801:: 1756:: 1748:: 1718:. 1695:: 1679:. 1667:. 1650:. 1577:: 1544:. 1532:: 1512:. 1497:: 1394:. 1376:. 1324:. 1310:. 1246:. 1222:. 1192:. 1171:. 1157:. 1143:. 1119:. 1107:: 1080:. 1065:. 1035:: 1008:. 917:( 780:( 746:( 691:( 581:) 568:( 559:2 540:= 537:N 505:N 308:1 289:5 143:. 130:) 124:( 119:) 115:( 101:. 68:) 62:( 57:) 53:( 49:. 39:.

Index

citation style
citation
footnoting
Learn how and when to remove this message
references
inline citations
improve
introducing
Learn how and when to remove this message
Scrabble § Acceptable words
lexicon
text corpus
vocabulary acquisition
Nation 1997
checklist
register
word
natural language electronic processing
computational linguistics
words
frequency
corpus
Nation 1997

personal pronouns
Serbo-Croatian
text corpus
New et al. 2007
Brysbaert & New 2009
New et al. 2007

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.

↑