Knowledge

Overcategorization

Source 📝

38: 180:. Assigning few category labels that are most closely related to the content of the item being classified will result in searches that have high precision, I.e., where a high proportion of the results are closely related to the query. Assigning more category labels to each item will reduce the precision of each search, but increase the recall, retrieving more relevant results. Related LIS concepts include exhaustivity of indexing and 204:
and not useful links, the damage is limited: The user only wastes time selecting links. In many cases, however, the user cannot judge whether or not a given link will turn out to be fruitful. In that case he or she has to follow the link and to read or skim another document. The worst case scenario
235:
categories are assigned to a document, recall may decrease. If too many non-relevant categories are assigned, precision becomes lower. The hard job is to say which categories are fruitful or
205:
is, of course, that even after reading the new document the user is unable to decide whether or not it might be useful if its subject matter is not thoroughly investigated.
216:
way. If the system is inconsistent, it means that when the user considers the links in a given category, he or she will not find all documents relevant to that category.
110: 82: 89: 59: 228: 129: 96: 67: 162: 224: 78: 63: 176:
In LIS, the ideal number of terms that should be assigned to classify an item are measured by the variables
303: 209: 18: 298: 48: 166: 52: 248: 103: 258: 208:
Overcategorization also has another unpleasant implication: It makes the system (for example
253: 201: 181: 177: 219:
Basically, the problem of overcategorization should be understood from the perspective of
8: 268: 193: 273: 170: 292: 25: 278: 197: 213: 154: 263: 236: 220: 37: 158: 192:
If too many categories are assigned to a given document, the
153:
is the process of assigning too many categories, classes or
200:
the links are. If the user is able to distinguish between
290: 66:. Unsourced material may be challenged and 130:Learn how and when to remove this message 291: 64:adding citations to reliable sources 31: 187: 13: 14: 315: 17:For the Knowledge guideline, see 239:for future use of the document. 223:and the traditional measures of 36: 163:Library and Information Science 1: 284: 212:) difficult to maintain in a 19:Knowledge:Overcategorization 7: 242: 10: 320: 15: 196:for users depend on how 24:Not to be confused with 167:document classification 161:. It is related to the 259:Information pollution 254:Information overload 182:information overload 178:precision and recall 79:"Overcategorization" 60:improve this article 304:Information science 269:Subject (documents) 165:(LIS) concepts of 147:overcategorisation 143:Overcategorization 140: 139: 132: 114: 311: 274:Subject indexing 188:Basic principles 171:subject indexing 151:category clutter 135: 128: 124: 121: 115: 113: 72: 40: 32: 29: 22: 319: 318: 314: 313: 312: 310: 309: 308: 299:Library science 289: 288: 287: 245: 190: 136: 125: 119: 116: 73: 71: 57: 41: 30: 23: 16: 12: 11: 5: 317: 307: 306: 301: 286: 283: 282: 281: 276: 271: 266: 261: 256: 251: 244: 241: 189: 186: 138: 137: 44: 42: 35: 9: 6: 4: 3: 2: 316: 305: 302: 300: 297: 296: 294: 280: 277: 275: 272: 270: 267: 265: 262: 260: 257: 255: 252: 250: 247: 246: 240: 238: 234: 231:. If too few 230: 226: 222: 217: 215: 211: 206: 203: 199: 195: 185: 183: 179: 174: 172: 168: 164: 160: 156: 152: 148: 144: 134: 131: 123: 120:November 2011 112: 109: 105: 102: 98: 95: 91: 88: 84: 81: –  80: 76: 75:Find sources: 69: 65: 61: 55: 54: 50: 45:This article 43: 39: 34: 33: 27: 20: 249:Exhaustivity 232: 218: 210:in Knowledge 207: 194:implications 191: 175: 150: 146: 142: 141: 126: 117: 107: 100: 93: 86: 74: 58:Please help 46: 26:Overcategory 279:Overfitting 198:informative 157:to a given 155:index terms 293:Categories 285:References 214:consistent 90:newspapers 264:Relevance 229:precision 221:relevance 47:does not 243:See also 237:relevant 233:relevant 159:document 104:scholar 68:removed 53:sources 225:recall 202:useful 106:  99:  92:  85:  77:  111:JSTOR 97:books 227:and 169:and 83:news 51:any 49:cite 149:or 62:by 295:: 184:. 173:. 145:, 133:) 127:( 122:) 118:( 108:· 101:· 94:· 87:· 70:. 56:. 28:. 21:.

Index

Knowledge:Overcategorization
Overcategory

cite
sources
improve this article
adding citations to reliable sources
removed
"Overcategorization"
news
newspapers
books
scholar
JSTOR
Learn how and when to remove this message
index terms
document
Library and Information Science
document classification
subject indexing
precision and recall
information overload
implications
informative
useful
in Knowledge
consistent
relevance
recall
precision

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.