Knowledge

Quantile normalization

Source 📝

120:
However, note that when, as in column two, values are tied in rank, they should instead be assigned the mean of the values corresponding to the ranks they would normally represent if they were different. In the case of column 2, they represent ranks iii and iv. So we assign the two tied rank iii
92:
These rank values are set aside to use later. Go back to the first set of data. Rearrange that first set of column values so each column is in order going lowest to highest value. (First column consists of 5,2,3,4. This is rearranged to 2,3,4,5. Second Column 4,1,4,2 is rearranged to 1,2,4,4, and
27:
identical in statistical properties. To quantile-normalize a test distribution to a reference distribution of the same length, sort the test distribution and sort the reference distribution. The highest entry in the test distribution then takes the value of the highest entry in the reference
130:
Min. :2.000 Min. :2.000 Min. :2.000 1st Qu.:2.750 1st Qu.:2.750 1st Qu.:2.750 Median :3.833 Median :4.083 Median :3.833 Mean :3.833 Mean :3.833 Mean :3.833 3rd Qu.:4.917 3rd Qu.:5.167 3rd Qu.:4.917 Max. :5.667 Max. :5.167 Max. :5.667
39:) of the distributions. So the highest value in all cases becomes the mean of the highest values, the second highest value becomes the mean of the second highest values, and so on. 28:
distribution, the next highest entry in the reference distribution, and so on, until the test distribution is a perturbation of the reference distribution.
230: 96:
A 5 4 3 becomes A 2 1 3 B 2 1 4 becomes B 3 2 4 C 3 4 6 becomes C 4 4 6 D 4 2 8 becomes D 5 4 8
127:
The new values have the same distribution and can now be easily compared. Here are the summary statistics for each of the three columns:
121:
entries the mean of 4.67 for rank iii and 5.67 for rank iv, which is 5.17. And so we arrive at the following set of normalized values:
35:
normalize two or more distributions to each other, without a reference distribution, sort as before, then set to the average (usually,
102:
A (2 + 1 + 3)/3 = 2.00 = rank i B (3 + 2 + 4)/3 = 3.00 = rank ii C (4 + 4 + 6)/3 = 4.67 = rank iii D (5 + 4 + 8)/3 = 5.67 = rank iv
249: 93:
column 3 consisting of 3,4,6,8 stays the same because it is already in order from lowest to highest value.) The result is:
254: 51: 185:"A comparison of normalization methods for high density oligonucleotide array data based on variance and bias" 227: 124:
A 5.67 5.17 2.00 B 2.00 2.00 3.00 C 3.00 5.17 4.67 D 4.67 3.00 5.67
114:
A 5.67 4.67 2.00 B 2.00 2.00 3.00 C 3.00 4.67 4.67 D 4.67 3.00 5.67
42:
Generally a reference distribution will be one of the standard statistical distributions such as the
24: 50:. The reference distribution can be generated randomly or from taking regular samples from the 43: 47: 8: 165: 201: 184: 206: 169: 196: 157: 148:
Amaratunga, D.; Cabrera, J. (2001). "Analysis of Data from Viral DNA Microchips".
234: 36: 108:
A iv iii i B i i ii C ii iii iii D iii ii iv
89:
A iv iii i B i i ii C ii iii iii D iii ii iv
161: 86:
For each column determine a rank from lowest to highest and assign number i-iv
243: 210: 58: 54:
of the distribution. However, any reference distribution can be used.
83:
A 5 4 3 B 2 1 4 C 3 4 6 D 4 2 8
183:
Bolstad, B. M.; Irizarry, R. A.; Astrand, M.; Speed, T. P. (2003).
32: 77:
A quick illustration of such normalizing on a very small dataset:
182: 16:
Technique to make two distributions statistically identical
105:
Now take the ranking order and substitute in new values
99:
Now find the mean for each row to determine the ranks
147: 241: 150:Journal of the American Statistical Association 141: 57:Quantile normalization is frequently used in 176: 200: 242: 117:These are the new normalized values. 61:data analysis. It was introduced as 13: 14: 266: 228:Normalization of Affymetrix Chips 221: 52:cumulative distribution function 250:Statistical data transformation 202:10.1093/bioinformatics/19.2.185 23:is a technique for making two 1: 134: 80:Arrays 1 to 3, genes A to D 7: 10: 271: 162:10.1198/016214501753381814 72: 255:Equivalence (mathematics) 63:quantile standardization 67:quantile normalization 21:quantile normalization 44:Gaussian distribution 65:and then renamed as 48:Poisson distribution 233:2016-04-23 at the 262: 215: 214: 204: 180: 174: 173: 145: 270: 269: 265: 264: 263: 261: 260: 259: 240: 239: 235:Wayback Machine 224: 219: 218: 181: 177: 146: 142: 137: 132: 125: 115: 109: 103: 97: 90: 84: 75: 37:arithmetic mean 19:In statistics, 17: 12: 11: 5: 268: 258: 257: 252: 238: 237: 223: 222:External links 220: 217: 216: 195:(2): 185–193. 189:Bioinformatics 175: 139: 138: 136: 133: 129: 123: 113: 107: 101: 95: 88: 82: 74: 71: 15: 9: 6: 4: 3: 2: 267: 256: 253: 251: 248: 247: 245: 236: 232: 229: 226: 225: 212: 208: 203: 198: 194: 190: 186: 179: 171: 167: 163: 159: 156:(456): 1161. 155: 151: 144: 140: 128: 122: 118: 112: 106: 100: 94: 87: 81: 78: 70: 68: 64: 60: 55: 53: 49: 45: 40: 38: 34: 29: 26: 25:distributions 22: 192: 188: 178: 153: 149: 143: 126: 119: 116: 110: 104: 98: 91: 85: 79: 76: 66: 62: 56: 41: 30: 20: 18: 244:Categories 135:References 59:microarray 111:becomes: 231:Archived 211:12538238 170:18154109 33:quantile 73:Example 46:or the 209:  168:  166:S2CID 207:PMID 197:doi 158:doi 31:To 246:: 205:. 193:19 191:. 187:. 164:. 154:96 152:. 69:. 213:. 199:: 172:. 160::

Index

distributions
quantile
arithmetic mean
Gaussian distribution
Poisson distribution
cumulative distribution function
microarray
doi
10.1198/016214501753381814
S2CID
18154109
"A comparison of normalization methods for high density oligonucleotide array data based on variance and bias"
doi
10.1093/bioinformatics/19.2.185
PMID
12538238
Normalization of Affymetrix Chips
Archived
Wayback Machine
Categories
Statistical data transformation
Equivalence (mathematics)

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.