Knowledge

Gower's distance

Source ๐Ÿ“

35:
or other multivariate statistical techniques. Data can be binary, ordinal, or continuous variables. It works by normalizing the differences between each pair of variables and then computing a weighted average of these differences. The distance was defined in 1971 by Gower and it takes values between
256: 731: 488: 580: 127: 397: 344: 291: 610: 515: 771: 751: 535: 364: 311: 122: 102: 82: 62: 622: 619:
and later Podani suggested extensions where the ordering of an ordinal feature is used. For example, Podani obtains relative rank differences as
615:
In its original exposition, the distance does not treat ordinal variables in a special manner. In the 1990s, first Kaufman and
402: 932: 907: 957: 1014: 822: 786: 1019: 816: 540: 973:
Podani, Jรกnos (May 1999). "Extending Gower's general coefficient of similarity to ordinal characters".
612:
between two objects is the weighted average of the similarities calculated for all their descriptors.
806: 782: 31:
that can handle different types of data within the same dataset and is particularly useful in
369: 316: 263: 585: 493: 8: 990: 874: 756: 736: 520: 349: 296: 251:{\displaystyle S_{ij}={\frac {\sum _{k=1}^{p}w_{ijk}s_{ijk}}{\sum _{k=1}^{p}w_{ijk}}},} 107: 87: 67: 47: 28: 953: 928: 903: 982: 866: 32: 616: 1008: 726:{\displaystyle s_{ijk}=1-{\frac {|r_{i}-r_{j}|}{\max {\{r\}}-\min {\{r\}}}}} 994: 878: 854: 20: 399:
are 0 or 1, with 1 denoting equality. If the variable is continuous,
986: 870: 832: 366:-th variable. If the variable is binary or ordinal, the values of 927:(Third English ed.). Amsterdam: Elsevier. pp. 278โ€“280. 855:"A general coefficient of similarity and some of its properties" 753:
being the ranks corresponding to the ordered categories of the
781:
Many programming languages and statistical packages, such as
950:
Finding groups in data: an introduction to cluster analysis
346:
is the similarity between the two objects regarding their
36:
0 and 1 with smaller values indicating higher similarity.
900:
Modern multidimensional scaling: theory and applications
483:{\displaystyle s_{ijk}=1-{\frac {|x_{i}-x_{j}|}{R_{k}}}} 902:(2 ed.). New York : Springer. pp. 124โ€“125. 789:, etc., include implementations of Gower's distance. 759: 739: 625: 588: 543: 523: 496: 405: 372: 352: 319: 299: 266: 130: 110: 90: 70: 50: 765: 745: 725: 604: 574: 529: 509: 482: 391: 358: 338: 305: 285: 250: 116: 96: 76: 56: 1006: 947: 922: 706: 689: 948:Kaufman, Leonard; Rousseeuw, Peter J. (1990). 898:Borg, Ingwer; Groenen, Patrick J. F. (2005). 716: 710: 699: 693: 897: 776: 923:Legendre, Pierre; Legendre, Louis (2012). 293:are non-negative weights usually set to 1007: 972: 582:. As a result, the overall similarity 852: 27:between two mixed-type objects is a 952:. New York: Wiley. pp. 35โ€“36. 575:{\displaystyle 0\leq s_{ijk}\leq 1} 13: 14: 1031: 537:-th variable and thus ensuring 966: 941: 916: 891: 846: 683: 655: 463: 435: 16:Distance measure in statistics 1: 839: 39: 104:descriptors, the similarity 7: 10: 1036: 812:StatMatch::gower.dist(X) 777:Software implementations 392:{\displaystyle s_{ijk}} 339:{\displaystyle s_{ijk}} 286:{\displaystyle w_{ijk}} 853:Gower, John C (1971). 767: 747: 727: 606: 605:{\displaystyle S_{ij}} 576: 531: 511: 484: 393: 360: 340: 307: 287: 252: 225: 170: 118: 98: 78: 58: 828:gower.gower_matrix(X) 768: 748: 728: 607: 577: 532: 512: 510:{\displaystyle R_{k}} 485: 394: 361: 341: 308: 288: 253: 205: 150: 119: 99: 79: 59: 1015:Statistical distance 757: 737: 623: 586: 541: 521: 494: 403: 370: 350: 317: 297: 264: 128: 108: 88: 68: 48: 1020:Similarity measures 517:being the range of 763: 743: 723: 602: 572: 527: 507: 480: 389: 356: 336: 303: 283: 248: 114: 94: 74: 54: 29:similarity measure 934:978-0-444-53868-0 925:Numerical ecology 837: 836: 766:{\displaystyle k} 746:{\displaystyle r} 721: 530:{\displaystyle k} 478: 359:{\displaystyle k} 306:{\displaystyle 1} 243: 117:{\displaystyle S} 97:{\displaystyle p} 77:{\displaystyle j} 57:{\displaystyle i} 1027: 999: 998: 970: 964: 963: 945: 939: 938: 920: 914: 913: 909:978-0387-25150-9 895: 889: 888: 886: 885: 850: 829: 813: 795:Language/program 792: 791: 772: 770: 769: 764: 752: 750: 749: 744: 732: 730: 729: 724: 722: 720: 719: 702: 687: 686: 681: 680: 668: 667: 658: 652: 641: 640: 611: 609: 608: 603: 601: 600: 581: 579: 578: 573: 565: 564: 536: 534: 533: 528: 516: 514: 513: 508: 506: 505: 489: 487: 486: 481: 479: 477: 476: 467: 466: 461: 460: 448: 447: 438: 432: 421: 420: 398: 396: 395: 390: 388: 387: 365: 363: 362: 357: 345: 343: 342: 337: 335: 334: 312: 310: 309: 304: 292: 290: 289: 284: 282: 281: 257: 255: 254: 249: 244: 242: 241: 240: 224: 219: 203: 202: 201: 186: 185: 169: 164: 148: 143: 142: 123: 121: 120: 115: 103: 101: 100: 95: 83: 81: 80: 75: 63: 61: 60: 55: 44:For two objects 33:cluster analysis 25:Gower's distance 1035: 1034: 1030: 1029: 1028: 1026: 1025: 1024: 1005: 1004: 1003: 1002: 987:10.2307/1224438 971: 967: 960: 946: 942: 935: 921: 917: 910: 896: 892: 883: 881: 871:10.2307/2528823 851: 847: 842: 827: 811: 779: 758: 755: 754: 738: 735: 734: 709: 692: 688: 682: 676: 672: 663: 659: 654: 653: 651: 630: 626: 624: 621: 620: 593: 589: 587: 584: 583: 554: 550: 542: 539: 538: 522: 519: 518: 501: 497: 495: 492: 491: 472: 468: 462: 456: 452: 443: 439: 434: 433: 431: 410: 406: 404: 401: 400: 377: 373: 371: 368: 367: 351: 348: 347: 324: 320: 318: 315: 314: 298: 295: 294: 271: 267: 265: 262: 261: 230: 226: 220: 209: 204: 191: 187: 175: 171: 165: 154: 149: 147: 135: 131: 129: 126: 125: 124:is defined as: 109: 106: 105: 89: 86: 85: 69: 66: 65: 49: 46: 45: 42: 17: 12: 11: 5: 1033: 1023: 1022: 1017: 1001: 1000: 981:(2): 331โ€“340. 965: 958: 940: 933: 915: 908: 890: 865:(4): 857โ€“871. 844: 843: 841: 838: 835: 834: 830: 825: 819: 818: 814: 809: 803: 802: 799: 796: 778: 775: 773:-th variable. 762: 742: 718: 715: 712: 708: 705: 701: 698: 695: 691: 685: 679: 675: 671: 666: 662: 657: 650: 647: 644: 639: 636: 633: 629: 599: 596: 592: 571: 568: 563: 560: 557: 553: 549: 546: 526: 504: 500: 475: 471: 465: 459: 455: 451: 446: 442: 437: 430: 427: 424: 419: 416: 413: 409: 386: 383: 380: 376: 355: 333: 330: 327: 323: 302: 280: 277: 274: 270: 247: 239: 236: 233: 229: 223: 218: 215: 212: 208: 200: 197: 194: 190: 184: 181: 178: 174: 168: 163: 160: 157: 153: 146: 141: 138: 134: 113: 93: 73: 53: 41: 38: 15: 9: 6: 4: 3: 2: 1032: 1021: 1018: 1016: 1013: 1012: 1010: 996: 992: 988: 984: 980: 976: 969: 961: 959:9780471878766 955: 951: 944: 936: 930: 926: 919: 911: 905: 901: 894: 880: 876: 872: 868: 864: 860: 856: 849: 845: 833: 831: 826: 824: 821: 820: 817: 815: 810: 808: 805: 804: 800: 797: 794: 793: 790: 788: 784: 774: 760: 740: 713: 703: 696: 677: 673: 669: 664: 660: 648: 645: 642: 637: 634: 631: 627: 618: 613: 597: 594: 590: 569: 566: 561: 558: 555: 551: 547: 544: 524: 502: 498: 473: 469: 457: 453: 449: 444: 440: 428: 425: 422: 417: 414: 411: 407: 384: 381: 378: 374: 353: 331: 328: 325: 321: 300: 278: 275: 272: 268: 258: 245: 237: 234: 231: 227: 221: 216: 213: 210: 206: 198: 195: 192: 188: 182: 179: 176: 172: 166: 161: 158: 155: 151: 144: 139: 136: 132: 111: 91: 71: 51: 37: 34: 30: 26: 22: 978: 974: 968: 949: 943: 924: 918: 899: 893: 882:. Retrieved 862: 858: 848: 780: 614: 259: 43: 24: 18: 1009:Categories 884:2024-06-03 859:Biometrics 840:References 260:where the 40:Definition 21:statistics 704:− 670:− 649:− 617:Rousseeuw 567:≤ 548:≤ 450:− 429:− 207:∑ 152:∑ 798:Function 84:having 995:1224438 879:2528823 993:  956:  931:  906:  877:  823:Python 787:Python 991:JSTOR 975:Taxon 875:JSTOR 801:Ref. 733:with 490:with 954:ISBN 929:ISBN 904:ISBN 313:and 64:and 983:doi 867:doi 707:min 690:max 19:In 1011:: 989:. 979:48 977:. 873:. 863:27 861:. 857:. 785:, 23:, 997:. 985:: 962:. 937:. 912:. 887:. 869:: 807:R 783:R 761:k 741:r 717:} 714:r 711:{ 700:} 697:r 694:{ 684:| 678:j 674:r 665:i 661:r 656:| 646:1 643:= 638:k 635:j 632:i 628:s 598:j 595:i 591:S 570:1 562:k 559:j 556:i 552:s 545:0 525:k 503:k 499:R 474:k 470:R 464:| 458:j 454:x 445:i 441:x 436:| 426:1 423:= 418:k 415:j 412:i 408:s 385:k 382:j 379:i 375:s 354:k 332:k 329:j 326:i 322:s 301:1 279:k 276:j 273:i 269:w 246:, 238:k 235:j 232:i 228:w 222:p 217:1 214:= 211:k 199:k 196:j 193:i 189:s 183:k 180:j 177:i 173:w 167:p 162:1 159:= 156:k 145:= 140:j 137:i 133:S 112:S 92:p 72:j 52:i

Index

statistics
similarity measure
cluster analysis
Rousseeuw
R
Python
R

Python

"A general coefficient of similarity and some of its properties"
doi
10.2307/2528823
JSTOR
2528823
ISBN
978-0387-25150-9
ISBN
978-0-444-53868-0
ISBN
9780471878766
doi
10.2307/1224438
JSTOR
1224438
Categories
Statistical distance
Similarity measures

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.

โ†‘