Knowledge

Spinlock

Source 📝

164:
happens, other threads will be left "spinning" (repeatedly trying to acquire the lock), while the thread holding the lock is not making progress towards releasing it. The result is an indefinite postponement until the thread holding the lock can finish and release it. This is especially true on a single-processor system, where each waiting thread of the same priority is likely to waste its quantum (allocated time where a thread can run) spinning until the thread that holds the lock is finally finished.
25: 163:
often use spinlocks. However, spinlocks become wasteful if held for longer durations, as they may prevent other threads from running and require rescheduling. The longer a thread holds a lock, the greater the risk that the thread will be interrupted by the OS scheduler while holding the lock. If this
438:
instruction sets serve to replace locks in most cases. Although locks are still required as a fallback, they have the potential to greatly improve performance by having the processor handle entire blocks of atomic operations. This feature is built into some mutex implementations, for example in
643:
A few multi-core processors have a "power-conscious spin-lock" instruction that puts a processor to sleep, then wakes it up on the next cycle after the lock is freed. A spin-lock using such instructions is more efficient and uses less energy than spin locks with or without a back-off loop.
443:. The Hardware Lock Elision (HLE) in x86 is a weakened but backwards-compatible version of TSE, and we can use it here for locking without losing any compatibility. In this particular case, the processor can choose to not lock until two threads actually conflict with each other. 674:
to a different thread while waiting. This typically involves attaching the current thread to a queue of threads waiting for the lock, followed by switching to another thread that is ready to do some useful work. This scheme also has the advantage that it guarantees that
183:
operations and cannot be easily implemented in programming languages not supporting truly atomic operations. On architectures without such operations, or if high-level language implementation is required, a non-atomic locking algorithm may be used, e.g.
145:. Once acquired, spinlocks will usually be held until they are explicitly released, although in some implementations they may be automatically released if the thread being waited on (the one that holds the lock) blocks or "goes to sleep". 404:
systems) will do the wrong thing and data protected by the lock could be corrupted. On most non-x86 architectures, explicit memory barrier or atomic instructions (as in the example) must be used. On some systems, such as
679:
does not occur as long as all threads eventually relinquish locks they acquire and scheduling decisions can be made about which thread should progress first. Spinlocks that never entail switching, usable by
424:
bus traffic while a CPU waits for the lock. This optimization is effective on all CPU architectures that have a cache per CPU, because MESI is so widespread. On Hyper-Threading CPUs, pausing with
958: 832: 141:("spin") while repeatedly checking whether the lock is available. Since the thread remains active but is not performing a useful task, the use of such a lock is a kind of 1039: 167:
Implementing spinlocks correctly is challenging because programmers must take into account the possibility of simultaneous access to the lock, which could cause
954: 1057: 1013: 367:
The simple implementation above works on all CPUs using the x86 architecture. However, a number of performance optimizations are possible:
825: 431: 1087: 416:, code trying to acquire a lock should loop reading without trying to write anything until it reads a changed value. Because of 1009: 708:". The idea is to use a spinlock when trying to access a resource locked by a currently-running thread, but to sleep if the 428:
gives additional performance by hinting to the core that it can work on the other thread while the lock spins waiting.
743: 192:
than a spinlock, be slower to allow progress after unlocking, and may not be implementable in a high-level language if
89: 937: 791: 108: 61: 727: 633:, such a test-and-test-and-set lock (TTAS) performs much better than the simple test-and-set lock (TAS) approach. 1092: 996: 656:
to acquire a lock, it wastes time that might be productively spent elsewhere. There are two ways to avoid this:
68: 46: 871: 906: 661: 1019: 75: 1002: 753: 420:
caching protocols, this causes the cache line for the lock to become "Shared"; then there is remarkably
693: 681: 462:; In C: while (!__sync_bool_compare_and_swap(&locked, 0, 1)) while (locked) __builtin_ia32_pause(); 159:, spinlocks are efficient if threads are likely to be blocked for only short periods. For this reason, 401: 134: 57: 42: 730:
behaviour, however this resulted in more CPU usage in the kernel and larger applications, such as
653: 185: 160: 130: 35: 1067: 204:
The following example uses x86 assembly language to implement a spinlock. It will work on any
193: 152: 1062: 1025: 859: 435: 122: 1072: 660:
Do not acquire the lock. In many situations it is possible to design data structures that
8: 1053: 884: 676: 637: 374:
can safely use an unlocked MOV instead of the slower locked XCHG. This is due to subtle
1029: 709: 1035: 82: 933: 787: 172: 409:, there are special "unlock" instructions which provide the needed memory ordering. 1043: 705: 176: 149: 413: 394: 375: 189: 671: 379: 168: 156: 999:
from The Open Group Base Specifications Issue 6, IEEE Std 1003.1, 2004 Edition
1081: 630: 1026:
The Performance of Spin Lock Alternatives for Shared-Memory Multiprocessors
180: 142: 1047: 858:
Maurice Herlihy and Nir Shavit. "The Art of Multiprocessor Programming".
763: 723: 390: 1036:
Algorithms for Scalable Synchronization on Shared-Memory Multiprocessors
579:; work on the other thread now. Also written as the "pause". 138: 807: 748: 665: 24: 898: 697: 640:
delay before re-checking the lock performs even better than TTAS.
171:. Generally, such an implementation is possible only with special 902: 758: 731: 719: 701: 519:; XACQUIRE hints to the processor that we are acquiring a lock. 16:
In computing, a lock which causes a thread to loop continuously
980: 440: 406: 387: 383: 208: 205: 927: 781: 576:; Tell the CPU that we are waiting in a spinloop, so it can 417: 398: 457:
With the optimizations applied, a sample would look like:
664:, e.g. by using per-thread or per-CPU data and disabling 294:; Otherwise, EAX is 1 and we didn't acquire the lock. 615:; Assuming the memory ordering rules apply, release the 516:; atomically decide: if locked is zero, write ECX to it. 378:
rules which support this, even though MOV is not a full
885:"Parallelism and the ARM Instruction Set Architecture" 652:
The primary disadvantage of a spinlock is that, while
618:; lock variable with a "lock release" hint. 528:; If we locked it (old value equal to EAX: 0), return. 303:; Jump back to the MOV instruction if the Zero Flag is 1073:
Operating Systems: Three Easy Pieces (Chapter: Locks)
786:(Fourth ed.). Addison-Wesley. pp. 176–179. 498:; Zero out EAX, because cmpxchg compares against EAX. 282:; Test EAX with itself. Among other things, this will 1010:
User-Level Spin Locks - Threads, Processes & IPC
315:; The lock has been acquired, return to the calling 188:. However, such an implementation may require more 49:. Unsourced material may be challenged and removed. 636:With large numbers of processors, adding a random 370:On later implementations of the x86 architecture, 306:; not set; the lock was previously locked, and so 285:; set the processor's Zero Flag if EAX is 0. 1079: 932:(Fourth ed.). Addison-Wesley. p. 198. 928:Silberschatz, Abraham; Galvin, Peter B. (1994). 782:Silberschatz, Abraham; Galvin, Peter B. (1994). 896: 264:; This will always store 1 to the lock, leaving 981:"tedu comment on Locking in WebKit - Lobsters" 955:"src/lib/librthread/rthread.c - Revision 1.71" 222:; The lock variable. 1 = locked, 0 = unlocked. 288:; If EAX is 0, then the lock was unlocked and 1048:2006 Dijkstra Prize in Distributed Computing 629:On any multi-processor system that uses the 362: 309:; we need to spin until it becomes unlocked. 978: 952: 872:"Boost.Fiber Tuning: Exponential back-off" 826:"New Technologies in the Arm Architecture" 446:A simpler version of the test can use the 267:; the previous value in the EAX register. 854: 852: 712:is not currently running. (The latter is 704:) use a hybrid approach called "adaptive 199: 137:trying to acquire it to simply wait in a 109:Learn how and when to remove this message 432:Transactional Synchronization Extensions 716:the case on single-processor systems.) 348:; Atomically swap the EAX register with 258:; Atomically swap the EAX register with 1080: 849: 1068:Interlocked Variable Access(Windows) 1063:Austria C++ SpinLock Class Reference 722:attempted to replace spinlocks with 47:adding citations to reliable sources 18: 1003:Variety of spinlock Implementations 897:Jonathan Corbet (9 December 2009). 883:John Goodacre and Andrew N. Sloss. 13: 808:"gcc - x86 spinlock using cmpxchg" 692:Most operating systems (including 558:; Perform the zero-test as before. 386:processors, some revisions of the 382:. However, some processors (some 14: 1104: 990: 567:; If it's zero, we can retry. 148:Because they avoid overhead from 454:built into many Unix compilers. 23: 997:pthread_spin_lock documentation 972: 961:from the original on 2021-02-27 909:from the original on 7 May 2013 838:from the original on 2019-04-02 647: 34:needs additional citations for 1088:Concurrency control algorithms 946: 921: 890: 877: 865: 818: 800: 775: 1: 769: 624:; The lock has been released. 357:; The lock has been released. 480:; Set the ECX register to 1. 452:__sync_bool_compare_and_swap 336:; Set the EAX register to 0. 246:; Set the EAX register to 1. 7: 860:"Spin Locks and Contention" 754:Deadlock (computer science) 737: 682:real-time operating systems 450:instruction on x86, or the 393:(due to bugs), and earlier 10: 1109: 1046:. This paper received the 979:Ted Unangst (2016-05-06). 953:Ted Unangst (2013-06-01). 899:"Spinlock naming resolved" 1020:Spin Lock Example in Java 930:Operating System Concepts 784:Operating System Concepts 363:Significant optimizations 734:, becoming much slower. 631:MESI contention protocol 459: 213: 179:(i.e. un-interruptible) 161:operating-system kernels 684:, are sometimes called 543:; Read locked into EAX. 1093:Programming constructs 1040:John M. Mellor-Crummey 662:do not require locking 211:compatible processor. 200:Example implementation 194:out-of-order execution 175:instructions, such as 588:; Keep check-pausing. 351:; the lock variable. 291:; we just locked it. 261:; the lock variable. 1005:from Concurrency Kit 436:transactional memory 412:To reduce inter-CPU 186:Peterson's algorithm 153:process rescheduling 123:software engineering 43:improve this article 677:resource starvation 638:exponential backoff 434:and other hardware 1030:Thomas E. Anderson 728:first-in-first-out 173:assembly language 157:context switching 119: 118: 111: 93: 1100: 1044:Michael L. Scott 985: 984: 976: 970: 969: 967: 966: 950: 944: 943: 925: 919: 918: 916: 914: 894: 888: 881: 875: 869: 863: 856: 847: 846: 844: 843: 837: 830: 822: 816: 815: 804: 798: 797: 779: 625: 622: 619: 616: 613: 610: 607: 604: 601: 598: 595: 592: 589: 586: 583: 580: 577: 574: 571: 568: 565: 562: 559: 556: 553: 550: 547: 544: 541: 538: 535: 532: 529: 526: 523: 520: 517: 514: 511: 508: 505: 502: 499: 496: 493: 490: 487: 484: 481: 478: 475: 472: 469: 466: 463: 453: 449: 427: 358: 355: 352: 349: 346: 343: 340: 337: 334: 331: 328: 325: 322: 319: 316: 313: 310: 307: 304: 301: 298: 295: 292: 289: 286: 283: 280: 277: 274: 271: 268: 265: 262: 259: 256: 253: 250: 247: 244: 241: 238: 235: 232: 229: 226: 223: 220: 217: 150:operating system 114: 107: 103: 100: 94: 92: 51: 27: 19: 1108: 1107: 1103: 1102: 1101: 1099: 1098: 1097: 1078: 1077: 1058:Jeffrey Richter 993: 988: 977: 973: 964: 962: 951: 947: 940: 926: 922: 912: 910: 895: 891: 882: 878: 870: 866: 857: 850: 841: 839: 835: 828: 824: 823: 819: 806: 805: 801: 794: 780: 776: 772: 744:Synchronization 740: 726:which enforced 650: 627: 626: 623: 620: 617: 614: 611: 608: 605: 602: 599: 596: 593: 590: 587: 584: 581: 578: 575: 572: 569: 566: 563: 560: 557: 554: 551: 548: 545: 542: 539: 536: 533: 530: 527: 524: 521: 518: 515: 512: 509: 506: 503: 500: 497: 494: 491: 488: 485: 482: 479: 476: 473: 470: 467: 464: 461: 451: 447: 425: 376:memory ordering 365: 360: 359: 356: 353: 350: 347: 344: 341: 338: 335: 332: 329: 326: 323: 320: 317: 314: 311: 308: 305: 302: 299: 296: 293: 290: 287: 284: 281: 278: 275: 272: 269: 266: 263: 260: 257: 254: 251: 248: 245: 242: 239: 236: 233: 230: 227: 224: 221: 218: 215: 202: 169:race conditions 115: 104: 98: 95: 52: 50: 40: 28: 17: 12: 11: 5: 1106: 1096: 1095: 1090: 1076: 1075: 1070: 1065: 1060: 1054:Spin-Wait Lock 1051: 1032: 1022: 1016: 1006: 1000: 992: 991:External links 989: 987: 986: 971: 945: 938: 920: 889: 876: 864: 848: 817: 812:Stack Overflow 799: 792: 773: 771: 768: 767: 766: 761: 756: 751: 746: 739: 736: 690: 689: 669: 649: 646: 460: 380:memory barrier 364: 361: 216:; Intel syntax 214: 201: 198: 133:that causes a 117: 116: 31: 29: 22: 15: 9: 6: 4: 3: 2: 1105: 1094: 1091: 1089: 1086: 1085: 1083: 1074: 1071: 1069: 1066: 1064: 1061: 1059: 1055: 1052: 1049: 1045: 1041: 1037: 1033: 1031: 1027: 1023: 1021: 1017: 1015: 1014:Gert Boddaert 1011: 1007: 1004: 1001: 998: 995: 994: 982: 975: 960: 956: 949: 941: 939:0-201-59292-4 935: 931: 924: 908: 904: 900: 893: 886: 880: 873: 868: 861: 855: 853: 834: 827: 821: 813: 809: 803: 795: 793:0-201-59292-4 789: 785: 778: 774: 765: 762: 760: 757: 755: 752: 750: 747: 745: 742: 741: 735: 733: 729: 725: 721: 717: 715: 711: 707: 703: 699: 695: 687: 686:raw spinlocks 683: 678: 673: 670: 667: 663: 659: 658: 657: 655: 645: 641: 639: 634: 632: 458: 455: 444: 442: 437: 433: 429: 423: 419: 415: 410: 408: 403: 400: 396: 392: 389: 385: 381: 377: 373: 368: 212: 210: 207: 197: 195: 191: 187: 182: 178: 174: 170: 165: 162: 158: 154: 151: 146: 144: 140: 136: 132: 128: 124: 113: 110: 102: 91: 88: 84: 81: 77: 74: 70: 67: 63: 60: –  59: 55: 54:Find sources: 48: 44: 38: 37: 32:This article 30: 26: 21: 20: 974: 963:. Retrieved 948: 929: 923: 911:. Retrieved 892: 879: 867: 840:. Retrieved 820: 811: 802: 783: 777: 724:ticket locks 718: 713: 691: 685: 651: 648:Alternatives 642: 635: 628: 600:spin_unlock: 456: 445: 430: 421: 411: 371: 369: 366: 321:spin_unlock: 318:; function. 203: 196:is allowed. 181:test-and-set 166: 147: 143:busy waiting 126: 120: 105: 99:October 2012 96: 86: 79: 72: 65: 53: 41:Please help 36:verification 33: 764:Ticket lock 597:; All done. 414:bus traffic 391:Pentium Pro 372:spin_unlock 1082:Categories 965:2022-01-25 842:2019-09-26 770:References 666:interrupts 465:spin_lock: 231:spin_lock: 69:newspapers 58:"Spinlock" 1008:Article " 749:Busy spin 300:spin_lock 1018:Article 959:Archived 907:Archived 887:. p. 47. 833:Archived 738:See also 698:Mac OS X 603:XRELEASE 501:XACQUIRE 127:spinlock 1034:Paper " 1024:Paper " 903:LWN.net 759:Seqlock 732:Firefox 720:OpenBSD 702:FreeBSD 694:Solaris 654:waiting 507:cmpxchg 448:cmpxchg 426:rep nop 395:Pentium 219:locked: 83:scholar 936:  913:14 May 790:  714:always 710:thread 672:Switch 531:pause: 483:retry: 190:memory 177:atomic 135:thread 85:  78:  71:  64:  56:  1038:" by 1028:" by 1012:" by 836:(PDF) 829:(PDF) 706:mutex 585:pause 564:retry 441:glibc 407:IA-64 388:Intel 384:Cyrix 209:80386 206:Intel 129:is a 90:JSTOR 76:books 1042:and 934:ISBN 915:2013 788:ISBN 700:and 591:out: 546:test 504:lock 418:MESI 399:i486 397:and 339:xchg 270:test 249:xchg 139:loop 131:lock 125:, a 62:news 1056:by 621:ret 606:mov 594:ret 582:jmp 573:nop 570:rep 555:eax 549:eax 537:eax 534:mov 525:out 513:ecx 495:eax 489:eax 486:xor 471:ecx 468:mov 402:SMP 354:ret 342:eax 333:eax 327:eax 324:xor 312:ret 297:jnz 279:eax 273:eax 252:eax 237:eax 234:mov 155:or 121:In 45:by 1084:: 957:. 905:. 901:. 851:^ 831:. 810:. 696:, 561:jz 522:je 422:no 225:dd 1050:. 983:. 968:. 942:. 917:. 874:. 862:. 845:. 814:. 796:. 688:. 668:. 612:0 609:, 552:, 540:, 510:, 492:, 477:1 474:, 345:, 330:, 276:, 255:, 243:1 240:, 228:0 112:) 106:( 101:) 97:( 87:· 80:· 73:· 66:· 39:.

Index


verification
improve this article
adding citations to reliable sources
"Spinlock"
news
newspapers
books
scholar
JSTOR
Learn how and when to remove this message
software engineering
lock
thread
loop
busy waiting
operating system
process rescheduling
context switching
operating-system kernels
race conditions
assembly language
atomic
test-and-set
Peterson's algorithm
memory
out-of-order execution
Intel
80386
memory ordering

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.