164:
happens, other threads will be left "spinning" (repeatedly trying to acquire the lock), while the thread holding the lock is not making progress towards releasing it. The result is an indefinite postponement until the thread holding the lock can finish and release it. This is especially true on a single-processor system, where each waiting thread of the same priority is likely to waste its quantum (allocated time where a thread can run) spinning until the thread that holds the lock is finally finished.
25:
163:
often use spinlocks. However, spinlocks become wasteful if held for longer durations, as they may prevent other threads from running and require rescheduling. The longer a thread holds a lock, the greater the risk that the thread will be interrupted by the OS scheduler while holding the lock. If this
438:
instruction sets serve to replace locks in most cases. Although locks are still required as a fallback, they have the potential to greatly improve performance by having the processor handle entire blocks of atomic operations. This feature is built into some mutex implementations, for example in
643:
A few multi-core processors have a "power-conscious spin-lock" instruction that puts a processor to sleep, then wakes it up on the next cycle after the lock is freed. A spin-lock using such instructions is more efficient and uses less energy than spin locks with or without a back-off loop.
443:. The Hardware Lock Elision (HLE) in x86 is a weakened but backwards-compatible version of TSE, and we can use it here for locking without losing any compatibility. In this particular case, the processor can choose to not lock until two threads actually conflict with each other.
674:
to a different thread while waiting. This typically involves attaching the current thread to a queue of threads waiting for the lock, followed by switching to another thread that is ready to do some useful work. This scheme also has the advantage that it guarantees that
183:
operations and cannot be easily implemented in programming languages not supporting truly atomic operations. On architectures without such operations, or if high-level language implementation is required, a non-atomic locking algorithm may be used, e.g.
145:. Once acquired, spinlocks will usually be held until they are explicitly released, although in some implementations they may be automatically released if the thread being waited on (the one that holds the lock) blocks or "goes to sleep".
404:
systems) will do the wrong thing and data protected by the lock could be corrupted. On most non-x86 architectures, explicit memory barrier or atomic instructions (as in the example) must be used. On some systems, such as
679:
does not occur as long as all threads eventually relinquish locks they acquire and scheduling decisions can be made about which thread should progress first. Spinlocks that never entail switching, usable by
424:
bus traffic while a CPU waits for the lock. This optimization is effective on all CPU architectures that have a cache per CPU, because MESI is so widespread. On Hyper-Threading CPUs, pausing with
958:
832:
141:("spin") while repeatedly checking whether the lock is available. Since the thread remains active but is not performing a useful task, the use of such a lock is a kind of
1039:
167:
Implementing spinlocks correctly is challenging because programmers must take into account the possibility of simultaneous access to the lock, which could cause
954:
1057:
1013:
367:
The simple implementation above works on all CPUs using the x86 architecture. However, a number of performance optimizations are possible:
825:
431:
1087:
416:, code trying to acquire a lock should loop reading without trying to write anything until it reads a changed value. Because of
1009:
708:". The idea is to use a spinlock when trying to access a resource locked by a currently-running thread, but to sleep if the
428:
gives additional performance by hinting to the core that it can work on the other thread while the lock spins waiting.
743:
192:
than a spinlock, be slower to allow progress after unlocking, and may not be implementable in a high-level language if
89:
937:
791:
108:
61:
727:
633:, such a test-and-test-and-set lock (TTAS) performs much better than the simple test-and-set lock (TAS) approach.
1092:
996:
656:
to acquire a lock, it wastes time that might be productively spent elsewhere. There are two ways to avoid this:
68:
46:
871:
906:
661:
1019:
75:
1002:
753:
420:
caching protocols, this causes the cache line for the lock to become "Shared"; then there is remarkably
693:
681:
462:; In C: while (!__sync_bool_compare_and_swap(&locked, 0, 1)) while (locked) __builtin_ia32_pause();
159:, spinlocks are efficient if threads are likely to be blocked for only short periods. For this reason,
401:
134:
57:
42:
730:
behaviour, however this resulted in more CPU usage in the kernel and larger applications, such as
653:
185:
160:
130:
35:
1067:
204:
The following example uses x86 assembly language to implement a spinlock. It will work on any
193:
152:
1062:
1025:
859:
435:
122:
1072:
660:
Do not acquire the lock. In many situations it is possible to design data structures that
8:
1053:
884:
676:
637:
374:
can safely use an unlocked MOV instead of the slower locked XCHG. This is due to subtle
1029:
709:
1035:
82:
933:
787:
172:
409:, there are special "unlock" instructions which provide the needed memory ordering.
1043:
705:
176:
149:
413:
394:
375:
189:
671:
379:
168:
156:
999:
from The Open Group Base
Specifications Issue 6, IEEE Std 1003.1, 2004 Edition
1081:
630:
1026:
The
Performance of Spin Lock Alternatives for Shared-Memory Multiprocessors
180:
142:
1047:
858:
Maurice
Herlihy and Nir Shavit. "The Art of Multiprocessor Programming".
763:
723:
390:
1036:
Algorithms for
Scalable Synchronization on Shared-Memory Multiprocessors
579:; work on the other thread now. Also written as the "pause".
138:
807:
748:
665:
24:
898:
697:
640:
delay before re-checking the lock performs even better than TTAS.
171:. Generally, such an implementation is possible only with special
902:
758:
731:
719:
701:
519:; XACQUIRE hints to the processor that we are acquiring a lock.
16:
In computing, a lock which causes a thread to loop continuously
980:
440:
406:
387:
383:
208:
205:
927:
781:
576:; Tell the CPU that we are waiting in a spinloop, so it can
417:
398:
457:
With the optimizations applied, a sample would look like:
664:, e.g. by using per-thread or per-CPU data and disabling
294:; Otherwise, EAX is 1 and we didn't acquire the lock.
615:; Assuming the memory ordering rules apply, release the
516:; atomically decide: if locked is zero, write ECX to it.
378:
rules which support this, even though MOV is not a full
885:"Parallelism and the ARM Instruction Set Architecture"
652:
The primary disadvantage of a spinlock is that, while
618:; lock variable with a "lock release" hint.
528:; If we locked it (old value equal to EAX: 0), return.
303:; Jump back to the MOV instruction if the Zero Flag is
1073:
Operating
Systems: Three Easy Pieces (Chapter: Locks)
786:(Fourth ed.). Addison-Wesley. pp. 176–179.
498:; Zero out EAX, because cmpxchg compares against EAX.
282:; Test EAX with itself. Among other things, this will
1010:
User-Level Spin Locks - Threads, Processes & IPC
315:; The lock has been acquired, return to the calling
188:. However, such an implementation may require more
49:. Unsourced material may be challenged and removed.
636:With large numbers of processors, adding a random
370:On later implementations of the x86 architecture,
306:; not set; the lock was previously locked, and so
285:; set the processor's Zero Flag if EAX is 0.
1079:
932:(Fourth ed.). Addison-Wesley. p. 198.
928:Silberschatz, Abraham; Galvin, Peter B. (1994).
782:Silberschatz, Abraham; Galvin, Peter B. (1994).
896:
264:; This will always store 1 to the lock, leaving
981:"tedu comment on Locking in WebKit - Lobsters"
955:"src/lib/librthread/rthread.c - Revision 1.71"
222:; The lock variable. 1 = locked, 0 = unlocked.
288:; If EAX is 0, then the lock was unlocked and
1048:2006 Dijkstra Prize in Distributed Computing
629:On any multi-processor system that uses the
362:
309:; we need to spin until it becomes unlocked.
978:
952:
872:"Boost.Fiber Tuning: Exponential back-off"
826:"New Technologies in the Arm Architecture"
446:A simpler version of the test can use the
267:; the previous value in the EAX register.
854:
852:
712:is not currently running. (The latter is
704:) use a hybrid approach called "adaptive
199:
137:trying to acquire it to simply wait in a
109:Learn how and when to remove this message
432:Transactional Synchronization Extensions
716:the case on single-processor systems.)
348:; Atomically swap the EAX register with
258:; Atomically swap the EAX register with
1080:
849:
1068:Interlocked Variable Access(Windows)
1063:Austria C++ SpinLock Class Reference
722:attempted to replace spinlocks with
47:adding citations to reliable sources
18:
1003:Variety of spinlock Implementations
897:Jonathan Corbet (9 December 2009).
883:John Goodacre and Andrew N. Sloss.
13:
808:"gcc - x86 spinlock using cmpxchg"
692:Most operating systems (including
558:; Perform the zero-test as before.
386:processors, some revisions of the
382:. However, some processors (some
14:
1104:
990:
567:; If it's zero, we can retry.
148:Because they avoid overhead from
454:built into many Unix compilers.
23:
997:pthread_spin_lock documentation
972:
961:from the original on 2021-02-27
909:from the original on 7 May 2013
838:from the original on 2019-04-02
647:
34:needs additional citations for
1088:Concurrency control algorithms
946:
921:
890:
877:
865:
818:
800:
775:
1:
769:
624:; The lock has been released.
357:; The lock has been released.
480:; Set the ECX register to 1.
452:__sync_bool_compare_and_swap
336:; Set the EAX register to 0.
246:; Set the EAX register to 1.
7:
860:"Spin Locks and Contention"
754:Deadlock (computer science)
737:
682:real-time operating systems
450:instruction on x86, or the
393:(due to bugs), and earlier
10:
1109:
1046:. This paper received the
979:Ted Unangst (2016-05-06).
953:Ted Unangst (2013-06-01).
899:"Spinlock naming resolved"
1020:Spin Lock Example in Java
930:Operating System Concepts
784:Operating System Concepts
363:Significant optimizations
734:, becoming much slower.
631:MESI contention protocol
459:
213:
179:(i.e. un-interruptible)
161:operating-system kernels
684:, are sometimes called
543:; Read locked into EAX.
1093:Programming constructs
1040:John M. Mellor-Crummey
662:do not require locking
211:compatible processor.
200:Example implementation
194:out-of-order execution
175:instructions, such as
588:; Keep check-pausing.
351:; the lock variable.
291:; we just locked it.
261:; the lock variable.
1005:from Concurrency Kit
436:transactional memory
412:To reduce inter-CPU
186:Peterson's algorithm
153:process rescheduling
123:software engineering
43:improve this article
677:resource starvation
638:exponential backoff
434:and other hardware
1030:Thomas E. Anderson
728:first-in-first-out
173:assembly language
157:context switching
119:
118:
111:
93:
1100:
1044:Michael L. Scott
985:
984:
976:
970:
969:
967:
966:
950:
944:
943:
925:
919:
918:
916:
914:
894:
888:
881:
875:
869:
863:
856:
847:
846:
844:
843:
837:
830:
822:
816:
815:
804:
798:
797:
779:
625:
622:
619:
616:
613:
610:
607:
604:
601:
598:
595:
592:
589:
586:
583:
580:
577:
574:
571:
568:
565:
562:
559:
556:
553:
550:
547:
544:
541:
538:
535:
532:
529:
526:
523:
520:
517:
514:
511:
508:
505:
502:
499:
496:
493:
490:
487:
484:
481:
478:
475:
472:
469:
466:
463:
453:
449:
427:
358:
355:
352:
349:
346:
343:
340:
337:
334:
331:
328:
325:
322:
319:
316:
313:
310:
307:
304:
301:
298:
295:
292:
289:
286:
283:
280:
277:
274:
271:
268:
265:
262:
259:
256:
253:
250:
247:
244:
241:
238:
235:
232:
229:
226:
223:
220:
217:
150:operating system
114:
107:
103:
100:
94:
92:
51:
27:
19:
1108:
1107:
1103:
1102:
1101:
1099:
1098:
1097:
1078:
1077:
1058:Jeffrey Richter
993:
988:
977:
973:
964:
962:
951:
947:
940:
926:
922:
912:
910:
895:
891:
882:
878:
870:
866:
857:
850:
841:
839:
835:
828:
824:
823:
819:
806:
805:
801:
794:
780:
776:
772:
744:Synchronization
740:
726:which enforced
650:
627:
626:
623:
620:
617:
614:
611:
608:
605:
602:
599:
596:
593:
590:
587:
584:
581:
578:
575:
572:
569:
566:
563:
560:
557:
554:
551:
548:
545:
542:
539:
536:
533:
530:
527:
524:
521:
518:
515:
512:
509:
506:
503:
500:
497:
494:
491:
488:
485:
482:
479:
476:
473:
470:
467:
464:
461:
451:
447:
425:
376:memory ordering
365:
360:
359:
356:
353:
350:
347:
344:
341:
338:
335:
332:
329:
326:
323:
320:
317:
314:
311:
308:
305:
302:
299:
296:
293:
290:
287:
284:
281:
278:
275:
272:
269:
266:
263:
260:
257:
254:
251:
248:
245:
242:
239:
236:
233:
230:
227:
224:
221:
218:
215:
202:
169:race conditions
115:
104:
98:
95:
52:
50:
40:
28:
17:
12:
11:
5:
1106:
1096:
1095:
1090:
1076:
1075:
1070:
1065:
1060:
1054:Spin-Wait Lock
1051:
1032:
1022:
1016:
1006:
1000:
992:
991:External links
989:
987:
986:
971:
945:
938:
920:
889:
876:
864:
848:
817:
812:Stack Overflow
799:
792:
773:
771:
768:
767:
766:
761:
756:
751:
746:
739:
736:
690:
689:
669:
649:
646:
460:
380:memory barrier
364:
361:
216:; Intel syntax
214:
201:
198:
133:that causes a
117:
116:
31:
29:
22:
15:
9:
6:
4:
3:
2:
1105:
1094:
1091:
1089:
1086:
1085:
1083:
1074:
1071:
1069:
1066:
1064:
1061:
1059:
1055:
1052:
1049:
1045:
1041:
1037:
1033:
1031:
1027:
1023:
1021:
1017:
1015:
1014:Gert Boddaert
1011:
1007:
1004:
1001:
998:
995:
994:
982:
975:
960:
956:
949:
941:
939:0-201-59292-4
935:
931:
924:
908:
904:
900:
893:
886:
880:
873:
868:
861:
855:
853:
834:
827:
821:
813:
809:
803:
795:
793:0-201-59292-4
789:
785:
778:
774:
765:
762:
760:
757:
755:
752:
750:
747:
745:
742:
741:
735:
733:
729:
725:
721:
717:
715:
711:
707:
703:
699:
695:
687:
686:raw spinlocks
683:
678:
673:
670:
667:
663:
659:
658:
657:
655:
645:
641:
639:
634:
632:
458:
455:
444:
442:
437:
433:
429:
423:
419:
415:
410:
408:
403:
400:
396:
392:
389:
385:
381:
377:
373:
368:
212:
210:
207:
197:
195:
191:
187:
182:
178:
174:
170:
165:
162:
158:
154:
151:
146:
144:
140:
136:
132:
128:
124:
113:
110:
102:
91:
88:
84:
81:
77:
74:
70:
67:
63:
60: –
59:
55:
54:Find sources:
48:
44:
38:
37:
32:This article
30:
26:
21:
20:
974:
963:. Retrieved
948:
929:
923:
911:. Retrieved
892:
879:
867:
840:. Retrieved
820:
811:
802:
783:
777:
724:ticket locks
718:
713:
691:
685:
651:
648:Alternatives
642:
635:
628:
600:spin_unlock:
456:
445:
430:
421:
411:
371:
369:
366:
321:spin_unlock:
318:; function.
203:
196:is allowed.
181:test-and-set
166:
147:
143:busy waiting
126:
120:
105:
99:October 2012
96:
86:
79:
72:
65:
53:
41:Please help
36:verification
33:
764:Ticket lock
597:; All done.
414:bus traffic
391:Pentium Pro
372:spin_unlock
1082:Categories
965:2022-01-25
842:2019-09-26
770:References
666:interrupts
465:spin_lock:
231:spin_lock:
69:newspapers
58:"Spinlock"
1008:Article "
749:Busy spin
300:spin_lock
1018:Article
959:Archived
907:Archived
887:. p. 47.
833:Archived
738:See also
698:Mac OS X
603:XRELEASE
501:XACQUIRE
127:spinlock
1034:Paper "
1024:Paper "
903:LWN.net
759:Seqlock
732:Firefox
720:OpenBSD
702:FreeBSD
694:Solaris
654:waiting
507:cmpxchg
448:cmpxchg
426:rep nop
395:Pentium
219:locked:
83:scholar
936:
913:14 May
790:
714:always
710:thread
672:Switch
531:pause:
483:retry:
190:memory
177:atomic
135:thread
85:
78:
71:
64:
56:
1038:" by
1028:" by
1012:" by
836:(PDF)
829:(PDF)
706:mutex
585:pause
564:retry
441:glibc
407:IA-64
388:Intel
384:Cyrix
209:80386
206:Intel
129:is a
90:JSTOR
76:books
1042:and
934:ISBN
915:2013
788:ISBN
700:and
591:out:
546:test
504:lock
418:MESI
399:i486
397:and
339:xchg
270:test
249:xchg
139:loop
131:lock
125:, a
62:news
1056:by
621:ret
606:mov
594:ret
582:jmp
573:nop
570:rep
555:eax
549:eax
537:eax
534:mov
525:out
513:ecx
495:eax
489:eax
486:xor
471:ecx
468:mov
402:SMP
354:ret
342:eax
333:eax
327:eax
324:xor
312:ret
297:jnz
279:eax
273:eax
252:eax
237:eax
234:mov
155:or
121:In
45:by
1084::
957:.
905:.
901:.
851:^
831:.
810:.
696:,
561:jz
522:je
422:no
225:dd
1050:.
983:.
968:.
942:.
917:.
874:.
862:.
845:.
814:.
796:.
688:.
668:.
612:0
609:,
552:,
540:,
510:,
492:,
477:1
474:,
345:,
330:,
276:,
255:,
243:1
240:,
228:0
112:)
106:(
101:)
97:(
87:·
80:·
73:·
66:·
39:.
Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.