Knowledge

Prefetch input queue

Source đź“ť

129:, a change in the processor code immediately in front of the current location of execution might not change how the processor interprets the code, as it is already loaded into its PIQ. It simply executes its old copy already loaded in the PIQ instead of the new and altered version of the code in its 648:
how large the PIQ is. "How far away do I have to change the code in front of me for it to affect me?" If it is too near (it is already in the PIQ) the update will not have any effect. If it is far enough, the change of the code will affect the program and the program has then found the size of the
245:
The processor executes a program by fetching the instructions from memory and executing them. Usually the processor execution speed is much faster than the memory access speed. Instruction queue is used to prefetch the next instructions in a separate buffer while the processor is executing the
218:: This model is a generalization of the basic M/M/1 model where multiple servers operate in parallel. This kind of model can also model scenarios with impatient users who leave the queue immediately if they are not receiving service. This can also be modeled using a 236:
Generally in applications like prefetch input queue, M/M/1 Model is popularly used because of limited use of queue features. In this model in accordance with microprocessors, the user takes the role of the execution unit and server is the bus interface unit.
290:
has a four-byte prefetch. As the Execution Unit is executing the current instruction, the bus interface unit reads up to six (or four) bytes of opcodes in advance from the memory. The queue lengths were chosen based on simulation studies.
324:
However, these disadvantages are greatly offset by the improvement in processor execution time. After the introduction of prefetch instruction queue in the 8086 processor, all successive processors have incorporated this feature.
208:): In this model, elements of queue are served on a first-come, first-served basis. Given the mean arrival and service rates, then actual rates vary around these average values randomly and hence have to be determined using a 60:
was brought to the forefront of computing architecture design during the 1960s due to the need for faster and more efficient computing. Pipelining is the broader concept and most modern processors load their instructions some
278:
This process is much faster than sending out an address, reading the opcode and then decoding and executing it. Fetching the next instruction while the current instruction is being decoded or executed is called pipelining.
152:
affect the state of the processor immediately), it can be deduced that either the code is being executed in an emulator or the processor invalidates the PIQ upon writes to addresses loaded in the PIQ.
263:
architecture is possible only if the bus interface unit and the execution unit are independent. While the execution unit is decoding or executing an instruction which does not require the use of the
45:
boosting its speed. The processor no longer has to wait for the memory access operations for the subsequent instruction opcode to complete. This architecture was prominently used in the
298:
instruction i.e. either a jump or a call instruction. In this case, the entire queue must be dumped and the contents pointed to by the instruction pointer must be fetched from memory.
321:
As the complexity of these chips increases, the cost also increases. These processors are relatively costlier than their counterparts without the prefetch input queue.
408:
is assembled into two bytes of machine code, so the two NOPs will just overwrite this jump and nothing else. (That is, the jump is replaced with a do-nothing-code.)
168:
are proposed in order to approximately simulate the real time queuing systems so that those can be analysed mathematically for different performance specifications.
232:. This model considers the case of more than one failed machine being repaired by single repairman. Service time for any user is going to increase in this case. 1931: 310:
level complexity of the such processors is much higher than for regular processors. This is primarily because of the need to implement two separate units, the
118:
as if it were written in its last mode. If the PIQ is not flushed, the processor might translate its codes wrong and generate an invalid instruction
903: 2042: 1225: 99:
of the CPU. However, there are some circumstances where the behavior of PIQ is visible, and needs to be taken into account by the programmer.
1744: 1022: 1901: 1467: 1284: 2255: 1247: 228:(Takacs' finite input Model) : This model is used to analyze advanced cases. Here the service time distribution is no longer a 1896: 1968: 2250: 1721: 2665: 1789: 1052: 896: 2675: 1816: 411:
Because the machine code of the jump is already read into the PIQ, and probably also already executed by the processor (
943: 222:
having only two states, success and failure. The best example of this model is our regular land-line telephone systems.
41:. The fetching of opcodes well in advance, prior to their need for execution, increases the overall efficiency of the 1983: 1811: 1784: 1163: 826: 801: 762: 2798: 2361: 1254: 1220: 1215: 1134: 1099: 209: 38: 2834: 2773: 2670: 2071: 1978: 1779: 1000: 889: 1799: 1518: 953: 682: 256:
The processor usually has two separate units for fetching the instructions and for executing the instructions.
1973: 1821: 1794: 1655: 1269: 1230: 1087: 415:
processors execute several instructions at once, but they "pretend" that they don't because of the need for
2410: 2172: 1648: 1609: 1264: 1259: 1193: 1005: 2037: 1734: 1432: 1129: 2687: 2334: 1751: 1242: 1210: 980: 968: 948: 164:(1878-1929) who first conceived of a queue as a solution to congestion in telephone traffic. Different 306:
Processors implementing the instruction queue prefetch algorithm are rather technically advanced. The
2778: 2741: 2731: 1119: 778: 295: 253:, the rate at which instructions are executed can be up to four times that of sequential execution. 2793: 2200: 2136: 2113: 1963: 1925: 1761: 1711: 1706: 1183: 1077: 985: 2746: 2529: 2423: 2387: 2304: 2288: 2130: 1919: 1878: 1866: 1729: 1643: 1564: 1329: 990: 933: 428: 2552: 2524: 2434: 2399: 2148: 2142: 2124: 1858: 1852: 1756: 1660: 1551: 1490: 1352: 995: 416: 89: 2726: 2635: 2381: 2093: 1911: 1670: 1638: 1596: 1508: 1309: 1124: 1114: 1104: 1094: 1064: 1047: 912: 172: 2756: 2692: 2278: 2000: 1890: 1837: 1369: 1082: 938: 920: 260: 250: 130: 81: 77: 57: 30: 754: 747: 114:
and vice versa, the PIQ has to be flushed, or else the CPU will continue to translate the
8: 2803: 2788: 2608: 2459: 2441: 2405: 2393: 2047: 1994: 1771: 1687: 1569: 1424: 1319: 1178: 435: 389: 161: 126: 85: 2660: 2652: 2504: 2479: 2283: 2158: 1682: 1623: 1503: 1235: 963: 859: 645: 644:
What this code does is basically that it changes the execution flow, and determines by
119: 140:
This behavior of the PIQ can be used to determine if code is being executed inside an
2613: 2580: 2496: 2428: 2329: 2319: 2309: 2240: 2235: 2230: 2153: 2082: 1988: 1948: 1581: 1531: 1481: 1457: 1339: 1279: 1274: 1156: 1072: 822: 797: 758: 442: 219: 96: 863: 2783: 2716: 2702: 2557: 2464: 2418: 2225: 2220: 2215: 2210: 2205: 2195: 2065: 2032: 1943: 1938: 1847: 1699: 1694: 1677: 1665: 1604: 1168: 1146: 1032: 1010: 928: 851: 2697: 2682: 2630: 2534: 2509: 2346: 2339: 2190: 2185: 2180: 2119: 2027: 2017: 1739: 1574: 1526: 1289: 1173: 1141: 1042: 1037: 958: 311: 165: 92:. Nearly all modern high-performance computers fulfill these three requirements. 26: 2808: 2642: 2625: 2618: 2514: 2371: 2108: 2022: 1953: 1536: 1498: 1447: 1442: 1437: 1151: 975: 855: 662: 650: 633:; the offset to "dd". you need also change dx to edx at the top as 315: 229: 205: 111: 42: 2828: 2603: 2519: 1559: 1541: 1334: 1027: 1462: 2813: 2751: 2567: 2544: 2356: 2077: 1015: 842:
McKevitt, James; Bayliss, John (March 1979). "New options from big chips".
649:
processor's PIQ. If this code is being executed under multitasking OS, the
148:
never simulate this behavior. If the PIQ-size is zero (changes in the code
115: 66: 636:; well. (dw and dx = 16 bit addressing, dd and edx = 32 bit addressing) 419:), the change of the code will not have any change of the execution flow. 2598: 2562: 2273: 2245: 2103: 1958: 881: 567:; should be followed by offset (rm = "dw", pm = "dd") 412: 268: 62: 2484: 2474: 2469: 2451: 2351: 2324: 1586: 1419: 1389: 1109: 397: 307: 287: 283: 264: 46: 744: 2575: 2572: 2314: 1384: 1362: 134: 107: 504:; "calculate" codeseg in the far jump below (edx here too) 2590: 1409: 141: 627:; this code is for ] and ], but it could easily be changed into 286:
processor has a six-byte prefetch instruction pipeline, while the
1399: 1357: 294:
An exception is encountered when the execution unit encounters a
155: 95:
Usually, the prefetching behavior of the PIQ is invisible to the
1414: 1379: 1344: 432: 272: 144:
or directly on the hardware of a real CPU. Most emulators will
22: 630:; running for ] as well. just change the "dw" for 1872: 1404: 1374: 729: 2736: 1884: 1804: 1394: 745:
Zaky, Safwat; V. Carl Hamacher; Zvonko G. Vranesic (1996).
717:
An Introduction to Probability theory and its applications
65:
before they execute them. This is achieved by pre-loading
1324: 1314: 438: 103: 732:
Probability, Random Variables and Stochastic Processes
734:(Fourth ed.). McGraw-Hill. pp. 784 to 800. 37:(PIQ). The pre-fetched instructions are stored in a 821:. New Delhi: Tata McGraw-Hill. pp. 2.13–2.14. 730:Papoulis, Athanasios; S.Unnikrishna Pillai (2008). 187:
A1 is the distribution of time between two arrivals
746: 624:; register bx now contains the size of the PIQ 422: 2826: 585:; 0x40 = opcode "inc ax" (INCrease ax) 841: 445:algorithm that determines the size of the PIQ: 579:; and then the code segment (calculated above) 531:; 0x90 = opcode "nop" (NO oPeration) 156:Performance evaluation based on queuing theory 897: 210:cumulative probability distribution function 1902:Computer performance by orders of magnitude 911: 904: 890: 753:(Fourth ed.). McGraw-Hill. pp.  687:ARM Technical Support Knowledge Articles 171:Queuing models can be represented using 719:(Second ed.). John Wiley and Sons. 2827: 714: 702:Computer Architecture and Organization 885: 699: 1873:Floating-point operations per second 816: 791: 558:; 0xEA = opcode "far jump" 240: 328: 190:A2 is the service time distribution 13: 796:. Tata McGraw-Hill. p. 2.12. 14: 2846: 874: 271:, the bus interface unit fetches 193:A3 is the total number of servers 2799:Semiconductor device fabrication 2774:History of general-purpose CPUs 1001:Nondeterministic Turing machine 819:Microprocessors and Interfacing 794:Microprocessors and Interfacing 704:(Second ed.). McGraw-Hill. 52: 954:Deterministic finite automaton 835: 810: 785: 771: 738: 723: 708: 693: 675: 522:; check if ax has been altered 423:Example program to detect size 76:This behavior only applies to 1: 1745:Simultaneous and heterogenous 668: 653:may lead to the wrong value. 204:(Single Queue Single Server/ 2429:Integrated memory controller 2411:Translation lookaside buffer 1610:Memory dependence prediction 1053:Random-access stored program 1006:Probabilistic Turing machine 301: 196:A4 is the capacity of system 106:processor changes mode from 33:and it is served by using a 29:well in advance is known as 7: 1885:Synaptic updates per second 779:"Block diagram of 8086 CPU" 656: 392:program will overwrite the 10: 2851: 2289:Heterogeneous architecture 1211:Orthogonal instruction set 981:Alternating Turing machine 969:Quantum cellular automaton 856:10.1109/MSPEC.1979.6367944 2779:Microprocessor chronology 2766: 2742:Dynamic frequency scaling 2715: 2651: 2589: 2543: 2495: 2450: 2370: 2297: 2266: 2171: 2092: 2056: 2010: 1910: 1897:Cache performance metrics 1836: 1770: 1720: 1631: 1622: 1595: 1550: 1517: 1489: 1480: 1300: 1203: 1192: 1063: 919: 47:Intel 8086 microprocessor 21:Fetching the instruction 2794:Hardware security module 2137:Digital signal processor 2114:Graphics processing unit 1926:Graphics processing unit 715:Feller, William (1968). 683:"ARM Information Center" 447: 332: 318:, operating separately. 259:The implementation of a 84:computers) that can run 2747:Dynamic voltage scaling 2530:Memory address register 2424:Branch target predictor 2388:Address generation unit 2131:Physics processing unit 1920:Central processing unit 1879:Transactions per second 1867:Instructions per second 1790:Array processing (SIMT) 934:Stored-program computer 2835:Instruction processing 2553:Hardwired control unit 2435:Memory management unit 2400:Memory management unit 2149:Secure cryptoprocessor 2143:Tensor Processing Unit 2125:Vision processing unit 1859:Cycles per instruction 1853:Instructions per cycle 1800:Associative processing 1491:Instruction pipelining 913:Processor technologies 817:Hall, Douglas (2006). 792:Hall, Douglas (2006). 417:backward compatibility 90:instruction pipelining 88:and have some sort of 2636:Sum-addressed decoder 2382:Arithmetic logic unit 1509:Classic RISC pipeline 1463:Epiphany architecture 1310:Motorola 68000 series 749:Computer Organization 400:(which is encoded as 246:current instruction. 78:von Neumann computers 16:CPU optimization unit 2757:Performance per watt 2335:replacement policies 2001:Package on a package 1891:Performance per watt 1795:Pipelined processing 1565:Tomasulo's algorithm 1370:Clipper architecture 1226:Application-specific 939:Finite-state machine 700:Hayes, John (1998). 82:Harvard architecture 71:prefetch input queue 69:from memory into a 35:prefetch input queue 2789:Digital electronics 2442:Instruction decoder 2394:Floating-point unit 2048:Soft microprocessor 1995:System in a package 1570:Reservation station 1100:Transport-triggered 427:This is an example 406:jmp near to_the_end 273:instruction opcodes 251:four stage pipeline 127:self-modifying code 86:self-modifying code 2661:Integrated circuit 2505:Processor register 2159:Baseband processor 1504:Operand forwarding 964:Cellular automaton 480:; zero register ax 465:; zero register bx 173:Kendall's notation 2822: 2821: 2711: 2710: 2330:Instruction cache 2320:Scratchpad memory 2167: 2166: 2154:Network processor 2083:Network on a chip 2038:Ultra-low-voltage 1989:Multi-chip module 1832: 1831: 1618: 1617: 1605:Branch prediction 1582:Register renaming 1476: 1475: 1458:VISC architecture 1280:Quantum computing 1275:VISC architecture 1157:Secondary storage 1073:Microarchitecture 1033:Register machines 450:code_starts_here: 443:assembly language 380:; Some other code 335:code_starts_here: 275:from the memory. 241:Instruction queue 220:Bernoulli process 97:programming model 2842: 2784:Processor design 2676:Power management 2558:Instruction unit 2419:Branch predictor 2368: 2367: 2066:System on a chip 2008: 2007: 1848:Transistor count 1772:Flynn's taxonomy 1629: 1628: 1487: 1486: 1290:Addressing modes 1201: 1200: 1147:Memory hierarchy 1011:Hypercomputation 929:Abstract machine 906: 899: 892: 883: 882: 868: 867: 839: 833: 832: 814: 808: 807: 789: 783: 782: 775: 769: 768: 752: 742: 736: 735: 727: 721: 720: 712: 706: 705: 697: 691: 690: 679: 640: 637: 634: 631: 628: 625: 622: 619: 616: 613: 610: 607: 604: 601: 598: 595: 592: 589: 586: 583: 580: 577: 574: 571: 568: 565: 562: 559: 556: 553: 550: 547: 544: 541: 538: 535: 532: 529: 526: 523: 520: 517: 514: 511: 508: 505: 502: 499: 496: 493: 490: 487: 484: 481: 478: 475: 472: 469: 466: 463: 460: 457: 454: 451: 384: 381: 378: 375: 372: 369: 366: 363: 360: 357: 354: 351: 348: 345: 342: 339: 336: 329:x86 example code 2850: 2849: 2845: 2844: 2843: 2841: 2840: 2839: 2825: 2824: 2823: 2818: 2804:Tick–tock model 2762: 2718: 2707: 2647: 2631:Address decoder 2585: 2539: 2535:Program counter 2510:Status register 2491: 2446: 2406:Load–store unit 2373: 2366: 2293: 2262: 2163: 2120:Image processor 2095: 2088: 2058: 2052: 2028:Microcontroller 2018:Embedded system 2006: 1906: 1839: 1828: 1766: 1716: 1614: 1591: 1575:Re-order buffer 1546: 1527:Data dependency 1513: 1472: 1302: 1296: 1195: 1194:Instruction set 1188: 1174:Multiprocessing 1142:Cache hierarchy 1135:Register/memory 1059: 959:Queue automaton 915: 910: 879: 877: 872: 871: 840: 836: 829: 815: 811: 804: 790: 786: 777: 776: 772: 765: 743: 739: 728: 724: 713: 709: 698: 694: 681: 680: 676: 671: 659: 642: 641: 638: 635: 632: 629: 626: 623: 620: 617: 614: 611: 608: 605: 602: 599: 596: 593: 590: 587: 584: 581: 578: 575: 572: 569: 566: 563: 560: 557: 554: 551: 548: 545: 542: 539: 536: 533: 530: 527: 524: 521: 518: 515: 512: 509: 506: 503: 500: 497: 494: 491: 488: 485: 482: 479: 476: 473: 470: 467: 464: 461: 458: 455: 452: 449: 425: 386: 385: 382: 379: 376: 373: 370: 367: 364: 361: 358: 355: 352: 349: 346: 343: 340: 337: 334: 331: 304: 243: 166:queueing models 158: 125:When executing 55: 17: 12: 11: 5: 2848: 2838: 2837: 2820: 2819: 2817: 2816: 2811: 2809:Pin grid array 2806: 2801: 2796: 2791: 2786: 2781: 2776: 2770: 2768: 2764: 2763: 2761: 2760: 2754: 2749: 2744: 2739: 2734: 2729: 2723: 2721: 2713: 2712: 2709: 2708: 2706: 2705: 2700: 2695: 2690: 2685: 2680: 2679: 2678: 2673: 2668: 2657: 2655: 2649: 2648: 2646: 2645: 2643:Barrel shifter 2640: 2639: 2638: 2633: 2626:Binary decoder 2623: 2622: 2621: 2611: 2606: 2601: 2595: 2593: 2587: 2586: 2584: 2583: 2578: 2570: 2565: 2560: 2555: 2549: 2547: 2541: 2540: 2538: 2537: 2532: 2527: 2522: 2517: 2515:Stack register 2512: 2507: 2501: 2499: 2493: 2492: 2490: 2489: 2488: 2487: 2482: 2472: 2467: 2462: 2456: 2454: 2448: 2447: 2445: 2444: 2439: 2438: 2437: 2426: 2421: 2416: 2415: 2414: 2408: 2397: 2391: 2385: 2378: 2376: 2365: 2364: 2359: 2354: 2349: 2344: 2343: 2342: 2337: 2332: 2327: 2322: 2317: 2307: 2301: 2299: 2295: 2294: 2292: 2291: 2286: 2281: 2276: 2270: 2268: 2264: 2263: 2261: 2260: 2259: 2258: 2248: 2243: 2238: 2233: 2228: 2223: 2218: 2213: 2208: 2203: 2198: 2193: 2188: 2183: 2177: 2175: 2169: 2168: 2165: 2164: 2162: 2161: 2156: 2151: 2146: 2140: 2134: 2128: 2122: 2117: 2111: 2109:AI accelerator 2106: 2100: 2098: 2090: 2089: 2087: 2086: 2080: 2075: 2072:Multiprocessor 2069: 2062: 2060: 2054: 2053: 2051: 2050: 2045: 2040: 2035: 2030: 2025: 2023:Microprocessor 2020: 2014: 2012: 2011:By application 2005: 2004: 1998: 1992: 1986: 1981: 1976: 1971: 1966: 1961: 1956: 1954:Tile processor 1951: 1946: 1941: 1936: 1935: 1934: 1923: 1916: 1914: 1908: 1907: 1905: 1904: 1899: 1894: 1888: 1882: 1876: 1870: 1864: 1863: 1862: 1850: 1844: 1842: 1834: 1833: 1830: 1829: 1827: 1826: 1825: 1824: 1814: 1809: 1808: 1807: 1802: 1797: 1792: 1782: 1776: 1774: 1768: 1767: 1765: 1764: 1759: 1754: 1749: 1748: 1747: 1742: 1740:Hyperthreading 1732: 1726: 1724: 1722:Multithreading 1718: 1717: 1715: 1714: 1709: 1704: 1703: 1702: 1692: 1691: 1690: 1685: 1675: 1674: 1673: 1668: 1658: 1653: 1652: 1651: 1646: 1635: 1633: 1626: 1620: 1619: 1616: 1615: 1613: 1612: 1607: 1601: 1599: 1593: 1592: 1590: 1589: 1584: 1579: 1578: 1577: 1572: 1562: 1556: 1554: 1548: 1547: 1545: 1544: 1539: 1534: 1529: 1523: 1521: 1515: 1514: 1512: 1511: 1506: 1501: 1499:Pipeline stall 1495: 1493: 1484: 1478: 1477: 1474: 1473: 1471: 1470: 1465: 1460: 1455: 1452: 1451: 1450: 1448:z/Architecture 1445: 1440: 1435: 1427: 1422: 1417: 1412: 1407: 1402: 1397: 1392: 1387: 1382: 1377: 1372: 1367: 1366: 1365: 1360: 1355: 1347: 1342: 1337: 1332: 1327: 1322: 1317: 1312: 1306: 1304: 1298: 1297: 1295: 1294: 1293: 1292: 1282: 1277: 1272: 1267: 1262: 1257: 1252: 1251: 1250: 1240: 1239: 1238: 1228: 1223: 1218: 1213: 1207: 1205: 1198: 1190: 1189: 1187: 1186: 1181: 1176: 1171: 1166: 1161: 1160: 1159: 1154: 1152:Virtual memory 1144: 1139: 1138: 1137: 1132: 1127: 1122: 1112: 1107: 1102: 1097: 1092: 1091: 1090: 1080: 1075: 1069: 1067: 1061: 1060: 1058: 1057: 1056: 1055: 1050: 1045: 1040: 1030: 1025: 1020: 1019: 1018: 1013: 1008: 1003: 998: 993: 988: 983: 976:Turing machine 973: 972: 971: 966: 961: 956: 951: 946: 936: 931: 925: 923: 917: 916: 909: 908: 901: 894: 886: 876: 875:External links 873: 870: 869: 834: 827: 809: 802: 784: 770: 763: 737: 722: 707: 692: 673: 672: 670: 667: 666: 665: 663:Sequence point 658: 655: 651:context switch 448: 436:self-modifying 424: 421: 394:jmp to_the_end 390:self-modifying 333: 330: 327: 303: 300: 242: 239: 234: 233: 230:Markov process 223: 213: 198: 197: 194: 191: 188: 181: 180: 157: 154: 112:protected mode 80:(that is, not 54: 51: 15: 9: 6: 4: 3: 2: 2847: 2836: 2833: 2832: 2830: 2815: 2812: 2810: 2807: 2805: 2802: 2800: 2797: 2795: 2792: 2790: 2787: 2785: 2782: 2780: 2777: 2775: 2772: 2771: 2769: 2765: 2758: 2755: 2753: 2750: 2748: 2745: 2743: 2740: 2738: 2735: 2733: 2730: 2728: 2725: 2724: 2722: 2720: 2714: 2704: 2701: 2699: 2696: 2694: 2691: 2689: 2686: 2684: 2681: 2677: 2674: 2672: 2669: 2667: 2664: 2663: 2662: 2659: 2658: 2656: 2654: 2650: 2644: 2641: 2637: 2634: 2632: 2629: 2628: 2627: 2624: 2620: 2617: 2616: 2615: 2612: 2610: 2607: 2605: 2604:Demultiplexer 2602: 2600: 2597: 2596: 2594: 2592: 2588: 2582: 2579: 2577: 2574: 2571: 2569: 2566: 2564: 2561: 2559: 2556: 2554: 2551: 2550: 2548: 2546: 2542: 2536: 2533: 2531: 2528: 2526: 2525:Memory buffer 2523: 2521: 2520:Register file 2518: 2516: 2513: 2511: 2508: 2506: 2503: 2502: 2500: 2498: 2494: 2486: 2483: 2481: 2478: 2477: 2476: 2473: 2471: 2468: 2466: 2463: 2461: 2460:Combinational 2458: 2457: 2455: 2453: 2449: 2443: 2440: 2436: 2433: 2432: 2430: 2427: 2425: 2422: 2420: 2417: 2412: 2409: 2407: 2404: 2403: 2401: 2398: 2395: 2392: 2389: 2386: 2383: 2380: 2379: 2377: 2375: 2369: 2363: 2360: 2358: 2355: 2353: 2350: 2348: 2345: 2341: 2338: 2336: 2333: 2331: 2328: 2326: 2323: 2321: 2318: 2316: 2313: 2312: 2311: 2308: 2306: 2303: 2302: 2300: 2296: 2290: 2287: 2285: 2282: 2280: 2277: 2275: 2272: 2271: 2269: 2265: 2257: 2254: 2253: 2252: 2249: 2247: 2244: 2242: 2239: 2237: 2234: 2232: 2229: 2227: 2224: 2222: 2219: 2217: 2214: 2212: 2209: 2207: 2204: 2202: 2199: 2197: 2194: 2192: 2189: 2187: 2184: 2182: 2179: 2178: 2176: 2174: 2170: 2160: 2157: 2155: 2152: 2150: 2147: 2144: 2141: 2138: 2135: 2132: 2129: 2126: 2123: 2121: 2118: 2115: 2112: 2110: 2107: 2105: 2102: 2101: 2099: 2097: 2091: 2084: 2081: 2079: 2076: 2073: 2070: 2067: 2064: 2063: 2061: 2055: 2049: 2046: 2044: 2041: 2039: 2036: 2034: 2031: 2029: 2026: 2024: 2021: 2019: 2016: 2015: 2013: 2009: 2002: 1999: 1996: 1993: 1990: 1987: 1985: 1982: 1980: 1977: 1975: 1972: 1970: 1967: 1965: 1962: 1960: 1957: 1955: 1952: 1950: 1947: 1945: 1942: 1940: 1937: 1933: 1930: 1929: 1927: 1924: 1921: 1918: 1917: 1915: 1913: 1909: 1903: 1900: 1898: 1895: 1892: 1889: 1886: 1883: 1880: 1877: 1874: 1871: 1868: 1865: 1860: 1857: 1856: 1854: 1851: 1849: 1846: 1845: 1843: 1841: 1835: 1823: 1820: 1819: 1818: 1815: 1813: 1810: 1806: 1803: 1801: 1798: 1796: 1793: 1791: 1788: 1787: 1786: 1783: 1781: 1778: 1777: 1775: 1773: 1769: 1763: 1760: 1758: 1755: 1753: 1750: 1746: 1743: 1741: 1738: 1737: 1736: 1733: 1731: 1728: 1727: 1725: 1723: 1719: 1713: 1710: 1708: 1705: 1701: 1698: 1697: 1696: 1693: 1689: 1686: 1684: 1681: 1680: 1679: 1676: 1672: 1669: 1667: 1664: 1663: 1662: 1659: 1657: 1654: 1650: 1647: 1645: 1642: 1641: 1640: 1637: 1636: 1634: 1630: 1627: 1625: 1621: 1611: 1608: 1606: 1603: 1602: 1600: 1598: 1594: 1588: 1585: 1583: 1580: 1576: 1573: 1571: 1568: 1567: 1566: 1563: 1561: 1560:Scoreboarding 1558: 1557: 1555: 1553: 1549: 1543: 1542:False sharing 1540: 1538: 1535: 1533: 1530: 1528: 1525: 1524: 1522: 1520: 1516: 1510: 1507: 1505: 1502: 1500: 1497: 1496: 1494: 1492: 1488: 1485: 1483: 1479: 1469: 1466: 1464: 1461: 1459: 1456: 1453: 1449: 1446: 1444: 1441: 1439: 1436: 1434: 1431: 1430: 1428: 1426: 1423: 1421: 1418: 1416: 1413: 1411: 1408: 1406: 1403: 1401: 1398: 1396: 1393: 1391: 1388: 1386: 1383: 1381: 1378: 1376: 1373: 1371: 1368: 1364: 1361: 1359: 1356: 1354: 1351: 1350: 1348: 1346: 1343: 1341: 1338: 1336: 1335:Stanford MIPS 1333: 1331: 1328: 1326: 1323: 1321: 1318: 1316: 1313: 1311: 1308: 1307: 1305: 1299: 1291: 1288: 1287: 1286: 1283: 1281: 1278: 1276: 1273: 1271: 1268: 1266: 1263: 1261: 1258: 1256: 1253: 1249: 1246: 1245: 1244: 1241: 1237: 1234: 1233: 1232: 1229: 1227: 1224: 1222: 1219: 1217: 1214: 1212: 1209: 1208: 1206: 1202: 1199: 1197: 1196:architectures 1191: 1185: 1182: 1180: 1177: 1175: 1172: 1170: 1167: 1165: 1164:Heterogeneous 1162: 1158: 1155: 1153: 1150: 1149: 1148: 1145: 1143: 1140: 1136: 1133: 1131: 1128: 1126: 1123: 1121: 1118: 1117: 1116: 1115:Memory access 1113: 1111: 1108: 1106: 1103: 1101: 1098: 1096: 1093: 1089: 1086: 1085: 1084: 1081: 1079: 1076: 1074: 1071: 1070: 1068: 1066: 1062: 1054: 1051: 1049: 1048:Random-access 1046: 1044: 1041: 1039: 1036: 1035: 1034: 1031: 1029: 1028:Stack machine 1026: 1024: 1021: 1017: 1014: 1012: 1009: 1007: 1004: 1002: 999: 997: 994: 992: 989: 987: 984: 982: 979: 978: 977: 974: 970: 967: 965: 962: 960: 957: 955: 952: 950: 947: 945: 944:with datapath 942: 941: 940: 937: 935: 932: 930: 927: 926: 924: 922: 918: 914: 907: 902: 900: 895: 893: 888: 887: 884: 880: 865: 861: 857: 853: 849: 845: 844:IEEE Spectrum 838: 830: 828:0-07-060167-4 824: 820: 813: 805: 803:0-07-060167-4 799: 795: 788: 780: 774: 766: 764:0-07-114309-2 760: 756: 751: 750: 741: 733: 726: 718: 711: 703: 696: 688: 684: 678: 674: 664: 661: 660: 654: 652: 647: 570:code_segment: 446: 444: 440: 437: 434: 430: 420: 418: 414: 409: 407: 403: 399: 395: 391: 326: 322: 319: 317: 313: 309: 299: 297: 292: 289: 285: 280: 276: 274: 270: 269:address buses 266: 262: 257: 254: 252: 247: 238: 231: 227: 224: 221: 217: 214: 211: 207: 203: 200: 199: 195: 192: 189: 186: 185: 184: 178: 177: 176: 174: 169: 167: 163: 153: 151: 147: 143: 138: 136: 132: 128: 123: 121: 117: 113: 109: 105: 100: 98: 93: 91: 87: 83: 79: 74: 72: 68: 64: 59: 50: 48: 44: 40: 36: 32: 28: 25:from program 24: 19: 2814:Chip carrier 2752:Clock gating 2671:Mixed-signal 2568:Write buffer 2545:Control unit 2357:Clock signal 2096:accelerators 2078:Cypress PSoC 1735:Simultaneous 1552:Out-of-order 1184:Neuromorphic 1065:Architecture 1023:Belt machine 1016:Zeno machine 949:Hierarchical 878: 850:(3): 28–34. 847: 843: 837: 818: 812: 793: 787: 773: 748: 740: 731: 725: 716: 710: 701: 695: 686: 677: 643: 582:flush_queue: 426: 410: 405: 404:). The jump 401: 393: 387: 323: 320: 305: 293: 281: 277: 258: 255: 248: 244: 235: 225: 215: 201: 182: 170: 159: 149: 145: 139: 124: 116:machine code 101: 94: 75: 70: 67:machine code 63:clock cycles 56: 53:Introduction 34: 20: 18: 2599:Multiplexer 2563:Data buffer 2274:Single-core 2246:bit slicing 2104:Coprocessor 1959:Coprocessor 1840:performance 1762:Cooperative 1752:Speculative 1712:Distributed 1671:Superscalar 1656:Instruction 1624:Parallelism 1597:Speculative 1429:System/3x0 1301:Instruction 1078:Von Neumann 991:Post–Turing 646:brute force 618:found_size: 564:flush_queue 413:superscalar 383:to_the_end: 226:M/G/1 Model 216:M/M/r Model 202:M/M/1 Model 179:A1/A2/A3/A4 31:prefetching 2719:management 2614:Multiplier 2475:Logic gate 2465:Sequential 2372:Functional 2352:Clock rate 2325:Data cache 2298:Components 2279:Multi-core 2267:Core count 1757:Preemptive 1661:Pipelining 1644:Bit-serial 1587:Wide-issue 1532:Structural 1454:Tilera ISA 1420:MicroBlaze 1390:ETRAX CRIS 1285:Comparison 1130:Load–store 1110:Endianness 669:References 600:nop_field: 528:found_size 377:to_the_end 308:CPU design 162:A.K Erlang 58:Pipelining 2653:Circuitry 2573:Microcode 2497:Registers 2340:coherence 2315:CPU cache 2173:Word size 1838:Processor 1482:Execution 1385:DEC Alpha 1363:Power ISA 1179:Cognitive 986:Universal 396:with two 302:Drawbacks 206:Markovian 120:exception 108:real mode 43:processor 2829:Category 2591:Datapath 2284:Manycore 2256:variable 2094:Hardware 1730:Temporal 1410:OpenRISC 1105:Cellular 1095:Dataflow 1088:modified 864:25154920 657:See also 261:pipeline 146:probably 142:emulator 102:When an 2767:Related 2698:Quantum 2688:Digital 2683:Boolean 2581:Counter 2480:Quantum 2241:512-bit 2236:256-bit 2231:128-bit 2074:(MPSoC) 2059:on chip 2057:Systems 1875:(FLOPS) 1688:Process 1537:Control 1519:Hazards 1405:Itanium 1400:Unicore 1358:PowerPC 1083:Harvard 1043:Pointer 1038:Counter 996:Quantum 755:310–329 507:around: 249:With a 183:where: 160:It was 133:and/or 23:opcodes 2703:Switch 2693:Analog 2431:(IMC) 2402:(MMU) 2251:others 2226:64-bit 2221:48-bit 2216:32-bit 2211:24-bit 2206:16-bit 2201:15-bit 2196:12-bit 2033:Mobile 1949:Stream 1944:Barrel 1939:Vector 1928:(GPU) 1887:(SUPS) 1855:(IPC) 1707:Memory 1700:Vector 1683:Thread 1666:Scalar 1468:Others 1415:RISC-V 1380:SuperH 1349:Power 1345:MIPS-X 1320:PDP-11 1169:Fabric 921:Models 862:  825:  800:  761:  615:around 433:syntax 402:0x9090 368:ahead: 296:branch 150:always 27:memory 2759:(PPW) 2717:Power 2609:Adder 2485:Array 2452:Logic 2413:(TLB) 2396:(FPU) 2390:(AGU) 2384:(ALU) 2374:units 2310:Cache 2191:8-bit 2186:4-bit 2181:1-bit 2145:(TPU) 2139:(DSP) 2133:(PPU) 2127:(VPU) 2116:(GPU) 2085:(NoC) 2068:(SoC) 2003:(PoP) 1997:(SiP) 1991:(MCM) 1932:GPGPU 1922:(CPU) 1912:Types 1893:(PPW) 1881:(TPS) 1869:(IPS) 1861:(CPI) 1632:Level 1443:S/390 1438:S/370 1433:S/360 1375:SPARC 1353:POWER 1236:TRIPS 1204:Types 860:S2CID 603:times 388:This 365:9090h 347:ahead 135:cache 39:queue 2737:ACPI 2470:Glue 2362:FIFO 2305:Core 2043:ASIP 1984:CPLD 1979:FPOA 1974:FPGA 1969:ASIC 1822:SPMD 1817:MIMD 1812:MISD 1805:SWAR 1785:SIMD 1780:SISD 1695:Data 1678:Task 1649:Word 1395:M32R 1340:MIPS 1303:sets 1270:ZISC 1265:NISC 1260:OISC 1255:MISC 1248:EPIC 1243:VLIW 1231:EDGE 1221:RISC 1216:CISC 1125:HUMA 1120:NUMA 823:ISBN 798:ISBN 759:ISBN 597:0x40 591:byte 555:0xEA 543:0x90 537:byte 429:NASM 398:NOPs 374:near 353:word 314:and 288:8088 284:8086 282:The 267:and 265:data 2732:APM 2727:PMU 2619:CPU 2576:ROM 2347:Bus 1964:PAL 1639:Bit 1425:LMC 1330:ARM 1325:x86 1315:VAX 852:doi 612:jmp 609:nop 606:256 588:mov 546:inc 534:mov 510:cmp 495:mov 483:mov 468:xor 453:xor 439:x86 371:jmp 356:ptr 350:mov 338:mov 312:BIU 131:RAM 110:to 104:x86 2831:: 2666:3D 858:. 848:16 846:. 757:. 685:. 573:dw 561:dw 552:db 549:bx 525:je 513:ax 501:dx 492:cs 486:dx 477:ax 471:ax 462:bx 456:bx 362::, 359:cs 341:bx 316:EU 175:: 137:. 122:. 73:. 49:. 905:e 898:t 891:v 866:. 854:: 831:. 806:. 781:. 767:. 689:. 639:; 621:; 594:, 576:0 540:, 519:1 516:, 498:, 489:, 474:, 459:, 441:- 431:- 344:, 212:.

Index

opcodes
memory
prefetching
queue
processor
Intel 8086 microprocessor
Pipelining
clock cycles
machine code
von Neumann computers
Harvard architecture
self-modifying code
instruction pipelining
programming model
x86
real mode
protected mode
machine code
exception
self-modifying code
RAM
cache
emulator
A.K Erlang
queueing models
Kendall's notation
Markovian
cumulative probability distribution function
Bernoulli process
Markov process

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.

↑