67:. He first takes the "Scheduling Problem" and reduces it to a machine which has to perform jobs and has a set time period, every hour or day for example, to finish each job in. The machine is given a reward value, based on finishing or not within the time period, and a probability value of whether it will finish or not for each job is calculated. The problem is "to decide which job to process next at each stage so as to maximize the total expected reward." He then moves on to the "Multiβarmed bandit problem" where each pull on a "
2264:
significantly. A problem setting to prevent such early terminations consists of defining the optimization as maximization of the future ratio seen by each state. An indexation is conjectured to exist for this problem, be computable as simple variation on existing restart-in-state or state elimination algorithms and evaluated to work well in practice.
425:
99:
with a reward function and with a probability of termination. It is a measure of the reward that can be achieved by the process evolving from that state on, under the probability that it will be terminated in future. The "index policy" induced by the
Gittins index, consisting of choosing at any time
2255:
While conventional
Gittins indices induce a policy to optimize the accrual of a reward, a common problem setting consists of optimizing the ratio of accrued rewards. For example, this is a case for systems to maximize bandwidth, consisting of data over time, or minimize power consumption, consisting
199:
algorithm. Cowan, W. and
Katehakis (2014), provide a solution to the problem, with potentially non-Markovian, uncountable state space reward processes, under frameworks in which, either the discount factors may be non-uniform and vary over time, or the periods of activation of each bandit may be not
172:
could be used to calculate the index of a state without requiring its calculation for all states with higher index values. LCM Kallenberg provided a parametric LP implementation to compute the indices for all states of a Markov chain. Further, Katehakis and
Veinott demonstrated that the index is
106:
such as the one of dynamic allocation, where a decision-maker has to maximize the total reward by distributing a limited amount of effort to a number of competing projects, each returning a stochastic reward. If the projects are independent from each other and only one project at a time may evolve,
40:
To illustrate the theory we can take two examples from a developing sector, such as from electricity generating technologies: wind power and wave power. If we are presented with the two technologies when they are both proposed as ideas we cannot say which will be better in the long run as we have no
2259:
This class of problems is different from the optimization of a semi-Markov reward process, because the latter one might select states with a disproportionate sojourn time just for accruing a higher reward. Instead, it corresponds to the class of linear-fractional markov reward optimization problem.
44:
If we were to make a judgment call at that early time in development we could be condemning one technology to being put on the shelf and the other would be developed and put into operation. If we develop both technologies we would be able to make a judgment call on each by comparing the progress of
75:
and has an unknown probability of success. There are multiple "bandits" and the distribution of successful pulls is calculated and different for each machine. Gittins states that the problem here is "to decide which arm to pull next at each stage so as to maximize the total expected reward from an
24:
with certain properties, namely: the process has an ultimate termination state and evolves with an option, at each intermediate state, of terminating. Upon terminating at a given state, the reward achieved is the sum of the probabilistic expected rewards associated with every state from the actual
189:
algorithm. This approach also has the advantage of calculating the index for one specific state without having to calculate all the greater indexes and it is valid under more general state space conditions. A faster algorithm for the calculation of all indices was obtained in 2004 by Sonin as a
2263:
However, a detrimental aspect of such ratio optimizations is that, once the achieved ratio in some state is high, the optimization might select states leading to a low ratio because they bear a high probability of termination, so that the process is likely to terminate before the ratio drops
194:
for the optimal stopping of a Markov chain. In this algorithm the termination probability of the process may depend on the current state rather than being a fixed factor. A faster algorithm was proposed in 2007 by NiΓ±o-Mora by exploiting the structure of a parametric simplex to reduce the
2246:
In queueing theory, Gittins index is used to determine the optimal scheduling of jobs, e.g., in an M/G/1 queue. The mean completion time of jobs under a
Gittins index schedule can be determined using the SOAP approach. Note that the dynamics of the queue are intrinsically Markovian, and
136:
and his collaborators demonstrated in a
Markovian framework that the optimal solution of the general case is an index policy whose "dynamic allocation index" is computable in principle for every state of each project as a function of the single project's dynamics. In parallel to Gittins,
970:
219:
41:
data, as yet, to base our judgments on. It would be easy to say that wave power would be too problematic to develop as it seems easier to put up many wind turbines than to make the long floating generators, tow them out to sea and lay the cables necessary.
1338:
1872:
131:
Questions about the optimal stopping policies in the context of clinical trials have been open from the 1940s and in the 1960s a few authors analyzed simple models leading to optimal index policies, but it was only in the 1970s that
1746:
1619:
79:
Gittins says that "Both the problems described above involve a sequence of decisions, each of which is based on more information than its predecessors, and these both problems may be tackled by dynamic allocation indices."
653:
200:
be fixed or uniform, subject instead to a possibly stochastic duration of activation before a change to a different bandit is allowed. The solution is based on generalized restart-in-state indices.
2247:
stochasticity is due to the arrival and service processes. This is in contrast to most of the works in the learning literature, where stochasticity is explicitly accounted through a noise term.
2175:
168:
was first addressed by
Varaiya and his collaborators with an algorithm that computes the indexes from the largest first down to the smallest and by Chen and Katehakis who showed that standard
815:
420:{\displaystyle \nu (i)=\sup _{\tau >0}{\frac {\left\langle \sum _{t=0}^{\tau -1}\beta ^{t}R\right\rangle _{Z(0)=i}}{\left\langle \sum _{t=0}^{\tau -1}\beta ^{t}\right\rangle _{Z(0)=i}}}}
565:
1969:
1038:
766:
2118:
1204:
532:
2207:
1904:
1522:
1473:
1435:
1075:
457:
2236:
1998:
1752:
804:
1388:
1176:
1126:
676:
2056:
2027:
1361:
486:
1493:
1196:
1149:
1099:
506:
1632:
45:
each technology at a set time interval such as every three months. The decisions we make about investment in the next stage would be based on those results.
2482:
1530:
2936:
2900:
2848:
2791:
577:
71:" lever is allocated a reward function for a successful pull, and a zero reward for an unsuccessful pull. The sequence of successes forms a
119:
and the
Gittins index policy is a known good heuristic but no optimal solution exists in general. In fact, in general this problem is
2865:
Scully, Ziv and
Harchol-Balter, Mor and Scheller-Wolf, Alan (2018). "SOAP: One Clean Analysis of All Age-Based Scheduling Policies".
2756:
Scully, Ziv and
Harchol-Balter, Mor and Scheller-Wolf, Alan (2018). "SOAP: One Clean Analysis of All Age-Based Scheduling Policies".
2960:
2920:
2806:
965:{\displaystyle v(i,k)=\sup _{\tau >0}\left\langle \sum _{t=0}^{\tau -1}\beta ^{t}R+\beta ^{t}k\right\rangle _{Z(0)=i}}
3007:
2603:
2571:
2542:
2509:
Varaiya, P.; Walrand, J.; Buyukkoc, C. (May 1985). "Extensions of the multiarmed bandit problem: The discounted case".
2124:
2678:
Ni, Mora J (2007). "A (2/3)^n Fast-Pivoting Algorithm for the Gittins Index and Optimal Stopping of a Markov Chain".
2569:
Kallenberg, Lodewijk C. M. (1986). "A Note on M. N. Katehakis' and Y.-R. Chen's Computation of the Gittins Index".
2389:
Gittins, J. C.; Jones, D. M. (1979). "A Dynamic Allocation Index for the Discounted Multiarmed Bandit Problem".
537:
3124:
1909:
2952:
2477:
981:
700:
145:
115:
problems) and the Gittins index policy is optimal. If multiple projects can evolve, the problem is called
2064:
3129:
1333:{\displaystyle h(i)=\sup _{\pi }\left\langle \sum _{t=0}^{\tau -1}\beta ^{t}R\right\rangle _{Z(0)=i}}
3119:
2908:
2842:
2692:
3005:; Veinott, Arthur F. Jr. (1987). "The multi-armed bandit problem: decomposition and computation".
2601:; Veinott, Arthur F. Jr. (1987). "The multi-armed bandit problem: decomposition and computation".
511:
174:
3075:
2651:
Sonin I (2008). "A generalized Gittins index for a Markov chain and its recursive calculation".
2183:
1882:
1498:
1081:
and Veinott (1987) associates to every state the action of restarting from one arbitrary state
2687:
1867:{\displaystyle Q^{\tau }(i)=\left\langle 1-\prod _{t=0}^{\tau -1}\beta \right\rangle _{Z(0)=i}}
1449:
1396:
1051:
433:
56:
2894:
2785:
2287:
Cowan, Robin (July 1991). "Tortoises and Hares: Choice among technologies of unknown merit".
2212:
1974:
112:
774:
160:
and conjectured that the same index would be a good heuristic in a more general setup named
2864:
2755:
1366:
1154:
1104:
679:
661:
196:
92:
29:
2032:
2003:
1346:
694:
The dynamic programming formulation in terms of retirement process, given by Whittle, is:
462:
8:
195:
computational effort of the pivot steps and thereby achieving the same complexity as the
153:
149:
100:
the stochastic process with the currently highest Gittins index, is the solution of some
3081:
Cowan, Robin (1991). "Tortoises and Hares: Choice Among Technologies of Unknown Merit".
3098:
3032:
3024:
2990:
2930:
2882:
2816:
2773:
2705:
2628:
2620:
2495:
2459:
2451:
2406:
2352:
2344:
2340:
2304:
1741:{\displaystyle R^{\tau }(i)=\left\langle \sum _{t=0}^{\tau -1}R\right\rangle _{Z(0)=i}}
1478:
1181:
1134:
1084:
491:
169:
108:
96:
60:
21:
55:
suggests a solution for problems such as this. He takes the two basic functions of a "
3045:
3002:
2956:
2916:
2886:
2777:
2709:
2598:
2537:
1078:
72:
2463:
2370:
Mitten L (1960). "An Analytic Solution to the Least Cost Testing Sequence Problem".
2356:
3090:
3057:
3016:
2982:
2970:
2874:
2765:
2736:
2697:
2660:
2612:
2580:
2551:
2518:
2491:
2441:
2433:
2398:
2336:
2296:
102:
3036:
2632:
2944:
1614:{\displaystyle \alpha (i)=\sup _{\tau >0}{\frac {R^{\tau }(i)}{Q^{\tau }(i)}}}
138:
133:
120:
52:
2915:. Monographs on Statistics and Applied Probability. London: Chapman & Hall.
2826:
2812:
3061:
2986:
2741:
2724:
2664:
3113:
2522:
25:
terminating state to the ultimate terminal state, inclusive. The index is a
2804:
2701:
2540:(1986). "Linear programming for finite state multi-armed bandit problems".
165:
68:
3020:
2616:
2584:
2555:
2327:
Gittins, J. C. (1979). "Bandit Processes and Dynamic Allocation Indices".
3048:(2014). "Multi-armed Bandits under General Depreciation and Commitment".
648:{\displaystyle \langle X\rangle _{c}\doteq \sum _{x\in \chi }xP\{X=x|c\}}
89:
26:
2446:
1495:, a generalization introduced by Sonin (2008) defines the Gittins index
3102:
3028:
2994:
2867:
Proceedings of the ACM on Measurement and Analysis of Computing Systems
2758:
Proceedings of the ACM on Measurement and Analysis of Computing Systems
2624:
2455:
2424:
Weitzman, Martin L. (1979). "Optimal Search for the Best Alternative".
2410:
2348:
2308:
534:
is the probability that the stochastic process does not terminate, and
2951:. Wiley-Interscience Series in Systems and Optimization. foreword by
2821:
488:
is the utility (also called reward) associated to the discrete state
123:
and it is generally accepted that no feasible solution can be found.
3094:
2878:
2769:
2437:
2402:
2300:
2907:
2329:
Journal of the Royal Statistical Society. Series B (Methodological)
1524:
as the maximum discounted total reward per chance of termination.
3077:
Matlab/Octave implementation of the index computation algorithms
2973:(November 1992). "On the Gittins index for multiarmed bandits".
1178:
if one can always choose to continue or restart from that state
20:
is a measure of the reward that can be achieved through a given
2805:
Di Gregorio, Lorenzo and Frascolla, Valerio (October 1, 2019).
2725:"Multi-armed bandits under general depreciation and commitment"
181:
and can be calculated exactly by solving that problem with the
3043:
164:. The question of how to actually calculate the index for
63:" problem and shows how these problems can be solved using
3050:
Probability in the Engineering and Informational Sciences
2729:
Probability in the Engineering and Informational Sciences
141:
established the same result in the economics literature.
2508:
1077:
is a Markov chain with rewards, the interpretation of
2913:
Bandit problems: Sequential allocation of experiments
2723:
Cowan, Wesley; Katehakis, Michael N. (January 2015).
2480:(1980). "Multi-armed Bandits and the Gittins Index".
2215:
2186:
2127:
2067:
2035:
2006:
1977:
1912:
1885:
1755:
1635:
1533:
1501:
1481:
1452:
1399:
1369:
1349:
1207:
1184:
1157:
1151:
is the highest total reward which can be achieved on
1137:
1107:
1087:
1054:
984:
818:
777:
703:
664:
580:
540:
514:
494:
465:
436:
222:
567:
is the conditional expectation operator given
2483:Journal of the Royal Statistical Society, Series B
2230:
2201:
2169:
2112:
2050:
2021:
1992:
1963:
1898:
1866:
1740:
1613:
1516:
1487:
1467:
1429:
1382:
1355:
1332:
1190:
1170:
1143:
1120:
1093:
1069:
1032:
964:
798:
760:
670:
647:
559:
526:
500:
480:
451:
419:
2170:{\displaystyle \alpha (i)\neq k\nu (i),\forall k}
1101:, thereby constructing a Markov decision process
689:
88:In applied mathematics, the "Gittins index" is a
3111:
3001:
2597:
1550:
1224:
975:with the same notation as above. It holds that
841:
719:
239:
1043:
213:The classical definition by Gittins et al. is:
177:constructed over the Markov chain and known as
50:Bandit Processes and Dynamic Allocation Indices
2722:
2646:
2644:
2642:
2180:this observation leads Sonin to conclude that
2943:
2808:Handover Optimality in Heterogeneous Networks
2535:
2935:: CS1 maint: multiple names: authors list (
2899:: CS1 maint: multiple names: authors list (
2847:: CS1 maint: multiple names: authors list (
2790:: CS1 maint: multiple names: authors list (
2388:
2238:is the "true meaning" of the Gittins index.
755:
722:
642:
622:
588:
581:
548:
541:
2969:
2639:
208:
2955:. Chichester: John Wiley & Sons, Ltd.
2677:
2568:
560:{\displaystyle \langle \cdot \rangle _{c}}
203:
2820:
2740:
2691:
2445:
148:demonstrated that the index emerges as a
144:Soon after the seminal paper of Gittins,
2423:
2369:
1964:{\displaystyle \prod _{j=0}^{t-1}\beta }
2650:
2476:
2326:
1033:{\displaystyle \nu (i)=(1-\beta )w(i).}
761:{\displaystyle w(i)=\inf\{k:v(i,k)=k\}}
3112:
2511:IEEE Transactions on Automatic Control
2322:
2320:
2318:
2250:
3080:
2949:Multi-armed bandit allocation indices
2286:
2282:
2280:
2278:
2276:
185:algorithm, or approximately with the
2113:{\displaystyle \alpha (i)=h(i)=w(i)}
1441:
2315:
95:value associated to the state of a
13:
3008:Mathematics of Operations Research
2653:Statistics and Probability Letters
2604:Mathematics of Operations Research
2572:Mathematics of Operations Research
2543:Mathematics of Operations Research
2496:10.1111/j.2517-6161.1980.tb01111.x
2341:10.1111/j.2517-6161.1979.tb01068.x
2273:
2241:
2161:
156:formulation of the problem called
14:
3141:
3069:
2975:The Annals of Applied Probability
2372:Journal of Industrial Engineering
1131:The Gittins Index of that state
2798:
2749:
2716:
2671:
2591:
1446:If the probability of survival
2562:
2529:
2502:
2470:
2417:
2382:
2363:
2225:
2219:
2196:
2190:
2155:
2149:
2137:
2131:
2107:
2101:
2092:
2086:
2077:
2071:
2045:
2039:
2016:
2010:
1987:
1981:
1958:
1955:
1949:
1943:
1853:
1847:
1835:
1832:
1826:
1820:
1772:
1766:
1727:
1721:
1709:
1706:
1700:
1694:
1652:
1646:
1605:
1599:
1584:
1578:
1543:
1537:
1511:
1505:
1462:
1456:
1424:
1418:
1409:
1403:
1319:
1313:
1301:
1298:
1292:
1279:
1217:
1211:
1064:
1058:
1024:
1018:
1012:
1000:
994:
988:
951:
945:
917:
914:
908:
902:
834:
822:
793:
781:
746:
734:
713:
707:
690:Retirement process formulation
635:
475:
469:
446:
440:
404:
398:
335:
329:
317:
314:
308:
302:
232:
226:
35:
1:
2858:
83:
76:infinite sequence of pulls."
2680:INFORMS Journal on Computing
1044:Restart-in-state formulation
7:
2911:and Fristedt, Bert (1985).
527:{\displaystyle \beta <1}
10:
3146:
2202:{\displaystyle \alpha (i)}
1899:{\displaystyle \beta ^{t}}
1517:{\displaystyle \alpha (i)}
126:
65:Dynamic allocation indices
48:In a paper in 1979 called
3062:10.1017/S0269964814000217
2742:10.1017/S0269964814000217
2665:10.1016/j.spl.2008.01.049
1468:{\displaystyle \beta (i)}
1430:{\displaystyle h(i)=w(i)}
1070:{\displaystyle Z(\cdot )}
459:is a stochastic process,
452:{\displaystyle Z(\cdot )}
173:the expected reward of a
2523:10.1109/TAC.1985.1103989
2267:
1363:indicates a policy over
209:Dynamic allocation index
2987:10.1214/aoap/1177005588
2231:{\displaystyle \nu (i)}
1993:{\displaystyle \nu (i)}
204:Mathematical definition
175:Markov decision process
2702:10.1287/ijoc.1060.0206
2232:
2203:
2171:
2114:
2052:
2023:
1994:
1971:in the definitions of
1965:
1939:
1900:
1868:
1816:
1742:
1690:
1615:
1518:
1489:
1469:
1431:
1384:
1357:
1334:
1265:
1192:
1172:
1145:
1122:
1095:
1071:
1034:
966:
888:
800:
799:{\displaystyle v(i,k)}
762:
672:
649:
561:
528:
502:
482:
453:
421:
378:
288:
107:the problem is called
3125:Design of experiments
3021:10.1287/moor.12.2.262
3003:Katehakis, Michael N.
2829:on September 28, 2020
2617:10.1287/moor.12.2.262
2599:Katehakis, Michael N.
2585:10.1287/moor.11.1.184
2556:10.1287/moor.11.1.180
2538:Katehakis, Michael N.
2256:of energy over time.
2233:
2204:
2172:
2115:
2058:, then it holds that
2053:
2024:
1995:
1966:
1913:
1901:
1869:
1790:
1743:
1664:
1616:
1519:
1490:
1475:depends on the state
1470:
1432:
1385:
1383:{\displaystyle M_{i}}
1358:
1335:
1239:
1193:
1173:
1171:{\displaystyle M_{i}}
1146:
1123:
1121:{\displaystyle M_{i}}
1096:
1072:
1035:
967:
862:
801:
763:
673:
671:{\displaystyle \chi }
650:
562:
529:
503:
483:
454:
422:
352:
262:
192:elimination algorithm
113:Stochastic scheduling
3083:The Economic Journal
2289:The Economic Journal
2213:
2184:
2125:
2065:
2051:{\displaystyle h(i)}
2033:
2022:{\displaystyle w(i)}
2004:
1975:
1910:
1883:
1753:
1633:
1531:
1499:
1479:
1450:
1397:
1367:
1356:{\displaystyle \pi }
1347:
1205:
1182:
1155:
1135:
1105:
1085:
1052:
982:
816:
775:
701:
662:
578:
538:
512:
492:
481:{\displaystyle R(i)}
463:
434:
220:
197:Gaussian elimination
2251:Fractional problems
190:consequence of his
154:dynamic programming
150:Lagrange multiplier
3130:Sequential methods
2228:
2199:
2167:
2110:
2048:
2019:
1990:
1961:
1896:
1864:
1738:
1611:
1564:
1514:
1485:
1465:
1427:
1380:
1353:
1330:
1232:
1188:
1168:
1141:
1118:
1091:
1067:
1030:
962:
855:
796:
758:
668:
645:
615:
557:
524:
498:
478:
449:
417:
253:
158:retirement process
109:multi-armed bandit
97:stochastic process
61:multi-armed bandit
22:stochastic process
2962:978-0-471-92059-5
2922:978-0-412-24810-8
2659:(12): 1526β1533.
1609:
1549:
1488:{\displaystyle i}
1442:Generalized index
1390:. It holds that
1223:
1191:{\displaystyle i}
1144:{\displaystyle i}
1094:{\displaystyle i}
840:
600:
501:{\displaystyle i}
415:
238:
103:stopping problems
73:Bernoulli process
3137:
3106:
3089:(407): 801β814.
3065:
3040:
2998:
2981:(4): 1024β1033.
2966:
2940:
2934:
2926:
2909:Berry, Donald A.
2904:
2898:
2890:
2853:
2852:
2846:
2838:
2836:
2834:
2825:. Archived from
2824:
2802:
2796:
2795:
2789:
2781:
2753:
2747:
2746:
2744:
2720:
2714:
2713:
2695:
2675:
2669:
2668:
2648:
2637:
2636:
2595:
2589:
2588:
2566:
2560:
2559:
2533:
2527:
2526:
2506:
2500:
2499:
2474:
2468:
2467:
2449:
2421:
2415:
2414:
2386:
2380:
2379:
2367:
2361:
2360:
2324:
2313:
2312:
2295:(407): 801β814.
2284:
2237:
2235:
2234:
2229:
2208:
2206:
2205:
2200:
2176:
2174:
2173:
2168:
2119:
2117:
2116:
2111:
2057:
2055:
2054:
2049:
2028:
2026:
2025:
2020:
1999:
1997:
1996:
1991:
1970:
1968:
1967:
1962:
1938:
1927:
1905:
1903:
1902:
1897:
1895:
1894:
1873:
1871:
1870:
1865:
1863:
1862:
1842:
1838:
1815:
1804:
1765:
1764:
1747:
1745:
1744:
1739:
1737:
1736:
1716:
1712:
1689:
1678:
1645:
1644:
1620:
1618:
1617:
1612:
1610:
1608:
1598:
1597:
1587:
1577:
1576:
1566:
1563:
1523:
1521:
1520:
1515:
1494:
1492:
1491:
1486:
1474:
1472:
1471:
1466:
1436:
1434:
1433:
1428:
1389:
1387:
1386:
1381:
1379:
1378:
1362:
1360:
1359:
1354:
1339:
1337:
1336:
1331:
1329:
1328:
1308:
1304:
1291:
1290:
1275:
1274:
1264:
1253:
1231:
1197:
1195:
1194:
1189:
1177:
1175:
1174:
1169:
1167:
1166:
1150:
1148:
1147:
1142:
1127:
1125:
1124:
1119:
1117:
1116:
1100:
1098:
1097:
1092:
1076:
1074:
1073:
1068:
1039:
1037:
1036:
1031:
971:
969:
968:
963:
961:
960:
940:
936:
932:
931:
898:
897:
887:
876:
854:
805:
803:
802:
797:
767:
765:
764:
759:
677:
675:
674:
669:
654:
652:
651:
646:
638:
614:
596:
595:
566:
564:
563:
558:
556:
555:
533:
531:
530:
525:
507:
505:
504:
499:
487:
485:
484:
479:
458:
456:
455:
450:
426:
424:
423:
418:
416:
414:
413:
393:
389:
388:
387:
377:
366:
345:
344:
324:
320:
298:
297:
287:
276:
255:
252:
183:policy iteration
179:Restart in State
69:one armed bandit
59:Problem" and a "
3145:
3144:
3140:
3139:
3138:
3136:
3135:
3134:
3120:Decision theory
3110:
3109:
3095:10.2307/2233856
3072:
2963:
2928:
2927:
2923:
2892:
2891:
2879:10.1145/3179419
2861:
2856:
2843:cite conference
2840:
2839:
2832:
2830:
2803:
2799:
2783:
2782:
2770:10.1145/3179419
2754:
2750:
2721:
2717:
2676:
2672:
2649:
2640:
2596:
2592:
2567:
2563:
2536:Chen, Yih Ren;
2534:
2530:
2507:
2503:
2475:
2471:
2438:10.2307/1910412
2422:
2418:
2403:10.2307/2335176
2387:
2383:
2368:
2364:
2325:
2316:
2301:10.2307/2233856
2285:
2274:
2270:
2253:
2244:
2242:Queueing theory
2214:
2211:
2210:
2185:
2182:
2181:
2126:
2123:
2122:
2066:
2063:
2062:
2034:
2031:
2030:
2005:
2002:
2001:
1976:
1973:
1972:
1928:
1917:
1911:
1908:
1907:
1906:is replaced by
1890:
1886:
1884:
1881:
1880:
1843:
1805:
1794:
1783:
1779:
1778:
1760:
1756:
1754:
1751:
1750:
1717:
1679:
1668:
1663:
1659:
1658:
1640:
1636:
1634:
1631:
1630:
1593:
1589:
1588:
1572:
1568:
1567:
1565:
1553:
1532:
1529:
1528:
1500:
1497:
1496:
1480:
1477:
1476:
1451:
1448:
1447:
1444:
1398:
1395:
1394:
1374:
1370:
1368:
1365:
1364:
1348:
1345:
1344:
1309:
1286:
1282:
1270:
1266:
1254:
1243:
1238:
1234:
1233:
1227:
1206:
1203:
1202:
1183:
1180:
1179:
1162:
1158:
1156:
1153:
1152:
1136:
1133:
1132:
1112:
1108:
1106:
1103:
1102:
1086:
1083:
1082:
1053:
1050:
1049:
1046:
983:
980:
979:
941:
927:
923:
893:
889:
877:
866:
861:
857:
856:
844:
817:
814:
813:
776:
773:
772:
702:
699:
698:
692:
663:
660:
659:
634:
604:
591:
587:
579:
576:
575:
551:
547:
539:
536:
535:
513:
510:
509:
493:
490:
489:
464:
461:
460:
435:
432:
431:
394:
383:
379:
367:
356:
351:
347:
346:
325:
293:
289:
277:
266:
261:
257:
256:
254:
242:
221:
218:
217:
211:
206:
187:value iteration
162:Restless bandit
139:Martin Weitzman
129:
117:Restless bandit
86:
53:John C. Gittins
38:
12:
11:
5:
3143:
3133:
3132:
3127:
3122:
3108:
3107:
3078:
3071:
3070:External links
3068:
3067:
3066:
3046:M.N. Katehakis
3044:Cowan, W. and
3041:
3015:(2): 262β268.
2999:
2967:
2961:
2941:
2921:
2905:
2873:(1). ACM: 16.
2860:
2857:
2855:
2854:
2813:5G World Forum
2797:
2764:(1). ACM: 16.
2748:
2715:
2693:10.1.1.77.5127
2686:(4): 596β606.
2670:
2638:
2611:(2): 262β268.
2590:
2579:(1): 184β186.
2561:
2550:(1): 180β183.
2528:
2517:(5): 426β439.
2501:
2490:(2): 143β149.
2478:Whittle, Peter
2469:
2432:(3): 641β654.
2416:
2397:(3): 561β565.
2381:
2362:
2335:(2): 148β177.
2314:
2271:
2269:
2266:
2252:
2249:
2243:
2240:
2227:
2224:
2221:
2218:
2198:
2195:
2192:
2189:
2178:
2177:
2166:
2163:
2160:
2157:
2154:
2151:
2148:
2145:
2142:
2139:
2136:
2133:
2130:
2120:
2109:
2106:
2103:
2100:
2097:
2094:
2091:
2088:
2085:
2082:
2079:
2076:
2073:
2070:
2047:
2044:
2041:
2038:
2018:
2015:
2012:
2009:
1989:
1986:
1983:
1980:
1960:
1957:
1954:
1951:
1948:
1945:
1942:
1937:
1934:
1931:
1926:
1923:
1920:
1916:
1893:
1889:
1877:
1876:
1875:
1874:
1861:
1858:
1855:
1852:
1849:
1846:
1841:
1837:
1834:
1831:
1828:
1825:
1822:
1819:
1814:
1811:
1808:
1803:
1800:
1797:
1793:
1789:
1786:
1782:
1777:
1774:
1771:
1768:
1763:
1759:
1748:
1735:
1732:
1729:
1726:
1723:
1720:
1715:
1711:
1708:
1705:
1702:
1699:
1696:
1693:
1688:
1685:
1682:
1677:
1674:
1671:
1667:
1662:
1657:
1654:
1651:
1648:
1643:
1639:
1622:
1621:
1607:
1604:
1601:
1596:
1592:
1586:
1583:
1580:
1575:
1571:
1562:
1559:
1556:
1552:
1548:
1545:
1542:
1539:
1536:
1513:
1510:
1507:
1504:
1484:
1464:
1461:
1458:
1455:
1443:
1440:
1439:
1438:
1426:
1423:
1420:
1417:
1414:
1411:
1408:
1405:
1402:
1377:
1373:
1352:
1341:
1340:
1327:
1324:
1321:
1318:
1315:
1312:
1307:
1303:
1300:
1297:
1294:
1289:
1285:
1281:
1278:
1273:
1269:
1263:
1260:
1257:
1252:
1249:
1246:
1242:
1237:
1230:
1226:
1222:
1219:
1216:
1213:
1210:
1187:
1165:
1161:
1140:
1115:
1111:
1090:
1066:
1063:
1060:
1057:
1045:
1042:
1041:
1040:
1029:
1026:
1023:
1020:
1017:
1014:
1011:
1008:
1005:
1002:
999:
996:
993:
990:
987:
973:
972:
959:
956:
953:
950:
947:
944:
939:
935:
930:
926:
922:
919:
916:
913:
910:
907:
904:
901:
896:
892:
886:
883:
880:
875:
872:
869:
865:
860:
853:
850:
847:
843:
839:
836:
833:
830:
827:
824:
821:
808:value function
795:
792:
789:
786:
783:
780:
769:
768:
757:
754:
751:
748:
745:
742:
739:
736:
733:
730:
727:
724:
721:
718:
715:
712:
709:
706:
691:
688:
667:
656:
655:
644:
641:
637:
633:
630:
627:
624:
621:
618:
613:
610:
607:
603:
599:
594:
590:
586:
583:
554:
550:
546:
543:
523:
520:
517:
497:
477:
474:
471:
468:
448:
445:
442:
439:
428:
427:
412:
409:
406:
403:
400:
397:
392:
386:
382:
376:
373:
370:
365:
362:
359:
355:
350:
343:
340:
337:
334:
331:
328:
323:
319:
316:
313:
310:
307:
304:
301:
296:
292:
286:
283:
280:
275:
272:
269:
265:
260:
251:
248:
245:
241:
237:
234:
231:
228:
225:
210:
207:
205:
202:
128:
125:
85:
82:
37:
34:
9:
6:
4:
3:
2:
3142:
3131:
3128:
3126:
3123:
3121:
3118:
3117:
3115:
3104:
3100:
3096:
3092:
3088:
3084:
3079:
3076:
3074:
3073:
3063:
3059:
3055:
3051:
3047:
3042:
3038:
3034:
3030:
3026:
3022:
3018:
3014:
3010:
3009:
3004:
3000:
2996:
2992:
2988:
2984:
2980:
2976:
2972:
2968:
2964:
2958:
2954:
2953:Peter Whittle
2950:
2946:
2945:Gittins, J.C.
2942:
2938:
2932:
2924:
2918:
2914:
2910:
2906:
2902:
2896:
2888:
2884:
2880:
2876:
2872:
2868:
2863:
2862:
2850:
2844:
2828:
2823:
2818:
2814:
2810:
2809:
2801:
2793:
2787:
2779:
2775:
2771:
2767:
2763:
2759:
2752:
2743:
2738:
2734:
2730:
2726:
2719:
2711:
2707:
2703:
2699:
2694:
2689:
2685:
2681:
2674:
2666:
2662:
2658:
2654:
2647:
2645:
2643:
2634:
2630:
2626:
2622:
2618:
2614:
2610:
2606:
2605:
2600:
2594:
2586:
2582:
2578:
2574:
2573:
2565:
2557:
2553:
2549:
2545:
2544:
2539:
2532:
2524:
2520:
2516:
2512:
2505:
2497:
2493:
2489:
2485:
2484:
2479:
2473:
2465:
2461:
2457:
2453:
2448:
2443:
2439:
2435:
2431:
2427:
2420:
2412:
2408:
2404:
2400:
2396:
2392:
2385:
2377:
2373:
2366:
2358:
2354:
2350:
2346:
2342:
2338:
2334:
2330:
2323:
2321:
2319:
2310:
2306:
2302:
2298:
2294:
2290:
2283:
2281:
2279:
2277:
2272:
2265:
2261:
2257:
2248:
2239:
2222:
2216:
2193:
2187:
2164:
2158:
2152:
2146:
2143:
2140:
2134:
2128:
2121:
2104:
2098:
2095:
2089:
2083:
2080:
2074:
2068:
2061:
2060:
2059:
2042:
2036:
2013:
2007:
1984:
1978:
1952:
1946:
1940:
1935:
1932:
1929:
1924:
1921:
1918:
1914:
1891:
1887:
1859:
1856:
1850:
1844:
1839:
1829:
1823:
1817:
1812:
1809:
1806:
1801:
1798:
1795:
1791:
1787:
1784:
1780:
1775:
1769:
1761:
1757:
1749:
1733:
1730:
1724:
1718:
1713:
1703:
1697:
1691:
1686:
1683:
1680:
1675:
1672:
1669:
1665:
1660:
1655:
1649:
1641:
1637:
1629:
1628:
1627:
1626:
1625:
1602:
1594:
1590:
1581:
1573:
1569:
1560:
1557:
1554:
1546:
1540:
1534:
1527:
1526:
1525:
1508:
1502:
1482:
1459:
1453:
1421:
1415:
1412:
1406:
1400:
1393:
1392:
1391:
1375:
1371:
1350:
1325:
1322:
1316:
1310:
1305:
1295:
1287:
1283:
1276:
1271:
1267:
1261:
1258:
1255:
1250:
1247:
1244:
1240:
1235:
1228:
1220:
1214:
1208:
1201:
1200:
1199:
1185:
1163:
1159:
1138:
1129:
1113:
1109:
1088:
1080:
1061:
1055:
1027:
1021:
1015:
1009:
1006:
1003:
997:
991:
985:
978:
977:
976:
957:
954:
948:
942:
937:
933:
928:
924:
920:
911:
905:
899:
894:
890:
884:
881:
878:
873:
870:
867:
863:
858:
851:
848:
845:
837:
831:
828:
825:
819:
812:
811:
810:
809:
790:
787:
784:
778:
752:
749:
743:
740:
737:
731:
728:
725:
716:
710:
704:
697:
696:
695:
687:
685:
681:
665:
639:
631:
628:
625:
619:
616:
611:
608:
605:
601:
597:
592:
584:
574:
573:
572:
570:
552:
544:
521:
518:
515:
495:
472:
466:
443:
437:
410:
407:
401:
395:
390:
384:
380:
374:
371:
368:
363:
360:
357:
353:
348:
341:
338:
332:
326:
321:
311:
305:
299:
294:
290:
284:
281:
278:
273:
270:
267:
263:
258:
249:
246:
243:
235:
229:
223:
216:
215:
214:
201:
198:
193:
188:
184:
180:
176:
171:
167:
166:Markov chains
163:
159:
155:
151:
147:
146:Peter Whittle
142:
140:
135:
124:
122:
118:
114:
111:(one type of
110:
105:
104:
98:
94:
91:
81:
77:
74:
70:
66:
62:
58:
54:
51:
46:
42:
33:
31:
28:
23:
19:
18:Gittins index
3086:
3082:
3053:
3049:
3012:
3006:
2978:
2974:
2948:
2912:
2895:cite journal
2870:
2866:
2831:. Retrieved
2827:the original
2822:1908.09991v2
2807:
2800:
2786:cite journal
2761:
2757:
2751:
2735:(1): 51β76.
2732:
2728:
2718:
2683:
2679:
2673:
2656:
2652:
2608:
2602:
2593:
2576:
2570:
2564:
2547:
2541:
2531:
2514:
2510:
2504:
2487:
2481:
2472:
2447:1721.1/31303
2429:
2426:Econometrica
2425:
2419:
2394:
2390:
2384:
2375:
2371:
2365:
2332:
2328:
2292:
2288:
2262:
2258:
2254:
2245:
2179:
1878:
1623:
1445:
1342:
1130:
1047:
974:
807:
770:
693:
683:
657:
568:
429:
212:
191:
186:
182:
178:
161:
157:
143:
130:
116:
101:
87:
78:
64:
49:
47:
43:
39:
17:
15:
2971:Weber, R.R.
121:NP-complete
36:Terminology
3114:Categories
2859:References
2391:Biometrika
678:being the
84:Definition
57:scheduling
3056:: 51β76.
2931:cite book
2887:216145213
2833:April 18,
2778:216145213
2710:122785013
2688:CiteSeerX
2217:ν
2188:α
2162:∀
2147:ν
2141:≠
2129:α
2069:α
1979:ν
1941:β
1933:−
1915:∏
1888:β
1818:β
1810:−
1807:τ
1792:∏
1788:−
1762:τ
1684:−
1681:τ
1666:∑
1642:τ
1595:τ
1574:τ
1555:τ
1535:α
1503:α
1454:β
1351:π
1288:π
1268:β
1259:−
1256:τ
1241:∑
1229:π
1079:Katehakis
1062:⋅
1010:β
1007:−
986:ν
925:β
891:β
882:−
879:τ
864:∑
846:τ
666:χ
612:χ
609:∈
602:∑
598:≐
589:⟩
582:⟨
549:⟩
545:⋅
542:⟨
516:β
444:⋅
381:β
372:−
369:τ
354:∑
291:β
282:−
279:τ
264:∑
244:τ
224:ν
2947:(1989).
2464:32530881
2378:(1): 17.
2357:17724147
2209:and not
1840:⟩
1781:⟨
1714:⟩
1661:⟨
1306:⟩
1236:⟨
938:⟩
859:⟨
682:of
391:⟩
349:⟨
322:⟩
259:⟨
3103:2233856
3029:3689689
2995:2959678
2625:3689689
2456:1910412
2411:2335176
2349:2985029
2309:2233856
806:is the
152:from a
134:Gittins
127:History
3101:
3037:656323
3035:
3027:
2993:
2959:
2919:
2885:
2776:
2708:
2690:
2633:656323
2631:
2623:
2462:
2454:
2409:
2355:
2347:
2307:
1624:where
1343:where
771:where
680:domain
430:where
93:scalar
30:scalar
3099:JSTOR
3033:S2CID
3025:JSTOR
2991:JSTOR
2883:S2CID
2817:arXiv
2774:S2CID
2706:S2CID
2629:S2CID
2621:JSTOR
2460:S2CID
2452:JSTOR
2407:JSTOR
2353:S2CID
2345:JSTOR
2305:JSTOR
2268:Notes
658:with
2957:ISBN
2937:link
2917:ISBN
2901:link
2849:link
2835:2020
2792:link
2029:and
1558:>
849:>
519:<
247:>
90:real
27:real
16:The
3091:doi
3087:101
3058:doi
3017:doi
2983:doi
2875:doi
2766:doi
2737:doi
2698:doi
2661:doi
2613:doi
2581:doi
2552:doi
2519:doi
2492:doi
2442:hdl
2434:doi
2399:doi
2337:doi
2297:doi
2293:101
1879:If
1551:sup
1225:sup
1048:If
842:sup
720:inf
240:sup
3116::
3097:.
3085:.
3054:29
3052:.
3031:.
3023:.
3013:12
3011:.
2989:.
2977:.
2933:}}
2929:{{
2897:}}
2893:{{
2881:.
2869:.
2845:}}
2841:{{
2815:.
2811:.
2788:}}
2784:{{
2772:.
2760:.
2733:29
2731:.
2727:.
2704:.
2696:.
2684:19
2682:.
2657:78
2655:.
2641:^
2627:.
2619:.
2609:12
2607:.
2577:11
2575:.
2548:11
2546:.
2515:30
2513:.
2488:42
2486:.
2458:.
2450:.
2440:.
2430:47
2428:.
2405:.
2395:66
2393:.
2376:11
2374:.
2351:.
2343:.
2333:41
2331:.
2317:^
2303:.
2291:.
2275:^
2000:,
1198:.
1128:.
686:.
571::
508:,
170:LP
32:.
3105:.
3093::
3064:.
3060::
3039:.
3019::
2997:.
2985::
2979:2
2965:.
2939:)
2925:.
2903:)
2889:.
2877::
2871:2
2851:)
2837:.
2819::
2794:)
2780:.
2768::
2762:2
2745:.
2739::
2712:.
2700::
2667:.
2663::
2635:.
2615::
2587:.
2583::
2558:.
2554::
2525:.
2521::
2498:.
2494::
2466:.
2444::
2436::
2413:.
2401::
2359:.
2339::
2311:.
2299::
2226:)
2223:i
2220:(
2197:)
2194:i
2191:(
2165:k
2159:,
2156:)
2153:i
2150:(
2144:k
2138:)
2135:i
2132:(
2108:)
2105:i
2102:(
2099:w
2096:=
2093:)
2090:i
2087:(
2084:h
2081:=
2078:)
2075:i
2072:(
2046:)
2043:i
2040:(
2037:h
2017:)
2014:i
2011:(
2008:w
1988:)
1985:i
1982:(
1959:]
1956:)
1953:j
1950:(
1947:Z
1944:[
1936:1
1930:t
1925:0
1922:=
1919:j
1892:t
1860:i
1857:=
1854:)
1851:0
1848:(
1845:Z
1836:]
1833:)
1830:t
1827:(
1824:Z
1821:[
1813:1
1802:0
1799:=
1796:t
1785:1
1776:=
1773:)
1770:i
1767:(
1758:Q
1734:i
1731:=
1728:)
1725:0
1722:(
1719:Z
1710:]
1707:)
1704:t
1701:(
1698:Z
1695:[
1692:R
1687:1
1676:0
1673:=
1670:t
1656:=
1653:)
1650:i
1647:(
1638:R
1606:)
1603:i
1600:(
1591:Q
1585:)
1582:i
1579:(
1570:R
1561:0
1547:=
1544:)
1541:i
1538:(
1512:)
1509:i
1506:(
1483:i
1463:)
1460:i
1457:(
1437:.
1425:)
1422:i
1419:(
1416:w
1413:=
1410:)
1407:i
1404:(
1401:h
1376:i
1372:M
1326:i
1323:=
1320:)
1317:0
1314:(
1311:Z
1302:]
1299:)
1296:t
1293:(
1284:Z
1280:[
1277:R
1272:t
1262:1
1251:0
1248:=
1245:t
1221:=
1218:)
1215:i
1212:(
1209:h
1186:i
1164:i
1160:M
1139:i
1114:i
1110:M
1089:i
1065:)
1059:(
1056:Z
1028:.
1025:)
1022:i
1019:(
1016:w
1013:)
1004:1
1001:(
998:=
995:)
992:i
989:(
958:i
955:=
952:)
949:0
946:(
943:Z
934:k
929:t
921:+
918:]
915:)
912:t
909:(
906:Z
903:[
900:R
895:t
885:1
874:0
871:=
868:t
852:0
838:=
835:)
832:k
829:,
826:i
823:(
820:v
794:)
791:k
788:,
785:i
782:(
779:v
756:}
753:k
750:=
747:)
744:k
741:,
738:i
735:(
732:v
729::
726:k
723:{
717:=
714:)
711:i
708:(
705:w
684:X
643:}
640:c
636:|
632:x
629:=
626:X
623:{
620:P
617:x
606:x
593:c
585:X
569:c
553:c
522:1
496:i
476:)
473:i
470:(
467:R
447:)
441:(
438:Z
411:i
408:=
405:)
402:0
399:(
396:Z
385:t
375:1
364:0
361:=
358:t
342:i
339:=
336:)
333:0
330:(
327:Z
318:]
315:)
312:t
309:(
306:Z
303:[
300:R
295:t
285:1
274:0
271:=
268:t
250:0
236:=
233:)
230:i
227:(
Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.