Freecell.net - play online competitive Freecell solitaire

home play scores tournament instructions discussion problems about login account privacy

About Elo Ratings

Elo ratings were introduced in December 2020. We use a customized version of the Elo rating model widely used in chess and sporting events for predicting outcomes between two competitors.

The most unique aspect of this implementation is that freecell players don’t compete head-to-head over a freecell game a la the more traditional Elo, so our model handles this indirectly by rating games as well as players. In other words, each specific deal is assigned a rating, and players and games exchange points based on how expected or unexpected the outcome was. In this way players compete against each other using the games as proxy.

Although we refer to it as a "total awesomeness rating" it's just another stat for viewing player's performance. This is fairly complex stat that looks at a players strength in solving deals (their winning percentage coupled with the diffulty of the deals they play), and as such may be somewhat influenced by the variants they choose (see Power Rankings below). At some point in the future the system will have properly normalized all the variants and there will be little advantage to playing a particular variant.

Elo is not an acronym but the name of the man who invented the rating system, Arpad Elo. Read more about the rating system here or just search on the Internet for "Elo rating system".

How it works

Only streak play is rated. Separate ratings have been developed for tournament and HotStreak play. Rankings are for current players only. Names are removed from published rankings after 14 days of inactivity. While most players favor the standard 8x4 game, the site offers a vast array of variants, each with its own lists of best streaks. Elo ratings bring this all together with their ability to account for differences in difficulty between variants and levels, to provide a single ranking for overall player performance. The idea here is that we can now start to compare streak players across variants and difficulty levels, and we can now compare game variants as well (see chart below).

The ratings are designed to answer a simple question: given a particular deal in streak play, how likely is this player to win? The premise of the Elo model is that we can quantify this likelihood based on the difference in ratings between game and player, and then use the actual outcome to improve our prediction for the next event.

Average Elos for each variant


	4x4	4x5	4x6	4x7	4x8	4x9	4x10
5	2155	2069	2015	1972	1823	1689	1536
6	2215	2129	2123	2016	1876	1766	1615
7	2270	2184	2192	2050	1929	1919	1717
8	2325	2239	2252	2106	1961	1991	1830
9	2380	2294	2307	2161	2028	2087	1910
10	2435	2349	2362	2216	2119	2156	2004
11	3037	2404	2417	2271	2233	2170	2054
12	3114	2459	2472	2326	2303	2164	2015

	5x3	5x4	5x5	5x6	5x7	5x8	5x9	5x10
5	2252	2108	1975	1783	1604	1400	1240	1086
6	2316	2154	2051	1923	1670	1480	1336	1099
7	2366	2258	2080	2004	1713	1588	1401	1182
8	2416	2313	2106	2079	1736	1683	1473	1274
9	2466	2368	2161	2151	1828	1768	1544	1291
10	2516	2409	2247	2224	1890	1815	1605	1316
11	2566	2464	2308	2252	2049	1824	1716	1421
12	2616	2519	2397	2358	2068	1782	1743	1478

	6x2	6x3	6x4	6x5	6x6	6x7	6x8	6x9	6x10
5	1961	1958	1815	1569	1340	1187	1046	900	847
6	2017	1999	1869	1632	1434	1250	1069	861	792
7	2068	2054	1907	1718	1493	1306	1132	858	733
8	2117	2103	1941	1784	1617	1393	1221	857	687
9	2195	2180	1988	1904	1708	1454	1263	858	632
10	2236	2221	2017	1949	1806	1469	1267	856	590
11	2261	2246	2048	1984	1844	1580	1373	987	645
12	2286	2273	2086	2018	1831	1590	1379	1025	684

	7x1	7x2	7x3	7x4	7x5	7x6	7x7	7x8	7x9
5	2119	1955	1687	1366	1175	928	855	805	711
6	2284	2017	1719	1451	1255	997	866	781	646
7	2306	2072	1881	1524	1303	1053	868	757	595
8	2350	2122	1980	1627	1373	1140	843	743	539
9	2421	2162	2055	1715	1428	1195	852	704	482
10	2467	2236	2113	1707	1461	1209	825	657	405
11	2503	2247	2140	1841	1566	1319	888	753	496
12	2522	2280	2095	1830	1617	1321	915	792	497

	8x0	8x1	8x2	8x3	8x4	8x5	8x6	8x7	8x8
5	1955	1784	1627	1292	965	862	737	750	679
6	1975	1833	1668	1382	976	869	736	686	627
7	2022	1893	1712	1465	1023	932	728	625	582
8	2072	1953	1764	1537	1065	988	710	546	528
9	2122	1980	1817	1651	1124	997	673	495	460
10	2152	2026	1864	1659	1159	928	636	414	400
11	2222	2104	1919	1794	1255	981	659	476	419
12	2272	2157	1926	1793	1272	1024	639	522	445

	9x0	9x1	9x2	9x3	9x4	9x5	9x6	9x7
5	1973	1643	1296	1025	950	569	601	540
6	2053	1676	1388	1058	929	473	577	558
7	2071	1769	1450	1111	945	487	605	611
8	2105	1854	1500	1174	964	510	629	636
9	2143	1923	1577	1214	993	542	645	658
10	2188	2009	1611	1213	869	610	486	440
11	2211	1992	1691	1365	1053	897	664	610
12	2228	1993	1696	1386	1133	935	762	659

	10x0	10x1	10x2	10x3	10x4	10x5	10x6
5	1754	1325	992	802	790	665	615
6	1773	1420	1053	801	744	655	543
7	1796	1478	1134	809	736	648	528
8	1833	1529	1175	834	717	598	509
9	1865	1554	1220	839	694	556	503
10	1902	1587	1207	773	532	413	231
11	1926	1702	1322	836	692	491	461
12	1962	1688	1343	848	704	513	495

	11x0	11x1	11x2	11x3	11x4	11x5
5	1510	1024	781	721	645	674
6	1613	1067	843	731	627	614
7	1732	1138	873	754	614	612
8	1674	1194	916	788	599	592
9	1799	1225	947	818	528	565
10	1826	1230	733	445	386	516
11	1844	1303	1034	582	461	599
12	1820	1347	1060	584	474	619

	12x0	12x1	12x2	12x3	12x4
5	1254	621	645	627	595
6	1326	633	629	548	526
7	1348	679	616	510	504
8	1406	716	610	494	477
9	1426	776	593	483	435
10	1411	683	374	334	310
11	1395	800	598	387	398
12	1447	833	622	387	433

	13x0	13x1	13x2	13x3
5	930	593	471	496
6	932	578	456	467
7	981	581	455	464
8	1017	600	432	445
9	1047	590	420	435
10	995	498	373	254
11	971	700	433	322
12	1025	734	448	333

Getting started

Every new player begins with a rating of 1500, a common starting point in Elo systems. Games could have also been assigned 1500 to start, but this would have ignored information we already have about individual games and the set they belong to. We also have far more games than players to rate, thousands of players versus millions of games, so a better starting point was needed.

To do this, the win/loss record of each game in December 2020 was used to assign an initial rating. Note that this was a one-time event, and game stats no longer play any role in Elo ratings. Game stats differ in important ways from ratings, because we don’t know who played those games and we don’t know if it was streak play or a timed event. But as an historical note here’s how ratings were assigned.

Basically we took the game's play history, adjusted it slightly toward the mean for that level, and then assigned ratings straight from the Elo formula that corresponds with that win%, assuming a 1500-level opponent. Ratings for level 6–12 were scaled up based on a calculation of how a player's rating increases after winning ten games in the level below. And finally whole-level adjustments were made in almost every level based on play testing to bring them into parity with each other, and now the Elo model is continuing to fine tune things.

As an example of how ratings were assigned, let’s say a 7x4-5 game had been beaten 1 time out of 10 plays, for a 10% player win rate. Before assigning its initial rating we adjusted to account for the fact that we don't know that much about a game after ten plays. For instance one more win would have doubled its win%, which is significant.

So since the cumulative player win% for 7x4-5 is 64% we add five more fictitious plays at the average win rate for this set of games, meaning we pretend it was beaten 3.2 times out of the next 5, giving an adjusted win rate of 4.2 wins out of 15 plays = 28%. In other words the fewer plays a game has the more we assume it’s a typical game for that variant and rate it accordingly.

If the same game showed 10 wins out of 100 plays, it has the same win% but now we know more about it. This time adding the 5-game adjustment has much less impact, and the game is rated as 13.2/105=12.6%. The 1 out of 10 game would be rated 1668 and the 10 out of 100 game would be rated 1837. This method was used for all levels 5 through 12 where streak play is possible, with an additional bump in the ratings of games beyond level 5 to account for the presumably higher average rating of the players there. New games being played for the first time are assigned the average rating for their respective variant and difficulty level.

The math

Elo ratings represent the likelihood that a player will win or lose a particular game. The formula for expected win% is the inverse of:

1 + 10**((game rating minus player rating) divided by 400)

So if a 1500-rated player is dealt a 1000-rated 8x4 game, we would say there’s a

1 / (1 + 10**((1000-1500) / 400)) = 94.68%

chance the player wins and only a 5.32% chance she loses.

These percentages also define how ratings adjust based on the actual outcome. We use a constant K of 8 points, which is the max point exchange between player and game. If the result was expected, and the player above wins, her rating increases by 5.32% of 8 or 0.43 points. If she loses, her rating decreases by 94.68% of 8 or 7.57 points. Points gained by a player are taken by the game, and vice versa. So her new rating will be 1500.43 if she wins or 1492.43 if she loses.

That’s the whole story in terms of the player ratings. There’s more going on behind the scenes though when it comes to the games, as we needed to create some leverage to balance the impact on players and games. We do this by taking a few hundredths of every point gain or loss on an individual game and applying it to all 32,768 games in that variant/difficulty level. In other words all the games in 8x3-8 get a small boost up or down based on what happens to any individual 8x3-8 that gets played. This gives us years’ worth of adjustment in days, which is not too much given how many more games there are than players. This extra “boost,” either up or down, is scaled to the frequency of play for that level so as long as a variant gets some play we’re able to get enough adjustment to bring it in line with the others.

What does a rating mean?

Elo ratings are a self-correcting predictive tool and not a score. If this were a head-to-head competition like chess, a 200-point difference means the higher rated player would expect to win 76% of the time. Some top players are 600 or more points above the starting rating, meaning they'd expect to outsolve an average player 97% of the time.

A rating is also focused on recent performance. You can think of it like a thermometer: it’s always adjusting based on the current temperature. The previous temperature is the starting point, but once it moves it doesn’t remember the old reading. A rating provides an interesting measure of overall solving ability, but may frustrate players who make it their primary focus. Ideally you check out your rating to see how you stack up and to be amazed at the talented field of players we have here, and then go back to running up streaks in your favorite variants.

Individual game ratings are only an approximation, and except perhaps in 8x4 will never reach their true level. That’s fine, as long as the average for the whole level reaches its true rating, since presumably players will face a large sample of games and some will be rated too high and others too low. Also, at this point the ratings don’t know the difference between really hard games and unwinnable games, so variants with lots of unwinnables will tend to have higher average ratings to compensate.

On the player side, ratings reach their true level much faster. To get there fastest some may opt for what a chess player might call “sharp” play, choosing variants with ratings close to their own where something, good or bad, is bound to happen. Opponents with close ratings push apart like magnets with like charges. Others will choose to protect a rating by only playing specific variants.

Eventually it won’t matter where you play because the variants will naturally move toward parity with each other. And since player ratings are set relative to game ratings, over time it will become impossible to maintain a rating built on play in specific variants that were previously overrated. In the mean time, if you want to know that your rating is an accurate representation of your ability, the best bet is to play in a variety of variants and difficulty levels. This has the added benefit of speeding along the process of getting all the variants into parity with each other. Feel free to look for variants you feel are overrated though, your play will help bring them in line.

Strategy

There’s nothing you have to do to improve your rating, except play better obviously. Good and bad streaks will happen, and it’s normal to see a rating fluctuate even by dozens of points if you play a lot. Note that if a player wins exactly the number of games predicted by their rating during a day the rating will be unchanged. If you lose one more game than expected your rating will drop by 8 points. Players are human and deals are random. Performance can vary by a lot more than one game, even if the ratings were perfect. So if you get down, keep playing. Ratings have no memory, they’re free floating and not held back by previous performance.

One point of caution, Elo ratings do not care if this is the first game of your streak or the hundredth, so play every game like it matters and don’t let your guard down on those early ones. Also, where Winnable versions of a variant exist it’s marginally preferable to play these over the regular version of the same variant where you might risk losing points to a game that other players won’t have to face. This difference is minor and transient, since most unwinnable games in these variants have been assigned very high ratings already and any points they take you’ll begin to get back with your next game, but this may help you add a few Elo points.

Power Rankings

Here are the current best returns for playing for Elo. The "power ranking" is simply the game average win percent multipled by the average Elo for the variant. These are ever-changing and of course your mileage may vary.

Rank	Variant	Difficulty	Win%	Elo	Power Ranking
1.	5x7	7	72.16%	1713.337	1236.398
2.	5x10	10	92.57%	1316.196	1218.425
3.	4x5	11	50.00%	2404.000	1202.000
4.	5x9	6	89.28%	1336.347	1193.097
5.	9x2	6	84.52%	1387.951	1173.072
6.	6x6	6	81.45%	1433.607	1167.673
7.	7x6	12	88.26%	1321.283	1166.118
8.	5x7	8	66.07%	1736.291	1147.192
9.	6x8	10	90.20%	1267.221	1143.031
10.	6x7	6	91.29%	1249.577	1140.706
11.	10x1	6	80.01%	1419.829	1135.999
12.	6x8	12	81.79%	1379.136	1128.061
13.	7x5	6	89.76%	1254.555	1126.084
14.	6x7	12	70.49%	1589.500	1120.374
15.	5x7	10	58.93%	1889.709	1113.578
16.	8x3	6	79.98%	1382.438	1105.658
17.	5x10	11	77.78%	1421.479	1105.556
18.	5x6	8	52.63%	2079.168	1094.299
19.	5x10	8	85.55%	1274.157	1089.994
20.	12x0	6	82.10%	1326.397	1089.004
21.	6x8	8	88.92%	1220.655	1085.410
22.	6x7	10	73.66%	1468.986	1082.101
23.	7x4	6	74.46%	1450.836	1080.305
24.	4x7	9	50.00%	2160.500	1080.250
25.	7x5	12	66.67%	1617.282	1078.209
26.	5x10	12	72.55%	1477.906	1072.154
27.	6x8	9	84.80%	1263.411	1071.422
28.	6x8	11	78.00%	1372.726	1070.709
29.	5x9	12	61.28%	1742.534	1067.902
30.	5x9	5	85.99%	1239.891	1066.195
31.	8x4	10	91.69%	1159.016	1062.753
32.	11x1	8	88.45%	1194.036	1056.075
33.	5x10	7	89.28%	1182.175	1055.417
34.	11x1	10	85.75%	1229.598	1054.409
35.	4x7	8	50.00%	2105.500	1052.750
36.	7x6	10	86.99%	1209.022	1051.764
37.	5x10	9	81.43%	1291.220	1051.474
38.	10x2	10	86.55%	1206.873	1044.599
39.	11x1	9	84.89%	1225.226	1040.089
40.	6x7	11	65.75%	1579.994	1038.793
41.	6x7	9	71.39%	1453.830	1037.920
42.	5x6	6	53.92%	1922.830	1036.884
43.	5x8	5	74.01%	1399.960	1036.092
44.	5x10	6	94.23%	1099.470	1036.077
45.	6x8	7	91.32%	1132.051	1033.777
46.	6x7	8	74.06%	1392.799	1031.462
47.	5x7	9	56.41%	1827.935	1031.143
48.	11x1	7	90.59%	1137.879	1030.779
49.	6x7	7	78.70%	1306.134	1027.903
50.	9x3	12	73.88%	1386.276	1024.115
51.	10x2	12	75.83%	1343.045	1018.472
52.	9x2	12	59.72%	1695.899	1012.707
53.	8x4	9	90.00%	1123.759	1011.391
54.	6x7	5	85.20%	1186.802	1011.207
55.	6x6	5	75.09%	1340.481	1006.524
56.	5x9	7	71.69%	1400.902	1004.355
57.	6x8	6	93.97%	1068.630	1004.181
58.	9x4	12	88.38%	1132.982	1001.369
59.	10x2	8	85.17%	1174.505	1000.303
60.	9x2	5	77.17%	1295.941	1000.131
...	...	...	...	...	...
502.	5x4	5	3.78%	2108.299	79.631
503.	6x3	6	3.96%	1998.500	79.149
504.	9x0	9	3.58%	2142.667	76.720
505.	7x2	12	3.31%	2280.231	75.380
506.	8x1	8	3.85%	1952.829	75.109
507.	6x4	12	3.46%	2085.952	72.175
508.	6x2	6	3.57%	2017.005	72.036
509.	10x0	8	3.66%	1832.534	67.140
510.	9x0	8	2.69%	2104.600	56.677
511.	8x1	12	2.24%	2156.781	48.358
512.	9x0	7	2.29%	2070.636	47.347
513.	4x4	10	1.53%	2435.006	37.270
514.	6x2	5	1.67%	1961.043	32.709
515.	4x5	5	1.31%	2069.043	27.198
516.	9x0	10	1.23%	2187.750	27.009
517.	9x0	12	1.20%	2227.822	26.706
518.	9x0	6	0.95%	2052.665	19.549
519.	4x6	6	0.86%	2122.667	18.299
520.	5x3	5	0.80%	2251.627	18.101
521.	4x4	5	0.27%	2155.222	5.864
522.	9x0	11	0.21%	2211.082	4.635
523.	4x4	12	0.00%	3114.405	0.000
524.	4x5	12	0.00%	2459.000	0.000
525.	4x6	12	0.00%	2471.658	0.000
526.	4x7	12	0.00%	2325.500	0.000
527.	5x3	12	0.00%	2616.000	0.000
528.	5x4	12	0.00%	2519.000	0.000
529.	5x5	12	0.00%	2397.300	0.000
530.	6x2	12	0.00%	2286.000	0.000
531.	6x3	12	0.00%	2272.662	0.000
532.	4x4	6	0.00%	2215.000	0.000
533.	4x5	6	0.00%	2129.000	0.000
534.	4x4	7	0.00%	2270.000	0.000
535.	4x5	7	0.00%	2184.000	0.000
536.	5x3	7	0.00%	2366.000	0.000
537.	5x4	7	0.00%	2258.000	0.000
538.	6x2	7	0.00%	2068.000	0.000
539.	4x4	8	0.00%	2325.000	0.000
540.	4x5	8	0.00%	2239.000	0.000
541.	4x6	8	0.00%	2251.625	0.000
542.	5x3	8	0.00%	2416.000	0.000
543.	5x4	8	0.00%	2313.000	0.000
544.	6x2	8	0.00%	2117.000	0.000
545.	4x4	9	0.00%	2380.000	0.000
546.	4x5	9	0.00%	2294.000	0.000
547.	4x6	9	0.00%	2306.625	0.000
548.	5x3	9	0.00%	2466.000	0.000
549.	5x4	9	0.00%	2368.000	0.000
550.	8x1	9	0.00%	1980.000	0.000
551.	4x5	10	0.00%	2349.075	0.000
552.	4x6	10	0.00%	2361.625	0.000
553.	4x7	10	0.00%	2215.500	0.000
554.	5x3	10	0.00%	2516.000	0.000
555.	6x2	10	0.00%	2236.000	0.000
556.	4x4	11	0.00%	3037.000	0.000
557.	4x7	11	0.00%	2270.500	0.000
558.	5x3	11	0.00%	2566.000	0.000
559.	5x4	11	0.00%	2464.000	0.000
560.	6x2	11	0.00%	2261.000	0.000

Note: to play these specific variants and difficulty levels, use the Custom mode and leave the game number selection on Random and check Streak mode so the game will count. Read here for more information on selecting a particular variant and difficulty level.

How we got here

Before he passed away SlowPoker (part of the original Ratings Crew) imagined devising a rating system for streak play here. He wanted to use the Elo system but he wanted to give each game a rating, sort of a man against machine approach. So basically each game would develop its own rating over time as would each player. These ratings represent the fruition of that idea. After the initial launch extensive play testing was done and manual adjustments made to the games level by level. Then more adjustments were made based on anomalies players uncovered, and finally the “secret sauce” part of the algorithm was fully implemented to let the machine do the work of boosting game averages up or down. We continue to monitor the adjustments the model is making to game averages, and it’s working very well.

Keep branching out, everyone. Play those odd variants and higher difficulty levels if you aren’t worried about protecting a streak. It all helps. Don’t worry, none of you are breaking the rating system. If you choose to play up instead of starting at level 5 that actually helps us get some coverage in lesser played games. Just know that if a rating is built on games that seem to be rated too high, you'll find that playing anything else will bring it back down.

Frequently Asked Questions

Can a player improve their rating by winning lots and lots of easy games?: If a 1900-level player played and won about 600 level 10 10x6s his rating would go up by one point. To gain another point he’d have to win about 1,000 more. Another 1,800 games gets him a third rating point. The average player would have lost 2 games at that point based on the 99.95% win rate for this variant, so this player would have showed he deserved those three hard-earned points by not losing. So yes it’s possible in theory, but there are diminishing returns to doing this, and only so many hours in a day.
What is the impact of losing a game that should have been rated higher?: It won't matter at all after a few days’ play. First of all there are as many underrated games as overrated, and you’re as likely to encounter one as the other. But regardless, a bigger than expected drop has no permanent impact. It doesn’t just get averaged away, it’s eventually erased. The player’s rating will go up more for wins and down less for losses until she ends up in the same place.
I won a hard game, shouldn’t I have gained more points?: Maybe. No game’s rating is exactly right. But remember that game stats can be deceiving. Many level 5 players are not especially skilled. Many games appeared in tournaments where there was no penalty for playing fast and loose. And finally remember the ratings know who you are, so they expect more from highly rated players. Consider it a compliment and trust that you’ll also play games that were overrated and on average things will work out.
I want to play a certain variant, but it seems underrated. What should I do?: Your call, but playing it is how we fix this. For aligning across variants and difficulty levels to work the model needs data, meaning someone needs to branch out and spend time playing variants and difficulty levels they normally wouldn’t. It may mean using the Custom option to reach lesser played levels as well. Doing this occasionally can also be a reality check on your rating, since it should come back up when you return to other variants.
What about unwinnable games?: There’s nothing to be gained from playing an unwinnable game, whether for your streak or your rating. In terms of impact, they actually matter a lot less for Elo ratings than for streaks. An unwinnable game resets your streak to zero; it might set your Elo rating back a few points. But if your rating takes a hit from an unwinnable game don’t sweat it, the model will give you more points for your next win and take fewer for your next loss until you're right back where you belong.
Those games with a high number of plays and no wins have already been assigned ratings near 3000 to minimize their impact. We used this number so as not to distort the averages too much since we're already near parity and averages are important for assigning new ratings. This will continue to be refined. Meanwhile it’s helpful to remember that every game’s rating is off to one degree or another, the unwinnables only stand out because we can tell when it’s off. The system is designed to work despite that.
Why is there such a wide range of ratings between level 5 and 12 of the same variant?: Generally speaking the level 5 game ratings started out moderately underrated, and the 10, 11, and 12 games were clearly overrated. The range for some of these has compressed to a difference of 200–300 points between level 5 and 12, especially in 8x4 (300-point difference between level 5 and 12) and the easier variants. Then it widens as variants get more difficult, then shrinks again when you get to the impossibly difficult ones. We'll learn more as the ratings get better over time, but at this moment it looks like difficulty level makes the biggest difference in 4x9, 4x10, 5x8, 6x5, 7x3, 8x3, and 10x1.
How can I see how my recent play has impacted my rating?: WTF happened?

Elo calculator

Copy the game and player Elo data directly from your Win/Lose dialog and paste it here to see game by game Elo changes.
I see other games with identical stats to the one I played but with different ratings. What’s happening?: Now that ratings are up and running we’re no longer assigning ratings based on game stats, so comparing ratings based on game stats is not going to get you anywhere. Game ratings now adjust based on point exchanges between 0 and 8 points, and not on the ratings of games with similar records. The initial ratings are not the model, they're just a starting point. Game stats are not ratings.

Elo Ratings
How our Elo system works

Hot Streaks
Summary of how Hot Streaks work

Daily Summary
Overview of yesterday's daily results

Who
Who's playing right now

Game Stats
Info on the individual games

Cumulative
Game Stats
Aggregate info by variants and difficulty level

Scores
Data Feeds
JSON or simple HTML access to the database of scores for each variant

cellmates's Stats
Stats and pretty graphs

Hop's Scores
More scores info

Joey's SSC
Saturday Streak Contest info