1Stirling
Account Details
SteamID64 76561198007723620
SteamID3 [U:1:47457892]
SteamID32 STEAM_0:0:23728946
Country United Kingdom
Signed Up May 8, 2016
Last Posted April 17, 2019 at 1:31 PM
Posts 14 (0 per day)
Game Settings
In-game Sensitivity
Windows Sensitivity
Raw Input  
DPI
 
Resolution
 
Refresh Rate
 
Hardware Peripherals
Mouse  
Keyboard  
Mousepad  
Headphones  
Monitor  
#14 Simulating a TF2 World Championship in Projects
ZestyI'm interested in the specifics of the model you're using, but I don't see it anywhere on your site. Is there any more information you can give on the model you're using to predict match results?

It works through my ranking system, which is player-based and derived from results and stats from officials. Recent matches, and those featuring high-profile players, are weighted the highest and have the most influence on players' statuses. When predicting a match result, the players on each team are looked up and their average scores are calculated, and then whatever disparity exists between the two teams is used to estimate how likely it would be for either of them to win the match as a percentage.

It's at this point the simulator steps in. It's actually quite simple and not at all fancy. It uses these percentages as a base for its match transcripts. The percentage values are applied not on a whole-map basis but on a capture-point-by-capture-point basis (although they're modified so that the difference between them is reduced, because a 10% chance of winning a match isn't the same as a 10% chance of winning a midfight, for example). To summarise, RNG is used to determine which team is the next to make a capture based on these percentage figures. An RNG-based system also determines how long it will be until the next capture happens, aka duration.

However, lots of smaller aspects influence what I've described above to produce more realistic outcomes. For example, certain maps are determined to be more conducive to longer stalemates, so longer duration numbers are more likely to pop up here. The percentage figures mentioned above are actually dynamic and they change point-by-point during the simulation based on context. If the duration following the previous capture is, by chance, sufficiently low, the chance of the team that won that capture to win the next one is proportionately increased to emulate a sense of momentum. This is what leads to events in simulations where, for example, one team sweeps across the map relatively quickly, something that would be somewhat rare if it was just left to raw chance. Another example is on last, where the team on the offensive has a slightly reduced chance of progressing and longer stalemates are more likely. Durations are much shorter on average in matches featuring teams perceived to have a big performance gap, while longer durations are more likely in close matches. Factors such as these, I think, lead to simulations that appear to be much more organic and realistic.

The RNG elements also mean that lower-ranked teams can in fact beat higher-ranked ones within these simulations to, what I think is, a realistic degree. The chance of the lower-ranked team winning rapidly drops off the bigger the score gulf exists between them in the rankings. There turned out to be a number of examples of this in the simulated world championship.

This whole system generates row after row of point captures following this simple procedure until either one team wins or time runs out.

So the simulator really isn't especially clever, but it is specifically designed to produce realistic-looking events.

posted about 5 years ago
#10 Simulating a TF2 World Championship in Projects

In case anyone's interested in the final results of this side-project, the final post in the series is now up with Day 11. This one documents the final leg of the playoffs (which began on Day 9).

posted about 5 years ago
#1 Simulating a TF2 World Championship in Projects

I thought I'd make some noises here about a TF2Metrics side-project I'm doing. Long story short, curating my stats-based ranking system led to my throwing together a basic match simulator which, after much tinkering, I'm now putting to use simulating an entire tournament among the world's finest 6v6 teams in a sort-of virtual super-LAN. A real-life world championship like this isn't entirely feasible, of course, so maybe this project can serve to scratch that itch a little bit. While I won't for a minute pretend that any events that transpire should be taken as completely accurate representations of what would happen in a real-life tournament of this scale, I do at least hope this will be of some entertainment value to some of you. I'm going through the format according to an eleven-day timetable, which brings us right to the eve of an actual real-life LAN, the Copenhagen Games.

I actually started this with an introductory post yesterday, which I'd suggest starting with, but I thought I'd withhold from mentioning it here until I actually had some completed virtual matches (which each manifest as a transcript of point captures) to show, which you can peruse in today's post.

Cheers!

posted about 5 years ago
#47 International Ranking System in Projects

It's been a year since I last drew attention to this thread before i61, so on the eve of i63 I thought I might trouble you all with another cheeky bump.

I've just completed a preview post for i63 here. It's quite different from what I did last year, focusing on statistical trends and speculation rather than the backstories of the teams involved. A few questionable conclusions were certainly reached, but I suppose that's where the fun lies. The last noteworthy thing this ranking model did in the run-up to i63 was to predict that France would win the ETF2L Nations Cup, and, well... if it's equally inaccurate about i63 as it was the Nations Cup, Svift are gonna win it.

Anyway, thanks for stopping by.

posted about 5 years ago
#17 TF2Elo Predictions in Projects

Fellow who runs TF2Metrics here.

Seeing this inspired me to start making dedicated weekly match predictions again. Fight me.

For the sake of comparison, here's my own crystal ball's assessment of what Week 1 of Invite might look like, including most likely score outcomes and win chances. I think my system is quite a bit more cynical than yours by the looks of things.

I'll also point out that my system has the unfair advantage of about a year's worth of data on most of the players involved in Invite, plus it already has a little bit of knowledge about most of the new arrivals, too. This means if your system proves to already be more accurate than mine I'm gonna be grumpy.

posted about 6 years ago
#45 International Ranking System in Projects
lazhttp://i.imgur.com/ceeTCNm.png
http://i.imgur.com/vgJQQjN.png

you hate to see it happen

There are certainly big inaccuracies in the list. It can only go off the data available of course, and for these two there's only a small amount of it from officials in the past year. T0m's a great example of how misleading the data can sometimes be, even when there's a lot to go off of - one of the best roamers in Europe and he's only ranked 270th. Often the numbers can't give the full picture.

You might notice Delpo used to be ranked much much higher, peaking at 42nd. That was back in his EVL days. Since then he's only had fleeting appearances in Invite, and this system rewards consistent prominence at the top level. That's why his status is reduced for now. I certainly shan't claim it's an accurate representation of ability in this case.

ymRaisinrktman was an alias cody used in a couple of ESEA matches this season

I'll be sure to amend that. It's certainly not the first time I've fallen foul to aliases.

cirloThat was an amazing write-up!

Hope you don't mind if i add it to i61 page on comp.tf

Not at all.

posted about 6 years ago
#34 International Ranking System in Projects

Hopefully it's not too impolite of me to bump this...

I've made a bunch of additions to the TF2Metrics project over the months since I started it. For one, the rankings list has some much-needed visual aids now, namely an automatically-generated class indicator next to everyone's names (based on whatever they last played) and deltas which show any positional or score changes to have happened over the course of the previous week.

I've also made a way for me to generate score predictions for matches based on the rankings scores of the individual participants involved. I call this the projection machine. Usually its predictions are broadly accurate, sometimes they're spot on, and sometimes they're completely barmy (in fact the same applies to the rankings list itself). It's all good fun either way.

I've also started publishing little graphics that show the data from matches that went 'on-record' and so affect the rankings. Update posts, such as this one, show these for every on-record match that happened since the previous update.

The blog has north of 50 posts now, and with i61 just around the corner I thought now might be a good time to invite people to revisit the project if they fancy it. I have a preview post for i61, which looks into the background of the Invite teams, and I also have the projection machine's impressions of the i61 group stage published, too.

Cheers!

posted about 6 years ago
#30 International Ranking System in Projects
muppetNice work with the playoff articles !

Thanks, dude! ESEA one is coming tonight.

posted about 7 years ago
#21 International Ranking System in Projects
nuzeEveryone seems to be posting about how some player that they know is ranked incorrectly compared to other players.. its like you forget that its a system of statistical analysis that obviously has flaws and uses a very unique and rather interesting metric in 'gilding' - I think this is very cool and a good proportion of the rankings seem to be pretty accurate to me.
Nice work my man!

You were always my favourite, Nuze <3

The criticism is completely justified, though. There are clear weaknesses. Heck, even the concept of a ranking system for TF2 players at all has flaws. With Drackk and Blaze, for example, is one really better than the other at all? Or are they just different?

I think I should have badged this as an attempt at a ranking system rather than a full-blooded proper ranking system.

I'll jump on the blog tomorrow and talk properly about the motivation behind why this system works the way it does, because there are specific purposes there. Chief among these is that I wanted a system that allowed players to stand out from among their team mates. If it was purely team results based, Muuki and Uubers would be level. This way, though, Uubers has a means to excel within the team itself. In the match that just finished, he was the only one that successfully prevented his counterpart on nR from getting gilded.

posted about 7 years ago
#16 International Ranking System in Projects
sandblastin no world should delpo be rank 262 LOL

This one's down to the limited scope of the rankings. When he played in ESEA-I with EVL a couple of seasons ago he hovered around the 40s.

posted about 7 years ago
#13 International Ranking System in Projects
bearodactylwho cares about the stats if team a beats team b twice and has a better record should they not be higher on the list?
like ok cool maybe x player got kritzd and then y player baited super hard so their team got better stats, but it's entirely possible that their team lost even though as a whole they got more damage and/or kills than their opponents

This is completely true and a very valid criticism. In the end, this is just another metric by which teams can be ranked, and I'll be the first to admit it's miles away from being totally authoritative, and you're right to say that a victorious team with modest stats is better than a losing one with flashy stats. There's a great deal to disagree with in the rankings as they currently are. I have a great deal of reservation about its conclusion that Se7en are ahead of Froyo. That conclusion came about because Se7en haven't had a tough playmate this season unlike Froyo have, meaning they've been free to stomp around getting gilded left right and centre.

In the end this is really nothing more than an experiment, to see how well this particular interpretation of statistics matches up with reality. Sometimes it's right, and indeed often it's not right. This particular series of equations thinks that Six Apes is inferior to an Asian team called P00tis is Kill, but I don't think anyone would consider that to be a reasonable statement to make without evidence. It thinks Lemmings are behind Nunya, even though it was the former came within reach of playoffs. This one can be explained because Cold Heart and Zesty never got gilded this season, which I think many would agree is rather unfair, especially for Zesty.

There's a long list of inaccuracies beyond this, of course.

gemmhow on earth are you able to compare two teams who play in different leagues when you can count on one hand the number of times matches between teams in those leagues have occurred

it makes no sense to put etf2l high teams in with esea IM teams when there's no results linking them

They're complete guesses, that's the fairest way of putting it. It generally puts teams with similar-ish records within their own region near eachother. For example it thinks Lowpander and Nature Walk would produce a good match if they got to play eachother, that's all it boils down to really.

posted about 7 years ago
#7 International Ranking System in Projects
bearodactylhttp://i.imgur.com/7xn5h6x.png
http://i.imgur.com/MorMJkA.png
http://i.imgur.com/SFm6cFM.png
http://i.imgur.com/ymiTuwV.png
http://emojipedia-us.s3.amazonaws.com/cache/80/c3/80c3d87224a20373f0b73f27d6f3ce04.png

And that's a prime example of this system being clearly wrong. This system rather undervalues you and Jarrett in my view, and that's part of the reason why MP4 is ahead. Dingo's run with EVL last season also boosts him to perhaps a debatable level. Usually with time these errors correct themselves, so we'll see if this has changed come season's end.

posted about 7 years ago
#5 International Ranking System in Projects
AlexandrosI feel like this is very biased toward current players.

You're absolutely right, the purpose of the 300-match window is to keep the rankings current. This is to help with accuracy for the current teams and to account for rust should an old player make a comeback. It also means that today's really good players aren't held back by poor performances they may have had perhaps a year ago.

posted about 7 years ago
#1 International Ranking System in Projects

I'm curating a stats-based ranking system that incorporates top-level ETF2L, ESEA, OzFortress, and AsiaFortress players and ranks them all in a single international list. It currently lists over 400 players and 27 teams. It's far from 100% accurate of course but I feel on the whole it works quite well. So far its claims to fame include correctly predicting the finishing order of i58 and ESA Rewind. The full table can be viewed via Dropbox here. I've been tinkering with it for ages and now feel it's ready to share.

I've also started a blog called TF2 Metrics where I plan on posting commentary and analysis about these rankings regularly while keeping the tables updated. The first post explains exactly how the system works.

Feel free to laugh at some of the more controversial conclusions this system reaches, but hopefully there's enough reasonableness here to at least be of interest to some of you. Cheers!

posted about 7 years ago