🎮Matchmaking in PvP Titles - Why AI May Save Us

Matchmaker Make Me a Match

Matchmaking in video games is defined as the grouping of players to fill a lobby or session. Developers achieve this through a variety of methods towards varying goals, but in all cases they must consider the universal experiences of queue times and network quality/latency. 343 Industries put together this great write-up on the vagaries of net code and importance of connectivity in Halo Infinite. Individual dev teams will set their own tolerances for queue duration and connectivity, but beyond that we see five major patterns of matchmaking emerge.

EA Patent 16/813454
Random Matchmaking - After connectivity constraints, players are grouped together randomly from the basket of available players in the queue.
Sequential Matchmaking - Instead placing players randomly from the pool, the system sequentially fills lobbies, adding players to each immediately within the constraints of connection quality. It is not truly random as the order in which a player enters the queue determines their resulting lobby assignment in sequence. This minimizes queue times but may be "lumpy" in terms of skill or other properties.
Skill Based Matchmaking (SBMM) - Under this system, the game attempts to score or rank players against each other based on previous performance. This value is often referred to as a Match Making Rating "MMR". It may surface this value to the user or keep it hidden. Players within a certain distance numerically from one another are grouped into a lobby. Ranked systems are a form of skill based match making where ELO or divisional points are awarded based on performance in game vs other players. Over longer periods of time a system of awarding points for wins and losses has the merit of converging on a true player rating based on only human feedback. In practice, most ranked systems also include a synthetic matchmaking variable due to lower populations at high tiers of play or as a mechanism for awarding outsize gains to speed overly proficient players through the lower ranks. In the end, the goal of a Skill Based Matchmaking system is to minimize the difference in player skill within the session or to minimize the difference in skill between the two competition teams.
Engagement Optimized Matchmaking (EOMM) - Often conflated with SBMM, Engagement Optimized Matchmaking uses statistical or other mathematical models to try and group players in a way that drives some given KPIs. Large studios like Epic Games and Blizzard Entertainment infamously focus on using EOMM methods to optimize for overall playtime. Other outcomes discussed could include converging towards a certain win/loss ratio, optimizing propensity to buy premium offerings/spend money. With EOMM and SBMM the developer must decide how to balance the algorithm with queue times and connection, typically with the outcome that EOMM or SBMM loosen as queue times go up or connectivity degrades.
Social Matchmaking - Players are matched based on players that they have in-game relationships with (e.g. friends lists, guilds, or previously played/partied with). These pop up more often in cooperative or PvE based games to try and build community or cultivate player relationships over time.

In practice, matchmaking engines likely utilize a variety of methods and parameters to group players, potentially combining some facets of all of the above.

Matchmaking: Players or Teams

While matching individual players according to performance can be difficult, the challenge grows exponentially when applying the same outcome expectations to groups. In many popular titles, particularly Battle Royales, teams of 2-4 players come together to fight up to 50 other squads. Does the matchmaking score become an average or weighted measure of the players? What do you do in cases where the internal variance in skill is extreme? In the case where a pro-level player plays with three low skill players, do you still put them in bottom skill lobby since the pro is likely to have more interactions by an order of magnitude and "farm" the lobby. There are a variety of math/methodology decisions that can be made to mitigate for these scenarios, but teaming does add a large amount of complexity difficult to tune out of a machine algorithm manually.

Anecdotally, I can say that high EOMM games have huge issues with team matchmaking. In my friend group, we have a lot of examples where someone has stopped playing a game because the lobbies assigned to the team greatly outmatched their individual skill. Games like Mario Cart with progressive player power-up/aid systems mitigate this by trying to help lower skilled players, but similar systems in FPS games have been wildly unpopular even in theoretical discussions. I find the thought of changing game mechanics on behalf of some players and not others to be morally questionable as it undermines any pretense of integrity within the system. More popular is changing the game environment to introduce more randomness or variability. Lowering the time to kill, enabling camping, increasing random events, or expanding asynchronous plays (those which do not have counterplay) serve to lower the skill gap within a game, flattening the outcome distribution and allowing lower skilled players to take a larger chunk of the game outcomes. This drives a more consistent player experience across skill levels and pairings of skill levels, but takes away from the competitiveness of a title. Expect to see a need for different casual and competitive rulesets in games which attempt to achieve more homogenous player experience through gameplay design.

Weighted Scoring - Combine individual player skill measures, likely weighted by individual KDR, number of games played, or some other metric. This is the easiest way bridge scores, but may not reflect the impacts of skills such as communication and teamwork which have outsize impact in some games. Additionally, getting good data on individual performance is difficult when many games do not allow for solo play or many players do not engage in solo play.
Team Performance - Give each group of players a rating for the team as a whole. This is likely the most accurate way to score a team, applying the same individual calculation to the group. The disadvantage is that building a history and resulting score for each combination of players can be storage intensive and results in large stretches of games where a team's matchmaking is not optimized during the many learning periods.

Controversy with Matchmaking

There is a natural tension of incentives between the developer and the player as it relates to matchmaking in competitive titles. PvE or other collaborative games have less of a burden to equalize teams and many parties will be made socially without any algorithmic intervention from the game. If difficulty levels are implemented, players will often self segregate into their own talent pools under theses terms. Contrast this with competitive titles like Call of Duty or League of Legends where a player's success in the game depends entirely on who else is in their lobby. For the player, they want to feel that matchmaking is "Fair" and gives them a reasonable chance to succeed. For the developer, they want to ensure that newer or lower skilled players do not become discouraged and stop playing the game entirely, thus reducing the probability of that player making an in-game purchase to zero. We'll often see the following themes:

Artificial Difficulty - Skilled players will complain that the game does not reflect the reality of their skill or reward the work taken to cultivate that skill level. Strict SBMM may preclude a skilled player from "chilling" or playing at anything less than a peak competition difficultly level and intensity. EOMM may place players in intermittently hard or impossible games intentionally if frustration has favorable impacts on retention and playtime. As a high skilled player, casual gaming may not be possible.
Scripted - Games with significant EOMM in particular may give the players the feeling that their in-game performance is divorced from outcomes or mechanical realities. Is the game just giving me an easy game because I lost three straight? Am I any good at this? Why do I play if I don't have any sense of real progression? In EOMM playlists, player statistics potentially mean very little.
Lack of Transparency - Developers are understandably reluctant to get into the nuts and bolts of their matchmaking algorithms. Often the matchmaking process is a series of highly technical systems, combining connection based considerations with one or more of the above factors. It may be difficult to the average player to understand everything that goes into matchmaking and in the case of sophisticated models (see machine learning) there may be too many parameters or the math may be too complex for simple human expression or reduction. Ranked playlists do the best job with public point scores but often still hide other MMR values. It's almost unheard of for a developer to publish a matchmaking rating for "casual" playlists. It is also very uncommon for a publisher to make matchmaking data available for third party analysis, so there is little documentation on best practices or peer review. Call of duty published this write-up for multiplayer, but has beat around the bush when talking about Battle Royale matchmaking. Halo has similarly put out a statement on multiplayer matchmaking, mostly within it's Ranked playlists, but these two articles are a rare exception to what is almost always a black box process.

Update 7/27/24 - SBMM Blog - Unsurprisingly, Activision has conducted multiple tests on the impact of removing or loosening Skill Based Matchmaking and they found in every case that lower skill players quit in higher numbers. More startling to me, they revealed that they have conducted large scale A-B tests on up to 50% of the player base. This seems to show that there is no real way to effectively promote a casual PvP experience between real human actors.

Sandbagging/Manipulation - Players may reverse boost or intentionally game the algorithms to get into less difficult lobbies at the expense of their player stats. Some streamers or other clip farmers have been known to either throw a large sequence of games to reduce their matchmaking rating or use VPNs or other network manipulation to put themselves in easier lobbies.

Can the Robots Save us?

Having waxed long-penned about the challenges of matchmaking, there are a few ways in which the recent increases in compute, transformers, and attention based models may be able to help us in the future. Quick breakdown of terms:

Compute - AI is very compute intensive and requires being able to parallel process large matrices or vectors quickly. Such hardware may enable complex calculation to quantify gameplay elements such as positioning/movement which are very hard to assess currently. Positioning is an example of a key gameplay element that doesn't always reflect well in event based statistics.
Attention models are a subset of deep learning models which specialize in breaking datasets into their most critical components, assigning methods for weighting them according to significance in the dataset. By using deep learning approaches such as attention, developers may be able to better quantify player skill or other desired KPIs using previously unknown attributes or relationships.
Transformer Architecture allows for more rapid training of models.

The bottom line is that we're getting better at quickly training human like models and that advances in compute are allowing for increasing complex calculations in less time. This gives us three opportunities:

Better Matchmaking - Especially for teams, being able to traverse more data and establish better relationships within that data could allow for better matchmaking or KPI scoring of teams, leading to higher quality matches.
Casual Robot Punching Bags - If it moves and reacts like a human, it might as well be a human. In the age of EOMM, we're already not playing an organic game. Better and less obvious bots that move and react close to humans of different skill level can make more real-time decisions and provide competition at differing levels. They can then be mixed into casual playlists to help ease the pressure value that is continuous competition. Various difficulties of bot can be trained on various levels of player performance to curate an experience in casual playlists while still factoring in overall player engagement. Fortnite already does this though the current bot implementation is horrifically obvious. Obviously non-human actors have no place in ranked or competitive playlists as the sole purpose of those systems is to compare human skill.
SCUMP vs KARRIGAN - Equality as interesting, being able to train AI opponents on pro players or high skill opponents unlocks some interesting opportunities. As a practice tool, having high quality competition could be a great aid to player performance. From an entertainment perspective, having AI models create historical matchups between teams of different eras could be a great spectacle. Also in play, monetizing the above for image and likeness could give players, developers, and organizations another potential product for revenue sharing.

In any case, the debate on what constitutes "fairness" or "equity" will rage till the end of time, whether in politics or video games. My hope is that through conscientious design and greater transparency the industry can move to a place where players can experience both the thrills of competition and the chills of relaxing with friends, regardless of skill level.

🥃 Review #40: Kirkland Islay Single Malt Scotch Whisky (2024)

Like Costco's other Scotches, the Kirkland Signature Islay Single Malt is bottled for Alexander Murray and imported by MISA Imports after being distilled and aged in Scotland (TTB.gov plant registry TX-I-1277). The isle of Islay is one of the southern most islands in Scotland and is one of the five whisky regions ensconced in law. There are only nine active distilleries on the island, and the Islay style is typified by strong peat or smoky flavors. As a single malt, we know that the juice in this bottle comes entirely from one of those nine! All of the distilleries are significantly smaller than Glenlivet and many of the mainland producers. Taste testing has people split between Caol Illa, Bruichladdich (Port Charlotte), and Bunnahabhain as being the source. Caol Illa and Laphroig both have done deals for private brand scotch without rights to name the source distillery, but Laphroig does not match the flavor profile for this bottle. Realistically, Caol Illa is the most likely cand...

Castle & Cairn

Search This Blog