The SC2API package is an R wrapper for Blizzard’s Starcraft II API. As an introduction to the package, it is an interesting and useful exercise to visualize the distribution of Matchmaking Rating (MMR) of the top Starcraft II players. For this exercise, we will limit our scope to the European region and investigate its Grandmaster league (i.e. the top 200 players in the region, although there may be less than 200 at any given time depending on player activity).
To begin, we will load the SC2API package and ggplot2, another package that is extremely useful for plotting and visualization.
To obtain the MMRs of players in the European Grandmaster league, we will first use the get_league_data
function to obtain the ladder ID of the Grandmaster ladder in the current season (season #44).
data <- get_league_data(season_id = 43,
queue_id = 201,
team_type = 0,
league_id = 6,
host_region = "eu")
The queue_id argument “201” refers to the fact that we are looking for 1v1 in the current expansion (Legacy of the Void), team_type means that the teams are arranged beforehand (not particularly useful since this only applies to team games), and a league_id refers to the league of Grandmaster. To see other choices for these arguments, see the help documentation in ?get_League_data
.
Since leagues are often separated into tiers and further separated into divisions, we must specify which ladder ID we are actually looking for. Of course, there is only one tier and one division for Grandmaster league and so finding the appropriate ladder ID is relatively easy. For other leagues, finding a particular ladder ID may be slightly more difficult but can be completed using list indexing.
Now that we have a ladder ID, we can supply it as an argument to the get_ladder_data
function to discover the MMRs of players within the league. The host_region argument must be set to “eu”" since that is the region with which we are currently concerned.
Then, we can extract players’ MMR from ladder_data.
To retrieve the MMRs of players in past seasons, the above functions are necessary. However, since the grandmaster league is, in some sense, special, there is an alternative function we can use to retrieve the MMRs of players currently in grandmaster (that is, in the current season):
ladder_data_current <- get_gm_leaderboard(2, host_region = "us")
mmrCurrent <- ladder_data_current$mmr
head(mmrCurrent)
#> [1] 7355 7229 6970 6813 6813 6805
Notice that for this function, it doesn’t matter which host region we use, as long as the region_id is set to the region from which we would like to retrieve data. That is,
ladder_data_current2 <- get_gm_leaderboard(2, host_region = "kr")
mmrCurrent2 <- ladder_data_current$mmr
identical(mmrCurrent,mmrCurrent2)
#> [1] TRUE
For this reason, it is important to refer to the documentation. If the region_id argument is not available, it is likely that the host_region affects the data retrieved.
We will continue the rest of the vignette using the MMRs of players from the last season.
Since we now have the MMRs of players from season 43, we can plot the distribution as a histogram using the ggplot2 package.
ggplot() +
geom_histogram(aes(x=mmr)) +
theme_light()
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Although this is useful, it would also be interesting to look at the distribution by player race.
First, we will obtain the race of each player in the league. This is accomplished using the sapply
function. Then, we create a simple data frame with the MMR and races. Since there are so few random players, we will filter these out from the dataset (at the time of writing there is a single player in the Grandmaster league of Europe).
races <- sapply(ladder_data$team$member,
function(x) x$played_race_count[[1]]$race$en_US)
df <- data.frame(mmr,races)
df <- df[df$races!='Random',]
head(df)
#> mmr races
#> 1 7464 Zerg
#> 2 7151 Protoss
#> 3 7075 Terran
#> 4 7004 Terran
#> 5 7002 Terran
#> 6 6882 Zerg
Now, we will once again plot a histogram and look at the separate race distributions using ggplot2 with some arguments to make the plot a little more visually appealing.
ggplot(df,aes(x = mmr, group = races, fill = races)) +
geom_histogram() +
scale_fill_manual(values = c('Orange','Blue','Red')) +
facet_wrap(~races) +
theme_light()
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Another way of visualizing these distributions is to plot them on the same graph and use a statistical method to create a smoothed density estimate. The below plot shows a smoothed density estimate using a Gaussian kernel and an adjusted bandwidth.