# Pokemon GO shiny rates: a Bayesian perspective¶

The Silph Road is the largest online and in-person network of Pokemon GO players and researchers. We investigate the question of how accurate their proposed shiny rates are by putting on our Bayesian hat, setting the "consensus" shiny rate as our prior, and using Silph field studies as observed data.

## Background: Silph, shinies, and statistics¶

The Silph Road organizes regional groups of Pokemon GO players, sets up in-person tournaments, and
conducts field studies to learn about game mechanics of Pokemon GO. Of particular interest to us here is
the *shiny rate*, which is the probability that a Pokemon found in the wild will be shiny (for
non-Pokemon players, this just means it's rare and specially coloured; it's like a trophy). Though not
publicized by the game developer Niantic, this rate has been of great interest to Pokemon GO players
(after all, shinies are not too far off from loot boxes).

Silph publishes field studies to determine shiny rates, and these studies have resulted in two consensus rates: one "standard" rate of 1/450 (used for the vast majority of Pokemon), and one "boosted" rate of 1/150 (used during certain events). Recently, however, those rates have been called into question on the Silph subreddit, saying that they are not consistent with the collected data. I am going to re-examine these findings from a Bayesian perspective.

## Methodology¶

I went through the Silph archives looking for their shiny rate publications posted this year, and
gathered them into a file `rates.csv`

. The null rows in this file were the result of Silph not
reporting their exact numbers (e.g., see Spoink ("over 16,500
Spoink") and Adventure Week
("over 30,000 encounters each")). I chose to keep these in the dataset in case someone asks "what
happened?" Additionally, the presence of two rows from the Gligar event were the result of an apparent
change in the shiny rate after ~24 hours, which I am taking to be fact.

```
import pandas as pd
rates = pd.read_csv("rates.csv")
rates.sample(5)
```

Let's compute the "rarity", defined as `n_encounters / n_shinies`

. A rarity R means that we
saw shinies with a rate of 1 in R.

```
rates["rarity"] = rates["n_encounters"] / rates["n_shiny"]
rates = rates.dropna()
rates.sample(5)
```

Domain knowledge tells us that there are three classes of shiny rates here: a highly boosted one (around 1 in 60, for Alolan Exeggutor and Meltan), one boosted one (which Silph claims to be 1 in 150), and one normal one (which Silph claims to be 1 in 450). We can use this to partition the dataset manuallly, discarding the highly boosted samples because they're not relevant to this debate.

```
boosted = rates[rates["rarity"].between(70, 200)].sort_values("date").reset_index(drop=True)
unboosted = rates[rates["rarity"] > 200].sort_values("date").reset_index(drop=True)
```

```
boosted
```

```
unboosted
```

Let's start with the proposed boosted shiny rate of 1 in 150. We'll come back to the standard one later.

## The boosted shiny rate: the Bayesian way¶

Frequentist statistics would construct a confidence interval on these rates--it's a simple proportions test--and call it a day. Indeed, that's what both Silph (see every publication they put out) and critics of Silph have done. After constructing this confidence interval, we simply check if 1/150 lies within it.

But we can do better than this yes/no response. Given that we believe that the boosted shiny rate is 1 in 150, the Bayesian way of thinking provides us with a natural way of incorporating this into our analysis: as a prior.

```
import arviz as az
import pymc3 as pm
az.style.use("fivethirtyeight")
```

### Setting priors¶

Let's use a Beta prior over p, since a Beta can be used as a distribution over probabilities. Using the success rate interpretation of the Beta, our prior will be fairly weak: equivalent to having seen 10 shinies in 1500 encounters. Put otherwise, our prior is that anything between 1 in 100 and 1 in 300 is plausible.

We'll add a second variable, rarity, which is 1 / p as defined before. This makes it easier to use phrases like "1 in 150" or "1 in N," and is more intuitive when talking about extremely small probabilities. Through the rest of this document, we'll mostly focus on the plots of the rarity.

```
with pm.Model() as model:
p = pm.Beta("p", alpha=10, beta=1490)
rarity = pm.Deterministic("rarity", 1. / p)
prior_samples = pm.sample_prior_predictive(samples=10000, model=model)
```

```
axes = az.plot_density(
prior_samples,
var_names=["p", "rarity"],
point_estimate=None,
credible_interval=0.99,
shade=0.5,
figsize=(12, 4),
)
```

From this, we can see that while 1/150 is at the center of our prior beliefs, we wouldn't be surprised
with a rarity of 1 in 100 or 1 in 200 either. This is without having collected any data--if *all*
we had heard was "the shiny rate is 1 in 150," but we weren't sure about that 150 number, this plot
represents a plausible range of values.

### Adding data¶

One advantage of the Bayesian approach is that it lets us add as much or as little data as we have. We will demonstrate how our beliefs in the shiny rate change over time as we show our model more data (i.e., as we progress through time and have more shinies released).

```
from typing import Tuple
def encounters_and_shiny(df: pd.DataFrame, species_name: str) -> Tuple[float, float]:
"""Given a species name, retrieve the number of encounters and number of shinies"""
row = df[df.name == species_name].iloc[0]
return (row["n_encounters"], row["n_shiny"])
assert encounters_and_shiny(boosted, "sneasel") == (1588, 13)
assert encounters_and_shiny(unboosted, "sentret") == (19297, 54)
```

Beacuse each encounter is independently shiny with probability p, a binomial distribution is appropriate for modeling the number of shinies we see. We will use Markov Chain Monte Carlo to learn the likely distributions over our parameters (shiny rate and rarity). In lay terms, we will try to infer a distribution of most probable values for those parameters, little by little as we see more data. We'll start with just Bronzor.

```
with model:
n_encounters, n_shiny = encounters_and_shiny(boosted, "bronzor")
bronzor = pm.Binomial("bronzor", n=n_encounters, p=p, observed=n_shiny)
trace = pm.sample(1000, chains=4)
```

```
_ = az.plot_trace(trace)
```

This plot represents what we might have believed in February 2019, after seeing 15 out of 2479 shinies for Bronzor. The left curves represent the likely ranges for the shiny rate p and the rarity 1-in-N. For those unfamiliar with MCMC, ignore the fuzzy-caterpillar-like plots on the right; for those familiar with it, this model exhibits excellent sampling behavior.

Notice how we're already seeing that these distributions are a little bit tighter. We see virtually no likelihood of the rate being 1 in 300 now, but earlier we did. Meanwhile, 1 in 150 remains a highly likely shiny rate given our limited data.

Let's add the next Pokemon we had an event for, Horsea.

```
with model:
n_encounters, n_shiny = encounters_and_shiny(boosted, "horsea")
horsea = pm.Binomial("horsea", n=n_encounters, p=p, observed=n_shiny)
trace = pm.sample(1000, chains=4)
```

```
_ = az.plot_trace(trace)
```

Because we observed a rate of 1 in 114 for Poliwag, the likelihood for the rarity has now shifted much further left. It is now almost entirely implausible for the shiny rate to be any lower than 1 in 200, and even 1 in 150 is starting to look unlikely.

The next shiny released was Nidoran M.

```
with model:
n_encounters, n_shiny = encounters_and_shiny(boosted, "nidoran_m")
nidoran_m = pm.Binomial("nidoran_m", n=n_encounters, p=p, observed=n_shiny)
trace = pm.sample(1000, chains=4)
```

```
_ = az.plot_trace(trace)
```

Nidoran's observed rarity was 1 in 107 over 5700 encounters, shifting our rarity curve evne further left, and now it's becoming more clear that 1 in 150 is a pretty unlikely shiny rate. Let's do this one more time for Sneasel.

```
with model:
n_encounters, n_shiny = encounters_and_shiny(boosted, "sneasel")
sneasel = pm.Binomial("sneasel", n=n_encounters, p=p, observed=n_shiny)
trace = pm.sample(1000, chains=4)
```

```
_ = az.plot_trace(trace)
```

At this point (perhaps earlier) I would feel confident saying that the shiny rate, whatever it is, is not 1 in 150. The Sneasel event happened in July 2019, and I'm writing this in October, so clearly that wasn't enough for the Pokemon GO community. Fortunately, four more events happened between then and now, and we can pass them all at once.

```
with model:
n_encounters, n_shiny = encounters_and_shiny(boosted, "poliwag")
poliwag = pm.Binomial("poliwag", n=n_encounters, p=p, observed=n_shiny)
n_encounters, n_shiny = encounters_and_shiny(boosted, "gligar_later")
gligar = pm.Binomial("gligar", n=n_encounters, p=p, observed=n_shiny)
n_encounters, n_shiny = encounters_and_shiny(boosted, "yanma")
yanma = pm.Binomial("yanma", n=n_encounters, p=p, observed=n_shiny)
n_encounters, n_shiny = encounters_and_shiny(boosted, "oddish")
oddish = pm.Binomial("oddish", n=n_encounters, p=p, observed=n_shiny)
trace = pm.sample(1000, chains=4)
```

```
_ = az.plot_trace(trace)
```

We can confidently say that **it is extremely unlikely that the boosted shiny rate is 1 in
150.** It is much more plausible that the rate is in the neighborhood of 1 in 120, as 150 hasn't
even registered on our posterior plot of the rarity.

Notice how natural a fit the Bayesian way of thinking was: we have some prior beliefs (that the rate is 1 in 150), and some data (the Silph studies), and we can marry the two together to get a posterior (the plot we see above). It's clear that the data do not support our prior beliefs, but that's okay; we're researchers, and that's how this is supposed to work.

## The normal shiny rate (supposedly 1 in 450)¶

Let's look next at the normal shiny rate, which is supposedly 1 in 450. For brevity's sake, I won't take us through the step-by-step process again, but rather pass all the data at once.

```
with pm.Model() as model:
p = pm.Beta("p", alpha=10, beta=4490)
rarity = pm.Deterministic("rarity", 1. / p)
prior_samples = pm.sample_prior_predictive(samples=10000, model=model)
```

```
axes = az.plot_density(
prior_samples,
var_names=["p", "rarity"],
point_estimate=None,
credible_interval=0.99,
shade=0.5,
figsize=(12, 4),
)
```