### Developing a Sport Trading Strategy Part 2: The Law of Total Probability

In the last post we focused on identifying a market to build our strategy around, we set up the framework for a strict and coherent management system consisting of a money management plan and a limitation process with clearly defined limitation rules (our statute of limitations... hur hur).

Now we need to look at applying the framework to past data to get an understanding of its viability as a strategy. In order to do this we need to take past historical data and convert it into the probability variables that we need, when we have these values, we can then run them through different distribution systems.

### Poisson Distribution

The Poisson Distribution Equation is a great choice for working out the likely outcome of each event based on historical data. We can feed the equation two teams and ask it for the probability of the final score being X, or y. Lets delve a little deeper.

Poisson distribution can be used to measure the probability of independent events occurring a certain number of times within a set period - such as the number of goals scored in a football match. Suppose that we can expect an independent event to occur λ times over a specified time interval, the probability of exactly X occurring is equal to the result of Poisson Distribution Equation.

The first step is working out what to apply the equation to. In our case the end goal is to work out the probability of games going over 2.5 goals.

x represents the number of goals you want to find the probability for, and the λ parameter is the expected number of goals. The equation in pseudo code, would be:

probability of (X occurrences over λ goals) = λ raised to the x power, e to the negative λ divided by x factorial:
p(x:λ) = (λx e-x) / x!

Right we have our equation for working out the probability of a single goal being scored, what we need to do now is work out what the value of x and λ should be.

x is the number of goals we want to find the probability for, so in our case, x = 2.5. λ is the number of goals expected for that game. To work this out we need that lovely historical data. We will be using last season's goals scored/conceded to work out the number of expected goals. (If you have Football Manager with last seasons database, it' is also good to cross reference last season's data with).

### Working out the probable score

In this example I'll be using two Bundasliga teams to show the working out and the result:

Hoffenheim (Home) vs Wolfsburg (Away)

Before we delve into this, we need to discuss the data we're using. It can't be too old; League stats have a half life and their value begins to decay over time. This is the result of players ageing, new players coming in, managers changing, financial changes etc. It also can't be too new, because the probability works best with lots of data. In our case we will be using last seasons data plus whatever is available for this season. If you have the latest version of Football Manager, its a good idea to cross reference the database (they are pretty accurate at depicting a teams attack/defence).

We need to work out the average number of goals each team scores per game:

To workout the average home goals scored by a team (ah), we divide their total home goals by the total number of home games they've played, for Hoffenheim:
(Note we are explicitly using goals scored at home stats)
ah = (38 /17) = 2.235

we do the same for the away team (aw), using their away goals/games stats:
(Note we are explicitly using goals scored away stats)
aw = (13 / 17) = 0.765

Now we need to work out the number of goals conceded for each team, to work this out, we take the total number of goals conceded and divide it by the total number of games. lets start with the goals conceded by the home team (ch):
(Note we are explicitly using goals conceded at home stats)

ch = (17 / 17) = 1

and for the away team, using their away conceded and total away games stats (ca):
(Note we are explicitly using goals conceded away stats)
ca = (17/17) = 1

We can now be use this data to calculate a teams' attack and defence by comparing their average home stats with that of the league average home stats:

Hoffenheim home attack (ha) = (ah / 1.6)  = 1.396
Hoffenheim home defence (hd)= (ch / 1.2) = 0.838

The same can be done with the away team:

Wolfsburg away attack (aa) = (aw / 1.2) = 0.641
Wolfsburg away Defence (ad) = (ca / 1.6) = 0.624

With the attack and defence statistics for each team we can predict the number of goals expected to be scored by each team using the following formula:

ha * ad * average league home goals per game

away team goals uses the same formula with different variables:

aa * hd * average league away team goals per game
This gives us:

Hoffenheim = 2.000
Wolfsburg = 0.411

This prediction is the average goals expected for this particular game.

Now I said earlier that we want to get the probability of more then 2.5 goals being scored, but really, we can find the probability of all scores, lets do that instead. We've covered quite a lot in this post, I think this is a good point to stop. In the next post we will use the Poisson Distribution equation on this data to find value bets by creating a goals probability table that will give the probability of different scoreline outcomes up to 100% of the book (If you're not sure what the 'book' is, take a look at the betting terminology post (Hint: 'the book' is discussed in the overrounds section).

Part 3 is now available