Building a probabilistic programming interpreter

Rob Zinkov
2015-08-25

Very often interpreters for probabilisitic programming languages (PPLs) can seem a little mysterious. In actuality, if you know how to write an interpreter for a simple language it isn’t that much more work.

Using Haskell as the host language I’ll show how to write a simple PPL which uses importance sampling as the underlying inference method. There is nothing special about using from haskell other than pattern-matching so this example should be pretty easy to port to other languages.

To start let’s import some things and set up some basic types

Our language will have as values functions, doubles, bools and pairs of those.

This language will have expressions for these values, conditionals and arithmetic.

We can evalute expressions in this language without doing anything special.

Of course this isn’t a probabilisitic programming language. So now we extend our language to include measures.

Let’s take a moment to explain what makes something a measure. Measures can considered un-normalized probability distributions. If you take the sum of the probability of each disjoint outcome from a un-normalized probability distribution, the answer may not be 1.

This is relevant as we will be representing measures as a list of weighted draws from the underlying distribution. Those draws will need to be normalized to be understood as a probability distribution.

We can construct measures in one of three ways. We may simply have the continuous uniform distribution whose bounds are defined as expressions. We may have a weighted distribution which only returns the value of its second argument, with probability of the first argument. This is only a probability distribution when the first argument evaluates to one. We’ll call this case dirac

The final form is what let’s us build measure expressions. What Bind does is take a measure as input, and a function from draws in that measure to another measure.

Because I don’t feel like defining measurable functions in their own form, Bind also takes a name to set what variable will hold values forthe draws, so the last argument to bind may just use that variable when it wants to refer to those draws. As an example if I wish to take a draw from a uniform distribution and then square that value.

Measures are evaluated by producing a weighted sample from the measure space they represent. This is also called importance sampling.

We may run these programs as follows

Evaluating this program repeatedly will allow you to produce as many draws from this measure as you need. This is great in that we can represent any unconditioned probability distribution. But how do we represent conditional distributions?

For that we will introduce another datatype

This is just an extension of Meas expect now we may say, a measure is either unconditioned, or if its conditioned for each case we may specify additionally which value its conditioned on. To draw from a conditioned measure, we convert it into an unconditional measure.

What evalC does is determine what weight to assign to a measure given we know it will produce a particular value. This weight is the probability of getting this value from the measure.

And that’s all you need to express probabilisitic programs. Take the following example. Suppose we have two random variables x and y where the value of y depends on x

x <~ uniform(1, 5)
y <~ uniform(x, 7)

What’s the conditional distribution on x given y is 3?

As you can see, anything above 3 for x has a weight of 0 because it would be impossible for to observe y with 3.

Further reading

This implementation for small problems is actually fairly capable. It can be extended to support more probability distributions in a straightforward way.

If you are interested in more advanced interpreters I suggest reading the following.