:e ImportQualifiedPost
:e FlexibleContexts
:e BlockArguments
:e TupleSections
:e FlexibleContexts
:e OverloadedStrings
:e LambdaCase
import Control.Monad.Bayes.Class
import Control.Monad.Bayes.Enumerator
import Control.Monad.Bayes.Weighted
import Control.Monad.Bayes.Sampler.Strict
import Control.Monad
import Data.List
import Data.Ord
import Control.Arrow (first)
import Data.Text (pack, Text)
import Numeric.Log
import Control.Arrow (second)
:l ../plotting.hs
Before discussing inference, we should understand how to sample from models. This notebook explains how to do that.
We'll start with a very simple model, namely:
model :: MonadDistribution m => m Bool
model = bernoulli 0.7
To take a sample, do:
sampler model
False
Or with a fixed seed:
sampleIOfixed model
True
To take multiple samples, you could rerun sampler
many times, but it's somewhat more in the probabilistic programming spirit to instead define a distribution over multiple draws from model
and then just sample once, as follows:
multipleDraws :: MonadDistribution m => m [Bool]
multipleDraws = replicateM 10 model
draws <- sampler multipleDraws
draws
[False,True,True,True,True,True,False,False,True,True]
We convert a list of samples to an empirical distribution, where toEmpirical
function returns list of distribution values with share of their occurence in the :
emp = toEmpirical draws :: [(Bool, Double)]
emp
[(True,0.7),(False,0.3)]
plot $ first (pack . show) <$> emp
In fact, we could lean even further into the spirit of probabilistic programming, and transform model
into a distribution over plots, and sample from that:
distributionOverPlots :: MonadDistribution m => m VegaLiteLab -- the type of plots
distributionOverPlots = plot . fmap (first (pack . show)) . toEmpirical <$> replicateM 10 model
sampler distributionOverPlots
sampler $ toEmpirical <$> replicateM 10 model
:t toEmpirical <$> replicateM 10 model
[(True,0.7),(False,0.3)]
Consider normal distribution:
model2 :: MonadDistribution m => m Double
model2 = normal 0 1
Sampling is no different to before:
sampler model2
-4.521959900261888e-2
And as before, to obtain multiple draws:
multipleDraws2 :: MonadDistribution m => m [Double]
multipleDraws2 = replicateM 10 model2
draws2 <- sampler multipleDraws2
draws2
[-0.6216690402229845,-0.47608570111519477,0.4670018245275328,-0.731326261945339,1.8293339635456327,-0.31691850879028954,-0.1522034252473077,-0.4871801689979811,0.4312750347264313,0.6641920998609923]
We'd like to view a histogram of samples, which in the limit of many samples should tend to the PDF of a normal distribution.
Again, we could generate a histogram from a drawn list of samples:
constWeight = 1 :: Log Double
samples <- sampler $ replicateM 100000 ((,constWeight) <$> model2)
plot . histogramToList . histogram 100 $ samples
Note, that the type of histogram is:
histogram :: Int -> [(Double, Log Double)] -> Histogram
because it is used in more general context for weighted samples (see next chapter).\
That is why we need to introduce a constant weight for our simple example by mapping samples to a pair (value, score) with (,constWeight) <$>
.
sampler $ replicateM 3 model2
[0.40485488235651435,1.1348389537359198,-1.092972060021726]
but it's nicer to apply a histogram
function to multiple draws to define a distribution over histograms from which we'll sample.\
In that approach a distribution is composed and transformed first and the sampling is clearly split from the distribution definition.
sampler $ plot . histogramToList . histogram 100 <$> replicateM 100000 ((,constWeight) <$> model2)
Mixture model:
model3 = do
p <- bernoulli 0.7
if p then normal 0 1 else normal 3 1
sampler $
plot . histogramToList . histogram 200
<$> replicateM 100000 (weighted model3)
For models with factor statements, we cannot just sample. For example, consider a distribution which is conditioned to positive values only:
model4 = do
p <- bernoulli 0.7
out <- if p then normal 0 1 else normal 3 1
condition (out > 0)
return out
condition
is a special case of factor
statement for binary factors.
Here, we use weighted
to convert model4
into a distribution over pairs of numbers and their weights. We then sample from that. To view
sampler $
plot . histogramToList . histogram 200
<$> replicateM 100000 (weighted model4)
We could ignore the condition
statement using unweighted
, as in:
sampler . unweighted $ plot . histogramToList . histogram 200 <$> replicateM 100000 (weighted $ unweighted model4)
But obviously, we'd like to take samples from the posterior. We can do so as follows:
draws4 <- sampler $ replicateM 10 $ weighted model4
draws4
[(4.22455827914325,1.0),(0.906234848690838,1.0),(2.6860029976528135,1.0),(-0.17404924439421926,0.0),(-1.1044111213397145,0.0),(-0.5604681812378783,0.0),(-0.1850630831580971,0.0),(3.354382098966137,1.0),(2.641529507690727,1.0),(0.7142981214125107,1.0)]
model5 :: MonadMeasure m => m Double
model5 = do
x <- normal 0 5
(factor . Exp) (cos x)
return x
sampler $
plot . histogramToList . histogram 200
<$> replicateM 100000 (weighted model5)