# init repo notebook
!git clone https://github.com/rramosp/ppdl.git > /dev/null 2> /dev/null
!mv -n ppdl/content/init.py ppdl/content/local . 2> /dev/null
!pip install -r ppdl/content/requirements.txt > /dev/null

LAB 05.02.02#

In this laboratory you’ll implement a Bayesian model for an unfair dice.

## Ignore this cell
!pip install ppdl==0.1.5 rlxmoocapi==0.1.0 --quiet
import inspect
from rlxmoocapi import submit, session
course_id = "ppdl.v1"
endpoint = "https://m5knaekxo6.execute-api.us-west-2.amazonaws.com/dev-v0001/rlxmooc"
lab = "L05.02.02"

Log-in with your username and password:

session.LoginSequence(
    endpoint=endpoint,
    course_id=course_id,
    lab_id=lab,
    varname="student"
    );

First, let us import the required libraries:

import numpy as np
import matplotlib.pyplot as plt
from itertools import product
from collections import Counter

Suppose you bought a dice from a store and you rolled it, however, the results make you suspect that the dice is tricked and has some kind of bias.

You decided to measure which face appeared on 100 trials and recorded the results:

results = {1: 16, 2: 16, 3: 16, 4: 16, 5: 16, 6: 20}
fig, ax = plt.subplots(figsize=(10, 7))
ax.bar(results.keys(), results.values())
ax.set_xlabel("Face")
ax.set_ylabel("Counts")

You noted that your dice could be biased, after calling the manufacturer they told that it may be a problem with the manufacturing machine. They provided the following information:

  • The dices are ensambled from two different materials: material A which is twice as heavier than material B.

  • Each dice is ensambled from different independent faces, and each face is produced with different processes and machines).

  • The manufacturer said that the probability of failure (at least 1 face is from a different material) is 0.05.

Using this information, you decided to perform a Bayesian modeling for the dice. First, the prior distribution:

Task 1#

Implement the prior function to compute the prior value for different combinations of materials.

HINT: your prior function must be a valid probability distribution, therefore, it must satisfy that:

\[ \sum_{\mathbf{m}} P(\mathbf{m}) = 1 \]

Where \(\mathbf{m}\) is a combination of materials for each face, e.g., \(\mathbf{m} = ("A", "B", "A", "B", "A", "A")\)

def prior(materials):
    ...

You can use the following cell to validate if your prior implementation is correct:

prior_sum = sum(
        prior({
            face + 1: material
            for face, material in enumerate(combination)
            })
        for combination in product(*("AB" for _ in range(6)))
        )
assert np.allclose(1, prior_sum)
student.submit_task(namespace=globals(), task_id="T1");

Task 2#

Implement the likelihood function \(P(\mathbf{x} | \mathbf{m})\) as a categorical distribution with the following conditions:

  • If all the faces have the same material, then the distribution is uniform.

  • If a face is from material A, then its probability is two times the probability of any face from material B.

def likelihood(x, materials):
    ...

You can test your code with the following cases:

materials = {1: "A", 2: "B", 3: "B", 4: "B", 5: "B", 6: "B"}
likelihood(x=1, materials=materials)
❱ likelihood(x=1, materials=materials)
0.2857142857142857
materials = {1: "A", 2: "A", 3: "A", 4: "A", 5: "A", 6: "A"}
likelihood(x=5, materials=materials)
❱ likelihood(materials)
0.16666666666666666
materials = {1: "B", 2: "B", 3: "B", 4: "A", 5: "A", 6: "A"}
likelihood(x=1, materials=materials)
❱ likelihood(x=1, materials=materials)
0.1111111111111111
student.submit_task(namespace=globals(), task_id="T2");

Task 3#

Compute the evidence \(P(x)\) using the likelihood and the prior.

def evidence(x):
    acum = 0
    for combination in product(*("AB" for _ in range(6))):
        materials = {
                face + 1: material
                for face, material in enumerate(combination)
                }
        prior_i = prior(materials)
        likelihood_i = likelihood(x=x, materials=materials)
        acum += likelihood_i * prior_i
    return acum
evidence(x=6)
student.submit_task(namespace=globals(), task_id="T2");

Task 4#

Compute the posterior distribution \(P(\mathbf{m} | \mathbf{x})\) to determine what is the probability of a given material combination \(\mathbf{m}\) for a given face \(\mathbf{x}\).

def posterior(m, x):
    ...

Use the following cells to evaluate your code (you must view a difference between the prior and the posterior):

m = {1: "A", 2: "B", 3: "A", 4: "B", 5: "A", 6: "B"}
print(prior(m))
print(posterior(m=m, x=1)) # posterior must change since the dice is biased
❱ print(prior(m))
0.0008064516129032258

❱ print(posterior(m=m, x=1))
0.0010752688172042998
m = {1: "A", 2: "A", 3: "A", 4: "A", 5: "A", 6: "A"}
print(prior(m))
print(posterior(m=m, x=1)) # posterior must not change since the dice is fair.
❱ print(prior(m))
0.475

❱ print(posterior(m=m, x=1))
0.4749999999999995