Unit 6: Uncertainty (Bayesian Net - Markov Model)

Discussion Assignment 6
 

Probability

All Possible Worlds
 
Possible World Instance
Example: possible dice throwing results
notion image
Probability of Possible Worlds
Rules
 

Unconditional Probability

Unconditional probability is the degree of belief in a proposition in the absence of any other evidence. All the questions that we have asked so far were questions of unconditional probability, because the result of rolling a die is not dependent on previous events.
 

 

Conditional Probability

Degree of belief in a proposition given some evidence that has already have been revealed
: The probability that is true given that we already know that is true
: What we want the probability of
: The information that we already know for certain about the world
 
Example
: Probability that today is raining given that yesterday was raining
 
Mathematical Relation
notion image
notion image
 
Example
What is the prob that the sum of two dice rolling equal 6 given that the first roll was 6?
notion image
notion image
notion image
 

Random Variable

A variable in probability theory with a domain of possible values it can take on
Example
  • Variable: Dice Roll
  • Domain: {1,2,3,4,5,6}
 
Example
  • Variable: Flight
  • Domain: {on time, delayed, cancelled}
 
Often, we are interested in the probability with which each value occurs. We represent this using a probability distribution. For example,
A probability distribution can be represented more succinctly as a vector. For example, 
 

Independence

Independence is the knowledge that the occurrence of one event does not affect the probability of the other event.
 
Examples
Independent Events
When rolling two dice, the result of each die is independent from the other. Rolling a 4 with the first die does not influence the value of the second die that we roll.
 
Dependent Events
This is opposed to dependent events, like clouds in the morning and rain in the afternoon. If it is cloudy in the morning, it is more likely that it will rain in the morning, so these events are dependent.
 
Mathematical Relation
Independence can be defined mathematically: events  and  are independent if and only if the probability of  and  is equal to the probability of  times the probability of :

Bayes’ Rule

notion image
Knowing , in addition to and P(b), allows us to calculate . This is helpful, because:
  • Knowing the conditional probability of a visible effect given an unknown cause: , allows us to calculate the probability of the unknown cause given the visible effect:
 

Joint Probability

Joint probability is the likelihood of multiple events all occurring.
Example
Let us consider the following example, concerning the probabilities of clouds in the morning and rain in the afternoon.
C = cloud
C = ¬cloud
0.4
0.6
R = rain
R = ¬rain
0.1
0.9
Looking at these data, we can’t say whether clouds in the morning are related to the likelihood of rain in the afternoon. To be able to do so, we need to look at the joint probabilities of all the possible outcomes of the two variables. We can represent this in a table as follows:
R = rain
R = ¬rain
C = cloud
0.08
0.32
C = ¬cloud
0.02
0.58

  • Using joint probabilities, we can deduce conditional probability.
    • For example, if we are interested in the probability distribution of clouds in the morning given rain in the afternoon:
    • In words, we divide the joint probability of rain and clouds by the probability of rain.
      • A side note: in probability, commas and ∧ are used interchangeably. Thus, .
    • It is possible to view as some constant by which is multiplied. Thus, we can rewrite , or , based on the tables above

Probability Rules

  • Negation:
  • Inclusion-Exclusion:
    • Exclude the double counted cases
  • Marginalization:
    • It allows us to go from the joint distributions to individual probabilities
    • notion image
      notion image
 
  • Conditioning: .
    • Similar to marginalization, but this type depends on conditional probability instead of joint probability
      • notion image
 

Bayesian network

  • A data structure that represents the dependencies among random variables.
  • It is one of the common Probability Models
  • Bayesian networks have the following properties:
    • They are directed graphs.
    • Each node on the graph represent a random variable.
    • An arrow from X to Y represents that X is a parent of Y. That is, the probability distribution of Y depends on the value of X.
    • Each node X has probability distribution .
      • Parents can be considered as causes of the effects
Example
notion image
  • The graph illustrates the dependencies and chain rules among the random variables
  • Bayesian network from the top down:
    • Rain is the root node in this network. This means that its probability distribution is not reliant on any prior event. In our example, Rain is a random variable that can take the values {none, light, heavy} with the following probability distribution:
      • none
        light
        heavy
        0.7
        0.2
        0.1
    • Maintenance, in our example, encodes whether there is train track maintenance, taking the values {yes, no}. Rain is a parent node of Maintenance, which means that the probability distribution of Maintenance is affected by Rain.
      • R
        yes
        no
        none
        0.4
        0.6
        light
        0.2
        0.8
        heavy
        0.1
        0.9
    • Train is the variable that encodes whether the train is on time or delayed, taking the values {on time, delayed}. Note that Train has arrows pointing to it from both Maintenance and Rain. This means that both are parents of Train, and their values affect the probability distribution of Train.
      • R
        M
        on time
        delayed
        none
        yes
        0.8
        0.2
        none
        no
        0.9
        0.1
        light
        yes
        0.6
        0.4
        light
        no
        0.7
        0.3
        heavy
        yes
        0.4
        0.6
        heavy
        no
        0.5
        0.5
    • Appointment is a random variable that represents whether we attend our appointment, taking the values {attend, miss}. Note that its only parent is Train. This point about Bayesian network is noteworthy: parents include only direct relations. It is true that maintenance affects whether the train is on time, and whether the train is on time affects whether we attend the appointment. However, in the end, what directly affects our chances of attending the appointment is whether the train came on time, and this is what is represented in the Bayesian network. For example, if the train came on time, it could be heavy rain and track maintenance, but that has no effect over whether we made it to our appointment.
      • T
        attend
        miss
        on time
        0.9
        0.1
        delayed
        0.6
        0.4
  • To find the probability of missing the appointment, then we need to compute the joint probability as follows:
 

Inference

  • Query : variable for which to compute distribution
  • Evidence variables : observed variables for event
  • Hidden-variables : variables that aren’t the query and also haven’t been observed.
    • Similar to hidden layers in ANN
  • Goal: Calculate
    • That can be done through Marginalization
 
Inference by Enumeration
Inference by enumeration is a process of finding the probability distribution of variable X given observed evidence e and some hidden variables Y.
notion image
 
📌
Check the lecture notes for coding example
 
notion image

Resources