Question
(15%) Problem 13.9 from R&N. It is quite often useful to consider the effect of some specific proposition in the context of some general background evidence that remains fixed, rather than the complete absense of information.
-
Prove the conditionalized version of the general product rule:
. -
Prove the conditionalized version of Bayes’ rule:
Solution
1
But,
Hence,
2
As shown above,
But, following the same reasoning as above,
Hence,
So,
Question
(15%) Problem 13.15 from R&N. Suppose you are a witness to a nighttime hit-and-run accident involving a taxi in Athens. All taxis in Athens are blue or green. You swear, under oath, that the taxi was blue. Extensive testing shows that, under the dim lighting conditions, discrimination between blue and green is 75% reliable. Is it possible to calculate the most likely color of the taxi? (Hint: distinguish between the proposition that the taxi is blue and the proposition that it appears blue.) What about now, given that 9 out of 10 Athenian taxis are green?
Solution
Let (T=b) indicate the event where the taxi is blue. Also, (T=g) indicates the event where the taxi is green.
Let (A=b) indicate the event where the taxi appears blue. Also, (A=g) indicates the event where the taxi appears green.
We are told that
We are asked to find
Hence, we can answer the question if we are given
Using the values given, \(P(T=b|A=b)=\frac{.750.1}{.75.1+.25*.9}=0.25\)
And,
Question
(25%) Consider the following Bayesian network:

-
Suppose that all the variables are Boolean. How many parameters (real numbers) are needed to specify an arbitrary joint probability distribution over 8 variables? How many parameters are needed to define the joint probability distribution using the above Bayesian network?
-
Which of the following probabilistic relations are implied by the structure of the above Bayesian network? Explain briefly.
(a)
(b)
(c)
(d)
(e)
(f)
(g)
- Express
in terms of the probabilities directly available in the network. You will need to use Bayes’ rule and conditioning on one or more other variables.
Solution
1
2
(a)
(b)
(c)
(d)
(e)
(f)
(g)
3
Question
(20%) Suppose that in a polytree network, X is an ancestor of Y and we wish to compute

Solution
Let the node between X and Y be named B. Let B’s unlabelled parent be given the label A.
Then,
This is because, when we calculate the value of
Hence,
Question
(25%) You are an AI consultant for an auto insurance company. Your task is to construct a Bayesian network that will allow the company to decide how much financial risk they run from various policy holders, given certain data about the policy holders. In order to design a Bayesian network, you need output variables and evidence variables. The output variables are those for which the insurance company is trying to get a probability distribution for their possible values. The evidence variables are those variables for which you can obtain information, on the basis of which it is legal to make decisions, and that are relevant to the decision. Output variables represent the cost of various catastrophic events that the insurance company might have to reimburse. In the automobile insurance domain, the major output variables are the medical cost (MedCost), property cost (PropCost), and intangible liability cost (ILiCost). Medical and property costs are those incurred by all individuals involved in an accident; auto theft or vandalism might also incur property cost. Intangible liability costs are legal penalties for things like pain and suffering, punitive damages, and so forth, that a driver might incur in an accident in which he or she is at fault. Evidence variables for this domain include the driver’s age and record; whether or not he or she owns another car; how far he or she drives per year; the vehicle’s make, model and year; whether it has safety equipment such as airbag and antilock brakes; where it is garaged and whether it has an antitheft device.
-
Build a possible belief network for this problem. You will need to decide on suitable domains for the variables, bearing in mind the need to discretize. You will also need to add intermediate nodes such as DrivingSkill and AutoRuggedness.
-
Give reasonable conditional probability tables to a few (but not all!) nodes.
-
How many independent values are necessary to define the joint probability distribution of your model, assuming no conditional independence relations hold among the variables? How many independent values are needed to complete your model using the conditional independence relations?
Solution
1
A possible belief network (at least in the universe I imagine) is shown below:

I briefly describe the domains of the variables in the network (Note the domain of (Accident).):
-
VehicleMakeModelYear: The set of all Vehicle Make-Model-Year combinations.
-
Airbag: Boolean.
-
AntilockBrakes: Boolean.
-
AutoRuggedness: Integral scale ranging from 1 to n.
-
Age: suitably defined age intervals.
-
DrivingRecord: Integral scale ranging from 1 to n.
-
DrivingSkill: Integral scale ranging from 1 to n.
-
Accident: Integral scale ranging from 0 to n, encoding the severity of the accident.
-
DistanceDrivenAnnually: suitably defined distance intervals.
-
OwnsManyCars: Boolean
-
AutoTheft: Boolean
-
AntitheftDevice: Boolean
-
Garage: Boolean
-
Vandalism: Boolean
-
DriverFault: Boolean
-
MedCost: suitably defined cost intervals.
-
PropCost: suitably defined cost intervals.
-
ILICost: suitably defined cost intervals.
2
I provide conditional probability tables for the Vandalism, MedCost and ILICost variables.
Garage | (P(Vandalism=true|Garage) ) |
true | 0.01 |
false | 0.1 |
Note:
Accident (6 severity levels) | c)\) |
|
0 | 0 | 0 |
1 | 0.2 | 0.1 |
2 | 0.2 | 0.2 |
3 | 0.3 | 0.3 |
4 | 0.2 | 0.4 |
5 | 0.1 | 0.4 |
Note:
DriverFault | |
||
False | 0 | 0 |
True | 0.2 | 0.3 |
3
Let us assume that the variables’ domains, described above, are descretized so as to be \textit{binary}. I have defined 18 variables. If there were no conditional independence relationships, then I would need
But, since I recognize the conditional independence relationships indicated by the network, I only need to store 58 values.