Homework 3: Continuous Variables & Classication

UC Irvine CS177: Applications of Probability in Computer Science

Due on November 10, 2020 at 11:59pm

Question 1: (20 points)

The time that a TA spends helping an individual student in oce hours is exponentially

distributed with a mean of 8 minutes, and independent of the time spent with other students.

For an exponential distribution with parameter ,

fX(x) = ex for x 0; E[X] =

1

; Var[X] =

1

2 :

Suppose there is a homework due tomorrow, and there are 4 people ahead of you in line.

a) The total time that it takes all 4 students ahead of you to receive help from the TA is a

random variable. What is the mean of this total time?

b) What is the standard deviation of the total time taken by the 4 students ahead of you?

c) What is the probability that all 4 people ahead of you will each take at most 10 minutes?

d) Suppose that the TA has nished helping the rst 3 students, and has already spent 10

minutes with the fourth student. What is the mean and standard deviation of the amount

of additional time you need to wait for help?

Question 2: (20 points)

An articial intelligence class has an assignment to write a program that generates the next

move in a game of chess. Suppose that the runtimes of student programs follow a normal

distribution with mean = 13 seconds, and standard deviation = 2:0 seconds. Hint: The

Python commands scipy.stats.norm.cdf and scipy.stats.norm.ppf may be useful.

a) What is the probability that a random program has a runtime greater than 18 seconds?

b) What is the probability that a random program has a runtime between 10 and 16 seconds?

c) The TA’s want to help the students complete their work faster. What would they have

to lower the average runtime to so that only 1.0% of students have runtimes over 13

seconds? Assume the standard deviation remains xed at = 2:0 seconds.

1

Question 3: (20 points)

You’ve been asked to test the performance of a batch of newly fabricated processors. If the

processors were correctly manufactured (class Y = 0), the time X to complete your test suite

is exponentially distributed with mean 1. If the equipment at the factory malfunctions (class

Y = 1), the time X is exponentially distributed with mean 50. You must decide whether or

not this batch of processors was correctly manufactured.

For the scenarios in the three parts below, it is possible to show that the optimal Bayesian

classier predicts Y = 0 if x c, and predicts Y = 1 if x > c, for some constant c. The

value of c depends on the test time distributions, the prior probabilities of the two classes,

and the assumed loss function. You need to determine the optimal c in each case.

a) Suppose that a new fabrication process has just been deployed, and the probability that

the factory manufactures correctly functioning processors is only P(Y = 0) = 0:5. What

threshold c of the observed test suite time X = x maximizes the probability that your

prediction is correct?

b) Suppose that after some improvements to the new fabrication process, the probability that

the factory manufactures correctly functioning processors increases to P(Y = 0) = 0:99.

What threshold c of the observed test suite time X = x maximizes the probability that

your prediction is correct?

c) Market research suggests that the loss (or cost) of a missed detection (predicting Y = 0

when the processor is actually defective) is 500 times greater than the loss of a false alarm

(predicting Y = 1 when the processor was correctly manufactured). Assuming again that

P(Y = 0) = 0:99, what threshold c of the observed test suite time X = x minimizes the

expected loss?

Question 4: (40 points)

For a given day i, we let Yi = 1 if the ground-level ozone concentration near some city

(Houston, in our data) is at a dangerously high level. This is called an \ozone day”. We let

Yi = 0 if the ozone concentration is low enough to be considered safe.

We want to predict Yi from more easily measured \features” describing atmospheric

pollutant levels and meteorological conditions (temperature, humidity, wind speed, etc.).

There are a total of M = 72 of these features collected each day, which we denote by

Xi = fXij j j = 1; : : : ;Mg. Each feature Xij 2 R is a real number, and we will thus use a

Gaussian distribution to model these continuous random variables.

We will build a \naive Bayes” classier, which predicts observation i to be an ozone day

if P(Yi = 1 j Xi) > P(Yi = 0 j Xi), and a non-ozone day otherwise. Using Bayes rule, this

classier is equivalent to one that chooses Yi = 1 if and only if

pY (1)fXjY (xi j 1)

fX(xi)

>

pY (0)fXjY (xi j 0)

fX(xi)

;

ln pY (1) + ln fXjY (xi j 1) > ln pY (0) + ln fXjY (xi j 0): (1)

2

In this equation, pY (yi) is the probability mass function that denes the prior probability

of ozone and non-ozone days. The conditional probability density function fXjY (xi j yi)

describes the distribution of the M = 72 environmental features, which we assume depends

on the type of day. We make two simplifying assumptions about these densities: the features

Xij are conditionally independent given Yi, and their distributions are Gaussian. Thus:

fXjY (xi j 1) =

MY

j=1

1 q

22

1j

exp

Don't use plagiarized sources. Get Your Custom Essay on

Homework on Continuous Variables and Classication

Just from $10/Page

(xij 1j)2

22

1j

; (2)

fXjY (xi j 0) =

MY

j=1

1 q

22

0j

exp

(xij 0j)2

22

0j

: (3)

Given Yi = 1, Xij is Gaussian with mean 1j and variance 2

1j . Given Yi = 0, Xij is Gaussian

with mean 0j and variance 2

0j . There are a total of 2M mean parameters and 2M variance

parameters, since every feature Xij has a distinct distribution for each of the two classes.

a) Derive equations for ln fXjY (xi j 1) and ln fXjY (xi j 0), the (natural) logarithms of the

conditional probability density functions in Equations (2,3). For numerical robustness,

simplify your answer so that it does not involve the exponential function.

Because ozone days are relatively rare, a classier that always predicts Yi = 0 would be

correct over 95% of the time, but would obviously not be practically useful for reducing

ozone hazard. To evaluate our classiers, we will thus separately compute the numbers

of false alarms (predictions of ozone days when in reality Yi = 0) and missed detections

(predictions of non-ozone days when in reality Yi = 1). We are willing to allow some false

alarms as long as there are very few missed detections.

For all parts below, assume that the mean parameters 1j ; 0j are set to match the mean

of the empirical distribution of the training data. The demo code computes these means.

b) Start by assuming the classes are equally probable (pY (1) = pY (0) = 1=2), and have

unit variance (2

1j = 2

0j = 1). Write code to compute the log conditional densities from

part (a). Then using Equation (1), classify each test example. Report your classication

accuracy, and the numbers of false alarms and missed detections.

Hint: Your classifer should have fewer than 10 missed detections.

c) Rather than assuming features have variance one, set the variance parameters 2

1j ; 2

0j

equal to the variance of the empirical distribution of the training data. Classify each

test example using Equation (1) with these variance estimates. Report your classication

accuracy, and the numbers of false alarms and missed detections.

d) Rather than assuming the classes are equally probable, estimate pY (1) as the fraction of

training examples that are ozone days. Classify each test example using Equation (1) with

this informative class prior, and the variances from part (c). Report your classication

accuracy, and the numbers of false alarms and missed detections.

3

Are you busy and do not have time to handle your assignment? Are you scared that your paper will not make the grade? Do you have responsibilities that may hinder you from turning in your assignment on time? Are you tired and can barely handle your assignment? Are your grades inconsistent?

Whichever your reason may is, it is valid! You can get professional academic help from our service at affordable rates. We have a team of professional academic writers who can handle all your assignments.

Our essay writers are graduates with diplomas, bachelor, masters, Ph.D., and doctorate degrees in various subjects. The minimum requirement to be an essay writer with our essay writing service is to have a college diploma. When assigning your order, we match the paper subject with the area of specialization of the writer.

- Plagiarism free papers
- Timely delivery
- Any deadline
- Skilled, Experienced Native English Writers
- Subject-relevant academic writer
- Adherence to paper instructions
- Ability to tackle bulk assignments
- Reasonable prices
- 24/7 Customer Support
- Get superb grades consistently

Basic features

- Free title page and bibliography
- Unlimited revisions
- Plagiarism-free guarantee
- Money-back guarantee
- 24/7 support

On-demand options

- Writer’s samples
- Part-by-part delivery
- Overnight delivery
- Copies of used sources
- Expert Proofreading

Paper format

- 275 words per page
- 12 pt Arial/Times New Roman
- Double line spacing
- Any citation style (APA, MLA, Chicago/Turabian, Harvard)

We value our customers and so we ensure that what we do is 100% original..

With us you are guaranteed of quality work done by our qualified experts.Your information and everything that you do with us is kept completely confidential.

You have to be 100% sure of the quality of your product to give a money-back guarantee. This describes us perfectly. Make sure that this guarantee is totally transparent.

Read moreThe Product ordered is guaranteed to be original. Orders are checked by the most advanced anti-plagiarism software in the market to assure that the Product is 100% original. The Company has a zero tolerance policy for plagiarism.

Read moreThe Free Revision policy is a courtesy service that the Company provides to help ensure Customer’s total satisfaction with the completed Order. To receive free revision the Company requires that the Customer provide the request within fourteen (14) days from the first completion date and within a period of thirty (30) days for dissertations.

Read moreThe Company is committed to protect the privacy of the Customer and it will never resell or share any of Customer’s personal information, including credit card data, with any third party. All the online transactions are processed through the secure and reliable online payment systems.

Read moreBy placing an order with us, you agree to the service we provide. We will endear to do all that it takes to deliver a comprehensive paper as per your requirements. We also count on your cooperation to ensure that we deliver on this mandate.

Read more
The price is based on these factors:

Academic level

Number of pages

Urgency