Homework 7 Solutions
Adapted by Ashley Van Reynolds from previous solutions
updated on March 2021 by Ana Gomez
Scoring:
40 points total: Completion
... [Show More] Points: 21, Accuracy Points: 19, on all problems except 5 & 6
1. Program a function for (“chi-abs”) and use
to repeat the PrEP |expected −observed| analysis you did in lab. Are the results of
the two analyses consistent? 2 completion points, 2 accuracy points
As given in Lab 7, the data is:
Drug Placebo
Infected 36 64
Not Infected 1215 1184
In [4]:
import numpy as np
import seaborn as sns
import pandas as pd
import matplotlib.pyplot as plt
|χ| = ∑ expected − observed
expected
In [2]:
# Define chi-abs function
# Purpose: compute chi-abs for any arrays of observed and expected data
# Inputs: 2 arrays of same dimensions (observed, expected data for same study)
# Outputs: chi-abs value
def chi_abs(obs, exp):
result=np.sum((np.abs(obs-exp))/exp)
return result
In [5]:
# From Lab 7:
# Enter observed data
obs = np.array([[36, 64],
[1215, 1184]])
# Calculate column totals
total_drug = np.sum(obs[:,0]) # sum column 0
total_placebo = np.sum(obs[:,1]) # sum column 1
3/7/2021 Homework 7 Solutions (Lecture 1 S20)
file:///home/user/Assignment%20Solutions%2021W/Homework%20Solutions/Homework%207%20Solutions%20(Lecture%201%20… 2/12
array([[ 50.06002401, 49.93997599],
[1200.93997599, 1198.06002401]])
0.0088
# Calculate row totals
total_infected = np.sum(obs[0,:]) # sum row 0
total_uninfected = np.sum(obs[1,:]) # sum row 1
# Calculate total number of patients
n = np.sum(obs)
# Expected value is (sum of row )/n x (sum of column)
# For example, overall probability of infection = total_infected/n,
#so number of expected patients with drug & infected =
#overall probability of infection * # of patients with drug
# = total_infected/n*total_drug
expected = np.array([[total_infected/n*total_drug,
total_infected/n*total_placebo],
[total_uninfected/n*total_drug,
total_uninfected/n*total_placebo]])
expected
Out[5]: In [9]:
# Perform 10,000 simulations and calculate p-value
#(same as in Lab 7, but calculating chi-abs)
obschiabs=chi_abs(obs, expected)
results=np.zeros(10000)
sim=np.zeros([2,2])
# Box combines all infected and uninfected from both drug and placebo groups
#so each individual has same overall probability of infection
for i in range(10000):
box=["I"]*total_infected + ["NI"]*total_uninfected
drug_resample=np.random.choice(box, total_drug)
sim[0,0]=np.sum(drug_resample == "I")
sim[1,0]=np.sum(drug_resample == "NI")
placebo_resample = np.random.choice(box, total_placebo)
sim[0,1]=np.sum(placebo_resample == "I")
sim[1,1]=np.sum(placebo_resample == "NI")
results[i]=chi_abs(sim, expected)
p=sns.displot(results, kde=False)
plt.axvline(obschiabs,color="mediumvioletred")
count=np.sum(results >= obschiabs)
pval=count/10000
pval
# In Lab 7, we calculated a p-value using chi-squared of about 0.0178.
# The results of the twoanalyses are fairly consistent, although we couldn't
# reject the null hypothesis using chi-squared if alpha = 0.01. With a p-value
# of 0.0088, we can definitely reject the null hypothesis and say that the drug
# has a significant effect on the odds of being infected with HIV.
# In combination with other research and expert opinions, we can recommend
# that patients at risk of HIV infection use PrEP drugs. [Show Less]