## 401077 Introduction to Biostatistics, Autumn 2017 Assignment 2

401077 Introduction to Biostatistics, Autumn 2017
Assignment 2
Due Sunday April 23, 2017
Please answer each question in the template document provided and submit via Turnitin on or before the due date. The marks allocated to each question are shown in the
assignment. A total of 30 marks are available and this assignment is worth 30% of your overall grade.
Some of the questions in this assignment ask you to analyse the data set assigned to you for assignments. This is the same data set which you used for Assignment 1.
Read ‘Description of your data set.docx’ for the descriptions of the variables.
Question 1 (8 marks)
Research question: Does the average minutes of moderate to vigorous physical activity (MVPA) differ between overweight and non-overweight University students?
Use the assignment data set assigned to you: Variables to analyse: ‘MVPA’ and ‘overweight’
Note: Each student will get different answers as the data sets differ.
Draw histograms of MVPA by overweight status. Add reasonable axis labels. (1 mark)

Name the statistical test you would you use to address the above research question using the data set provided. Explain why this test is appropriate. (3 mark)

step method. (4 marks)
Question 2 (3 marks)
Consider the research question “Is the mean weight of Australian university students heavier at graduation than at enrolment?” Suppose the following data was collected
student ID weight at enrolment (kgs) weight at graduation (kgs)
1 86 89
2 64 73
3 51 55
4 73 72
5 63 80
6 69 70

Identify the most appropriate statistical test for addressing the research question using the given data. Explain why this test is appropriate. (Note: Please do not
run any analyses or conduct the statistical test.)

Question 3 (4 marks)
Research question: How different are the average GPAs of overweight and non-overweight Australian University students?
Use the assignment data set assigned to you: Variables to analyse: ‘GPA’ and ‘overweight’
Note: Each student will get different answers as the data sets differ.

Using R Commander, calculate the 95% confidence interval for the difference in mean GPA between the overweight and non-overweight groups. (1 mark)

Carefully write in words, what this confidence interval is telling us. (2 marks).
From the results in a) can we conclude that there a statistically significant difference in mean GPA between overweight and non-overweight Australian
university students? Explain why or why not. (1 mark)
Question 4 (3 marks)
Estimate the minimum sample size required to detect a difference of 0.5 in mean GPA between the overweight and non-overweight retirees with α=0.05 and power=0.80
(β=0.20). (That is, this difference could be between 4 and 4.5 or between 4.5 and 5.0. You can choose any means as long as they are 0.5 apart.) Assume the population
standard deviation is σ=0.9 and equal group sizes. Present your answer as a sentence which summarises the required sample size to achieve what power subject to what
conditions.
Question 5 (10 marks)
Research question: Does the proportion of overweight students differ by sex in the population of Australian university students?
Use the assignment data set assigned to you: Variables to analyse: ‘overweight’ and ‘sex’
Note: Each student will get different answers as the data sets differ.
Show the relationship between overweight and sex using a two way contingency table. Include either row or column percentages. Type and label the table
yourself: an R Commander screenshot will not be accepted. (2 mark)
Looking at the results in part a) only, is there any evidence of association between sex and overweight status in this sample of Australian university
students? Explain why or why not. (2 marks)

Are the requirements for a Chi-square test met? Explain why. (1 mark)
Irrespective of your answer in part c) address the research question using a Chi-square test on the provided data. Please use R Commander but format your
answer following the 5 step method. (5 marks)
Question 6 (2 marks)
Estimate the minimum sample size required to produce a 95% confidence interval with a margin of error of ±10% for the proportion of Australia university students.
Assume that 40% of this population will be overweight. Present your answer as a sentence which summarises the required sample size to achieve what confidence level
subject to what conditions.