A recent study of social media use in early 2019 was undertaken to examine if there were any patterns in the gender, living arrangements, paren... [Show More] ts’ education and age of spent online per weekend in three social media types, namely Facebook, Instagram and Others (Youtube, Twitter etc). The purpose of the study is to understand the habits of the younger population in relation to different forms of media use. An example of the first 10 observations is seen below.
PLEASE USE THE DATA ON EXCEL DOC
A key to understanding the data is as follows:
1.Instagram: time spent in minutes per weekend on Instagram
2.Facebook: time spent in minutes per weekend on Facebook
3.Other: time spent in minutes per weekend on other social media platforms
4.Gender: a dummy variable where Female=1 and Male=0
5.Age: is the age of the person who agreed to participate in the study
6.Lives at Home: whether the person lives at home where Yes=1 and 0=No
7.Year 10: highest level of education of parents is Year 10
8.Year 12: highest level of education of parents is Year 12
9.University: parents had completed a bachelor’s degree
10.Postgraduate: This is the reference variable for Year 10, Year 12 and Bachelor so all comparisons are against this variable
Please note that I have already sorted the numerical allocations for the dummy variables. We will discuss how to interpret these in the lectures and in the Research report help sessions which begin on Thursday 23rd April
Task 1 (Boxplots and t-tests: Investigating the Data)
1.Construct separate boxplots for gender and time spent online for Facebook, Instagram and other social media platforms. What can you say about the distributional features shown in the graphical representations (central location, spread and skewness) of each of the boxplots?
2. (a) Considering Facebook, Instagram and Social Media, test whether there is a significant difference in the usage time between males and females at a 5% level of significance.
(b) Considering Facebook, Instagram and Social Media, test whether there is a significant difference in the usage time for those who live at home versus thus who do not live at home at a 5% level of significance.
3.Write a short summary of the results of Questions 1 & 2 outlining the results that you have found and how these results better help to explain the purpose of the study. What happens if you were willing to decrease the significance levels?
Note: In all two sample tests, you should discuss briefly whether it is a one or two tail test, the test statistics, any assumption made and draw a conclusion based on Excel output.
Task 2 (Regression Analysis and investing relationships between the variables)
You plan to develop a regression model to investigate how various factors influence time spent on social media types.
4. Before you conduct any regression analysis, you use Excel to construct a correlation matrix of all the quantitative variables in the dataset. Based on the correlation matrix, comment briefly on the associations between each of the dependant variables (Facebook, Instagram and Social Media) and the quantitative variables. Write a summary of your findings.
5.You conduct a stepwise regression according to the following procedure for each of the three (3) dependant variables:
Step 1: Gender only Step 2: Gender and Age
Step 3: Gender, Age and Living at Home
Step 4: Gender, Age, Living at Home, Year 10, Year 12, University
Present the regression output for each of the four regressions in tabular form for each dependent variable.
6.Based on the regression output obtained in Step 4, answer the following:
(a)Which summary measure in the regression output is used to assess the overall adequacy of the model? Comment on the overall adequacy of the model obtained in Step 4 for each of the three dependent variables.
(b)For each of the independent variables, fully interpret the regression coefficients and comment on their statistical significance. (In discussing statistical significance of a regression coefficient, you have to justify your choice of one or two tail test.)
7.Considering the correlations found in question 4 and all the coefficients in the regression analysis from each of the three regressions for Step 4 (one for each dependent variable), are there any issues re the signs or the statistical significances of the coefficients? Discuss fully.
Task 3 (Summary Report)
Present your findings of all components of the statistical analysis in the form of a professional report that is to be presented to a board of educators. The educators are are interested in whether the factors in your data can be used to understand the habits of the younger population in relation to different forms of media use. It is expected that you will outline all your findings in a clear and coherent fashion.
Use 1 & ½ spacing and font size of 11.
You can and are encouraged to include relevant charts and Excel objects in your summary report (Task 3).
No referencing is required in your summary report. However, if you wish to include, and refer to, additional information, you can use any referencing system as long as it is used consistently.
There is no word limit for Tasks 1 and 2.
The word limit of 500 (with a tolerance of 10%) applies only to the summary report, and is exclusive of words in tables, appendices and reference list (if any).
You should submit your response to all three tasks as a single pdf document saved in the format:
After uploading your research report on Blackboard, it is your responsibility to go back to the Assignment Upload page to check that your report was properly uploaded.
Due: 11:59 pm May 25 (Monday) 2020 via Blackboard [Show Less]