1. When one variable causes change in another, we call the first vari- able the vari- able*. The affected variable is called the variable*.
... [Show More] 2. A two-way table, also known as a two-way frequency table or contingency table, is used to show the relationship between two vari- ables ( C’C ); the rows show the categories of one variable, and the columns show the categories of the other variable. 3. . These represent the total number of in- stances that fall in both the corre- sponding row and header. The data in the green cells show When one variable causes change in another, we call the first variable the explanatory variable*. The affected variable is called the re- sponse variable*. In a randomized experiment, the re- searcher manipulates values of the explanatory variable and measures the resulting changes in the response variable. The different values of the explanatory variable are called treat- ments. An experimental unit is a single object or individual to be measured. categorical variables ( C’C ) The cells in yellow show joint frequen- cies*. These represent the total num- ber of instances that fall in both the corresponding row and header. For example, data in the "Male" row a. Tnhde"sWeitahreAeuqtisumal"tcootlhume snucmouonftsthtehenum number of males with autism. The data in the green cells show mar- ginal frequencies*. These are equal to the sum of the number of individuals in the corresponding row or column. For example, data in the "Totals" col- umn and "Female" row shows the total number of females in the study. It may be helpful to remember that marginal frequencies appear in the margins of the table. The bottom, right cell (in both the "Totals" column and the "Totals" row) measures the total number of individ- uals in the study. 4. The relationship between two variables that are both quanti- tative can be displayed in a . 5. As we've seen earlier, every point on a coordinate plane can be represented by an ordered pair*, ( x , y ). Here, the x -value is typically the variable's value for a piece of data, and the y -value is the corresponding value for the vari- able. A simple way to remember this fact is that the term "explanatory" has an " x " in it. scatterplot; explanatory variable; response vari- able 6. Side-by-side box plots are a good choice for two-variable data where the explanatory variable is data and the re- sponse variable is data. 7. Which variable, explanatory or re- sponse, is displayed on the x -axis on side-by-side boxplots? 8. A scatterplot is a good choice to display two-variable data that are both variables. 9. The relationship between the x -variable and the y -variable is called . 10. What determines the location of a dot on a scatterplot? 11. When analyzing a possible relation- ship for two-variable data, if both variables are categorical, what is the most appropriate choice to dis- play the data? a) Side-by-side boxplots b) Scatterplot c) Bar chart d) Two-way frequency table e) Histogram 12. A hospital hires an independent consulting firm to perform a study about patients with high blood pressure, and the medicine they are Categorical Quantitative Side-by-side boxplots can be horizon- tal or vertical, so either variable (ex- planatory or response) can be dis- played on the x -axis. Quantitative Correlation A dot is placed on a scatterplot accord- ing to its x - and y -value. Answer: D A two-way frequency table is the most appropriate way to graphically display a possible relationship for two-variable data, when both variables are categor- ical. The explanatory variable is patient's starting blood pressure. The explana- tory variable is a quantitative variable. being prescribed. The study is ex- amining the relationship between a patient's starting blood pressure when they entered the treatment program and the dosage of blood pressure medicine they are pre- scribed during their treatment. For this study: What is the explanatory variable? Is the explanatory variable categorical or quantitative? What is the response variable? Is it categorical or quantitative? What graphical display should be used to show the results of the study? 13. When working with two-variable data, if the explanatory variable is categorical and the response vari- able is quantitative, what is the most appropriate choice to display the data? a) Side-by-side boxplots b) Scatterplot c) Bar chart d) Two-way frequency table e) Histogram 14. 13. When working with two-variable data, if both variables are quantita- tive, what is the most appropriate choice to display the data? a) Side-by-side boxplots The response variable is the dosage of blood pressure medicine they are prescribed. The response variable is also a quantitative variable. As both the explanatory and response variables are quantitative (Q’Q) , a scatterplot would be an appropriate graphical display. Answer: A When working with two-variable data, if one variable is categorical and the other is quantitative, a side-by-side boxplot is the most appropriate way to display the data. Answer: B When working with two-variable data, if both variables are quantitative, a scatterplot is the most appropriate choice to display the data. b) Scatterplot c) Bar chart d) Two-way frequency table e) Histogram 15. In a two-way table, what does the sum of the joint frequencies in one row equal? a) The quantitative variable b) A marginal frequency c) The correlation coefficient d) The number of individuals in the placebo group 16. If both variables are cate- gorical, a - is used to display the data. 17. There are several ways we can an- alyze the data presented in this table. If we calculate the percent- age that each cell is of the to- tal, the results are called relative frequencies. When the relative fre- quencies are calculated from the row total or the column total, they are called . 18. Each row is a different gender. If we are trying to see if gender in- fluences the choice of exercise pro- gram, then gender is the explana- tory variable. In this case, we are calculating the relative frequency by rows; that is, we are calculating the relative frequency by gender. To determine relative frequency for Correct. The correct answer is b. In a two-way table, the sum of the joint fre- quencies in one row equals a marginal frequency. two-way frequency table conditional percentages. conditional row percentages. women, we divide the data in the top row by the total number of women. To determine the relative frequen- cy for men, we divide the data in the second row by the total number of men. The percentages obtained are called . 19. Each column is a different exercise program. If we are trying to determine how each exercise program is appealing to different genders, then the exercise program becomes the explanatory variable. In this case, the explanatory variable is in the columns, so we will be calculating the relative frequency by columns, that is, we are calculating the relative frequency by exercise program. To determine relative frequency for each cell, we divide the data by the corresponding column's total number of individuals. The percentages obtained are called conditional column percentages . 20. Calculating Overall Percentages 21. 5 Box Plot Summary 22. If the explanatory variable is cat- egorical and the response vari- able is quantitative, we can use descriptive statistics, namely the for the quantitative variable, and com- pare the statistics for each of the categories 23. for the majority of the data points, there is a linear relationship indicat- ing a *positive correlation*, mean- ing: 24. If the relationship is linear, the strength of the correlation (linear relationship) can be measured us- ing a statistic called the 25. A correlation coefficient at or near 1 represents a strong five-number summary, When two quantitative variables move in the same direction; meaning that as one variable (response variable ) increases, the other variable (explana- tory variable) increases. correlation coefficient*. A correlation coefficient is a number that falls somewhere from 1 to 1 . A measure of the linear relationship between two attributes. The numeri- cal value demonstrates how closely the attributes vary together. Correla- tion coefficients near -1 and +1 have strong linear correlation, while a corre- lation coefficient near 0 has weak (or no) linear correlation. negative linear correlation ie. _W_h_e. An tcwoorrvealaritaibolnescomeoffvieciiennotpaptoo-r ne site directions, in a linear fashion: as the explanatory variable increases the response variable decreases. positive linear correlation ie. When two variables move in the same direction, in a linear fashion; the explanatory variable increases the re- sponse variable increases. 26. To solve for the variation in data you To solve for the variation in data you take the be- tween the maximum and minimum. take the difference between the maxi- mum and minimum. For the treatment group this is equal to 2.24 ( 1.19)=3.43; for the placebo grou this is equal to 0.78 ( 2.29)=3.07. 3.07 is less than 3.43, Therefore the placebo group has less variation in its data. 27. A scatterplot is useful for which type of data? a. Both variables are categorical. b. One variable is categorical, one variable is quantitative. c. Both variables are quantitative. d. One variable is discrete, one vari- able is continuous. 28. True or False? A scatterplot always shows the explanatory variable on the horizontal, or x-axis. a. True b. False c. Both variables are quantitative. a. True on the X axis (hint- X for explanatory) 29. 1. A hospital is studying the effec- tiveness of two different heartburn treatments (Treatment A and Treat- ment B) administered daily for a week. The results are measured af- ter one week of treatment, by plac- ing a patient into one of two groups: heartburn subsided OR heartburn remained. Which numerical mea- sure could be used to analyze the data? a. Correlation coefficient b. Five-number summary c. Median d. Conditional percentages 30. One of the challenges with young patients who suffer from acute asthma is delivering medication during asthma attacks. Young pa- tients often have difficulty man- aging traditional delivery methods, such as inhalers. A study was done to determine whether using a nebu- lizer to deliver medication reduced the duration of asthma attacks in pediatric patients. Patients were split into two groups. One group Correct. The answer is d. Both vari- ables are categorical (C’C) so we will use a two-way table. Therefore, our numerical measure will be conditional percentages. a. Relative Frequency b. Five-number Summary Correct. The answer is b. In this study one variable is categorical, and the other is quantitative (C’Q) . The five-number summary will show five important statistical values: minimum, maximum, first quartile, median, and third quartile. Therefore, the five-num- ber summary would be the best choice was given traditional inhalers as a method to deliver medication dur- ing an attack. The other group was given the nebulizer. The effective- ness of both delivery methods was measured by comparing the time it took for an attack to subside. Which numerical measure would present the maximum and mini- mum amount of time that it took for attacks to subside for both delivery methods? a. Relative Frequency b. Five-number Summary c. Correlation d. Conditional percentages 31. An exercise physiologist was in- terested in how age affects reac- tion time. He sampled a population ranging in age from 10 years old to 70 years old. Subjects were asked to complete a simple reaction time test. The test consisted of subjects sitting in front of a computer, and clicking a mouse when the image of a circle on a screen changed col- or from black to red. Subjects were asked to attempt the task 5 times. The times from the five trials were averaged to give each subject an overall reaction time score. Which numerical measure could be used to analyze the strength of the rela- tionship between age and reaction to look at the minimum and maximum values for both groups. c. Correlation d. Conditional percentages a. Correlation coefficient Correct. The answer is a. In this study both variables are quantitative (Q’Q) that form paired data. The strength of a correlation can be measured by cal- culating the correlation coefficient. b. Conditional percentages c. Mean d. Five-number Summary time? a. Correlation coefficient b. Conditional percentages c. Mean d. Five-number Summary 32. 4. Consider a study on whether a certain medication improves kidney function in patients with chronic kidney disease. The Glomerular Fil- tration Rate (GFR) was collected for two groups: a treatment group and a placebo group. The researchers want to compare the middle 50% of the data for both groups. Which nu- merical measure would identify the two values that 50% of the data falls between for both groups? a. Median b. Five-number summary c. Mode d. Joint Frequencies 4. Consider a study on whether a certain medication improves kidney function in patients with chronic kid- ney disease. The Glomerular Filtra- tion Rate (GFR) was collected for two groups: a treatment group and a place- bo group. The researchers want to compare the middle 50% of the data for both groups. Which numerical mea- sure would identify the two values that 50% of the data falls between for both groups? b. Five-number summary Correct. The answer is b. In this study one variable is categorical and the oth- er is quantitative (C’Q) . The five-num- ber summary will show five important statistical values: minimum, maximum, first quartile, median, and third quar- tile. 50% of the data falls between the first( Q1 ) and third( Q3 ) quar- tile, so the researchers should look at the five-number summary to deter- mine the values for Q1 and Q3 for both groups, as these values define the middle 50% of the data. 33. Summarizing Distributions of Two Variables: When we have two quantitative vari- ables that are from paired data (that is, each x-value is paired with a par- ticular y-value), we can use a scat- terplot to display the data. The val- ues of the explanatory variable ap- pear on the x-axis, and the values of the response variable appear on the y-axis. Each pair of (x, y) values appears as a point on the scatterplot, and when we see a completed scatter- plot, it forms a picture of the data that shows us the relationship be- tween the variables. When we look at the picture, we look for an overall pattern and whether there are any deviations (outliers) from the pat- tern. We also can describe the scat- terplot by the direction, the form, and the strength of the relationship. There are four examples of scatterplots, representing (Q’Q): cor- relation, corre- lation, correlation, relationship 34. Scatterplot (a) shows a corre- lation between the variables be- cause as the x -variable increases, the y -variable increases. There are four examples of scatter- plots, representing (Q’Q): positive cor- relation, negative correlation, no cor- relation, non-linear relationship Scatterplot (a) shows a positive corre- lation 35. Scatterplot (b) shows a correlation because as the x -vari- able increases, the y -variable de- creases. 36. Scatterplot (c) shows ; there is no apparent overall trend be- tween the two variables. 37. Scatterplot (d) shows a , or , relation- ship. 38. The relationship between two quantitative variables ( Q’Q ) can be described by looking at a scatterplot of the two vari- ables. We use three characteris- tics to describe the relationship: , , and . 39. If a scatterplot shows a pattern of points that increase from the lower left corner of the graph to the upper right corner, we say that there is a Scatterplot (b) shows a negative cor- relation because as the x -variable in- creases, the y -variable decreases. Scatterplot (c) shows no correlation*; there is no apparent overall trend be- tween the two variables. Scatterplot (d) shows a nonlinear*, or curvilinear, relationship. direction, form, and strength If a scatterplot shows a pattern of points that increase from the lower-left corner of the graph to the upper right corner, we say that there is a "positive" correlation between the two variables: correlation between the two vari- ables: when the x -variable increas- es, the y -variable increases. When the pattern goes from the up- per left corner of the graph to the lower right corner, we say that there is a correlation between the two vari- ables: when the x -variable increas- es, the y -variable decreases. 40. If a scatterplot has a pattern of points that form a reasonably straight line, we describe it as lin- ear. If the points form a pattern that is more curved than straight, we say it is , or . 41. The pattern of points tells us about the strength of the cor- relation between the variables. If the points form a pattern, we say there is a strong correlation be- tween the variables. If the points are scattered and are not tightly grouped, we say there is a weak correlation or no correlation between the variables 42. 1) Strong Positive Linear Correla- tion ( Lower left corner to upper right corner) 2) Strong Negative Linear Correla- when the x -variable increases, the y -variable increases. When the pattern goes from the up- per left corner of the graph to the low- er right corner, we say that there is a "negative" correlation between the two variables: when the x -variable in- creases, the y -variable decreases. "nonlinear, or curvilinear". tightly grouped pattern* weak correlation* or no correlation be- tween the variables 43. 3) Very weak, correlation 4) nonlinear or curvilinear associa- tion 44. An * in a scatterplot is a point that does not fit the overall trend of the other points. It is usually far away from the other points and is easy to spot. There may be more than one outlier, but if several points are grouped together away from the majority of points, we call them a *, not outliers. Consider the following scatterplot: 45. Weak Negative Linear Relationship 46. no correlation Outlier* Cluster* 47. weak positive linear correlation 48. Strong, positive linear correlation 49. 5. The larger the sample size, the greater the effect an outlier will have on correlation. True or False? 50. Side-by-side box plots are useful for which type of data? 51. Review 52. Review false: The smaller the sample size, the greater the effect of the outlier. b) Explanatory variable is categorical and response variable is quantitative. Correct. The correct answer is b. Side-by-side boxplots are useful when the explanatory variable is categorical and the response variable is quantita- tive. 53. Review 54. Review 55. Review 56. Review 57. Review [Show Less]