{"id":98,"date":"2019-06-30T19:36:53","date_gmt":"2019-06-30T19:36:53","guid":{"rendered":"http:\/\/blogs.ubalt.edu\/jboettinger\/?p=98"},"modified":"2020-02-19T20:45:08","modified_gmt":"2020-02-20T01:45:08","slug":"statistics-choosing-a-test","status":"publish","type":"post","link":"https:\/\/blogs.ubalt.edu\/mathsupportcenter\/2019\/06\/30\/statistics-choosing-a-test\/","title":{"rendered":"Statistics: Choosing a Test"},"content":{"rendered":"<p>The following post is about breaking down the uses for different types of tests. More importantly, it&#8217;s designed to help you know what test to use based on the question being asked. This is not a comprehensive list of all the statistical tests out there, so if you feel that there is something missing which you would like to be included, please leave a comment below. All formulas for the tests presented here can be found in the <a href=\"http:\/\/blogs.ubalt.edu\/mathsupportcenter\/2019\/06\/30\/statistics-formula-glossary\/\">Statistics Formula Glossary<\/a> post. At the bottom is a decision tree which may be helpful in visualizing the purpose of this post.<!--more--><\/p>\n<h1>Tests Involving Means<\/h1>\n<h2>Z-Tests<\/h2>\n<ul>\n<li><span style=\"font-weight: 400\">Manipulated variables (IV): 0<\/span><\/li>\n<li><span style=\"font-weight: 400\">Measured variables (DV): 1<\/span><\/li>\n<li><span style=\"font-weight: 400\">Population Standard Deviation Known?: Yes<\/span><\/li>\n<\/ul>\n<p>This is a type of single sample comparison of means. It is important that there is one sample, the population mean is known, and &#8211; for a z test specifically &#8211; that the population standard deviation is known.<\/p>\n<p>Example word problem:<\/p>\n<p><em><span style=\"font-weight: 400\">The <\/span><span style=\"font-weight: 400\">price of gas<\/span><span style=\"font-weight: 400\"> is normally distributed across the country, \u03bc = 2.60, <\/span><span style=\"font-weight: 400\">\u03c3 = .42<\/span><span style=\"font-weight: 400\">. You want to know if our <\/span><span style=\"font-weight: 400\">prices<\/span><span style=\"font-weight: 400\"> here in Johnstown are significantly different from the population. The mean price of 40 gas stations in Johnstown is 2.29. Is the <\/span><span style=\"font-weight: 400\">price of gas<\/span><\/em><span style=\"font-weight: 400\"><em> in Johnstown significantly different from the national gas price?<\/em>*<\/span><\/p>\n<p><span style=\"font-weight: 400\">*http:\/\/www.pitt.edu\/~bertsch\/Z%20test%20practice.pdf<\/span><\/p>\n<h2>Single sample t-test<\/h2>\n<ul>\n<li><span style=\"font-weight: 400\">Manipulated variables (IV): 0<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Measured variables (DV): 1<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Population Standard Deviation Known?: No<\/span><\/li>\n<\/ul>\n<p>It is important that there is one sample, the population mean is known, and &#8211; for a t-test &#8211; that the population standard deviation is unknown.<\/p>\n<p><span style=\"font-weight: 400\">Example word problem:<\/span><\/p>\n<p><em><span style=\"font-weight: 400\">A coffee shop relocates to Italy and wants to make sure that all lattes are consistent. They believe that each latte has an average of 4 <\/span><span style=\"font-weight: 400\">oz of espresso<\/span><span style=\"font-weight: 400\">. If this is not the case, they must increase or decrease the <\/span><span style=\"font-weight: 400\">amount.<\/span><span style=\"font-weight: 400\"> A random sample of 25 lattes shows a mean of 4.6 oz of espresso and a standard deviation of .22 oz.*<\/span><\/em><\/p>\n<p><span style=\"font-weight: 400\">Remember, we\u2019re still working with a sample mean, so we need <\/span><b><i>standard error<\/i><\/b><span style=\"font-weight: 400\">. However, we don\u2019t have the population standard deviation, so we calculate the\u00a0<\/span><b><i>estimated standard error<\/i><\/b><b>.<\/b><\/p>\n<p><span style=\"font-weight: 400\">*http:\/\/www.mathandstatistics.com\/learn-stats\/hypothesis-testing\/one-sample-t-test-hypothesis-test-by-hand<\/span><\/p>\n<h2>Independent Samples t-test<\/h2>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Manipulated variables (IV): 1<\/span>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Number of levels: 2<\/span><\/li>\n<\/ul>\n<\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Measured variables (DV): 1<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Independent samples vs repeated or matched participants: Independent<\/span><\/li>\n<\/ul>\n<p>In this case, we&#8217;re no longer working with a population mean. Instead, we&#8217;re looking at the differences between two samples. Each sample represents a different level for the same manipulated (or independent) variable. If you&#8217;re unsure of whether to use a pooled-variances t-test or separate variances t-test, it would be important to first run an F-test to check for homogeneity of variance. If you run the F-test and the results show that the p-value is greater than your level of significance (in other words, you fail to reject the null hypothesis), that means your variances are not significantly different and you should use the pooled-variances t-test. If the F-test shows a p-value less than your level of significance (in other words, you reject the null hypothesis), that means your variances are significantly different, you do not have homogeneity, and you should use a separate variances t-test. They are both types of independent t-tests, though, and so a word problem for each may look the same.<\/p>\n<p><span style=\"font-weight: 400\">Example word problem:<\/span><\/p>\n<p><em><span style=\"font-weight: 400\">Let\u2019s say you\u2019re curious about whether <\/span><span style=\"font-weight: 400\">New Yorkers<\/span> <span style=\"font-weight: 400\">and <\/span><span style=\"font-weight: 400\">Kansans <\/span><span style=\"font-weight: 400\">spend a different <\/span><span style=\"font-weight: 400\">amount of money per month on movies<\/span><span style=\"font-weight: 400\">. It\u2019s impractical to ask every New Yorker and Kansan about their movie spending, so instead you ask a sample of each\u2014maybe 300 New Yorkers and 300 Kansans\u2014and the averages are $14 and $18. Is there a significant statistical difference between the two samples?*<\/span><\/em><\/p>\n<p><span style=\"font-weight: 400\">*http:\/\/docs.statwing.com\/examples-and-definitions\/t-test\/<\/span><\/p>\n<h2>Repeated Measures or Matched Pairs t-test<\/h2>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Manipulated variables (IV): 1<\/span>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Number of levels: 2<\/span><\/li>\n<\/ul>\n<\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Measured variables (DV): 1<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Independent samples vs repeated or matched participants: Repeated<\/span><\/li>\n<\/ul>\n<p>Just like with the independent samples t-test, we&#8217;re comparing the means of two different samples which represent two different levels of an independent variable. The difference here is that these samples are somehow related, either because they use the same participants or because the participants are matched together in some way.<\/p>\n<p><span style=\"font-weight: 400\">Example word problem:<\/span><\/p>\n<p><em><span style=\"font-weight: 400\">Researchers were interested in whether a new<\/span><span style=\"font-weight: 400\"> depression medication <\/span><span style=\"font-weight: 400\">could increase mood. Participants\u2019 <\/span><span style=\"font-weight: 400\">moods were measured on a standard PHQ-9 scale<\/span><span style=\"font-weight: 400\"> (in which a greater score means more severe depression)<\/span> <span style=\"font-weight: 400\">before taking the drug <\/span><span style=\"font-weight: 400\">and the average score was 14 with a standard deviation of 2. <\/span><span style=\"font-weight: 400\">After 6 weeks of taking the new drug<\/span><span style=\"font-weight: 400\">, the same participants filled out the PHQ-9 again and the new average score was an 11 with a standard deviation of 2.2. Was there a significant difference in the PHQ-9 scores<\/span> <span style=\"font-weight: 400\">before and after treatment<\/span><span style=\"font-weight: 400\">?<\/span><\/em><\/p>\n<h2>Independent One-Way ANOVA<\/h2>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Manipulated variables (IV): 1<\/span>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Number of levels: More than 2; In this case 3<\/span><\/li>\n<\/ul>\n<\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Measured variables (DV): 1<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Independent samples vs repeated or matched participants: Independent<\/span><\/li>\n<\/ul>\n<p>Now, instead of comparing two samples which represent two levels of an independent variable, we&#8217;re working with more than two. However, we are still working with one independent variable.<\/p>\n<p><span style=\"font-weight: 400\">Example word problem:<\/span><\/p>\n<p><em><span style=\"font-weight: 400\">The owner of a plant nursery wanted to know which <\/span><span style=\"font-weight: 400\">fertilizer type<\/span><span style=\"font-weight: 400\"> to recommend to her customers in order to help their <\/span><span style=\"font-weight: 400\">azaleas produce more flowers<\/span><span style=\"font-weight: 400\">. She decided she wanted to test <\/span><span style=\"font-weight: 400\">chemical fertilizer<\/span><span style=\"font-weight: 400\">, <\/span><span style=\"font-weight: 400\">composted fertilizer<\/span><span style=\"font-weight: 400\">, and <\/span><span style=\"font-weight: 400\">manure<\/span><span style=\"font-weight: 400\">. <\/span><span style=\"font-weight: 400\">She tested 10 different azalea plants per condition<\/span><span style=\"font-weight: 400\">&#8230;<\/span><\/em><\/p>\n<h2>Repeated Measures One-Way ANOVA<\/h2>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Manipulated variables (IV): 1<\/span>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Number of levels: More than 2; In this case 4<\/span><\/li>\n<\/ul>\n<\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Measured variables (DV): 1<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Independent samples vs repeated or matched participants: Repeated<\/span><\/li>\n<\/ul>\n<p>This is the same idea as the independent one-way ANOVA except that, just like with the repeated measures t-test, the samples are somehow related to one another.<\/p>\n<p><span style=\"font-weight: 400\">Example word problem:<\/span><\/p>\n<p><em><span style=\"font-weight: 400\">Researchers were interested in seeing how the <\/span><span style=\"font-weight: 400\">season <\/span><span style=\"font-weight: 400\">affects <\/span><span style=\"font-weight: 400\">overall mood<\/span><span style=\"font-weight: 400\"> in those who are not necessarily diagnosed with depression, dysthymia, or seasonal affective disorder. Participants were each asked to fill out a <\/span><span style=\"font-weight: 400\">PHQ-9<\/span><span style=\"font-weight: 400\"> once in the <\/span><span style=\"font-weight: 400\">fall<\/span><span style=\"font-weight: 400\">, once in the <\/span><span style=\"font-weight: 400\">winter<\/span><span style=\"font-weight: 400\">, once in the <\/span><span style=\"font-weight: 400\">spring<\/span><span style=\"font-weight: 400\">, and one final time in the <\/span><span style=\"font-weight: 400\">summer<\/span><span style=\"font-weight: 400\">.\u00a0<\/span><\/em><\/p>\n<h2>Two-Factor (or N-Way) ANOVA<\/h2>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Manipulated variables (IV): More than 1; in this case 2<\/span>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Number of levels for IV1: 2<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Number of levels for IV2: 3<\/span><\/li>\n<\/ul>\n<\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Measured variables (DV): 1<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">Example word problem:<\/span><\/p>\n<p><em><span style=\"font-weight: 400\">The plant nursery owner decided she wanted to change her experiment. Not only was she curious about <\/span><span style=\"font-weight: 400\">fertilizer type<\/span><span style=\"font-weight: 400\">, but she was wondering if maybe the amount of <\/span><span style=\"font-weight: 400\">water<\/span><span style=\"font-weight: 400\"> she gives to her azaleas affects the <\/span><span style=\"font-weight: 400\">bloom<\/span><span style=\"font-weight: 400\">. She kept her original three fertilizers (<\/span><span style=\"font-weight: 400\">chemical fertilizer<\/span><span style=\"font-weight: 400\">, <\/span><span style=\"font-weight: 400\">composted fertilizer<\/span><span style=\"font-weight: 400\">, and <\/span><span style=\"font-weight: 400\">manure<\/span><span style=\"font-weight: 400\">), but she also gave half of the plants in each condition <\/span><span style=\"font-weight: 400\">more water<\/span><span style=\"font-weight: 400\"> than she normally would and the other half the <\/span><span style=\"font-weight: 400\">normal amount of water<\/span><span style=\"font-weight: 400\">.\u00a0<\/span><\/em><\/p>\n<p>It&#8217;s important to remember that in this example, there are 2 levels for one independent variable and 3 for the other. This means there are 6 conditions in total: (1) Chemical fertilizer and normal water, (2) chemical fertilizer and more water, (3) composted fertilizer and normal water, (4) composted fertilizer and more water, (5) manure and normal water, and (6) manure and more water.<\/p>\n<h2>Correlation<\/h2>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Manipulated variables (IV): 0<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Measured variables (DV): 2<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Are we just relating stuff or are we trying to predict stuff?: Relate<\/span><\/li>\n<\/ul>\n<p>Remember, we&#8217;re not using correlations to compare means. We only want to find the relationship between two continuous variables.<\/p>\n<p><span style=\"font-weight: 400\">Example word problem:<\/span><\/p>\n<p><em><span style=\"font-weight: 400\">The following (imaginary) table provides data collected regarding <\/span><span style=\"font-weight: 400\">temperature in Fahrenheit<\/span><span style=\"font-weight: 400\"> and the <\/span><span style=\"font-weight: 400\">number of ice creams sold from the ice cream trucks in town<\/span><span style=\"font-weight: 400\">. Determine whether there is any sort of <\/span><span style=\"font-weight: 400\">relationship<\/span><span style=\"font-weight: 400\"> between the two variables. Make sure to describe the strength and direction of the correlation.\u00a0<\/span><\/em><\/p>\n<h2>Regression<\/h2>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Manipulated variables (IV): 0<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Measured variables (DV): 2<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Are we just relating stuff or are we trying to predict stuff?: Predict<\/span><\/li>\n<\/ul>\n<p>We can use the same data from correlations to do regressions. The difference is that there are some added steps for the purpose of predicting values using a linear equation.<\/p>\n<p><span style=\"font-weight: 400\">Example word problem:<\/span><\/p>\n<p><em><span style=\"font-weight: 400\">The following (imaginary) table provides data collected regarding <\/span><span style=\"font-weight: 400\">temperature in Fahrenheit<\/span><span style=\"font-weight: 400\"> and the <\/span><span style=\"font-weight: 400\">number of ice creams sold from the ice cream trucks in town<\/span><span style=\"font-weight: 400\">.\u00a0 Does knowledge about temperature <\/span><span style=\"font-weight: 400\">predict <\/span><span style=\"font-weight: 400\">ice cream sales?<\/span><\/em><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-176\" src=\"http:\/\/blogs.ubalt.edu\/jboettinger\/wp-content\/uploads\/sites\/1114\/2019\/06\/Decision-Tree-300x144.jpg\" alt=\"\" width=\"790\" height=\"379\" srcset=\"https:\/\/blogs.ubalt.edu\/mathsupportcenter\/wp-content\/uploads\/sites\/1114\/2019\/06\/Decision-Tree-300x144.jpg 300w, https:\/\/blogs.ubalt.edu\/mathsupportcenter\/wp-content\/uploads\/sites\/1114\/2019\/06\/Decision-Tree-768x368.jpg 768w, https:\/\/blogs.ubalt.edu\/mathsupportcenter\/wp-content\/uploads\/sites\/1114\/2019\/06\/Decision-Tree-1024x490.jpg 1024w, https:\/\/blogs.ubalt.edu\/mathsupportcenter\/wp-content\/uploads\/sites\/1114\/2019\/06\/Decision-Tree-624x299.jpg 624w, https:\/\/blogs.ubalt.edu\/mathsupportcenter\/wp-content\/uploads\/sites\/1114\/2019\/06\/Decision-Tree.jpg 1473w\" sizes=\"(max-width: 790px) 100vw, 790px\" \/><\/p>\n","protected":false},"excerpt":{"rendered":"<p>The following post is about breaking down the uses for different types of tests. More importantly, it&#8217;s designed to help you know what test to use based on the question being asked. This is not a comprehensive list of all the statistical tests out there, so if you feel that there is something missing which [&hellip;]<\/p>\n","protected":false},"author":1347,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[151,153,148,152,158,155,156,150,159,160],"tags":[4,3,5,95,74,8,82,7,6,88,84,79,71,87,116],"_links":{"self":[{"href":"https:\/\/blogs.ubalt.edu\/mathsupportcenter\/wp-json\/wp\/v2\/posts\/98"}],"collection":[{"href":"https:\/\/blogs.ubalt.edu\/mathsupportcenter\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.ubalt.edu\/mathsupportcenter\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.ubalt.edu\/mathsupportcenter\/wp-json\/wp\/v2\/users\/1347"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.ubalt.edu\/mathsupportcenter\/wp-json\/wp\/v2\/comments?post=98"}],"version-history":[{"count":7,"href":"https:\/\/blogs.ubalt.edu\/mathsupportcenter\/wp-json\/wp\/v2\/posts\/98\/revisions"}],"predecessor-version":[{"id":596,"href":"https:\/\/blogs.ubalt.edu\/mathsupportcenter\/wp-json\/wp\/v2\/posts\/98\/revisions\/596"}],"wp:attachment":[{"href":"https:\/\/blogs.ubalt.edu\/mathsupportcenter\/wp-json\/wp\/v2\/media?parent=98"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.ubalt.edu\/mathsupportcenter\/wp-json\/wp\/v2\/categories?post=98"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.ubalt.edu\/mathsupportcenter\/wp-json\/wp\/v2\/tags?post=98"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}