Tag Archives: proportion

Statistics: Sampling Distribution of the Proportion

Sampling Distribution of the Proportion

In this section, we’ll be talking about finding the probability of a sample proportion. You may remember that a sampling distribution is a distribution not of scores, but all possible sample outcomes which can be drawn given that we’re working with a specific n. In this case, the following equations can help us figure out, based on a population proportion, how likely it is that we’ll draw a sample that has a chosen proportion. A question which would require these equations may sound like the following:

“A nation-wide survey was conducted about the perception of a brand. People were asked whether they liked the brand or disliked the brand and the results showed that 65% of people liked the brand. If we were to draw a random sample of 200 people, what is the probability that 80% of people within that sample will say they like the brand?”

The population proportion, denoted as ?, is the proportion of items in the entire population with the particular characteristic that we’re interested in investigating. The sample proportion, denoted by p, is the proportion of items in the sample with the characteristic that we’re interested in investigating. The sample proportion, which by definition is a statistic, is used to estimate the population proportion, which by definition is a parameter. Continue reading

Statistics: Probability and Sampling

Introduction to Probability and Sampling

Probabilities

A probability is a fraction or a proportion of all the possible outcomes. So it’s the number of classified outcomes classified as X divided by the total number of possible outcomes (N). It’s generally reported as a decimal, but it can also be reported as a fraction or a percentage. 

What is the role of probability in populations, samples, and inferential statistics? As we discussed before, because it’s usually impossible for researchers to draw data from the entirety of a population, they draw samples. The size of the sample affects how comparable the sample population is to the general population. Probability is used to predict what kind of samples are likely to be obtained from a population. Thus, probability establishes a connection between samples and populations; we know from looking at the population how likely it is for a specific sample to be drawn. We also use proportions that exist within samples to infer the probabilities that exist within a population. Inferential statistics rely on this connection when they use sample data as the basis for making conclusions about populations. Continue reading

Statistics: Z-score Basics

Z-Score Introduction

Standardized Distributions

Sometimes when working with data sets, we want to have the scores on the distribution standardized. Essentially, this means that we convert scores from a distribution so that they fit into a model that can be used to compare and contrast distributions from different works. For example, if you have a distribution of scores that show the temperature each day over the summer in Boston, it may be recorded in Fahrenheit. Someone else in Paris may have recorded their summer temperatures as well but in Celcius. If we wanted to compare these distributions of scores based on their descriptive statistics, we may want to convert them to the same standardized unit of measurement. 

Standardized distributions have one single unit of measurement. Raw scores are transformed into this standardized unit of measurement to be compared to one another. Ultimately, they should look just like the original distribution, the only difference is that the scores have been placed on a different unit of measurement. Continue reading

Statistics: Frequency Distributions

Frequency Distributions

In statistics, a lot of tests are run using many different points of data and it’s important to understand how those data are spread out and what their individual values are in comparison with other data points. A frequency distribution is just that – an outline of what the data look like as a unit. A frequency table is one way to go about this. It’s an organized tabulation showing the number of individuals located in each category on the scale of measurement. When used in a table, you are given each score from highest to lowest (X) and next to it the number of times that score appears in the data (f). A table in which one is able to read the scores that appear in a data set and how often those particular scores appear in the data set. Here’s a link to a Khan Academy video we found to be helpful in explaining this concept.

Organizing Data into a Frequency Distribution

  1. Find the range
  2. Order the table from highest score to lowest score, not skipping scores that might not have shown up in the data set.
  3. In the next column, document how many times this score shows up in the data set

Organizing data into a group frequency table

  1. The grouped frequency table should have about 10 intervals. A good strategy is to come up with some widths according to Guideline 2 and divide the total range of numbers by that width to see if there are close to 10 intervals.
  2. The width of the interval should be a relatively simple number (like 2, 5, or 10)
  3. The bottom score in each class interval should be a multiple of the width (0-9, 10-19, 20-19, etc.)
  4. All intervals should be the same width.

Continue reading