In this chapter, we will complete our discussion of descriptive statistics with a look at standard scores. We will see how standard scores play an important role in describing the relative location of a given score, and we will describe how standard scores are related to another measure of location, percentiles.
From standard scores, we will move to the standard normal distribution and a distribution known as the normal probability distribution. In later chapters, you will see that probability distributions play an important role in statistical inference.
We will conclude the chapter with a discussion of normality plots, which are used to provide a visual indication of whether or not a set of scores follows a normal distribution. Normality plots and standard scores are usually a routine part of data checks.
Your R objectives for this chapter are to: (a) transform a data set into a standard score known as a z score; (b) transform the z scores into T scores, the most commonly used standard score; (c) create a normality plot; and (d) create a Q-Q summary table.
A standard score is any score expressed in standard deviation units. Such scores are also called standardized deviates or standard deviates. The z score is a standard score that is commonly used in statistics. It is found as:
\[ \begin{equation} z_X = \frac {X-M_X} {s_X} \tag{9-1} \end{equation} \] Here, X is a score, MX is the sample mean of the X scores, and sX is the sample standard deviation of the scores (where there is no potential confusion with a second variable, z is used without subscript for the standardized score, M is used without subscript for the original scale variable mean and s is used without subscript for the original scale variable standard deviation). A z score is found using a transformation where the mean of the scores is first subtracted from a given score and the result is divided by the standard deviation of the scores. This transformation of X into z yields a new set of scores (z scores) whose mean is 0 and whose standard deviation is 1. (Note that if the population mean and standard deviation are known, they should be used in place of M and s to find z scores. R uses the sample statistics M and s to find z scores.)
We can illustrate the z-transformation of Equation (9-1) with a set of data where the mean of the scores is 65 and the standard deviation of the scores is 10. The z score corresponding to the score of 75 is 1:
\[ z = \frac {75-65} {10} = \frac {10}{10} = 1.0 \] The z score corresponding to the score of 60 is -.50:
\[ z = \frac {60-65} {10} = \frac {-5}{10} = -0.50 \] #### Standard Deviation Units
These transformed scores are expressed in terms of standard deviation units. What does having a score written in standard deviation units mean? If you had 24 oranges, how many dozen oranges would you have? The answer is two dozen oranges. You arrive at this answer by dividing the number of oranges, 24, by the unit for a dozen oranges, 12. In other words, you transform 24 oranges into units in terms of dozens.
When we transform a score into a z score, we are expressing the distance that the score lies from its mean in terms of standard deviation units. Thus, in the example, the score of 75 was 0 points, or 1 standard deviation unit, above its mean and was expressed as a z score of 1. Similarly, the score of 60 was 5 points, or 1/2 of a standard deviation, below the mean and was expressed in standard deviation units as a z score of-.50. The minus sign indicates that the score is below its mean.
Other standard scores are generally based on the z score. Given a z score, we can then use another transformation to find a standard score with any mean and standard deviation that we want. The general equation used to arrive at these standard scores is:
Equation 9-2
\[ \begin{equation} G = u_G + s_G z \tag{9-2} \end{equation} \] Here, G (for general standard score) is the new standard score, uG is the mean, sG is the standard deviation, and z is a z score found using Equation (9-1).
We use a second transformation to transform a z score to another scale of measurement because a z score may be negative and/or contain several decimal digits. By transforming z scores to standard scores with a mean of 50 or 500 and a standard deviation of 10 or 100, we retain two or three decimal place accuracy and avoid negative numbers. Thus, this second transformation yields numbers that are generally more palatable to the layperson.
The most commonly used standard-score scale is the T score, which has a mean of 50 and a standard deviation of 10. T score is defined as:
\[ \begin{equation} T=50+10z \tag{9-3} \end{equation} \] When Equation (9-3) yields a decimal number, the number is rounded to the nearest whole number. Using this transformation, a z score of -.60 on a test would be transformed into a T score of:
\[ \begin{equation} T=50+10(-0.60) = 44 \tag{9-3} \end{equation} \] Here, we have removed both the minus sign and the decimal point, but we have retained the relative position of the score to its mean; that is, 44 is 6 points or .60 of a standard deviation (10) below the mean of 50. T scores are used to report performance on standardized tests such as the Differential Aptitude Tests, the Strong Vocational Interest Blank, and the Metropolitan Achievement Tests.
Other examples of the use of standard scores to report performance are: the Graduate Record Examination (GRE), \(\mu\) = 500 and \(\sigma\) = 100; the College Entrance Examination Board (CEEB), \(\mu\) = 500 and \(\sigma\) = 100; the American College Testing Program (ACT), \(\mu\) = 20 and \(\sigma\) = 5; Wechsler deviation IQ, \(\mu\) = 100 and \(\sigma\) = 15; and the Standard-Binet deviation IQ, \(\mu\) = 100 and \(\sigma\) = 16.
R easily finds z scores and T scores using two approaches. The steps for obtaining these standardized scores will be included in the SUMMARY section. We will use the data from exercise 1 of chapter 6 (repeated in Figure 9a) as an example in this chapter.
A list of the major reasons for transforming a set of scores into a set of standard scores follows. We will consider each of these uses of standard scores in more detail in later sections.
If you were told that your score of 60 was a z score 0.00, you would know immediately that you scored at the mean of the students in your class. If you received a z score of -2.00, you would be unhappy because your score would be two standard deviations below the mean of the class. If your score was +2.50, you would be very happy because you would have scored 2-1/2 standard deviations above the mean of the class.
Figure 9c shows the distribution of grades, in terms of z scores, that you might expect to find in a large, undergraduate class. You can see that a person with a z score of 2.50 would receive an A, a person with a z score of 0 would receive a C, and a person with a z score of -2.00 would receive an F.
If you received a score of 90 on a mathematics test and a score of 60 on an English test, on which test would you have the relatively higher score? To answer this question, you must know what the mean and standard deviation are for each test. Without knowledge of the mean and standard deviation, you have no idea what these scores mean relative to each other.
For example, 90 may be the lowest score on the mathematics test and 60 may be the highest score on the English test, or vice versa. If these scores are stated as z scores, however, they have the same mean (0.00) and standard deviation (1.00) and are therefore directly comparable. For example, if the mathematics score of 90 is transformed into a z score of 0.00 and the English score of 60 is transformed into a z score of +1.00, you know that you did relatively better in English than in mathematics. However, if the mathematics score of 90 transforms into a z score of 2.00 and the English score of 60 transforms into a z score of -1.65, you know that you did relatively better in mathematics than in English.
The data in exercise 1 of chapter 6 provides us with an excellent example of the use of z scores to identify outliers. Figure 9b shows the original data from this exercise and its transformed z scores and T scores. In Figure 9b, the outlier is easily spotted as the score 189, whose z score is 3.9045 and whose T score is 89.0448. Because z scores enable you to easily spot potential outliers, they should be routinely used in data analyses involving interval or ratio data.
To understand the use of z scores with the normal distribution, we will discuss the normal distribution in more detail in the next section. We will also show how you can use z scores when your data follow a normal distribution or when you can assume that your data was sampled from a normal distribution.
The normal distribution, an example of which is shown in Figure 9c, is also called the Gaussian or the bell-shaped distribution. It is a mathematical curve attributed to Abraham De Moivre (1667-1754). De Moivre developed the normal distribution in his search for a mathematical curve that would describe probabilities found in games of chance.
The equation for the normal distribution is:
\[ \begin{align} y = P(X) &= \frac {e^{-{(X-\mu)}^2 / {2 \sigma^2}}} {\sigma \sqrt{2 \pi}} &= \frac {1} {\sqrt{2 \pi \sigma ^2}} e^{{-(X-\mu)^2} / {2 \sigma^2}} \tag{9-4} \end{align} \] In Equation (9-4), y is referred to as the probability density, that is, the value of y for a given value of x. Any normal distribution is determined by the fixed values of its mean, \(\mu\) , and its variance, \(\sigma^2\). This equation may look complex, but with a little practice you can have R plot different normal distributions (see appendix G for further information).
From Equation (9-4), you can see that the shape of a given normal distribution depends on its mean (\(\mu\)) and its variance (\(\sigma^2\)). This means that for any normal distribution, the value of the mean is independent of the value of the variance, and vice versa. That is, knowing the value of a normal distribution’s mean does not give you any information about its variance, and vice versa. The various normal distributions shown in figures 9d, 9e, and 9f illustrate these points.
Figure 9d shows two normal distributions with different means but the same variances. Figure 9e shows two normal distributions with the same means but with different variances. Figure 9f shows two normal distributions with different means and different variances. As we shall see, all of these normal distributions can be transformed, using Equation (9-1), to the standard normal distribution, which has a mean of 0 and a variance of 1.
The normal distribution is described as a theoretical probability distribution. The term theoretical applies because the normal distribution is a mathematical function. A probability distribution is a frequency distribution where the ticks on the y-axis represent probabilities. In the following section, we present examples of discrete and continuous probability distributions.
Consider a class that is made up of 15 Education students, 10 Business students, and 5 Communication students. If you were to select one student at random from this class, what is the probability that this student would be a Business student? If we define probability as the number of possible choices divided by the total number of choices, the answer is 10/30 or 1 out of 3 or .3333 because there are 10 Business students and 30 total students, (10/30 = 1/3 = .3333). The probability distribution for these students is shown in Figure 9g.
Figure 9g shows that the probability of selecting an Education student is 15/30 or 1/2, and the probability of selecting a Communication student is 5/30 or 1/6. The distribution shown in figure 9g is a discrete probability distribution because only one of three outcomes is possible when we select a student at random from the class under consideration. That is, the classification system forces each student into only one category and no students are allowed to be between the categories, for example, no one can be classified as a double-major.
In figure 9h, we have the discrete probability distribution for a single throw of a pair of dice. (A discussion of how one finds these probabilities is not necessary here. If you are interested, however, consult appendix F where the binomial distribution and various probability rules are presented.) In Figure 9h, the probability of rolling a pair of dice whose total is 12 (that is, 6 and 6) is 1 out of 36, and the probability of throwing a 7 is 6 out of 36. You can see that if you throw a pair of dice just one time the most likely event is a 7; a 7 can show up as 1 and 6, 2 and 5, 3 and 4, 4 and 3, 5 and 2, or 6 and 1. Seven has the highest probability of any event, although any event between 2 and 12 is possible though not equally probable.
An important feature of the probability distribution shown in Figure 9h is that we can find the probability of randomly obtaining a number less than or equal to a given number by adding the probability of the given number to the probabilities of the numbers that are less than it. For example, the probability of throwing a pair of dice and having the result be a number that is less than or equal to 4 is 6/36, that is, the sum of 3/36 (the probability of throwing a 4) plus 2/36 (the probability of throwing a 3) plus 1/36 (the probability of throwing a 2). This probability is commonly written as:
\[ p(X \le 4) = p(X=4) + p(X=3) + p(X=2)= \frac{3}{36} + \frac{2}{36} + \frac{1}{36} = \frac{6}{36} = \frac{1}{6} \] Here, p stands for probability, X is the event of interest, and p(X ≤ 4) is read as “the probability that X is less than or equal to 4.”
Given that we know the probability of rolling two dice and randomly obtaining a number that is less than or equal to 4, we can then easily find the probability of rolling two dice and randomly obtaining a number that is greater than 4. The probability of obtaining a number greater than 4 is found as 1 minus the probability of randomly obtaining a number less than or equal to 4:
\[ p(X \gt 4) = 1 - p(X \le 4) = 1-\frac{1}{6} = \frac{5}{6} \] This works because the probability of one of the listed events happening when we throw two dice is 1: the sum of all of the probabilities in a probability distribution is always 1.
We can also find the probability that a number will occur that is between two numbers. For example, what is the probability that when we throw two dice, the result will be a number between (and including) 9 and 5? To answer this question, we must add the probabilities associated with the numbers (i.e., events) 9, 8, 7, 6, and 5. Written symbolically we have:
\[ \begin{align} p(5 \le X \le 9) &= p(X=5) + p(X=6) + p(X=7) + p(X=8) + p(X=9) \\ &= \frac{4}{36} + \frac{5}{36} + \frac{6}{36} + \frac{5}{36} + \frac{4}{36} \\ &= \frac{24}{36} \\ &= \frac{2}{3} \end{align} \] ### Example 3: A Continuous Probability Distribution
The normal distribution shown in Figure 9i is an example of a continuous probability, distribution. It is a special normal distribution called the standard normal distribution because it has a mean of 0 and a standard deviation of 1, that is, it is a continuous distribution of z scores. In a continuous probability distribution, the probability of a single event occurring is 0. This happens because in a continuous distribution, a probability is found as the area between two points. Given only one point, there is no area between that point and itself.
In a continuous probability distribution, however, we are able to find the area between two distinct points, that is for an interval. To illustrate this property, intervals, in terms of standard deviation units, have been marked on the x-axis of figure 9i. Associated with each interval in figure 9i is the probability of randomly obtaining a score from that interval by chance. For example, the probability of drawing a score at random that falls between 0 and 1 is .3413.
We can use the intervals and their associated probabilities to find the probability of choosing a z score at random from a variety of intervals. For example, the probability of selecting a z score at random that is less than or equal to 0 is .50, that is, p(z < 0) = .50.
We find this probability in several ways. Perhaps the easiest way is to observe that the normal distribution is symmetrically distributed and that 0 is the score at the center of this distribution; that is, a vertical line through 0 cuts the normal distribution in half. Then, since the area under a probability distribution always equals 1, we know that half of 1 is .50. Another method is to add up the probabilities in the intervals to the left of 0.
Then we have:
\[
\begin{align}
p(-1.0 \lt z \lt 0.0) &= .3413 \\
+ p(-2.0 \lt z \lt -1.0) &= +.1359 \\
+ p(-3.0 \lt z \lt -2.0) &= + .0214 \\
+ p(-∞ \lt z \lt -3.0) &= + .0013 \\
Sum &= .4999 ≈ .5000 = p(z < 0)
\end{align}
\] ### Figure 9g A discrete probability distribution based on
major
Other probabilities that will be of interest to us as we study statistical inference are: The probability of randomly obtaining a z score within one standard deviation of the mean; that is,
\[ p(-1 \le z \le 1)= .6826, \text{or approximately 68%} \]
The probability of obtaining a z score within two standard deviations of the mean; that is,
\[ p(-2 \le z \le 2)= .9544, \text{or approximately 95%} \]
The probability of obtaining a z score within three standard deviations of the mean; that is,
\[ p(-3 \le z \le 3)= .9972, \text{or approximately 99.7%} \] ### The Importance of Probability Distributions
Probability distributions play their most important role in inferential statistics because when we use inferential statistics, we set an a priori probability that a sample will fall into, or out of, a given interval. In the following chapters, we will discuss inferential statistics, using one or more probability distributions in each chapter.
In descriptive statistics, probability distributions allow us to determine the percentage of scores that fall below a given score. That is, they allow us to determine the location of a given score with respect to the other scores. It is this property of probability distributions that we will discuss in this chapter.
In this section, we will bring together the statistical terms that are of interest in this chapter: z scores, percentiles, percentile rank, and the standard normal probability distribution. An important point to remember, however, is that z scores do not transform scores into scores that are normally distributed. In fact, the z score transformation does not alter the original distribution of the scores. If your original set of scores is rectangular, then their associated z scores will also nave a rectangular distribution.
If you have a set of scores that follows a normal distribution or can be assumed to have been sampled from a normal distribution, however, you can use the properties of the standard normal probability distribution to obtain additional information. This is possible because all scores that follow a normal distribution can be transformed into z scores that have a standard normal distribution.
A percentile is the score below which a given percentage of scores fall. The percentile rank is the percentage of scores below a given score or percentile.
Given a probability distribution, the percentile rank of a given score is found by changing the probability into a percentage. To accomplish this, you multiply the probability by 100 and append % to the answer. For example, we have found that the probability of randomly obtaining a score below 0.00 is .5000. If we multiply this probability by 100 and then append %, the score 0.00 has a percentile rank of 50%, and the 50th percentile is 0.00. Remember that in a distribution, the median is that score below which 50% of the scores fall. Therefore, 0.00 is the median of the standard normal distribution.
In statistical work, two of the most common problems are: (a) to find the probability associated with a given z score interval and (b) to find the z score interval associated with a given probability. In appendix A, Table 9.1(a) can be used to solve problem 1 and Table 9.1(b) can be used to solve problem 2. In using these tables, we will adopt the notation z[p] where p represents the proportion of scores below a given z score. Therefore, z[p] is the percentile below which p*100% of the scores fall. For example, z[.50] = 0.00. Here, 50% of the scores fall below the z score of 0.00. The z scores in Table 9.1(a) and Table 9.1(b) follow a normal distribution and are often referred to as standard normal deviates.
This table contains probabilities associated with intervals between the mean z score of 0.00 and any z score, to two decimal places and up to a z score of 3.09. We can use Table 9.1(a) to find the probability of randomly drawing a z score from intervals under the standard normal distribution between -3.09 and 3.09. For example, the probability of randomly selecting a z score between 0.00 and 1.64 is .4495. You find this probability in Table 9.1(a) by first locating the z score of 1.6 in the table’s left-most column and then moving to the right along the row associated with 1.6 until you are below the column labeled . 04. Then at the coordinates of 1.6 and. 04 (the sum of which yields 1.64), you will find the probability .4495.
0 | 0.01 | 0.02 | 0.03 | 0.04 | 0.05 | 0.06 | 0.07 | 0.08 | 0.09 | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 0.0000 | 0.0040 | 0.0080 | 0.0120 | 0.0160 | 0.0199 | 0.0239 | 0.0279 | 0.0319 | 0.0359 |
0.1 | 0.0398 | 0.0438 | 0.0478 | 0.0517 | 0.0557 | 0.0596 | 0.0636 | 0.0675 | 0.0714 | 0.0753 |
0.2 | 0.0793 | 0.0832 | 0.0871 | 0.0910 | 0.0948 | 0.0987 | 0.1026 | 0.1064 | 0.1103 | 0.1141 |
0.3 | 0.1179 | 0.1217 | 0.1255 | 0.1293 | 0.1331 | 0.1368 | 0.1406 | 0.1443 | 0.1480 | 0.1517 |
0.4 | 0.1554 | 0.1591 | 0.1628 | 0.1664 | 0.1700 | 0.1736 | 0.1772 | 0.1808 | 0.1844 | 0.1879 |
0.5 | 0.1915 | 0.1950 | 0.1985 | 0.2019 | 0.2054 | 0.2088 | 0.2123 | 0.2157 | 0.2190 | 0.2224 |
0.6 | 0.2257 | 0.2291 | 0.2324 | 0.2357 | 0.2389 | 0.2422 | 0.2454 | 0.2486 | 0.2517 | 0.2549 |
0.7 | 0.2580 | 0.2611 | 0.2642 | 0.2673 | 0.2704 | 0.2734 | 0.2764 | 0.2794 | 0.2823 | 0.2852 |
0.8 | 0.2881 | 0.2910 | 0.2939 | 0.2967 | 0.2995 | 0.3023 | 0.3051 | 0.3078 | 0.3106 | 0.3133 |
0.9 | 0.3159 | 0.3186 | 0.3212 | 0.3238 | 0.3264 | 0.3289 | 0.3315 | 0.3340 | 0.3365 | 0.3389 |
1 | 0.3413 | 0.3438 | 0.3461 | 0.3485 | 0.3508 | 0.3531 | 0.3554 | 0.3577 | 0.3599 | 0.3621 |
1.1 | 0.3643 | 0.3665 | 0.3686 | 0.3708 | 0.3729 | 0.3749 | 0.3770 | 0.3790 | 0.3810 | 0.3830 |
1.2 | 0.3849 | 0.3869 | 0.3888 | 0.3907 | 0.3925 | 0.3944 | 0.3962 | 0.3980 | 0.3997 | 0.4015 |
1.3 | 0.4032 | 0.4049 | 0.4066 | 0.4082 | 0.4099 | 0.4115 | 0.4131 | 0.4147 | 0.4162 | 0.4177 |
1.4 | 0.4192 | 0.4207 | 0.4222 | 0.4236 | 0.4251 | 0.4265 | 0.4279 | 0.4292 | 0.4306 | 0.4319 |
1.5 | 0.4332 | 0.4345 | 0.4357 | 0.4370 | 0.4382 | 0.4394 | 0.4406 | 0.4418 | 0.4429 | 0.4441 |
1.6 | 0.4452 | 0.4463 | 0.4474 | 0.4484 | 0.4495 | 0.4505 | 0.4515 | 0.4525 | 0.4535 | 0.4545 |
1.7 | 0.4554 | 0.4564 | 0.4573 | 0.4582 | 0.4591 | 0.4599 | 0.4608 | 0.4616 | 0.4625 | 0.4633 |
1.8 | 0.4641 | 0.4649 | 0.4656 | 0.4664 | 0.4671 | 0.4678 | 0.4686 | 0.4693 | 0.4699 | 0.4706 |
1.9 | 0.4713 | 0.4719 | 0.4726 | 0.4732 | 0.4738 | 0.4744 | 0.4750 | 0.4756 | 0.4761 | 0.4767 |
2 | 0.4772 | 0.4778 | 0.4783 | 0.4788 | 0.4793 | 0.4798 | 0.4803 | 0.4808 | 0.4812 | 0.4817 |
2.1 | 0.4821 | 0.4826 | 0.4830 | 0.4834 | 0.4838 | 0.4842 | 0.4846 | 0.4850 | 0.4854 | 0.4857 |
2.2 | 0.4861 | 0.4864 | 0.4868 | 0.4871 | 0.4875 | 0.4878 | 0.4881 | 0.4884 | 0.4887 | 0.4890 |
2.3 | 0.4893 | 0.4896 | 0.4898 | 0.4901 | 0.4904 | 0.4906 | 0.4909 | 0.4911 | 0.4913 | 0.4916 |
2.4 | 0.4918 | 0.4920 | 0.4922 | 0.4925 | 0.4927 | 0.4929 | 0.4931 | 0.4932 | 0.4934 | 0.4936 |
2.5 | 0.4938 | 0.4940 | 0.4941 | 0.4943 | 0.4945 | 0.4946 | 0.4948 | 0.4949 | 0.4951 | 0.4952 |
2.6 | 0.4953 | 0.4955 | 0.4956 | 0.4957 | 0.4959 | 0.4960 | 0.4961 | 0.4962 | 0.4963 | 0.4964 |
2.7 | 0.4965 | 0.4966 | 0.4967 | 0.4968 | 0.4969 | 0.4970 | 0.4971 | 0.4972 | 0.4973 | 0.4974 |
2.8 | 0.4974 | 0.4975 | 0.4976 | 0.4977 | 0.4977 | 0.4978 | 0.4979 | 0.4979 | 0.4980 | 0.4981 |
2.9 | 0.4981 | 0.4982 | 0.4982 | 0.4983 | 0.4984 | 0.4984 | 0.4985 | 0.4985 | 0.4986 | 0.4986 |
3 | 0.4987 | 0.4987 | 0.4987 | 0.4988 | 0.4988 | 0.4989 | 0.4989 | 0.4989 | 0.4990 | 0.4990 |
3.1 | 0.4990 | 0.4991 | 0.4991 | 0.4991 | 0.4992 | 0.4992 | 0.4992 | 0.4992 | 0.4993 | 0.4993 |
3.2 | 0.4993 | 0.4993 | 0.4994 | 0.4994 | 0.4994 | 0.4994 | 0.4994 | 0.4995 | 0.4995 | 0.4995 |
3.3 | 0.4995 | 0.4995 | 0.4995 | 0.4996 | 0.4996 | 0.4996 | 0.4996 | 0.4996 | 0.4996 | 0.4997 |
Note that Table 9.1(a) only contains probabilities for intervals from the right half of the normal curve, that is, the half with positive z scores. Because of the symmetry of the normal curve, however, we can find the probabilities associated with intervals in the left half of the curve by using the intervals on the right half of the curve. For example, the probability associated with the interval from 0 to -1.64 is .4495, the same as that for 0.00 to 1.64. Therefore, the probabilities associated with intervals having negative boundaries are found using Table 9.1(a) and ignoring the signs of the boundaries.
We can use Table 9.1(a) to find the percentile rank of the score 1.64. Since Table 9.1(a) provides us with the probability of obtaining a z score between the mean and a given score, we know that the probability of randomly obtaining a z score between 0.00 and 1.64 is .4495. We also know the probability of obtaining a score below 0.00 is .5000. Therefore, the probability of randomly obtaining a score below 1.64 is the sum of these two probabilities, or .9495. Then, if we multiply .9495 by 100, we have that the score 1.64 has a percentile rank of 94.95%, or approximately 95%, and that the 95th percentile is 1.64. Using our new notation, we have z[.9495] = 1.64. Similarly, we have z[0505] = -1.64, since the area below -1.64 is .0505, which is .5000 – .4495.
###Table 9.1(b) Cumulative Normal Distribution – Values of \(z_p\)0 | 0.01 | 0.02 | 0.03 | 0.04 | 0.05 | 0.06 | 0.07 | 0.08 | 0.09 | |
---|---|---|---|---|---|---|---|---|---|---|
0 | -Inf | -2.3263 | -2.0537 | -1.8808 | -1.7507 | -1.6449 | -1.5548 | -1.4758 | -1.4051 | -1.3408 |
0.1 | -1.2816 | -1.2265 | -1.1750 | -1.1264 | -1.0803 | -1.0364 | -0.9945 | -0.9542 | -0.9154 | -0.8779 |
0.2 | -0.8416 | -0.8064 | -0.7722 | -0.7388 | -0.7063 | -0.6745 | -0.6433 | -0.6128 | -0.5828 | -0.5534 |
0.3 | -0.5244 | -0.4959 | -0.4677 | -0.4399 | -0.4125 | -0.3853 | -0.3585 | -0.3319 | -0.3055 | -0.2793 |
0.4 | -0.2533 | -0.2275 | -0.2019 | -0.1764 | -0.1510 | -0.1257 | -0.1004 | -0.0753 | -0.0502 | -0.0251 |
0.5 | 0.0000 | 0.0251 | 0.0502 | 0.0753 | 0.1004 | 0.1257 | 0.1510 | 0.1764 | 0.2019 | 0.2275 |
0.6 | 0.2533 | 0.2793 | 0.3055 | 0.3319 | 0.3585 | 0.3853 | 0.4125 | 0.4399 | 0.4677 | 0.4959 |
0.7 | 0.5244 | 0.5534 | 0.5828 | 0.6128 | 0.6433 | 0.6745 | 0.7063 | 0.7388 | 0.7722 | 0.8064 |
0.8 | 0.8416 | 0.8779 | 0.9154 | 0.9542 | 0.9945 | 1.0364 | 1.0803 | 1.1264 | 1.1750 | 1.2265 |
0.9 | 1.2816 | 1.3408 | 1.4051 | 1.4758 | 1.5548 | 1.6449 | 1.7507 | 1.8808 | 2.0537 | 2.3263 |
V1 | V2 | V3 | V4 | V5 | V6 | V7 | |
---|---|---|---|---|---|---|---|
p | 0.0005 | 0.0010 | 0.0050 | 0.0100 | 0.025 | 0.0500 | 0.1000 |
z | -3.2905 | -3.0902 | -2.5758 | -2.3264 | -1.960 | -1.6448 | -1.2815 |
V1 | V2 | V3 | V4 | V5 | V6 | V7 | |
---|---|---|---|---|---|---|---|
p | 0.9995 | 0.9990 | 0.9950 | 0.9900 | 0.975 | 0.9500 | 0.9000 |
z | 3.2905 | 3.0902 | 2.5758 | 2.3264 | 1.960 | 1.6448 | 1.2815 |
This table contains z scores associated with probabilities for the intervals bounded by the z score and minus infinity (-∞). That is, the probabilities found in the margins of Table 9.1(b) represent the areas below the z scores found in the body of the table. We can use Table 9.1(b) to find z[p]. Here, z[p] can be interpreted as the z score below which a given proportion of z scores fall, or the z score percentile for a given percentile rank. For example, the z score with a percentile rank of 95% is found as 1.64, that is, z[.95] = 1.64. You find this z score in Table 9.1(b) by first locating the probability of p = .90 in the table’s left-most column, and then moving to the right along the row associated with .90 until you are below the column labeled .05. Then at the coordinates of .90 and .05 (the sum of which yields .95), you will find the z score 1.64.
Note that the lower half of Table 9.1(b) contains more accurate z scores for special probability values. For example, in this special table we find that the z score whose percentile rank is 95.0% is 1.645, or z[.950] = 1.645. Thus, this part of the table contains probabilities and z scores with accuracy to the thousandths place. These special values will be helpful when we discuss statistical inference in later chapters.
0 | 0.01 | 0.02 | 0.03 | 0.04 | 0.05 | 0.06 | 0.07 | 0.08 | 0.09 | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 0.5000 | 0.5040 | 0.5080 | 0.5120 | 0.5160 | 0.5199 | 0.5239 | 0.5279 | 0.5319 | 0.5359 |
0.1 | 0.5398 | 0.5438 | 0.5478 | 0.5517 | 0.5557 | 0.5596 | 0.5636 | 0.5675 | 0.5714 | 0.5753 |
0.2 | 0.5793 | 0.5832 | 0.5871 | 0.5910 | 0.5948 | 0.5987 | 0.6026 | 0.6064 | 0.6103 | 0.6141 |
0.3 | 0.6179 | 0.6217 | 0.6255 | 0.6293 | 0.6331 | 0.6368 | 0.6406 | 0.6443 | 0.6480 | 0.6517 |
0.4 | 0.6554 | 0.6591 | 0.6628 | 0.6664 | 0.6700 | 0.6736 | 0.6772 | 0.6808 | 0.6844 | 0.6879 |
0.5 | 0.6915 | 0.6950 | 0.6985 | 0.7019 | 0.7054 | 0.7088 | 0.7123 | 0.7157 | 0.7190 | 0.7224 |
0.6 | 0.7257 | 0.7291 | 0.7324 | 0.7357 | 0.7389 | 0.7422 | 0.7454 | 0.7486 | 0.7517 | 0.7549 |
0.7 | 0.7580 | 0.7611 | 0.7642 | 0.7673 | 0.7704 | 0.7734 | 0.7764 | 0.7794 | 0.7823 | 0.7852 |
0.8 | 0.7881 | 0.7910 | 0.7939 | 0.7967 | 0.7995 | 0.8023 | 0.8051 | 0.8078 | 0.8106 | 0.8133 |
0.9 | 0.8159 | 0.8186 | 0.8212 | 0.8238 | 0.8264 | 0.8289 | 0.8315 | 0.8340 | 0.8365 | 0.8389 |
1 | 0.8413 | 0.8438 | 0.8461 | 0.8485 | 0.8508 | 0.8531 | 0.8554 | 0.8577 | 0.8599 | 0.8621 |
1.1 | 0.8643 | 0.8665 | 0.8686 | 0.8708 | 0.8729 | 0.8749 | 0.8770 | 0.8790 | 0.8810 | 0.8830 |
1.2 | 0.8849 | 0.8869 | 0.8888 | 0.8907 | 0.8925 | 0.8944 | 0.8962 | 0.8980 | 0.8997 | 0.9015 |
1.3 | 0.9032 | 0.9049 | 0.9066 | 0.9082 | 0.9099 | 0.9115 | 0.9131 | 0.9147 | 0.9162 | 0.9177 |
1.4 | 0.9192 | 0.9207 | 0.9222 | 0.9236 | 0.9251 | 0.9265 | 0.9279 | 0.9292 | 0.9306 | 0.9319 |
1.5 | 0.9332 | 0.9345 | 0.9357 | 0.9370 | 0.9382 | 0.9394 | 0.9406 | 0.9418 | 0.9429 | 0.9441 |
1.6 | 0.9452 | 0.9463 | 0.9474 | 0.9484 | 0.9495 | 0.9505 | 0.9515 | 0.9525 | 0.9535 | 0.9545 |
1.7 | 0.9554 | 0.9564 | 0.9573 | 0.9582 | 0.9591 | 0.9599 | 0.9608 | 0.9616 | 0.9625 | 0.9633 |
1.8 | 0.9641 | 0.9649 | 0.9656 | 0.9664 | 0.9671 | 0.9678 | 0.9686 | 0.9693 | 0.9699 | 0.9706 |
1.9 | 0.9713 | 0.9719 | 0.9726 | 0.9732 | 0.9738 | 0.9744 | 0.9750 | 0.9756 | 0.9761 | 0.9767 |
2 | 0.9772 | 0.9778 | 0.9783 | 0.9788 | 0.9793 | 0.9798 | 0.9803 | 0.9808 | 0.9812 | 0.9817 |
2.1 | 0.9821 | 0.9826 | 0.9830 | 0.9834 | 0.9838 | 0.9842 | 0.9846 | 0.9850 | 0.9854 | 0.9857 |
2.2 | 0.9861 | 0.9864 | 0.9868 | 0.9871 | 0.9875 | 0.9878 | 0.9881 | 0.9884 | 0.9887 | 0.9890 |
2.3 | 0.9893 | 0.9896 | 0.9898 | 0.9901 | 0.9904 | 0.9906 | 0.9909 | 0.9911 | 0.9913 | 0.9916 |
2.4 | 0.9918 | 0.9920 | 0.9922 | 0.9925 | 0.9927 | 0.9929 | 0.9931 | 0.9932 | 0.9934 | 0.9936 |
2.5 | 0.9938 | 0.9940 | 0.9941 | 0.9943 | 0.9945 | 0.9946 | 0.9948 | 0.9949 | 0.9951 | 0.9952 |
2.6 | 0.9953 | 0.9955 | 0.9956 | 0.9957 | 0.9959 | 0.9960 | 0.9961 | 0.9962 | 0.9963 | 0.9964 |
2.7 | 0.9965 | 0.9966 | 0.9967 | 0.9968 | 0.9969 | 0.9970 | 0.9971 | 0.9972 | 0.9973 | 0.9974 |
2.8 | 0.9974 | 0.9975 | 0.9976 | 0.9977 | 0.9977 | 0.9978 | 0.9979 | 0.9979 | 0.9980 | 0.9981 |
2.9 | 0.9981 | 0.9982 | 0.9982 | 0.9983 | 0.9984 | 0.9984 | 0.9985 | 0.9985 | 0.9986 | 0.9986 |
3 | 0.9987 | 0.9987 | 0.9987 | 0.9988 | 0.9988 | 0.9989 | 0.9989 | 0.9989 | 0.9990 | 0.9990 |
3.1 | 0.9990 | 0.9991 | 0.9991 | 0.9991 | 0.9992 | 0.9992 | 0.9992 | 0.9992 | 0.9993 | 0.9993 |
3.2 | 0.9993 | 0.9993 | 0.9994 | 0.9994 | 0.9994 | 0.9994 | 0.9994 | 0.9995 | 0.9995 | 0.9995 |
3.3 | 0.9995 | 0.9995 | 0.9995 | 0.9996 | 0.9996 | 0.9996 | 0.9996 | 0.9996 | 0.9996 | 0.9997 |
0 | 0.01 | 0.02 | 0.03 | 0.04 | 0.05 | 0.06 | 0.07 | 0.08 | 0.09 | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 0.5000 | 0.4960 | 0.4920 | 0.4880 | 0.4840 | 0.4801 | 0.4761 | 0.4721 | 0.4681 | 0.4641 |
0.1 | 0.4602 | 0.4562 | 0.4522 | 0.4483 | 0.4443 | 0.4404 | 0.4364 | 0.4325 | 0.4286 | 0.4247 |
0.2 | 0.4207 | 0.4168 | 0.4129 | 0.4090 | 0.4052 | 0.4013 | 0.3974 | 0.3936 | 0.3897 | 0.3859 |
0.3 | 0.3821 | 0.3783 | 0.3745 | 0.3707 | 0.3669 | 0.3632 | 0.3594 | 0.3557 | 0.3520 | 0.3483 |
0.4 | 0.3446 | 0.3409 | 0.3372 | 0.3336 | 0.3300 | 0.3264 | 0.3228 | 0.3192 | 0.3156 | 0.3121 |
0.5 | 0.3085 | 0.3050 | 0.3015 | 0.2981 | 0.2946 | 0.2912 | 0.2877 | 0.2843 | 0.2810 | 0.2776 |
0.6 | 0.2743 | 0.2709 | 0.2676 | 0.2643 | 0.2611 | 0.2578 | 0.2546 | 0.2514 | 0.2483 | 0.2451 |
0.7 | 0.2420 | 0.2389 | 0.2358 | 0.2327 | 0.2296 | 0.2266 | 0.2236 | 0.2206 | 0.2177 | 0.2148 |
0.8 | 0.2119 | 0.2090 | 0.2061 | 0.2033 | 0.2005 | 0.1977 | 0.1949 | 0.1922 | 0.1894 | 0.1867 |
0.9 | 0.1841 | 0.1814 | 0.1788 | 0.1762 | 0.1736 | 0.1711 | 0.1685 | 0.1660 | 0.1635 | 0.1611 |
1 | 0.1587 | 0.1562 | 0.1539 | 0.1515 | 0.1492 | 0.1469 | 0.1446 | 0.1423 | 0.1401 | 0.1379 |
1.1 | 0.1357 | 0.1335 | 0.1314 | 0.1292 | 0.1271 | 0.1251 | 0.1230 | 0.1210 | 0.1190 | 0.1170 |
1.2 | 0.1151 | 0.1131 | 0.1112 | 0.1093 | 0.1075 | 0.1056 | 0.1038 | 0.1020 | 0.1003 | 0.0985 |
1.3 | 0.0968 | 0.0951 | 0.0934 | 0.0918 | 0.0901 | 0.0885 | 0.0869 | 0.0853 | 0.0838 | 0.0823 |
1.4 | 0.0808 | 0.0793 | 0.0778 | 0.0764 | 0.0749 | 0.0735 | 0.0721 | 0.0708 | 0.0694 | 0.0681 |
1.5 | 0.0668 | 0.0655 | 0.0643 | 0.0630 | 0.0618 | 0.0606 | 0.0594 | 0.0582 | 0.0571 | 0.0559 |
1.6 | 0.0548 | 0.0537 | 0.0526 | 0.0516 | 0.0505 | 0.0495 | 0.0485 | 0.0475 | 0.0465 | 0.0455 |
1.7 | 0.0446 | 0.0436 | 0.0427 | 0.0418 | 0.0409 | 0.0401 | 0.0392 | 0.0384 | 0.0375 | 0.0367 |
1.8 | 0.0359 | 0.0351 | 0.0344 | 0.0336 | 0.0329 | 0.0322 | 0.0314 | 0.0307 | 0.0301 | 0.0294 |
1.9 | 0.0287 | 0.0281 | 0.0274 | 0.0268 | 0.0262 | 0.0256 | 0.0250 | 0.0244 | 0.0239 | 0.0233 |
2 | 0.0228 | 0.0222 | 0.0217 | 0.0212 | 0.0207 | 0.0202 | 0.0197 | 0.0192 | 0.0188 | 0.0183 |
2.1 | 0.0179 | 0.0174 | 0.0170 | 0.0166 | 0.0162 | 0.0158 | 0.0154 | 0.0150 | 0.0146 | 0.0143 |
2.2 | 0.0139 | 0.0136 | 0.0132 | 0.0129 | 0.0125 | 0.0122 | 0.0119 | 0.0116 | 0.0113 | 0.0110 |
2.3 | 0.0107 | 0.0104 | 0.0102 | 0.0099 | 0.0096 | 0.0094 | 0.0091 | 0.0089 | 0.0087 | 0.0084 |
2.4 | 0.0082 | 0.0080 | 0.0078 | 0.0075 | 0.0073 | 0.0071 | 0.0069 | 0.0068 | 0.0066 | 0.0064 |
2.5 | 0.0062 | 0.0060 | 0.0059 | 0.0057 | 0.0055 | 0.0054 | 0.0052 | 0.0051 | 0.0049 | 0.0048 |
2.6 | 0.0047 | 0.0045 | 0.0044 | 0.0043 | 0.0041 | 0.0040 | 0.0039 | 0.0038 | 0.0037 | 0.0036 |
2.7 | 0.0035 | 0.0034 | 0.0033 | 0.0032 | 0.0031 | 0.0030 | 0.0029 | 0.0028 | 0.0027 | 0.0026 |
2.8 | 0.0026 | 0.0025 | 0.0024 | 0.0023 | 0.0023 | 0.0022 | 0.0021 | 0.0021 | 0.0020 | 0.0019 |
2.9 | 0.0019 | 0.0018 | 0.0018 | 0.0017 | 0.0016 | 0.0016 | 0.0015 | 0.0015 | 0.0014 | 0.0014 |
3 | 0.0013 | 0.0013 | 0.0013 | 0.0012 | 0.0012 | 0.0011 | 0.0011 | 0.0011 | 0.0010 | 0.0010 |
3.1 | 0.0010 | 0.0009 | 0.0009 | 0.0009 | 0.0008 | 0.0008 | 0.0008 | 0.0008 | 0.0007 | 0.0007 |
3.2 | 0.0007 | 0.0007 | 0.0006 | 0.0006 | 0.0006 | 0.0006 | 0.0006 | 0.0005 | 0.0005 | 0.0005 |
3.3 | 0.0005 | 0.0005 | 0.0005 | 0.0004 | 0.0004 | 0.0004 | 0.0004 | 0.0004 | 0.0004 | 0.0003 |
A plot of “special” z scores and raw scores known as a normality plot, or a Q-Q plot, provides you with a visual check of whether a set of scores follows a normal distribution. Shortly, you will see why these z scores have been called special.
The normality plot for the data labeled NORMAL-Z in Figure 5k is shown in Figure 9j. This data illustrates a normality plot based on a sample of scores drawn at random from a normal distribution. The normality plot in Figure 9j would be a straight line if the NORMAL-Z data followed a normal distribution. Since this data is a random sample, however, Figure 9j does not contain a straight line but does contain a rough approximation of a straight line.
Contrast Figure 9j with Figure 9k. Figure 9k contains a normality plot based on the leptokurtic data, labeled LEPTO-X, in Figure 5k. The approximation to a straight line is very poor in Figure 9k compared to the normality plot in Figure 9j. To facilitate the comparison with a straight line, we added a line that represents how the points would look if they followed a normal distribution to Figure 9k. The normality plot for the leptokurtic data bows above and below the straight line. You can construct a normality plot by following these steps:
We followed these steps with the NORMAL-Z data from figure 5k to produce the results shown in Table 9.2. The normality plot of the ordered observations and the quantiles is the one we discussed in Figure 9j.
Consider the set of z scores -3, -2, -1, 0, 1, 2, and 3, selected from a standard normal distribution. The probability of randomly sampling a z score below each of these z scores is .0014, .0228, .1587, .5000, .8413, .9772, and .9987, respectively. The quantiles associated with these probabilities are the original z scores of -3, -2, -1, 0, 1, 2, and 3. In terms of the preceding steps, this data would look as follows:
You can see that if you were to plot the ordered observations and their quantiles, the points would fall on a straight line. This is a trivial example, but it illustrates what is happening when you arrive at a normality plot. If your ordered scores are normally distributed, the probabilities found in column 2 will lead to quantiles that are directly related to the ordered scores, and the resulting plot will yield points that fall on a straight line.
Ordered_Observations | Probability_Levels | Quantiles |
---|---|---|
1 | 0.0014 | -3 |
2 | 0.0228 | -2 |
3 | 0.1587 | -1 |
4 | 0.5000 | 0 |
5 | 0.8413 | 1 |
6 | 0.9772 | 2 |
7 | 0.9987 | 3 |
What differs in this example is the manner in which the probabilities are estimated. Here they were arrived at directly using Table 9.1(a), and in a normality plot (step 3) they are estimated from the rank order of the observations.
To use the rank order of the scores to estimate the proportion of scores below a given score (remember that here proportion and probability are the same), it is recommended that you have 20 or more scores (Johnson & Wichern, 1982). The simplest way of finding the proportion of scores below a given score is to divide the rank of the score by the number of scores, p = i/n. However, some statisticians prefer to think of a rank order number as standing for all numbers between i – .5 and i + .5. For example, the rank of 5 stands for the ranks between 4.5 and 5.5. In this case, the proportion of scores below a given rank would be found as the proportion below i – .5 or as (i – .5)/n. The equation p = (i – .5)/n is the basis for the probability in step 4 of the preceding steps. The subtraction of .5 from i is called the correction for continuity because it assumes the ranks derive from a continuous distribution.
where \[
\begin{align}
\text {Observed Observations} &= X \\
\text {Probability Levels} &= (I-0.5)/N \\
\text {Standard Normal Quantiles of I} &= Q(I) \\
\text {Cross Products} &= X*Q(I) \\
\end{align}
\] ### Figure 9j A normality plot of scores that were sampled at
random from the normal data in Table 9.2
Note that some statistics programs will invert the axes, with the theoretical quantiles on the X-axis and the observed values on the Y axis. Some programs will use raw (original) observed scores and some will use standarized observed scores.
Some programs will also provide a “detrended” Q-Q Plot that shows the
deviations from normal (the 0 line represents normal).
jmv::descriptives(
data = data,
vars = Ordered_Observations,
qq = TRUE,
n = FALSE,
missing = FALSE,
mean = FALSE,
median = FALSE,
sd = FALSE,
min = FALSE,
max = FALSE)
DESCRIPTIVES
jmv::descriptives(
data = data,
vars = LEPTO_X,
qq = TRUE,
n = FALSE,
missing = FALSE,
mean = FALSE,
median = FALSE,
sd = FALSE,
min = FALSE,
max = FALSE)
DESCRIPTIVES
Figure 9m shows a normality plot of the positively skewed scores, labeled XRATING, from figure 5f. A normality plot will frequently bow either above or below the hypothetical straight line representing the normal plot at locations where the ordered observations have more scores (higher frequencies) than would be found in a normal distribution. For example, when a straight line is placed on the normality plot in figure 9m, the scores bow below the line because the lower scores have high frequencies.
jmv::descriptives(
data = data,
vars = X_RATING,
qq = TRUE,
n = FALSE,
missing = FALSE,
mean = FALSE,
median = FALSE,
sd = FALSE,
min = FALSE,
max = FALSE)
DESCRIPTIVES
Figure 9n shows a normality plot of the negatively skewed scores, labeled YRATING, from figure 5f. When a straight line is placed on this normality plot, the scores bow above the line because the higher scores have higher frequencies.
jmv::descriptives(
data = data,
vars = Y_RATING,
qq = TRUE,
n = FALSE,
missing = FALSE,
mean = FALSE,
median = FALSE,
sd = FALSE,
min = FALSE,
max = FALSE)
DESCRIPTIVES
Figure 9o is a normality plot of the platykurtic set of scores, labeled PLATY-Y, from figure 5k. In this normality plot, slight bows are found in the tails because a platykurtic distribution has higher frequencies in its tails than does the normal distribution.
jmv::descriptives(
data = data,
vars = Ordered_Observations,
qq = TRUE,
n = FALSE,
missing = FALSE,
mean = FALSE,
median = FALSE,
sd = FALSE,
min = FALSE,
max = FALSE)
DESCRIPTIVES
The large number of scores in the middle of the distribution caused the bows in the normality plot of the leptokurtic distribution in figure 91. The high frequencies of scores at 24 and 25 caused the normality plot to be drawn horizontal to these points on the y-axis, resulting in bows in the normality plot.
SUMMARY
This chapter explained how standard scores can help us determine the location of a score above or below the mean in standard deviation units. It also discussed the use of standard scores to compare scores based on different scales (that is, scores with different means and variances) and to identify outliers in a data set.
In descriptive statistical analyses, standard scores called z scores (which have a mean of 0.00 and a standard deviation of 1.00) are frequently used, but when standard score information is reported to the public, it is generally in terms of standard scores that do not have decimal points and negative signs. For example, the T score, which has a mean of 50 and a standard deviation of 10, is commonly used to report performance on standardized tests.
This chapter also discussed the mathematical curve called the normal or Gaussian distribution and discrete and continuous probability distributions; particularly, the normal probability distribution. You should remember that a normal probability distribution is a theoretical distribution that allows you to find the probability that a score could be selected at random from a given interval of scores. A normal frequency distribution, on the other hand, is not a theoretical distribution but is simply a frequency distribution of observed scores. The former is a theoretical distribution, and the latter is an observed, or an empirical, distribution.
This chapter linked standard scores to another measure of location, the percentile. We are able to do this because if we know the probability of observing a score below a given score and if we multiply this probability by 100 and append %, we then have the percentage of scores that fall below the given score. We refer to this percentage as the percentile rank of the score, and to the score as the percentile. The percentile rank of a score is easy to find when the score comes from a normal distribution because all normal distributions can be transformed into the standard normal distribution. Tabled probabilities (or proportions) are available for the standard normal distribution.
This chapter concluded by discussing normality or Q-Q plots. If a set of scores is normally distributed, a normality plot would be a positively sloped straight line. We considered the shapes of normality plots for several abnormal distributions. I future chapters, we will use normality plots as a routine part of our data analyses.
Standardized (Z) Scores and Normal Distribution Probabilities
Analyses to Run * Use a SCALE variable Y * Use a SCALE variable X * Run descriptive statistics * Create standardized scores for X and Y * Report * Y * Standardized Y (\(z_Y\)) * X * Standardized X (\(z_X\))
Respond to the following items
Pick a case and respond to the items below . Use z scores with the Standard Normal Distribution to answer the following questions
Provide the case’s Y score, and show or explain how its \(z_Y\) is calculated
Provide the case’s X score, and show or explain how its \(z_X\) is calculated
Report and interpret the proportion or percentage of the normal distribution that falls BETWEEN that Y score and the mean of Y (i.e., between \(z_Y\) and 0)
Report and interpret the proportion or percentage of the normal distribution that falls in the area ABOVE and BELOW that Y score (i.e., above and below that \(z_Y\) score)
Report and interpret the proportion or percentage of the normal distribution that falls in the area ABOVE and BELOW that X score (i.e., above and below that \(z_X\) score)
Report and interpret the proportion or percentage of the normal distribution that falls in the area MORE EXTREME (i.e., farther from the mean in either direction) than that Y score
Report and interpret the proportion or percentage of the normal distribution that falls BETWEEN the Y and X scores (i.e., between \(z_Y\) and \(z_X\))
Multiply \(z_X\) by -1 and report and interpret the proportion or percentage of the normal distribution that falls BETWEEN \(z_X\) and 1*\(z_X\)
Report whether, relative to other cases, whether the case had a higher score on Y or X (use Z scores as evidence)
Show or explain how to calculate what original Y score is associated with \(z_Y\) = 1.5, using Y = \(M_Y\) + (\(s_Y\) * \(z_Y\))
Report the \(z_Y\) score for the case in the data is farthest BELOW the mean on variable Y
Report the \(z_X\) score for the case in the data is farthest ABOVE the mean on variable X
Report which cases (if any) have either Y or X scores that are more than 2 standard deviations away from the mean (in either direction)
Report what \(z_Y\) score is the cut-off value for the TOP 10% of the normal distribution (hint, it is the same Z score regardless of what the original variable is)
Report what Y score is associated with the \(z_Y\) in the previous item
Report what \(z_X\) score is the cut-off value for the MOST EXTREME 10% of the normal distribution (i.e., 5% in the TOP tail and 5% in the BOTTOM tail)
Report what two X scores are associated with the \(z_X\) cut-off in the previous item (that is, both the –\(z_X\) and the +\(z_X\))
Please cite as:
Barcikowski, R. S., & Brooks, G. P. (2025). The Stat-Pro book:
A guide for data analysts (revised edition) [Unpublished manuscript].
Department of Educational Studies, Ohio University.
https://people.ohio.edu/brooksg/Rmarkdown/
This is a revision of an unpublished textbook by Barcikowski (1987).
This revision updates some text and uses R and JAMOVI as the primary
tools for examples. The textbook has been used as the primary textbook
in Ohio University EDRE 7200: Educational Statistics courses for
most semesters 1987-1991 and again 2018-2025.