INTRODUCTION

In this chapter, we will try to discover what John Tukey, a well-known statistician meant when he said:” A summary picture is worth a thousand summary words.” Our objective in this chapter is to learn graphical techniques that will help us further understand our data. We will concentrate on creating and interpreting several types of graphs. It’s a good idea to use one or more of these graphics as a routine part of any data analysis you perform. That is, while you may not include every graph you produce in your written report of results, they are all valuable in helping you understand your data and your results better as an analyst.

GRAPHS THAT COME FROM STATISTICAL PROCEDURES

A number of jamovi functions provide relevant graphs as part of their output (sometimes by default, but sometimes only through requesting the appropriate option). For example, among the procedures we will discuss the most, here are some graphs possible:

Types of Graphs Provided

  • Bar Chart
  • Pie Chart
  • Boxplot
  • Histogram
  • Normal Q-Q Plot
  • Residual Scatterplot
  • Residual Histogram
  • Residual Normal Probability (P-P) Plot
  • Partial Plots

In this chapter we will look at graphs that are available in basic jamovi (and also available in R). There are additional modules that can be added to jamovi that provide numerous additional graphical options, such as vijPlots, JJStatsPlot, FlexPlot, surveymv. We have provided some plots available from esci in this chapter, but there are others available. One very nice plot, called a raincloud plot, is available in surveymv, but not available using that package in R.

With charts and graphs, just like in your statistical analyses, you need to pay attention to the level of measurement of your variables. Some graphs just make no sense… Some graphs in this section are most appropriate when you have nominal variables, some for ordinal variables with relatively few categories (e.g., not an ordinal variable that provides the ranking of 450 students in a high school graduating class), and some for scale variables.

With several of the graphing procedures, you must pay attention to whether you are graphing variables or groups. That is, there is an option for many of the graph procedures to choose between “groups of cases” and “separate variables.” Even if you have just one variable, you will typically need to choose “separate variables” in order for the graph to work.

Many of the procedures also have options for “simple” or “clustered” data. The clustering provides additional complexity to the graph, but may be useful in some circumstances. Similarly, several of the procedures allow “paneling” that shows multiple copies of the same graph across multiple groups. Such graphs can usually be paneled by rows or columns. The nice thing about paneling is that the axes match automatically, making it much easier to make comparisons across groups.

We will use the Chapter 4 Cholesterol data for many of the examples below. Recall that CHOL was the cholesterol score, HT was height measured in inches, and GROUP was whether they were prescribed an herbal supplement pill to help lower their cholesterol. We will also use the X RATING and Y RATING variables from the student teacher rating data for two student teachers (student teacher X and student teacher Y) in Chapter 5. The headings for each of the graphs displayed below indicate what types of variables were used in the example. Each example also provides the commands used.

Table 6.1

Bar Chart for a categorical variable (jamovi Exploration | Descriptives)

jmv::descriptives(
    data = data,
    vars = Pill_Group,
    freq = TRUE,
    bar = TRUE,
    n = FALSE,
    missing = FALSE,
    mean = FALSE,
    median = FALSE,
    sd = FALSE,
    min = FALSE,
    max = FALSE)

 DESCRIPTIVES

 FREQUENCIES

 Frequencies of Pill_Group                              
 ────────────────────────────────────────────────────── 
   Pill_Group    Counts    % of Total    Cumulative %   
 ────────────────────────────────────────────────────── 
   1 Yes-Pill        20        50.000          50.000   
   0 No-Pill         20        50.000         100.000   
 ────────────────────────────────────────────────────── 

Bar Chart for a scale variable with relatively few values in the data (jamovi Exploration | Descriptives)

jmv::descriptives(
    data = data,
    vars = Height,
    bar = TRUE,
    n = FALSE,
    missing = FALSE,
    mean = FALSE,
    median = FALSE,
    sd = FALSE,
    min = FALSE,
    max = FALSE)

 DESCRIPTIVES

Clustered Bar Chart for a scale variable by a nominal variable (jamovi Exploration | Descriptives)

jmv::descriptives(
    data = data,
    formula = Height ~ Pill_Group,
    bar = TRUE,
    n = FALSE,
    missing = FALSE,
    mean = FALSE,
    median = FALSE,
    sd = FALSE,
    min = FALSE,
    max = FALSE)

 DESCRIPTIVES

jmv::descriptives(
    data = data,
    formula = Pill_Group ~ Height,
    bar = TRUE,
    n = FALSE,
    missing = FALSE,
    mean = FALSE,
    median = FALSE,
    sd = FALSE,
    min = FALSE,
    max = FALSE)

 DESCRIPTIVES

Note that when two interval or ratio variables have very different scales of measurement, it may not be appropriate to graph them together. Sometimes doing so results in confidence intervals like those for the HT variable. This will be true for many graphs of variables measured on very different scales.

Bar Chart for a single scale variable to show mean and confidence interval across groups (jamovi Exploration | Descriptives)

jmv::descriptives(
    data = data,
    formula = Cholesterol ~ Pill_Group,
    desc = "rows",
    bar = TRUE,
    n = FALSE,
    missing = FALSE,
    mean = FALSE,
    median = FALSE,
    sd = FALSE,
    min = FALSE,
    max = FALSE)

 DESCRIPTIVES

Boxplot (sometimes called Box-and-Whisker plot) for one scale variable (jamovi Exploration | Descriptives)

jmv::descriptives(
    data = data,
    vars = Cholesterol,
    box = TRUE,
    n = FALSE,
    missing = FALSE,
    mean = FALSE,
    median = FALSE,
    sd = FALSE,
    min = FALSE,
    max = FALSE)

 DESCRIPTIVES

Boxplot for one scale variable with Violin Plot and Stacked Data with case numbers of outliers labeled (jamovi Exploration | Descriptives)

jmv::descriptives(
    data = data,
    vars = Cholesterol,
    box = TRUE,
    dot = TRUE,
    dotType = "stack",
    n = FALSE,
    missing = FALSE,
    mean = FALSE,
    median = FALSE,
    sd = FALSE,
    min = FALSE,
    max = FALSE)

 DESCRIPTIVES

Boxplot for one scale variable with Violin Plot and Jittered Data with both median and mean (jamovi Exploration | Descriptives)

jmv::descriptives(
    data = data,
    vars = Cholesterol,
    box = TRUE,
    violin = TRUE,
    dot = TRUE,
    boxMean = TRUE,
    n = FALSE,
    missing = FALSE,
    mean = FALSE,
    median = FALSE,
    sd = FALSE,
    min = FALSE,
    max = FALSE)

 DESCRIPTIVES

Boxplot for one single scale variable across groups (jamovi Exploration | Descriptives)

jmv::descriptives(
    data = data,
    formula = Cholesterol ~ Pill_Group,
    box = TRUE,
    violin = TRUE,
    dot = TRUE,
    dotType = "stack",
    boxMean = TRUE,
    n = FALSE,
    missing = FALSE,
    mean = FALSE,
    median = FALSE,
    sd = FALSE,
    min = FALSE,
    max = FALSE)

 DESCRIPTIVES

Boxplot for two scale variables in the same plot (jamovi Exploration | Descriptives with some data manipulation)

jmv::descriptives(
    data = data,
    formula = Scores ~ Rating,
    box = TRUE,
    dot = TRUE,
    dotType = "stack",
    boxMean = TRUE,
    n = FALSE,
    missing = FALSE,
    mean = FALSE,
    median = FALSE,
    sd = FALSE,
    min = FALSE,
    max = FALSE)

 DESCRIPTIVES

Error Bar Plot for one scale variable in one group (jamovi T-Tests | One Sample T Test)

jmv::ttestOneS(
    data = data,
    vars = Cholesterol,
    students = FALSE,
    testValue = 200,
    plots = TRUE)

 ONE SAMPLE T-TEST

 One Sample T-Test 
 ───────────────── 
  
 ───────────────── 
  
 ───────────────── 
   Note. Hₐ μ
   ≠ 200

Note that the one group, one variable Error Bar Plot is just not very interesting…

Error Bar Plot for one scale variable in one group (jamovi esci | Single Group)

esci::jamovimagnitude(
    data = data,
    outcome_variable = Cholesterol,
    alpha = "0.05")
  |                                                                              |                                                                      |   0%  |                                                                              |......................................................................| 100%                                                                                          

 MEANS AND MEDIANS: SINGLE GROUP

 Overview                                                                                              
 ───────────────────────────────────────────────────────────────────────────────────────────────────── 
   Outcome variable    <i>M</i>    LL        UL        <i>Mdn</i>    <i>s</i>    <i>N</i>    Missing   
 ───────────────────────────────────────────────────────────────────────────────────────────────────── 
   Cholesterol           223.65    209.24    238.06        219.00      45.064          40          0   
 ───────────────────────────────────────────────────────────────────────────────────────────────────── 


character(0)
  |                                                                              |                                                                      |   0%  |                                                                              |......................................................................| 100%                                                                                          

Error Bar Plot for two scale variables in one group (jamovi T-Tests | Paired Samples T Test)

jmv::ttestPS(
    data = data,
    pairs = list( list(
            i1="Cholesterol",
            i2="Height")),
    students = FALSE,
    plots = TRUE)

 PAIRED SAMPLES T-TEST

 Paired Samples T-Test 
 ───────────────────── 
  
 ───────────────────── 
  
 ───────────────────── 
   Note. Hₐ
   μ <sub>Measure
   1 - Measure
   2</sub> ≠ 0

Error Bar Plot for one scale variable across two groups (jamovi T-Tests | Independent Samples T Test)

jmv::ttestIS(
    data = data,
    formula = Cholesterol ~ Pill_Group,
    vars = Cholesterol,
    students = FALSE,
    plots = TRUE)

 INDEPENDENT SAMPLES T-TEST

 Independent Samples T-Test 
 ────────────────────────── 
  
 ────────────────────────── 
  
 ────────────────────────── 
   Note. Hₐ μ <sub>1
   Yes-Pill</sub> ≠
   μ <sub>0
   No-Pill</sub>

Histogram for one scale variable (jamovi Exploration | Descriptives)

jmv::descriptives(
    data = data,
    vars = Cholesterol,
    hist = TRUE,
    n = FALSE,
    missing = FALSE,
    mean = FALSE,
    median = FALSE,
    sd = FALSE,
    min = FALSE,
    max = FALSE)

 DESCRIPTIVES

Note that there are many ways to create these graphs and they do not always necessarily look the same… Compare this to the next histogram…

Histogram and Dot Plot for one scale variable (jamovi esci | Describe)

esci::jamovidescribe(
    data = data,
    outcome_variable = Cholesterol,
    show_details = FALSE,
    mark_mean = FALSE,
    mark_median = FALSE,
    mark_sd = FALSE,
    mark_quartiles = FALSE,
    mark_z_lines = FALSE,
    mark_percentile = "0",
    histogram_bins = "12",
    es_plot_width = "500",
    es_plot_height = "400",
    ymin = "auto",
    ymax = "auto",
    breaks = "auto",
    xmin = "auto",
    xmax = "auto",
    xbreaks = "auto",
    ylab = "auto",
    xlab = "auto",
    axis.text.y = "14",
    axis.title.y = "15",
    axis.text.x = "14",
    axis.title.x = "15",
    fill_regular = "#008DF9",
    fill_highlighted = "#E20134",
    color = "black"
  )

 DESCRIBE

 Overview                                                                                                                    
 ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── 
   Outcome variable    <i>M</i>    <i>Mdn</i>    <i>s</i>    Minimum    Maximum    25th      75th      <i>N</i>    Missing   
 ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── 
   Cholesterol           223.65        219.00      45.064     146.00     330.00    193.75    255.00          40          0   
 ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── 


character(0)
  |                                                                              |                                                                      |   0%  |                                                                              |......................................................................| 100%                                                                                          

  |                                                                              |                                                                      |   0%  |                                                                              |......................................................................| 100%                                                                                          

Histogram Paneled by Rows for one scale variable across one nominal variable

jmv::descriptives(
    data = data,
    formula = Cholesterol ~ Pill_Group,
    hist = TRUE,
    n = FALSE,
    missing = FALSE,
    mean = FALSE,
    median = FALSE,
    sd = FALSE,
    min = FALSE,
    max = FALSE)

 DESCRIPTIVES

Histogram Paneled by Rows for one scale variable across one nominal variable

Histogram for one variable (vijPlots jamovi module)

  vijPlots::histogram(
    data = data,
    aVar = Cholesterol,
    group = NULL,
    facet = NULL,
    colorPalette = "jmv")

Histogram for one variable with adjusted bin widths and normal curve superimposed (vijPlots jamovi module)

  vijPlots::histogram(
    data = data,
    aVar = Cholesterol,
    group = NULL,
    facet = NULL,
    normalCurve = TRUE,
    binWidth = 20,
    binBoundary = 0,
    colorPalette = "jmv")

Histogram for one variable paneled by Groupd (vijPlots jamovi module)

  vijPlots::histogram(
    data = data,
    aVar = Cholesterol,
    group = NULL,
    facet = Pill_Group,
    normalCurve = FALSE,
    binWidth = 20,
    binBoundary = 0,
    colorPalette = "jmv")

Histogram for one variable paneled by Groups with normal curve (vijPlots jamovi module)

  vijPlots::histogram(
    data = data,
    aVar = Cholesterol,
    group = NULL,
    facet = Pill_Group,
    histtype = "density",
    normalCurve = TRUE,
    binWidth = 20,
    binBoundary = 0,
    colorPalette = "jmv")

Raincloud plot (jamovi Exploration | surveymv)

surveymv::surveyPlot(
    data = data,
    vars = Cholesterol)

 SURVEY PLOTS

Histogram Paneled for one scale variable across one nominal variable (jamovi Exploration | surveymv)

surveymv::surveyPlot(
    data = data,
    vars = Cholesterol,
    group = Pill_Group)

 SURVEY PLOTS

Scatterplot for two scale variables

scatr::scat(
    data = data,
    x = 'Weight',
    y = 'Cholesterol')

Scatterplot for two scale variables with regression “linear fit” line added, standard error for line, and marginal densities

scatr::scat(
    data = data,
    x = 'Weight',
    y = 'Cholesterol',
    line = 'linear',
    se = TRUE,
    marg = 'dens') #box #dens

Scatterplot for two scale variables marked differently for different levels of one nominal variable with boxplots on the margins

scatr::scat(
    data = data,
    x = 'Weight',
    y = 'Cholesterol',
    group = 'Pill_Group',
    marg = 'box',
    line = 'linear')

Scatterplot with one binary nominal variable and one scale variable

scatr::scat(
    data = data,
    x = 'Pill_Grp',
    y = 'Cholesterol',
    line = 'linear')

Note that this nominal-by-scale variable scatterplot ONLY makes sense when the nominal variable is numeric and binary (i.e., dichotomous, two-group). This is also the only time a correlation calculated for nominal and scale variables makes sense.

Scatterplot with one ordinal variable (or scale variable without much variation) and one scale variable

scatr::scat(
    data = data,
    x = 'Height',
    y = 'Weight',
    line = 'linear')

Note that, if you look carefully, this ordinal-like–by-scale variable scatterplot shows vertical “lines” of dots at each of the discrete values in the more ordinal-like variable (Height). So when you see lines like that in your graphs it is probably for this reason. In some scatterplots they will show up diagonally instead of vertically or horizontally. Sometimes it is more or less obvious.

Matrix of Scatterplots for multiple variables

jmv::corrMatrix(
    data = data,
    vars = vars(Height, Cholesterol, Weight, Pill_Grp),
    sig = FALSE,
    plots = TRUE,
    plotDens = TRUE,
    plotStats = TRUE)

 CORRELATION MATRIX

 Correlation Matrix                                                 
 ────────────────────────────────────────────────────────────────── 
                  Height      Cholesterol    Weight      Pill_Grp   
 ────────────────────────────────────────────────────────────────── 
   Height                —                                          
   Cholesterol    -0.01077              —                           
   Weight          0.25085        0.68236           —               
   Pill_Grp        0.13194       -0.09551    -0.05055           —   
 ────────────────────────────────────────────────────────────────── 

Note that here we have a numeric binary variable (Pill_Grp) and three scale variables. The same warning as above applies: a nominal-by-scale variable scatterplot ONLY makes sense when the nominal variable is binary (i.e., dichotomous, two-group). This is also the only time a correlation calculated for nominal and scale variables makes sense.

But whe we use the binary variable as a factor (where we have told jamovi it is a categorical nominal/ordinal variable), we get a different look to the matrix: paneled results by group (and this will work for multiple groups).

Matrix of Scatterplots for multiple variables (with one nominal/ordinal variable)

jmv::corrMatrix(
    data = data,
    vars = vars(Weight, Cholesterol, Pill_Group),
    sig = FALSE,
    plots = TRUE,
    plotDens = TRUE,
    plotStats = TRUE)

 CORRELATION MATRIX

 Correlation Matrix                                        
 ───────────────────────────────────────────────────────── 
                  Weight       Cholesterol    Pill_Group   
 ───────────────────────────────────────────────────────── 
   Weight               —                                  
   Cholesterol    0.68236              —                   
   Pill_Group         NaN ᵃ          NaN ᵃ             —   
 ───────────────────────────────────────────────────────── 
   ᵃ Pearson correlation cannot be calculated for
   non-numeric values

SUMMARY

A large number of common graphs that can be produced were shown in this chapter.