13 {marginaleffects}

Let’s have a look at the turnout dataset below.

# install jrnold's package from github
# devtools::install_github("jrnold/ZeligData")  # just run once

# load packages
library(tidyverse)

# load data
turnout <- ZeligData::turnout  # see ?ZeligData::turnout for details

# quick look
glimpse(turnout)

Rows: 2,000
Columns: 5
$ race    <fct> white, white, white, white, white, white, white, white, white,…
$ age     <int> 60, 51, 24, 38, 25, 67, 40, 56, 32, 75, 46, 52, 22, 60, 24, 30…
$ educate <dbl> 14, 10, 12, 8, 12, 12, 12, 10, 12, 16, 15, 12, 12, 12, 14, 10,…
$ income  <dbl> 3.3458, 1.8561, 0.6304, 3.4183, 2.7852, 2.3866, 4.2857, 9.3205…
$ vote    <int> 1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1,…

And a logit model modeling the binary vote variable as a function of age, educate, income, and race.

# fit logit model
f <- vote ~ age + educate + income + race
fit <- glm(f, data = turnout, family = binomial)

# print table
modelsummary::modelsummary(fit, gof_map = NA)

	(1)
(Intercept)	-3.034
	(0.326)
age	0.028
	(0.003)
educate	0.176
	(0.020)
income	0.177
	(0.027)
racewhite	0.251
	(0.146)

These coefficients are not particularly interpretable, so we might want to compute quantities of interest.¹

¹ In political science, King, Tomz, and Wittenberg (2000) prompted the focus on more interpretable quantities and I often borrow their langauge.

King, Gary, Michael Tomz, and Jason Wittenberg. 2000. “Making the Most of Statistical Analyses: Improving Interpretation and Presentation.” American Journal of Political Science 44: 341–55. http://gking.harvard.edu/files/abs/making-abs.shtml.

We have learned that we can use the invariance property and the delta method as a very general way to compute almost any quantity of interest. The {marginaleffects} package in R exploits this generally to compute many quantities of interest, especially those that are involve \(E(Y)\)–like the commonly used expected value and first difference.

library(marginaleffects)

We’ll focus on two main functions from {marginaleffects}: predictions() and comparisons(), as well as their avg_*() variants. Let’s take a quick look at each.

13.1 Expected value → `predictions()`

Reminder: expected value = \(E(y \mid X_c)\), where \(X_c\) is a carefully chosen covariate vector.

By default, predictions() generates an expected value for every row in the observed dataset. These expected vaues are stored in the estimate column of the output and it includes other commonly used values like the upper and lower bounds of the 95% confidence interval.

ev <- predictions(fit)
glimpse(ev)

Rows: 2,000
Columns: 12
$ rowid     <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 1…
$ estimate  <dbl> 0.8775302, 0.6786731, 0.5290410, 0.5755400, 0.6286173, 0.838…
$ p.value   <dbl> 3.090260e-72, 1.948825e-25, 3.272806e-01, 2.466941e-03, 2.09…
$ s.value   <dbl> 237.5510944, 82.0855976, 1.6114001, 8.6630613, 28.8270054, 1…
$ conf.low  <dbl> 0.8525216, 0.6472753, 0.4709367, 0.5268140, 0.5876069, 0.807…
$ conf.high <dbl> 0.8988014, 0.7085345, 0.5863688, 0.6228407, 0.6678547, 0.864…
$ race      <fct> white, white, white, white, white, white, white, white, whit…
$ age       <int> 60, 51, 24, 38, 25, 67, 40, 56, 32, 75, 46, 52, 22, 60, 24, …
$ educate   <dbl> 14, 10, 12, 8, 12, 12, 12, 10, 12, 16, 15, 12, 12, 12, 14, 1…
$ income    <dbl> 3.3458, 1.8561, 0.6304, 3.4183, 2.7852, 2.3866, 4.2857, 9.32…
$ vote      <int> 1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, …
$ df        <dbl> Inf, Inf, Inf, Inf, Inf, Inf, Inf, Inf, Inf, Inf, Inf, Inf, …

13.2 First difference → `comparisons()`

Reminder: first difference = \(E(y \mid X_{hi}) - E(y \mid X_{lo})\), where \(X_{hi}\) and \(X_{lo}\) are a carefully chosen covariate vectors that fix on covariate of interest at a high value and low value respectively.

Bcomparisons() computes these first differences

Q: But what variable does it vary by default? A: All of them, one at a time! You’ll usually want to pick one.
Q: And what are the high and low values? A: It depends, but something somewhat reasonable. You’ll almost always want to specify the one you want.

By default, comparisons() computes a first difference varying each variable from a low to a high value, while fixing every other variable to the observed values.

These expected values are stored in the estimate column of the output and it includes other commonly used values like the upper and lower bounds of the 95% confidence interval.

fd <- comparisons(fit)
glimpse(fd)

Rows: 8,000
Columns: 18
$ rowid        <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17…
$ term         <chr> "age", "age", "age", "age", "age", "age", "age", "age", "…
$ contrast     <chr> "+1", "+1", "+1", "+1", "+1", "+1", "+1", "+1", "+1", "+1…
$ estimate     <dbl> 0.003014792, 0.006151818, 0.007058384, 0.006911517, 0.006…
$ std.error    <dbl> 0.0002549662, 0.0007409855, 0.0008934136, 0.0008993856, 0…
$ statistic    <dbl> 11.824281, 8.302212, 7.900466, 7.684710, 7.211994, 12.732…
$ p.value      <dbl> 2.923968e-32, 1.021909e-16, 2.778629e-15, 1.533439e-14, 5…
$ s.value      <dbl> 104.75377, 53.11958, 48.35455, 45.89022, 40.72200, 120.95…
$ conf.low     <dbl> 0.0025150671, 0.0046995135, 0.0053073252, 0.0051487540, 0…
$ conf.high    <dbl> 0.003514516, 0.007604123, 0.008809442, 0.008674281, 0.008…
$ race         <fct> white, white, white, white, white, white, white, white, w…
$ age          <int> 60, 51, 24, 38, 25, 67, 40, 56, 32, 75, 46, 52, 22, 60, 2…
$ educate      <dbl> 14, 10, 12, 8, 12, 12, 12, 10, 12, 16, 15, 12, 12, 12, 14…
$ income       <dbl> 3.3458, 1.8561, 0.6304, 3.4183, 2.7852, 2.3866, 4.2857, 9…
$ vote         <int> 1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, …
$ predicted_lo <dbl> 0.8775302, 0.6786731, 0.5290410, 0.5755400, 0.6286173, 0.…
$ predicted_hi <dbl> 0.8805450, 0.6848249, 0.5360994, 0.5824515, 0.6352124, 0.…
$ predicted    <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…

For the numeric variables like age, comparisons() generates a first difference moving age from the observed value age to a high value age + 1. The variable contrast in the output specifies this comparison +1.
For the factor variable race, comparisons() generates a first difference moving race from the reference level to each other factor level. The variable contrast in the output specifies this comparison white - others.

table(fd$term, fd$contrast)

         
            +1 white - others
  age     2000              0
  educate 2000              0
  income  2000              0
  race       0           2000

Thus, by default, comparisons() gives us a data frame with the observed data repeated several times for several difference first differences. This is a reasonable default behavior, but we’ll almost always want to compute something more specific (and avoid computing things we don’t need).

13.3 Framework

When we call predictions() or comparisons(), we are always making three choices: (1) the quantity we want, (2) the grid of predictor values where we want it, and (3) whether or not to aggregate across units (the idea of “aggregation” is new to us). These choices determine the estimand, and it is important to be explicit in writing and thinking.

Here’s how Arel-Bundock, Greifer, and Heiss (2024) describe these choices:

Arel-Bundock, Vincent, Noah Greifer, and Andrew Heiss. 2024. “How to Interpret Statistical Models Using Marginaleffects for r and Python.” Journal of Statistical Software 111 (9). https://doi.org/10.18637/jss.v111.i09.

Quantity: What is the quantity of interest? Do we want to report a expected value or a comparison of expected values (difference, ratio, derivative, etc.)?
Grid: What predictor values are we interested in? Do we want to report estimates for the units in our dataset, or for hypothetical or representative individuals?
Aggregation: Do we report estimates for every observation in the grid or a global summary?

13.3.1 Quantity

First, we must choose our quantity of interest. Most generally, it can be “whatever we want.” But {marginaleffects} is slightly restrictive.

13.3.1.1 What scale?

First, we must choose the scale of the quantity of interest. For a logistic regression, we almost always want the probability scale (i.e., "response" scale). But we could also choose the linear predictor scale (i.e., "link" scale).

We make this choice with the type= argument:

# probability scale; default; almost always want this one
predictions(fit, type = "response")


 Estimate Std. Error     z Pr(>|z|)     S 2.5 % 97.5 %
    0.878    0.01177  74.5   <0.001   Inf 0.854  0.901
    0.679    0.01564  43.4   <0.001   Inf 0.648  0.709
    0.529    0.02958  17.9   <0.001 235.3 0.471  0.587
    0.576    0.02457  23.4   <0.001 400.7 0.527  0.624
    0.629    0.02051  30.6   <0.001 682.7 0.588  0.669
--- 1990 rows omitted. See ?print.marginaleffects ---
    0.796    0.01718  46.4   <0.001   Inf 0.763  0.830
    0.691    0.01476  46.8   <0.001   Inf 0.662  0.720
    0.946    0.00795 119.0   <0.001   Inf 0.931  0.962
    0.509    0.02763  18.4   <0.001 249.4 0.455  0.563
    0.678    0.02135  31.7   <0.001 732.3 0.636  0.720
Type: response

# linear predictor scale; rarely use
predictions(fit, type = "link")


 Estimate Std. Error      z Pr(>|z|)     S  2.5 % 97.5 %
   1.9692     0.1096 17.974  < 0.001 237.6  1.755  2.184
   0.7477     0.0717 10.423  < 0.001  82.1  0.607  0.888
   0.1163     0.1187  0.980  0.32728   1.6 -0.116  0.349
   0.3045     0.1006  3.027  0.00247   8.7  0.107  0.502
   0.5263     0.0879  5.990  < 0.001  28.8  0.354  0.698
--- 1990 rows omitted. See ?print.marginaleffects ---
   1.3631     0.1059 12.873  < 0.001 123.6  1.156  1.571
   0.8048     0.0691 11.644  < 0.001 101.7  0.669  0.940
   2.8710     0.1568 18.310  < 0.001 246.3  2.564  3.178
   0.0361     0.1105  0.327  0.74398   0.4 -0.181  0.253
   0.7436     0.0978  7.606  < 0.001  45.0  0.552  0.935
Type: link

13.3.1.2 Prediction or comparison?

Second, we must choose whether we want a prediction or a comparison. We make this choice by using predictions() or comparisons().

Arel-Bundock et al. (2025) define a prediction as:

Predictions are the outcomes predicted by a fitted model on a specified scale for a given combination of values of the predictor variables, such as their observed values, their means, or factor levels.

Borrowing language from King et al. (2000), I refer the {marginaleffects}’s “prediction” as an “expected value.”

They define a comparison as:

Comparisons are functions of two or more predictions. Examples of comparisons include contrasts, differences, risk ratios, odds, lift, etc.

The “first difference” from King et al. (2000) is an example of a comparison–a “difference” in the quote above.

13.3.1.3 What to compare?

Third, if we choose to compare, we need to choose what to compare. This involves two choices.

First, we need to choose a focal variable.
We need to choose a “high” scenario and a “low” scenario.

We specify this focal variable and the scenarios with the variables argument to comparisons(). The variables argument is a list of focal variables and a specification of the high and low values. There are numerous convenient ways to specify the comparisons you want, see ?marginaleffects::comparisons for details. I put a few common examples below for numeric variables.

# focal variable: `age`; high = age + 10; low = age
comparisons(fit, variables = list(age = 10)) |>
  glimpse()

Rows: 2,000
Columns: 18
$ rowid        <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17…
$ term         <chr> "age", "age", "age", "age", "age", "age", "age", "age", "…
$ contrast     <chr> "+10", "+10", "+10", "+10", "+10", "+10", "+10", "+10", "…
$ estimate     <dbl> 0.027360732, 0.058480001, 0.069607780, 0.067373982, 0.063…
$ std.error    <dbl> 0.002096085, 0.006680299, 0.008968105, 0.008750418, 0.008…
$ statistic    <dbl> 13.053255, 8.754100, 7.761704, 7.699516, 7.286021, 14.409…
$ p.value      <dbl> 6.088899e-39, 2.057388e-18, 8.379572e-15, 1.365825e-14, 3…
$ s.value      <dbl> 126.94901, 58.75389, 46.76204, 46.05722, 41.51041, 153.95…
$ conf.low     <dbl> 0.023252481, 0.045386857, 0.052030616, 0.050223478, 0.046…
$ conf.high    <dbl> 0.03146898, 0.07157315, 0.08718494, 0.08452449, 0.0805266…
$ race         <fct> white, white, white, white, white, white, white, white, w…
$ age          <int> 60, 51, 24, 38, 25, 67, 40, 56, 32, 75, 46, 52, 22, 60, 2…
$ educate      <dbl> 14, 10, 12, 8, 12, 12, 12, 10, 12, 16, 15, 12, 12, 12, 14…
$ income       <dbl> 3.3458, 1.8561, 0.6304, 3.4183, 2.7852, 2.3866, 4.2857, 9…
$ vote         <int> 1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, …
$ predicted_lo <dbl> 0.8775302, 0.6786731, 0.5290410, 0.5755400, 0.6286173, 0.…
$ predicted_hi <dbl> 0.9048910, 0.7371531, 0.5986488, 0.6429140, 0.6920739, 0.…
$ predicted    <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…

#  high = 90; low = 18
comparisons(fit, variables = list(age = c(18, 90))) |>
  glimpse()

Rows: 2,000
Columns: 18
$ rowid        <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17…
$ term         <chr> "age", "age", "age", "age", "age", "age", "age", "age", "…
$ contrast     <chr> "90 - 18", "90 - 18", "90 - 18", "90 - 18", "90 - 18", "9…
$ estimate     <dbl> 0.25841282, 0.41140255, 0.39295205, 0.42083720, 0.3332314…
$ std.error    <dbl> 0.02716613, 0.04374691, 0.04168724, 0.04639905, 0.0351402…
$ statistic    <dbl> 9.512315, 9.404151, 9.426195, 9.069954, 9.482914, 9.48784…
$ p.value      <dbl> 1.864654e-21, 5.245209e-21, 4.252375e-21, 1.190675e-19, 2…
$ s.value      <dbl> 68.86158, 67.36949, 67.67222, 62.86485, 68.45434, 68.5226…
$ conf.low     <dbl> 0.20516818, 0.32566018, 0.31124656, 0.32989674, 0.2643579…
$ conf.high    <dbl> 0.3116575, 0.4971449, 0.4746575, 0.5117777, 0.4021050, 0.…
$ race         <fct> white, white, white, white, white, white, white, white, w…
$ age          <int> 60, 51, 24, 38, 25, 67, 40, 56, 32, 75, 46, 52, 22, 60, 2…
$ educate      <dbl> 14, 10, 12, 8, 12, 12, 12, 10, 12, 16, 15, 12, 12, 12, 14…
$ income       <dbl> 3.3458, 1.8561, 0.6304, 3.4183, 2.7852, 2.3866, 4.2857, 9…
$ vote         <int> 1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, …
$ predicted_lo <dbl> 0.6853277, 0.4531349, 0.4865455, 0.4347258, 0.5812263, 0.…
$ predicted_hi <dbl> 0.9437405, 0.8645374, 0.8794975, 0.8555630, 0.9144578, 0.…
$ predicted    <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…

# high = 75th %ile; low = 25th %ile
comparisons(fit, variables = list(age = "iqr")) |>
  glimpse()

Rows: 2,000
Columns: 18
$ rowid        <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17…
$ term         <chr> "age", "age", "age", "age", "age", "age", "age", "age", "…
$ contrast     <chr> "Q3 - Q1", "Q3 - Q1", "Q3 - Q1", "Q3 - Q1", "Q3 - Q1", "Q…
$ estimate     <dbl> 0.11549285, 0.18099237, 0.17383559, 0.18445762, 0.1487362…
$ std.error    <dbl> 0.013478609, 0.022433678, 0.021368981, 0.023404006, 0.018…
$ statistic    <dbl> 8.568603, 8.067886, 8.134950, 7.881455, 8.258313, 8.23494…
$ p.value      <dbl> 1.047476e-17, 7.152553e-16, 4.121079e-16, 3.235905e-15, 1…
$ s.value      <dbl> 56.40586, 50.31239, 51.10783, 48.13475, 52.58773, 52.3056…
$ conf.low     <dbl> 0.08907526, 0.13702317, 0.13195316, 0.13858662, 0.1134363…
$ conf.high    <dbl> 0.14191044, 0.22496157, 0.21571803, 0.23032863, 0.1840361…
$ race         <fct> white, white, white, white, white, white, white, white, w…
$ age          <int> 60, 51, 24, 38, 25, 67, 40, 56, 32, 75, 46, 52, 22, 60, 2…
$ educate      <dbl> 14, 10, 12, 8, 12, 12, 12, 10, 12, 16, 15, 12, 12, 12, 14…
$ income       <dbl> 3.3458, 1.8561, 0.6304, 3.4183, 2.7852, 2.3866, 4.2857, 9…
$ vote         <int> 1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, …
$ predicted_lo <dbl> 0.7589574, 0.5450262, 0.5780504, 0.5264779, 0.6673926, 0.…
$ predicted_hi <dbl> 0.8744502, 0.7260185, 0.7518860, 0.7109355, 0.8161288, 0.…
$ predicted    <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…

# high = 1 SD above; low = 1 SD below
comparisons(fit, variables = list(age = "2sd")) |>
  glimpse()

Rows: 2,000
Columns: 18
$ rowid        <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17…
$ term         <chr> "age", "age", "age", "age", "age", "age", "age", "age", "…
$ contrast     <chr> "(x + sd) - (x - sd)", "(x + sd) - (x - sd)", "(x + sd) -…
$ estimate     <dbl> 0.14399385, 0.22469520, 0.21587670, 0.22897272, 0.1849871…
$ std.error    <dbl> 0.016766207, 0.027487650, 0.026217808, 0.028668738, 0.022…
$ statistic    <dbl> 8.588338, 8.174406, 8.233972, 7.986843, 8.325331, 8.30893…
$ p.value      <dbl> 8.823623e-18, 2.973280e-16, 1.811059e-16, 1.384381e-15, 8…
$ s.value      <dbl> 56.65333, 51.57879, 52.29402, 49.35968, 53.40079, 53.2013…
$ conf.low     <dbl> 0.11113269, 0.17082040, 0.16449074, 0.17278302, 0.1414371…
$ conf.high    <dbl> 0.17685502, 0.27857000, 0.26726266, 0.28516241, 0.2285371…
$ race         <fct> white, white, white, white, white, white, white, white, w…
$ age          <int> 60, 51, 24, 38, 25, 67, 40, 56, 32, 75, 46, 52, 22, 60, 2…
$ educate      <dbl> 14, 10, 12, 8, 12, 12, 12, 10, 12, 16, 15, 12, 12, 12, 14…
$ income       <dbl> 3.3458, 1.8561, 0.6304, 3.4183, 2.7852, 2.3866, 4.2857, 9…
$ vote         <int> 1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, …
$ predicted_lo <dbl> 0.7418547, 0.5222984, 0.5556273, 0.5036674, 0.6468164, 0.…
$ predicted_hi <dbl> 0.8858485, 0.7469936, 0.7715040, 0.7326401, 0.8318036, 0.…
$ predicted    <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…

13.3.1.4 How to compare?

If we choose to make a comparison, then we have to choose how to compare the two expected values hi and lo.

The simplest way is a difference, which is simply hi - lo, but we could also use the ratio hi/lo or the “lift” (hi - lo)/lo.

We can specify the comparison we want with the comparison argument to comparisons(). The default is "difference" and that’s usually a good choice.

We can find the "ratio" rather than the "difference" by using comparison = "ratio".

#  high = 18; low = 90; comparison = ratio
comparisons(fit, 
            variables = list(age = c(18, 90)), 
            comparison = "ratio") |>
  glimpse()

Rows: 2,000
Columns: 18
$ rowid        <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17…
$ term         <chr> "age", "age", "age", "age", "age", "age", "age", "age", "…
$ contrast     <chr> "90 / 18", "90 / 18", "90 / 18", "90 / 18", "90 / 18", "9…
$ estimate     <dbl> 1.377065, 1.907903, 1.807637, 1.968052, 1.573325, 1.61143…
$ std.error    <dbl> 0.05108134, 0.15618248, 0.13648145, 0.18325488, 0.0849123…
$ statistic    <dbl> 26.958270, 12.215858, 13.244560, 10.739426, 18.528813, 17…
$ p.value      <dbl> 4.563059e-160, 2.557872e-34, 4.851146e-40, 6.645656e-27, …
$ s.value      <dbl> 529.31849, 111.59061, 130.59880, 86.95965, 252.19243, 224…
$ conf.low     <dbl> 1.276947, 1.601791, 1.540138, 1.608879, 1.406900, 1.43068…
$ conf.high    <dbl> 1.477182, 2.214015, 2.075136, 2.327225, 1.739750, 1.79219…
$ race         <fct> white, white, white, white, white, white, white, white, w…
$ age          <int> 60, 51, 24, 38, 25, 67, 40, 56, 32, 75, 46, 52, 22, 60, 2…
$ educate      <dbl> 14, 10, 12, 8, 12, 12, 12, 10, 12, 16, 15, 12, 12, 12, 14…
$ income       <dbl> 3.3458, 1.8561, 0.6304, 3.4183, 2.7852, 2.3866, 4.2857, 9…
$ vote         <int> 1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, …
$ predicted_lo <dbl> 0.6853277, 0.4531349, 0.4865455, 0.4347258, 0.5812263, 0.…
$ predicted_hi <dbl> 0.9437405, 0.8645374, 0.8794975, 0.8555630, 0.9144578, 0.…
$ predicted    <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…

We can also supply our own function \(hi, lo) {...}. I think the percent change \(hi, lo) (hi/lo) - 1) is good. This is also known as “lift” and we can use comparison = "lift" instead.

#  high = 18; low = 90; comparison = custom (percent change)
comparisons(fit, 
            variables = list(age = c(18, 90)), 
            comparison = \(hi, lo) (hi/lo) - 1) |>
  glimpse()

Rows: 2,000
Columns: 18
$ rowid        <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17…
$ term         <chr> "age", "age", "age", "age", "age", "age", "age", "age", "…
$ contrast     <chr> "custom", "custom", "custom", "custom", "custom", "custom…
$ estimate     <dbl> 0.37706459, 0.90790306, 0.80763680, 0.96805211, 0.5733248…
$ std.error    <dbl> 0.05108134, 0.15618248, 0.13648145, 0.18325488, 0.0849123…
$ statistic    <dbl> 7.381650, 5.813092, 5.917557, 5.282545, 6.751962, 6.62993…
$ p.value      <dbl> 1.563395e-13, 6.132948e-09, 3.267581e-09, 1.274016e-07, 1…
$ s.value      <dbl> 42.54038, 27.28077, 28.18913, 22.90411, 35.99663, 34.7934…
$ conf.low     <dbl> 0.27694700, 0.60179103, 0.54013807, 0.60887915, 0.4068997…
$ conf.high    <dbl> 0.4771822, 1.2140151, 1.0751355, 1.3272251, 0.7397500, 0.…
$ predicted    <dbl> 0.8775302, 0.6786731, 0.5290410, 0.5755400, 0.6286173, 0.…
$ race         <fct> white, white, white, white, white, white, white, white, w…
$ age          <int> 60, 51, 24, 38, 25, 67, 40, 56, 32, 75, 46, 52, 22, 60, 2…
$ educate      <dbl> 14, 10, 12, 8, 12, 12, 12, 10, 12, 16, 15, 12, 12, 12, 14…
$ income       <dbl> 3.3458, 1.8561, 0.6304, 3.4183, 2.7852, 2.3866, 4.2857, 9…
$ vote         <int> 1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, …
$ predicted_lo <dbl> 0.6853277, 0.4531349, 0.4865455, 0.4347258, 0.5812263, 0.…
$ predicted_hi <dbl> 0.9437405, 0.8645374, 0.8794975, 0.8555630, 0.9144578, 0.…

13.3.2 Grid

Next, we need to decide where to compute these predictions or comparisons. In most cases, the values of the non-focal variables matter, so we have to make a choice about these other variables.

By default, predictions() and comparisons() use the observed data, returning one value for each row. However, this isn’t always what we want.

To help us create the grid we want, {marginaleffects} has a powerful datagrid() function.

For example, we can supply newdata = datagrid(type = "mean_or_mode") to predictions(). This computes the prediction for a “typical case” has numeric variables equal to the mean and factor variables equal to their mode.

predictions(fit, newdata = datagrid(type = "mean_or_mode")) |>
  glimpse()

Rows: 1
Columns: 13
$ rowid     <dbl> 1
$ estimate  <dbl> 0.7858649
$ p.value   <dbl> 1.387065e-92
$ s.value   <dbl> 305.1453
$ conf.low  <dbl> 0.7641047
$ conf.high <dbl> 0.8061271
$ age       <int> 45
$ educate   <dbl> 12.06675
$ income    <dbl> 3.88664
$ race      <fct> white
$ vote      <int> 1
$ type      <chr> "mean_or_mode"
$ df        <dbl> Inf

If we instead supply age = 18:90 to datagrid() then it will create a grid that varies age from 18 to 90 while fixing the other variables at their mean_or_mode.

predictions(fit, newdata = datagrid(age = 18:90)) |>
  glimpse()

Rows: 73
Columns: 12
$ rowid     <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 1…
$ estimate  <dbl> 0.6305579, 0.6371384, 0.6436679, 0.6501444, 0.6565660, 0.662…
$ p.value   <dbl> 2.331460e-07, 2.236297e-08, 1.587534e-09, 8.070391e-11, 2.83…
$ s.value   <dbl> 22.03226, 25.41431, 29.23057, 33.52857, 38.35926, 43.77650, …
$ conf.low  <dbl> 0.5822381, 0.5904128, 0.5985160, 0.6065424, 0.6144863, 0.622…
$ conf.high <dbl> 0.6763951, 0.6814098, 0.6864024, 0.6913731, 0.6963224, 0.701…
$ educate   <dbl> 12.06675, 12.06675, 12.06675, 12.06675, 12.06675, 12.06675, …
$ income    <dbl> 3.88664, 3.88664, 3.88664, 3.88664, 3.88664, 3.88664, 3.8866…
$ race      <fct> white, white, white, white, white, white, white, white, whit…
$ vote      <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, …
$ age       <int> 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, …
$ df        <dbl> Inf, Inf, Inf, Inf, Inf, Inf, Inf, Inf, Inf, Inf, Inf, Inf, …

Tip

You can always inspect the output of predictions() to see the grid that you are using.

We can use the same logic with comparisons. We can compute the first difference of a focal variable while systematically varying another variable (e.g., educate or race) and fixing the other variables at their "mean_or_mode".

# first difference of moving age from 18 to 90 for respondents with 
#   12, 14, and 16 years of education
comparisons(fit, 
            variables = list(age = c(18, 90)), 
            newdata = datagrid(educate = c(12, 14, 16))) |>
  glimpse()

Rows: 3
Columns: 18
$ rowid        <int> 1, 2, 3
$ term         <chr> "age", "age", "age"
$ contrast     <chr> "90 - 18", "90 - 18", "90 - 18"
$ estimate     <dbl> 0.3007122, 0.2430060, 0.1902606
$ std.error    <dbl> 0.03213503, 0.02560497, 0.02112214
$ statistic    <dbl> 9.357771, 9.490580, 9.007636
$ p.value      <dbl> 8.143638e-21, 2.297518e-21, 2.105455e-19
$ s.value      <dbl> 66.73482, 68.56041, 62.04250
$ conf.low     <dbl> 0.2377287, 0.1928212, 0.1488619
$ conf.high    <dbl> 0.3636957, 0.2931908, 0.2316592
$ age          <int> 45, 45, 45
$ income       <dbl> 3.88664, 3.88664, 3.88664
$ race         <fct> white, white, white
$ vote         <int> 1, 1, 1
$ educate      <dbl> 12, 14, 16
$ predicted_lo <dbl> 0.6278227, 0.7056094, 0.7730163
$ predicted_hi <dbl> 0.9285349, 0.9486155, 0.9632768
$ predicted    <dbl> NA, NA, NA

# first difference of moving age from 18 to 90 for white respondents 
#   and for non-white respondents
comparisons(fit, 
            variables = list(age = c(18, 90)), 
            newdata = datagrid(race = unique)) |>
  glimpse()

Rows: 2
Columns: 18
$ rowid        <int> 1, 2
$ term         <chr> "age", "age"
$ contrast     <chr> "90 - 18", "90 - 18"
$ estimate     <dbl> 0.3404720, 0.2987511
$ std.error    <dbl> 0.03832429, 0.03189213
$ statistic    <dbl> 8.883975, 9.367549
$ p.value      <dbl> 6.451296e-19, 7.423628e-21
$ s.value      <dbl> 60.42704, 66.86837
$ conf.low     <dbl> 0.2653578, 0.2362437
$ conf.high    <dbl> 0.4155862, 0.3612585
$ age          <int> 45, 45
$ educate      <dbl> 12.06675, 12.06675
$ income       <dbl> 3.88664, 3.88664
$ vote         <int> 1, 1
$ race         <fct> others, white
$ predicted_lo <dbl> 0.5704809, 0.6305579
$ predicted_hi <dbl> 0.9109528, 0.9293090
$ predicted    <dbl> NA, NA

13.3.3 Aggregation

Finally, we need to decide what to do with the collection of estimates from the grid. For example, by default, comparisons returns a first difference for every row in the data set.

fd <- comparisons(fit, variables = list(age = c(18, 90))) |>
  glimpse()

Rows: 2,000
Columns: 18
$ rowid        <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17…
$ term         <chr> "age", "age", "age", "age", "age", "age", "age", "age", "…
$ contrast     <chr> "90 - 18", "90 - 18", "90 - 18", "90 - 18", "90 - 18", "9…
$ estimate     <dbl> 0.25841282, 0.41140255, 0.39295205, 0.42083720, 0.3332314…
$ std.error    <dbl> 0.02716613, 0.04374691, 0.04168724, 0.04639905, 0.0351402…
$ statistic    <dbl> 9.512315, 9.404151, 9.426195, 9.069954, 9.482914, 9.48784…
$ p.value      <dbl> 1.864654e-21, 5.245209e-21, 4.252375e-21, 1.190675e-19, 2…
$ s.value      <dbl> 68.86158, 67.36949, 67.67222, 62.86485, 68.45434, 68.5226…
$ conf.low     <dbl> 0.20516818, 0.32566018, 0.31124656, 0.32989674, 0.2643579…
$ conf.high    <dbl> 0.3116575, 0.4971449, 0.4746575, 0.5117777, 0.4021050, 0.…
$ race         <fct> white, white, white, white, white, white, white, white, w…
$ age          <int> 60, 51, 24, 38, 25, 67, 40, 56, 32, 75, 46, 52, 22, 60, 2…
$ educate      <dbl> 14, 10, 12, 8, 12, 12, 12, 10, 12, 16, 15, 12, 12, 12, 14…
$ income       <dbl> 3.3458, 1.8561, 0.6304, 3.4183, 2.7852, 2.3866, 4.2857, 9…
$ vote         <int> 1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, …
$ predicted_lo <dbl> 0.6853277, 0.4531349, 0.4865455, 0.4347258, 0.5812263, 0.…
$ predicted_hi <dbl> 0.9437405, 0.8645374, 0.8794975, 0.8555630, 0.9144578, 0.…
$ predicted    <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…

We could plot these.

scatter <- ggplot(fd, aes(x = age, y = estimate, size = income, color = educate)) +
  facet_wrap(vars(race)) +
  geom_point(shape = 19, alpha = 0.25)

hist <- ggplot(fd, aes(x = estimate)) + 
  geom_histogram()

library(patchwork)
scatter / hist

It’s really useful to summarize the heterogeneity of the effects from your model. This is one of the great strengths of parametric modeling—it allows a much richer set of quantities of interest.²

² There’s also some cost—you do need to assume that your parametric model is a good model.

But ultimately it’s helpful to combine these many estimates into a single summary of “the effect.” It’s common to average the comparisons. If you have a collection of comparisons that you want to average, then you can use avg_comparisons() rather than comparisons(). avg_comparisons() will average the comparisons produced by comparisons() and compute the correct SE (using the delta method, again).

avg_comparisons(fit, variables = list(age = c(18, 90))) |>
  glimpse()

Rows: 1
Columns: 9
$ term      <chr> "age"
$ contrast  <chr> "90 - 18"
$ estimate  <dbl> 0.2994053
$ std.error <dbl> 0.03039745
$ statistic <dbl> 9.849684
$ p.value   <dbl> 6.875961e-23
$ s.value   <dbl> 73.62278
$ conf.low  <dbl> 0.2398274
$ conf.high <dbl> 0.3589832

13.4 Average Case or Observed Value

When trying to find representative summary of “the effect” from non-linear model in which the effects vary, Hanmer and Ozan Kalkan (2013) note that there are two approaches.

Hanmer, Michael J., and Kerem Ozan Kalkan. 2013. “Behind the Curve: Clarifying the Best Approach to Calculating Predicted Probabilities and Marginal Effects from Limited Dependent Variable Models.” American Journal of Political Science 57 (1): 263–77. https://doi.org/10.1111/j.1540-5907.2012.00602.x.

The average case approach tries to identify a typical case and compute the effect for that case. The observed value approach computes the effect for all cases and then averages those effects.

In many cases the two approaches give similar estimates, but these are different estimands and sometimes the estimates meaningfully diverge. Radean and Beger (2025) discuss potential differences.

Radean, Marius, and Andreas Beger. 2025. “Not-so-Average After All: Individual Vs. Aggregate Effects in Substantive Research.” Journal of Peace Research. https://repository.essex.ac.uk/40373/.

It is easy to compute either estimand using {marginaleffects}.

# avg. case approach
avg_comparisons(fit, 
                # set the "other" values to their means or modes
                newdata = datagrid(type = "mean_or_mode"),
                variables = list(age = c(18, 90))) |>
  glimpse()

Rows: 1
Columns: 19
$ rowid        <int> 1
$ term         <chr> "age"
$ contrast     <chr> "90 - 18"
$ estimate     <dbl> 0.2987511
$ std.error    <dbl> 0.03189213
$ statistic    <dbl> 9.367549
$ p.value      <dbl> 7.423628e-21
$ s.value      <dbl> 66.86837
$ conf.low     <dbl> 0.2362437
$ conf.high    <dbl> 0.3612585
$ age          <int> 45
$ educate      <dbl> 12.06675
$ income       <dbl> 3.88664
$ race         <fct> white
$ vote         <int> 1
$ type         <chr> "mean_or_mode"
$ predicted_lo <dbl> 0.6305579
$ predicted_hi <dbl> 0.929309
$ predicted    <dbl> NA

# observed value approach
avg_comparisons(fit, variables = list(age = c(18, 90))) |>
  glimpse()

Rows: 1
Columns: 9
$ term      <chr> "age"
$ contrast  <chr> "90 - 18"
$ estimate  <dbl> 0.2994053
$ std.error <dbl> 0.03039745
$ statistic <dbl> 9.849684
$ p.value   <dbl> 6.875961e-23
$ s.value   <dbl> 73.62278
$ conf.low  <dbl> 0.2398274
$ conf.high <dbl> 0.3589832

13.1 Expected value → predictions()

13.2 First difference → comparisons()

13.3 Framework

13.3.1 Quantity

13.3.1.1 What scale?

13.3.1.2 Prediction or comparison?

13.3.1.3 What to compare?

13.3.1.4 How to compare?

13.3.2 Grid

13.3.3 Aggregation

13.4 Average Case or Observed Value

13.1 Expected value → `predictions()`

13.2 First difference → `comparisons()`