10  Formulas

10.1 Basics of Formulas

How can we construct a design matrix in R? R has formulas to make this easy. See ?formula for details.

As a working example, suppose a normal linear model \[ \mu_i = \beta_0 + \beta_1 x_{i1} + \beta_2 x_{i2}. \]

We use formulas to specify the columns of the design matrix in the normal linear model.

A formula has a LHS (left-hand side) and a RHS separated by a ~. The LHS indicates the outcome and the RHS indicates the predictors. The RHS determines the columns of the linear predictor. The operator + adds predictors. By default, an intercept (a column of 1’s) is included.

For the linear regression above, we would use the formula y ~ x1 + x2

10.1.1 model.matrix() examples

Throughout these examples, I use the penguins data set from the {palmerpenguins} R package. For more, see ?palmerpenguins::penguins or the website for the package.

# load packages 
library(modelsummary)
library(palmerpenguins) # example data

# quick look
glimpse(penguins)
Rows: 344
Columns: 8
$ species           <fct> Adelie, Adelie, Adelie, Adelie, Adelie, Adelie, Adel…
$ island            <fct> Torgersen, Torgersen, Torgersen, Torgersen, Torgerse…
$ bill_length_mm    <dbl> 39.1, 39.5, 40.3, NA, 36.7, 39.3, 38.9, 39.2, 34.1, …
$ bill_depth_mm     <dbl> 18.7, 17.4, 18.0, NA, 19.3, 20.6, 17.8, 19.6, 18.1, …
$ flipper_length_mm <int> 181, 186, 195, NA, 193, 190, 181, 195, 193, 190, 186…
$ body_mass_g       <int> 3750, 3800, 3250, NA, 3450, 3650, 3625, 4675, 3475, …
$ sex               <fct> male, female, female, NA, female, male, female, male…
$ year              <int> 2007, 2007, 2007, 2007, 2007, 2007, 2007, 2007, 2007…
# penguins has missing values, let's remove those rows for these examples.
penguins <- na.omit(penguins)

10.1.1.1 Single numeric variable

For a linear regression model with a single predictor, we can use the formula below to construct the design matrix. Notice that the LHS is empty. The LHS is irrelevant for the design matrix, so I leave it empty in these examples.

f <- ~ flipper_length_mm

To construct the design matrix X, we can use the model.matrix() function. We give model.matrix() a formula and a data set. It uses the RHS of the formula to construct the design matrix using the variables in the data set.

X <- model.matrix(f, data = penguins)

We can use the head() function to inspect the first six rows of the design matrix.

head(X)
  (Intercept) flipper_length_mm
1           1               181
2           1               186
3           1               195
4           1               193
5           1               190
6           1               181

10.1.1.2 Two numeric variables

We can add a second predictor and we get the expected design matrix.

f <- ~ flipper_length_mm + body_mass_g
X <- model.matrix(f, data = penguins)
head(X)
  (Intercept) flipper_length_mm body_mass_g
1           1               181        3750
2           1               186        3800
3           1               195        3250
4           1               193        3450
5           1               190        3650
6           1               181        3625

10.1.1.3 Removing the intercept

An intercept (or column of ones) is added to the design matrix by default. You can remove the intercetp by adding -1 or + 0 to the formula.

f <- ~ -1 + flipper_length_mm + body_mass_g
X <- model.matrix(f, data = penguins)
head(X)
  flipper_length_mm body_mass_g
1               181        3750
2               186        3800
3               195        3250
4               193        3450
5               190        3650
6               181        3625

10.1.2 lm() examples

To see formulas from more familiar perspective, let’s also use lm() to fit several linear models. We’re using the penguins data with bill_length_mm as the outcome

fit1 <- lm(bill_length_mm ~ flipper_length_mm, data = penguins)
fit2 <- lm(bill_length_mm ~ body_mass_g, data = penguins)
fit3 <- lm(bill_length_mm ~ flipper_length_mm + body_mass_g, data = penguins)
fit4 <- lm(bill_length_mm ~ flipper_length_mm - 1, data = penguins)

modelsummary(list(fit1, fit2, fit3, fit4), gof_map = NA)
(1) (2) (3) (4)
(Intercept) -7.219 27.151 -3.981
(3.272) (1.292) (4.722)
flipper_length_mm 0.255 0.227 0.219
(0.016) (0.033) (0.001)
body_mass_g 0.004 0.001
(0.000) (0.001)

10.2 Factors

The examples above show numerical variables. But formulas also handle factors. Factors are automatically expanded into indicator variables in the design matrix. By default, R uses treatment coding: one category is the baseline (reference) and the others are coded as 0/1 indicators.1

1 This is the standard approach in political science, but there are other ways. R also supports other contrast codings, such as contr.sum or contr.helmert. These can be useful to make regression tables informative about specific hypotheses, but they are not commonly used and won’t be familiar to your readers.

10.2.1 model.matrix() examples

10.2.1.1 A single factor variable

The penguins data set contains a factor variable species.

levels(penguins$species)
[1] "Adelie"    "Chinstrap" "Gentoo"   

A formula with species on the right-hand side makes a design matrix with two an intercept and two indicators. One indicator variable is for Chinstrap penguins; the other is for Gentoo penguins. Adelie penguins are the reference (i.e., left-out) category.

The coefficients for speciesChinstrap and speciesGentoo are the changes in \(\mu_i\) for Chinstrap and Gentoo penguins relative to Adelie penguins, respectively.

f <- ~ species
X <- model.matrix(f, data = penguins)
head(X)
  (Intercept) speciesChinstrap speciesGentoo
1           1                0             0
2           1                0             0
3           1                0             0
4           1                0             0
5           1                0             0
6           1                0             0

We can change the reference category from Adelie with the relevel() function. The below makes Gentoo the reference category.

penguins$species <- relevel(penguins$species, "Gentoo")
f <- ~ species
X <- model.matrix(f, data = penguins)
head(X)
  (Intercept) speciesAdelie speciesChinstrap
1           1             1                0
2           1             1                0
3           1             1                0
4           1             1                0
5           1             1                0
6           1             1                0

Now the coefficients for speciesAdelie and speciesChinstrap are the changes in \(\mu_i\) for Adelie and Chinstrap penguins relative to Gentoo penguins, respectively.

10.2.1.2 A factor and numeric variable

We can include factors and continuous predictors together. For example, we combine species with flipper_length_mm.

f <- ~ species + flipper_length_mm
X <- model.matrix(f, data = penguins)
head(X)
  (Intercept) speciesAdelie speciesChinstrap flipper_length_mm
1           1             1                0               181
2           1             1                0               186
3           1             1                0               195
4           1             1                0               193
5           1             1                0               190
6           1             1                0               181

The resulting design matrix includes columns for the intercept, the dummy variables for species, and the continuous predictor. The coefficients on the indicators have the same interpretation variables represent differences between species, but now “holding flipper length constant.”

10.2.2 lm() examples

As familiar examples, here are a few lm() fits with factors species and island as predictors of bill_length_mm.

# one factor
fit1 <- lm(bill_length_mm ~ species, data = penguins)

# two factors
fit2 <- lm(bill_length_mm ~ species + island, data = penguins)

# one factor, one numeric
fit3 <- lm(bill_length_mm ~ species + flipper_length_mm, data = penguins)

# one factor, two numeric
fit4 <- lm(bill_length_mm ~ species + flipper_length_mm + body_mass_g, data = penguins)

modelsummary(list(fit1, fit2, fit3, fit4), gof_map = NA)
(1) (2) (3) (4)
(Intercept) 47.568 47.568 0.369 11.223
(0.272) (0.273) (4.661) (4.436)
speciesAdelie -8.744 -8.593 -2.849 -2.021
(0.367) (0.525) (0.664) (0.612)
speciesChinstrap 1.266 1.721 5.918 7.344
(0.452) (0.753) (0.605) (0.577)
islandDream -0.455
(0.602)
islandTorgersen 0.063
(0.624)
flipper_length_mm 0.217 0.099
(0.021) (0.024)
body_mass_g 0.003
(0.000)

10.3 Polynomials

Sometimes the we suspect the effect of a variable on the outcome is not linear. To allow nonlinear effects, we can include polynomial terms of the predictor in the design matrix. For example, we can use the quadratic model

\[ \mu_i = \beta_0 + \beta_1 x_i + \beta_2 x_i^2. \]

Here the effect of \(x_i\) depends on its value—\(\frac{d \mu_i}{d x_i} = \beta_1 + 2\beta_2 x_i\). We can build the design matrix easily with formulas and the poly() function.

10.3.1 Orthogonal and raw polynomials

The function poly() creates polynomial terms. By default, it produces orthogonal polynomials, which are linear combinations of \(x, x^2, x^3, ...\), that are uncorrelated.2 If we set raw = TRUE, poly() returns the raw powers of \(x\) (i.e, \(x = 3\) becomes \(x^2 = 9\)). The raw powers are easier to interpret directly when they appear in regression tables but can sometimes cause problems for computation.

2 These uncorrelated variables are created using the Gram–Schmidt process. Orthogonal polynomials are helpful because they reduce multicollinearity. Less multicollinearly makes numerical optimization routines (e.g., Newton–Raphson) more robust and MCMC samplers mix more efficiently. Raw powers can lead to high multicollinearity and cause problems for our computational procedures. However, these transformed variables do not have the same interpretation.

Recommendation: I recommend using raw = TRUE if you are reporting results in a regression table, unless these cause numerical problems. However, it usually doesn’t matter. First, we usually process our results in a way that means our interpretation is the same with raw = FALSE and raw = TRUE. Second, numerical routines are really good, so raw = FALSE doesn’t usually lead to problems.

10.3.2 model.matrix() examples

Let’s look at orthogonal and raw polynomials using flipper_length_mm.

# orthogonal
f <- ~ poly(flipper_length_mm, 2)
X <- model.matrix(f, data = penguins)
head(X)
  (Intercept) poly(flipper_length_mm, 2)1 poly(flipper_length_mm, 2)2
1           1                 -0.07818550                 0.089041829
2           1                 -0.05860679                 0.030341919
3           1                 -0.02336511                -0.038291958
4           1                 -0.03119659                -0.027153981
5           1                 -0.04294382                -0.006039163
6           1                 -0.07818550                 0.089041829
# raw
f <- ~ poly(flipper_length_mm, 2, raw = TRUE)
X <- model.matrix(f, data = penguins)
head(X)
  (Intercept) poly(flipper_length_mm, 2, raw = TRUE)1
1           1                                     181
2           1                                     186
3           1                                     195
4           1                                     193
5           1                                     190
6           1                                     181
  poly(flipper_length_mm, 2, raw = TRUE)2
1                                   32761
2                                   34596
3                                   38025
4                                   37249
5                                   36100
6                                   32761

With orthogonal polynomials, the columns are not \(x\) and \(x^2\) but to uncorrelated combinations of them. With raw polynomials, the columns are \(x\) and \(x^2\) exactly.

Here’s an example of a 4th-order polynomial.

# raw
f <- ~ poly(flipper_length_mm, 4, raw = TRUE)
X <- model.matrix(f, data = penguins)
head(X)
  (Intercept) poly(flipper_length_mm, 4, raw = TRUE)1
1           1                                     181
2           1                                     186
3           1                                     195
4           1                                     193
5           1                                     190
6           1                                     181
  poly(flipper_length_mm, 4, raw = TRUE)2
1                                   32761
2                                   34596
3                                   38025
4                                   37249
5                                   36100
6                                   32761
  poly(flipper_length_mm, 4, raw = TRUE)3
1                                 5929741
2                                 6434856
3                                 7414875
4                                 7189057
5                                 6859000
6                                 5929741
  poly(flipper_length_mm, 4, raw = TRUE)4
1                              1073283121
2                              1196883216
3                              1445900625
4                              1387488001
5                              1303210000
6                              1073283121

10.3.3 lm() examples

As familiar examples, here are a few models with factors species and island as predictors of bill_length_mm.

As familiar examples, here are a few lm() fits with both orthogonal and raw polynomials.

fit1 <- lm(bill_length_mm ~ poly(flipper_length_mm, 2), data = penguins)
fit2 <- lm(bill_length_mm ~ poly(flipper_length_mm, 3), data = penguins)
fit3 <- lm(bill_length_mm ~ poly(flipper_length_mm, 2, raw = TRUE), data = penguins)
fit4 <- lm(bill_length_mm ~ poly(flipper_length_mm, 3, raw = TRUE), data = penguins)

modelsummary(list(fit1, fit2, fit3, fit4), gof_map = NA)
(1) (2) (3) (4)
(Intercept) 43.993 43.993 -46.928 -182.782
(0.227) (0.228) (50.349) (613.541)
poly(flipper_length_mm, 2)1 65.077
(4.150)
poly(flipper_length_mm, 2)2 -3.280
(4.150)
poly(flipper_length_mm, 3)1 65.077
(4.156)
poly(flipper_length_mm, 3)2 -3.280
(4.156)
poly(flipper_length_mm, 3)3 0.923
(4.156)
poly(flipper_length_mm, 2, raw = TRUE)1 0.647
(0.497)
poly(flipper_length_mm, 2, raw = TRUE)2 -0.001
(0.001)
poly(flipper_length_mm, 3, raw = TRUE)1 2.665
(9.097)
poly(flipper_length_mm, 3, raw = TRUE)2 -0.011
(0.045)
poly(flipper_length_mm, 3, raw = TRUE)3 0.000
(0.000)

Note that while the raw and orthogonal polynomial terms have different coefficients, they produce the exact same estimates of \(\mu\). And they produce the exact same estimates of \(d \mu / d x_i\), though we have to transform the estimates to get \(x_i\) back to its original scale (R can do this for us).

10.4 Interactions

We can also use formulas to create interactions (Brambor, Clark, and Golder 2006). In scalar form, we typically use the specification \[ \mu_i =\; \beta_0 + \beta_1 x_{i1} + \beta_2 x_{i2} + \beta_3 x_{i1}x_{i2} \] to model interactions so that the effect of \(x_{i1}\) depends on \(x_{i2}\) (and vice versa).

Brambor, Thomas, William Roberts Clark, and Matt Golder. 2006. “Understanding Interaction Models: Improving Empirical Analyses.” Political Analysis 14 (1): 63–82. https://doi.org/10.1093/pan/mpi014.

R provides three operators for interactions:

  • : includes only the product of two variables.
  • * expands to the main effects and their interaction.
  • ^ expands to all interactions up to a given order.

10.4.1 model.matrix() examples

10.4.1.1 Two numeric variables

First, consider two numeric predictors.

f <- ~ flipper_length_mm * body_mass_g
X <- model.matrix(f, data = penguins)
head(X)
  (Intercept) flipper_length_mm body_mass_g flipper_length_mm:body_mass_g
1           1               181        3750                        678750
2           1               186        3800                        706800
3           1               195        3250                        633750
4           1               193        3450                        665850
5           1               190        3650                        693500
6           1               181        3625                        656125

This creates columns for flipper_length_mm, body_mass_g, and their product. The coefficient on the product term shows how the slope of one variable depends on the other.

We wouldn’t (usually) want to do this, but we can also include only the product, leaving out main effects.

f <- ~ flipper_length_mm:body_mass_g
X <- model.matrix(f, data = penguins)
head(X)
  (Intercept) flipper_length_mm:body_mass_g
1           1                        678750
2           1                        706800
3           1                        633750
4           1                        665850
5           1                        693500
6           1                        656125

Or we could include only one of the main effects.

f <- ~ flipper_length_mm + flipper_length_mm:body_mass_g
X <- model.matrix(f, data = penguins)
head(X)
  (Intercept) flipper_length_mm flipper_length_mm:body_mass_g
1           1               181                        678750
2           1               186                        706800
3           1               195                        633750
4           1               193                        665850
5           1               190                        693500
6           1               181                        656125

10.4.1.2 Numeric and factor variable

We can also interact factor and numeric variables. For example, we can interact flipper_length_mm with species.

levels(penguins$species)
[1] "Gentoo"    "Adelie"    "Chinstrap"
f <- ~ flipper_length_mm * species
X <- model.matrix(f, data = penguins)
head(X)
  (Intercept) flipper_length_mm speciesAdelie speciesChinstrap
1           1               181             1                0
2           1               186             1                0
3           1               195             1                0
4           1               193             1                0
5           1               190             1                0
6           1               181             1                0
  flipper_length_mm:speciesAdelie flipper_length_mm:speciesChinstrap
1                             181                                  0
2                             186                                  0
3                             195                                  0
4                             193                                  0
5                             190                                  0
6                             181                                  0

Here the design matrix allows separate slopes for each species. The dummy variables for species adjust the intercepts, and the product terms adjust the slopes.

10.4.1.3 Two factor variables

Finally, we can interact two factors, such as species and island.

levels(penguins$species)
[1] "Gentoo"    "Adelie"    "Chinstrap"
levels(penguins$island)
[1] "Biscoe"    "Dream"     "Torgersen"
f <- ~ species * island
X <- model.matrix(f, data = penguins)
head(X)
  (Intercept) speciesAdelie speciesChinstrap islandDream islandTorgersen
1           1             1                0           0               1
2           1             1                0           0               1
3           1             1                0           0               1
4           1             1                0           0               1
5           1             1                0           0               1
6           1             1                0           0               1
  speciesAdelie:islandDream speciesChinstrap:islandDream
1                         0                            0
2                         0                            0
3                         0                            0
4                         0                            0
5                         0                            0
6                         0                            0
  speciesAdelie:islandTorgersen speciesChinstrap:islandTorgersen
1                             1                                0
2                             1                                0
3                             1                                0
4                             1                                0
5                             1                                0
6                             1                                0

This creates dummy variables for both main effects and for combinations of levels across the two factors.

Remember: When the models get complicated, you can take a first derivative to understand the meaning of the coefficients. For example, the effect of speciesChinstrap is \(\beta_2 + \beta_6 \texttt{islandDream} + \beta_8 \texttt{islandTorgerson}\). This means that the effect is \(\beta_2\) for penguins on Biscoe (the reference island), \(\beta_2 + \beta_6\) for penguins on Dream, and \(\beta_2 + \beta_8\) for penguins on Torgerson.

10.4.2 lm() examples

As familiar examples, here are a few lm() fits with interactions predicting bill_length_mm.

# numeric × numeric
fit1 <- lm(bill_length_mm ~ flipper_length_mm * body_mass_g, data = penguins)

# numeric × factor
fit2 <- lm(bill_length_mm ~ flipper_length_mm * species, data = penguins)

# factor × factor
fit3 <- lm(bill_length_mm ~ species * island, data = penguins)

# richer example
fit4 <- lm(bill_length_mm ~ (flipper_length_mm + body_mass_g + bill_depth_mm)^2, data = penguins)

modelsummary(list(fit1, fit2, fit3, fit4), gof_map = NA)
(1) (2) (3) (4)
(Intercept) -26.752 -20.488 47.568 139.367
(21.149) (7.757) (0.273) (59.166)
flipper_length_mm 0.339 0.313 -0.902
(0.107) (0.036) (0.343)
body_mass_g 0.006 0.025
(0.005) (0.008)
flipper_length_mm × body_mass_g -0.000 -0.000
(0.000) (0.000)
speciesAdelie 33.524 -8.593
(9.920) (0.525)
speciesChinstrap 26.081 1.721
(11.559) (0.753)
flipper_length_mm × speciesAdelie -0.178
(0.048)
flipper_length_mm × speciesChinstrap -0.092
(0.056)
islandDream -0.455
(0.602)
islandTorgersen 0.063
(0.624)
bill_depth_mm -10.376
(3.328)
flipper_length_mm × bill_depth_mm 0.075
(0.020)
body_mass_g × bill_depth_mm -0.001
(0.000)

10.5 Transformations

Transformations change how predictors enter the design matrix. In general, we can write \[ \mu_i = \beta_0 + \beta_1 g(x_i), \] where \(g(\cdot)\) is a transformation such as \(\log\), \(\sqrt{\cdot}\), or squaring. Transformations can linearize nonlinear relationships, stabilize variance, or change interpretation.

10.5.1 Transformations that don’t conflict with formula operators

Some functions can be applied directly in a formula. For example, we can use the log or square-root functions.

The log is especially common. For the model \[ \mu_i = \beta_0 + \beta_1 \log(x_i), \] the marginal effect is \(\tfrac{d\mu_i}{dx_i} = \tfrac{\beta_1}{x_i}\), which decreases in magnitude as \(x_i\) grows.

f <- ~ log(flipper_length_mm)
X <- model.matrix(f, data = penguins)
head(X)
  (Intercept) log(flipper_length_mm)
1           1               5.198497
2           1               5.225747
3           1               5.273000
4           1               5.262690
5           1               5.247024
6           1               5.198497

The square root can also be used directly. For the model \[ \mu_i = \beta_0 + \beta_1 \sqrt{x_i}, \] the marginal effect is \(\tfrac{d\mu_i}{dx_i} = \tfrac{\beta_1}{2\sqrt{x_i}}\).

f <- ~ sqrt(flipper_length_mm)
X <- model.matrix(f, data = penguins)
head(X)
  (Intercept) sqrt(flipper_length_mm)
1           1                13.45362
2           1                13.63818
3           1                13.96424
4           1                13.89244
5           1                13.78405
6           1                13.45362

10.5.2 Transformations that require I()

Importantly, though, some symbols in formulas have special meaning. In formulas, + adds predictors, - removes predictors, ^ expands interactions up to a certain order, and : forms interactions.

If we want arithmetic version instead of the formula-operator version, we must wrap the expression in I().

For example, to include a quadratic term, we write

f <- ~ I(flipper_length_mm^2)
X <- model.matrix(f, data = penguins)
head(X)
  (Intercept) I(flipper_length_mm^2)
1           1                  32761
2           1                  34596
3           1                  38025
4           1                  37249
5           1                  36100
6           1                  32761
f <- ~ flipper_length_mm + I(flipper_length_mm^2)
X <- model.matrix(f, data = penguins)
head(X)
  (Intercept) flipper_length_mm I(flipper_length_mm^2)
1           1               181                  32761
2           1               186                  34596
3           1               195                  38025
4           1               193                  37249
5           1               190                  36100
6           1               181                  32761
Warning

If not wrapped in I(), then flipper_length_mm^2 means “all second-order interactions of these variables and their main effects.” In this case, it’s the single variable flipper_length_mm. This is a pretty confusing result!

f <- ~ flipper_length_mm^2
X <- model.matrix(f, data = penguins)
head(X)
  (Intercept) flipper_length_mm
1           1               181
2           1               186
3           1               195
4           1               193
5           1               190
6           1               181

The I() function can also force R to treat sums and differences literally. For example, the formula below creates one coefficient for the sum of flipper length and body mass.

f <- ~ I(flipper_length_mm + body_mass_g)
X <- model.matrix(f, data = penguins)
head(X)
  (Intercept) I(flipper_length_mm + body_mass_g)
1           1                               3931
2           1                               3986
3           1                               3445
4           1                               3643
5           1                               3840
6           1                               3806

We can include multiple transformations at once.

f <- ~ log(flipper_length_mm) + I(flipper_length_mm^2) + sqrt(body_mass_g)
X <- model.matrix(f, data = penguins)
head(X)
  (Intercept) log(flipper_length_mm) I(flipper_length_mm^2) sqrt(body_mass_g)
1           1               5.198497                  32761          61.23724
2           1               5.225747                  34596          61.64414
3           1               5.273000                  38025          57.00877
4           1               5.262690                  37249          58.73670
5           1               5.247024                  36100          60.41523
6           1               5.198497                  32761          60.20797

10.5.3 lm() examples

As familiar examples, here are a few lm() fits with transformations of predictors.

fit1 <- lm(bill_length_mm ~ log(flipper_length_mm), data = penguins)
fit2 <- lm(bill_length_mm ~ I(flipper_length_mm^2), data = penguins)
fit3 <- lm(bill_length_mm ~ sqrt(flipper_length_mm), data = penguins)
fit4 <- lm(bill_length_mm ~ log(flipper_length_mm) + I(flipper_length_mm^2) + sqrt(body_mass_g), data = penguins)

modelsummary(list(fit1, fit2, fit3, fit4), gof_map = NA)
(1) (2) (3) (4)
(Intercept) -230.169 18.649 -58.945 -345.868
(17.441) (1.637) (6.556) (244.273)
log(flipper_length_mm) 51.721 75.246
(3.290) (50.714)
I(flipper_length_mm^2) 0.001 -0.000
(0.000) (0.001)
sqrt(flipper_length_mm) 7.266
(0.462)
sqrt(body_mass_g) 0.097
(0.077)

10.6 Updating Formulas

Sometimes we want to adjust an existing formula instead of rewriting it from scratch. The function update() makes this easy. It is especially useful when experimenting changes to a baseline specification.

For example, suppose we start with a model of bill_length_mm on flipper_length_mm.

f <- bill_length_mm ~ flipper_length_mm
fit1 <- lm(f, data = penguins)

We can use update() to add or remove predictors. The special symbol . means “keep everything else.”

# add body_mass_g
fit2 <- update(fit1, . ~ . + body_mass_g)

# remove flipper_length_mm
fit3 <- update(fit1, . ~ . - flipper_length_mm)

modelsummary(list(fit1, fit2, fit3), gof_map = NA)
(1) (2) (3)
(Intercept) -7.219 -3.981 43.993
(3.272) (4.722) (0.300)
flipper_length_mm 0.255 0.227
(0.016) (0.033)
body_mass_g 0.001
(0.001)