What's new in R 4.5.0?

Published: April 10, 2025

tags: r

R 4.5.0 (“How About a Twenty-Six”) was released on 11th April, 2025. Here we summarise some of the interesting changes that have been introduced. In previous blog posts we have discussed the new features introduced in R 4.4.0 and earlier versions (see the links at the end of this post).

The full changelog can be found at the r-release ‘NEWS’ page and if you want to keep up to date with developments in base R, have a look at the r-devel ‘NEWS’ page.

penguins

Who doesn’t love a new dataset?

One of the great things about learning R for data science is that there are a collection of datasets available to work with, built into the base installation of R. The Palmer Penguins dataset has been available via an external package since 2020, and has been added to R v4.5.0 as a base dataset.

This dataset is useful for clustering and classification tasks and was originally highlighted as an alternative to the iris dataset.

In addition to the penguins dataset, there is a related penguins_raw dataset. This may prove useful when teaching or learning data cleaning.

`use()`

If you have worked in languages other than R, its approach to importing code from packages may seem strange. In a Python module, you would either import a package and then use functions from within the explicit namespace for the package:

import numpy
numpy.array([1, 2, 3])
# array([1, 2, 3])

Or you would import a specific function by name, prior to its use

from numpy import array
array([1, 2, 3])
# array([1, 2, 3])

In an R script, we either use explicitly-namespaced functions (without loading the containing package):

penguins |>
  dplyr::filter(bill_len > 40)

Or we load a package, adding all its exported functions to our namespace, and then use the specific functions we need:

library("dplyr")
penguins |>
  filter(bill_len > 40)

The latter form can cause some confusion. If you load multiple packages, there may be naming conflicts between the exported functions. Indeed, there is a filter() function in the base package {stats} that is overridden when we load {dplyr} - so the behaviour of filter() differs before and after loading {dplyr}.

R 4.5.0 introduces a new way to load objects from a package: use(). This allows us to be more precise about which functions we load, and from where:

# R 4.5.0 (New session)
use("dplyr", c("filter", "select"))

# Attaching package: ‘dplyr’
# 
# The following object is masked from ‘package:stats’:
#
#     filter
#

penguins |>
  filter(bill_len > 40) |>
  select(species:bill_dep)

#   species    island bill_len bill_dep
# 1  Adelie Torgersen     40.3     18.0
# 2  Adelie Torgersen     42.0     20.2
# 3  Adelie Torgersen     41.1     17.6
# 4  Adelie Torgersen     42.5     20.7
# 5  Adelie Torgersen     46.0     21.5
# 6  Adelie    Biscoe     40.6     18.6

Note that only those objects that we use() get imported from the package:

# R 4.5.0 (Session continued)

n_distinct(penguins)

# Error in n_distinct(penguins) : could not find function "n_distinct"

A feature similar to use() has been available in the {box} and {import} packages for a while. {box} is a particularly interesting project, as it allows more fine-grained control over the import and export of objects from specific code files.

Parallel downloads

Historically, the install.packages() function worked sequentially - both the downloading and installing of packages was performed one at a time. This means it could be slow to install many packages.

We often recommend the {pak} package for installing packages because it can download and install packages in parallel.

But as of R 4.5.0, install.packages() (and the related download.packages() and update.packages()) are capable of downloading packages in parallel. This may speed up the whole download-and-install process. As described in a post on the R-project blog by Tomas Kalibera, the typical speed-up expected is around 2-5x (although this is highly variable).

C23

C23 is the current standard for the C language. Much of base R and many R packages require compilation from C. If a C23 compiler is available on your machine, R will now preferentially use that.

grepv()

For pattern matching in base R, grep() and related functions are the main tools. By default, grep() returns the index of any entry in a vector that matches some pattern.

penguins_raw$Comments |> grep(pattern = "Nest", x = _)
#  [1]   7   8  29  30  39  40  69  70 121 122 131 132 139 140 163 164 193 194 199
# [20] 200 271 272 277 278 293 294 299 300 301 302 303 304 315 316 341 342

We have been able to extract the values of the input vector, rather than the indices, by specifying value = TRUE in the arguments to grep():

penguins_raw$Comments |>
  grep(pattern = "Nest", x = _, value = TRUE)
# [1] "Nest never observed with full clutch."                               
# [2] "Nest never observed with full clutch."                               
# [3] "Nest never observed with full clutch."                               
# [4] "Nest never observed with full clutch."                               
# [5] "Nest never observed with full clutch."                               
# [6] "Nest never observed with full clutch. Not enough blood for isotopes."

Now, in R 4.5.0, a new function grepv() has been introduced which will automatically extract values rather than indices from pattern matching:

penguins_raw$Comments |>
  grepv(pattern = "Nest", x = _)
# [1] "Nest never observed with full clutch."                               
# [2] "Nest never observed with full clutch."                               
# [3] "Nest never observed with full clutch."                               
# [4] "Nest never observed with full clutch."                               
# [5] "Nest never observed with full clutch."                               
# [6] "Nest never observed with full clutch. Not enough blood for isotopes."

Contributions from R-Dev-Days

Many of the changes that are described in the “R News” for the new release came about as contributions from “R Dev Day”s. These are regular events that aim to expand the number of people contributing code to the core of R. In 2024, Jumping Rivers staff attended these events in London and Newcastle (prior to “SatRDays” and “Shiny In Production”, respectively). Dev days are often attached to a conference and provide an interesting challenge to anyone interested in keeping R healthy and learning some new skills.

Trying out R 4.5.0

To take away the pain of installing the latest development version of R, you can use docker. To use the devel version of R, you can use the following commands:

docker pull rstudio/r-base:devel-jammy
docker run --rm -it rstudio/r-base:devel-jammy

Once R 4.5 is the released version of R and the r-docker repository has been updated, you should use the following command to test out R 4.5.

docker pull rstudio/r-base:4.5-jammy
docker run --rm -it rstudio/r-base:4.5-jammy

An alternative way to install multiple versions of R on the same machine is using rig.

What's new in R 4.5.0?

penguins

`use()`

Parallel downloads

C23

grepv()

Contributions from R-Dev-Days

Trying out R 4.5.0

See also

Recent Posts

Top Tags

Authors