
broom - Convert Statistical Objects into Tidy Tibbles
Summarizes key information about statistical objects in tidy tibbles. This makes it easy to report results, create plots and consistently work with large numbers of models at once. Broom provides three verbs that each provide different types of information about a model. tidy() summarizes information about model components such as coefficients of a regression. glance() reports information about an entire model, such as goodness of fit measures like AIC and BIC. augment() adds information about individual observations to a dataset, such as fitted values or influence measures.
Last updated
modelingtidy-data
21.88 score 1.5k stars 1.7k dependents 61k scripts 696k downloads
recipes - Preprocessing and Feature Engineering Steps for Modeling
A recipe prepares your data for modeling. We provide an extensible framework for pipeable sequences of feature engineering steps provides preprocessing tools to be applied to data. Statistical parameters for the steps can be estimated from an initial data set and then applied to other data sets. The resulting processed output can then be used as inputs for statistical or machine learning models.
Last updated
18.30 score 614 stars 425 dependents 9.5k scripts 178k downloads
tidymodels - Easily Install and Load the 'Tidymodels' Packages
The tidy modeling "verse" is a collection of packages for modeling and statistical analysis that share the underlying design philosophy, grammar, and data structures of the tidyverse.
Last updated
17.06 score 815 stars 15 dependents 77k scripts 131k downloads
rsample - General Resampling Infrastructure
Classes and functions to create and summarize different types of resampling objects (e.g. bootstrap, cross-validation).
Last updated
16.89 score 340 stars 100 dependents 7.1k scripts 65k downloads
parsnip - A Common API to Modeling and Analysis Functions
A common interface is provided to allow users to specify a model without having to remember the different argument names across different functions or computational engines (e.g. 'R', 'Spark', 'Stan', 'H2O', etc).
Last updated
16.57 score 652 stars 83 dependents 3.6k scripts 48k downloads
infer - Tidy Statistical Inference
The objective of this package is to perform inference using an expressive statistical grammar that coheres with the tidy design framework.
Last updated
16.27 score 792 stars 18 dependents 4.7k scripts 32k downloads
yardstick - Tidy Characterizations of Model Performance
Tidy tools for quantifying how well model fits to a data set such as confusion matrices, class probability curve summaries, and regression metrics (e.g., RMSE).
Last updated
16.09 score 401 stars 69 dependents 5.3k scripts 48k downloads
hardhat - Construct Modeling Packages
Building modeling packages is hard. A large amount of effort generally goes into providing an implementation for a new method that is efficient, fast, and correct, but often less emphasis is put on the user interface. A good interface requires specialized knowledge about S3 methods and formulas, which the average package developer might not have. The goal of 'hardhat' is to reduce the burden around building new modeling packages by providing functionality for preprocessing, predicting, and validating input.
Last updated
14.88 score 108 stars 493 dependents 248 scripts 165k downloads
dials - Tools for Creating Tuning Parameter Values
Many models contain tuning parameters (i.e. parameters that cannot be directly estimated from the data). These tools can be used to define objects for creating, simulating, or validating values for such parameters.
Last updated
14.78 score 116 stars 70 dependents 653 scripts 37k downloads
tune - Tidy Tuning Tools
The ability to tune models is important. 'tune' contains functions and classes to be used in conjunction with other 'tidymodels' packages for finding reasonable values of hyper-parameters in models, preprocessing methods, and post-processing steps.
Last updated
14.73 score 334 stars 50 dependents 1.2k scripts 38k downloads
corrr - Correlations in R
A tool for exploring correlations. It makes it possible to easily perform routine tasks when exploring correlation matrices such as ignoring the diagonal, focusing on the correlations of certain variables against others, or rearranging and visualizing the matrix in terms of the strength of the correlations.
Last updated
14.43 score 591 stars 19 dependents 3.4k scripts 17k downloads
workflows - Modeling Workflows
Managing both a 'parsnip' model and a preprocessor, such as a model formula or recipe from 'recipes', can often be challenging. The goal of 'workflows' is to streamline this process by bundling the model alongside the preprocessor, all within the same object.
Last updated
13.72 score 210 stars 55 dependents 1.4k scripts 39k downloads
stacks - Tidy Model Stacking
Model stacking is an ensemble technique that involves training a model to combine the outputs of many diverse statistical models, and has been shown to improve predictive performance in a variety of settings. 'stacks' implements a grammar for 'tidymodels'-aligned model stacking.
Last updated
12.40 score 302 stars 2 dependents 1.1k scripts 5.8k downloads
probably - Tools for Post-Processing Predicted Values
Models can be improved by post-processing class probabilities, by: recalibration, conversion to hard probabilities, assessment of equivocal zones, and other activities. 'probably' contains tools for conducting these operations as well as calibration tools and conformal inference techniques for regression models.
Last updated
12.35 score 121 stars 1 dependents 19k scripts 7.4k downloads
butcher - Model Butcher
Provides a set of S3 generics to axe components of fitted model objects and help reduce the size of model objects saved to disk.
Last updated
12.30 score 138 stars 17 dependents 194 scripts 12k downloads
lime - Local Interpretable Model-Agnostic Explanations
When building complex models, it is often difficult to explain why the model should be trusted. While global measures such as accuracy are useful, they cannot be used for explaining why a model made a specific prediction. 'lime' (a port of the 'lime' 'Python' package) is a method for explaining the outcome of black box models by fitting a local model around the point in question an perturbations of this point. The approach is described in more detail in the article by Ribeiro et al. (2016) <doi:10.48550/arXiv.1602.04938>.
Last updated
caretmodel-checkingmodel-evaluationmodelingcpp
12.13 score 492 stars 3 dependents 962 scripts 5.3k downloads
tidypredict - Run Predictions Inside the Database
It parses a fitted 'R' model object, and returns a formula in 'Tidy Eval' code that calculates the predictions. It works with several databases back-ends because it leverages 'dplyr' and 'dbplyr' for the final 'SQL' translation of the algorithm. It currently supports lm(), glm(), randomForest(), ranger(), rpart(), earth(), xgb.Booster.complete(), lgb.Booster(), catboost.Model(), cubist(), and ctree() models.
Last updated
dbplyrdplyrpurrrrlang
12.05 score 263 stars 2 dependents 294 scripts 1.4k downloadsworkflowsets - Create a Collection of 'tidymodels' Workflows
A workflow is a combination of a model and preprocessors (e.g, a formula, recipe, etc.) (Kuhn and Silge (2021) <https://www.tmwr.org/>). In order to try different combinations of these, an object can be created that contains many workflows. There are functions to create workflows en masse as well as training them and visualizing the results.
Last updated
11.94 score 97 stars 19 dependents 393 scripts 30k downloadsmodeldata - Data Sets Useful for Modeling Examples
Data sets used for demonstrating or testing model-related packages are contained in this package.
Last updated
11.22 score 24 stars 17 dependents 3.0k scripts 38k downloads
bonsai - Model Wrappers for Tree-Based Models
Bindings for additional tree-based model engines for use with the 'parsnip' package. Models include gradient boosted decision trees with 'LightGBM' (Ke et al, 2017.), conditional inference trees and conditional random forests with 'partykit' (Hothorn and Zeileis, 2015. and Hothorn et al, 2006. <doi:10.1198/106186006X133933>), and accelerated oblique random forests with 'aorsf' (Jaeger et al, 2022 <doi:10.5281/zenodo.7116854>).
Last updated
11.06 score 54 stars 1 dependents 24k scripts 1.4k downloads
themis - Extra Recipes Steps for Dealing with Unbalanced Data
A dataset with an uneven number of cases in each class is said to be unbalanced. Many models produce a subpar performance on unbalanced datasets. A dataset can be balanced by increasing the number of minority cases using SMOTE 2011 <doi:10.48550/arXiv.1106.1813>, BorderlineSMOTE 2005 <doi:10.1007/11538059_91> and ADASYN 2008 <https://ieeexplore.ieee.org/document/4633969>. Or by decreasing the number of majority cases using NearMiss 2003 <https://www.site.uottawa.ca/~nat/Workshop2003/jzhang.pdf> or Tomek link removal 1976 <https://ieeexplore.ieee.org/document/4309452>.
Last updated
10.60 score 142 stars 2 dependents 1.7k scripts 18k downloads
rules - Model Wrappers for Rule-Based Models
Bindings for additional models for use with the 'parsnip' package. Models include prediction rule ensembles (Friedman and Popescu, 2008) <doi:10.1214/07-AOAS148>, C5.0 rules (Quinlan, 1992 ISBN: 1558602380), and Cubist (Kuhn and Johnson, 2013) <doi:10.1007/978-1-4614-6849-3>.
Last updated
10.31 score 42 stars 1 dependents 23k scripts 5.2k downloads
textrecipes - Extra 'Recipes' for Text Processing
Converting text to numerical features requires specifically created procedures, which are implemented as steps according to the 'recipes' package. These steps allows for tokenization, filtering, counting (tf and tfidf) and feature hashing.
Last updated
10.06 score 164 stars 1 dependents 1.0k scripts 1.4k downloads
finetune - Additional Functions for Model Tuning
The ability to tune models is important. 'finetune' enhances the 'tune' package by providing more specialized methods for finding reasonable values of model tuning parameters. Two racing methods described by Kuhn (2014) <doi:10.48550/arXiv.1405.6974> are included. An iterative search method using generalized simulated annealing (Bohachevsky, Johnson and Stein, 1986) <doi:10.1080/00401706.1986.10488128> is also included.
Last updated
9.75 score 65 stars 2 dependents 1.1k scripts 5.6k downloads
tidyclust - A Common API to Clustering
A common interface to specifying clustering models, in the same style as 'parsnip'. Creates unified interface across different functions and computational engines.
Last updated
9.63 score 113 stars 276 scripts 4.3k downloads
censored - 'parsnip' Engines for Survival Models
Engines for survival models from the 'parsnip' package. These include parametric models (e.g., Jackson (2016) <doi:10.18637/jss.v070.i08>), semi-parametric (e.g., Simon et al (2011) <doi:10.18637/jss.v039.i05>), and tree-based models (e.g., Buehlmann and Hothorn (2007) <doi:10.1214/07-STS242>).
Last updated
parsniptidymodels
9.61 score 124 stars 1 dependents 467 scripts 5.2k downloads
spatialsample - Spatial Resampling Infrastructure
Functions and classes for spatial resampling to use with the 'rsample' package, such as spatial cross-validation (Brenning, 2012) <doi:10.1109/IGARSS.2012.6352393>. The scope of 'rsample' and 'spatialsample' is to provide the basic building blocks for creating and analyzing resamples of a spatial data set, but neither package includes functions for modeling or computing statistics. The resampled spatial data sets created by 'spatialsample' do not contain much overhead in memory.
Last updated
cpp
9.52 score 77 stars 4 dependents 273 scripts 4.3k downloadstailor - Iterative Steps for Postprocessing Model Predictions
Postprocessors refine predictions outputted from machine learning models to improve predictive performance or better satisfy distributional limitations. This package introduces 'tailor' objects, which compose iterative adjustments to model predictions. A number of pre-written adjustments are provided with the package, such as calibration. See Lichtenstein, Fischhoff, and Phillips (1977) <doi:10.1007/978-94-010-1276-8_19>. Other methods and utilities to compose new adjustments are also included. Tailors are tightly integrated with the 'tidymodels' framework.
Last updated
9.30 score 16 stars 51 dependents 37 scripts 29k downloads
embed - Extra Recipes for Encoding Predictors
Predictors can be converted to one or more numeric representations using a variety of methods. Effect encodings using simple generalized linear models <doi:10.48550/arXiv.1611.09477> or nonlinear models <doi:10.48550/arXiv.1604.06737> can be used. There are also functions for dimension reduction and other approaches.
Last updated
9.26 score 144 stars 1.2k scripts 1.6k downloads
discrim - Model Wrappers for Discriminant Analysis
Bindings for additional classification models for use with the 'parsnip' package. Models include flavors of discriminant analysis, such as linear (Fisher (1936) <doi:10.1111/j.1469-1809.1936.tb02137.x>), regularized (Friedman (1989) <doi:10.1080/01621459.1989.10478752>), and flexible (Hastie, Tibshirani, and Buja (1994) <doi:10.1080/01621459.1994.10476866>), as well as naive Bayes classifiers (Hand and Yu (2007) <doi:10.1111/j.1751-5823.2001.tb00465.x>).
Last updated
8.69 score 31 stars 1.7k scripts 5.3k downloads
multilevelmod - Model Wrappers for Multi-Level Models
Bindings for hierarchical regression models for use with the 'parsnip' package. Models include longitudinal generalized linear models (Liang and Zeger, 1986) <doi:10.1093/biomet/73.1.13>, and mixed-effect models (Pinheiro and Bates) <doi:10.1007/978-1-4419-0318-1_1>.
Last updated
8.39 score 74 stars 368 scripts 419 downloads
poissonreg - Model Wrappers for Poisson Regression
Bindings for Poisson regression models for use with the 'parsnip' package. Models include simple generalized linear models, Bayesian models, and zero-inflated Poisson models (Zeileis, Kleiber, and Jackman (2008) <doi:10.18637/jss.v027.i08>).
Last updated
7.60 score 23 stars 1 dependents 571 scripts 939 downloads
modeldb - Fits Models Inside the Database
Uses 'dplyr' and 'tidyeval' to fit statistical models inside the database. It currently supports KMeans and linear regression models.
Last updated
databasedbplyrdplyrggplot2modelingrlangsqltidyevalvisualization
7.59 score 78 stars 63 scripts 582 downloads
applicable - A Compilation of Applicability Domain Methods
A modeling package compiling applicability domain methods in R. It combines different methods to measure the amount of extrapolation new samples can have from the training set. See <doi:10.4018/IJQSPR.2016010102> for an overview of applicability domains.
Last updated
7.40 score 48 stars 1 dependents 68 scripts 4.3k downloads
orbital - Predict with 'tidymodels' Workflows in Databases
Turn 'tidymodels' workflows into objects containing the sufficient sequential equations to perform predictions. These smaller objects allow for low dependency prediction locally or directly in databases.
Last updated
7.38 score 48 stars 45 scripts 615 downloadsmodelenv - Provide Tools to Register Models for Use in 'tidymodels'
An developer focused, low dependency package in 'tidymodels' that provides functions to register how models are to be used. Functions to register models are complimented with accessor functions to retrieve registered model information to aid in model fitting and error handling.
Last updated
7.35 score 4 stars 57 dependents 1 scripts 33k downloads
baguette - Efficient Model Functions for Bagging
Tree- and rule-based models can be bagged (<doi:10.1007/BF00058655>) using this package and their predictions equations are stored in an efficient format to reduce the model objects size and speed.
Last updated
7.19 score 28 stars 964 scripts 1.3k downloads
filtro - Feature Selection Using Supervised Filter-Based Methods
Tidy tools to apply filter-based supervised feature selection methods. These methods score and rank feature relevance using metrics such as p-values, correlation, and importance scores (Kuhn and Johnson (2019) <doi:10.1201/9781315108230>).
Last updated
quarto
6.96 score 7 stars 1 dependents 18 scripts 236 downloadsdesirability2 - Desirability Functions for Multiparameter Optimization
In-line functions for multivariate optimization via desirability functions (Derringer and Suich, 1980, <doi:10.1080/00224065.1980.11980968>) with easy use within 'dplyr' pipelines.
Last updated
6.81 score 14 stars 2 dependents 62 scripts 600 downloadsusemodels - Boilerplate Code for 'Tidymodels' Analyses
Code snippets to fit models using the tidymodels framework can be easily created for a given data set.
Last updated
6.78 score 87 stars 174 scripts 302 downloadsagua - 'tidymodels' Integration with 'h2o'
Create and evaluate models using 'tidymodels' and 'h2o' <https://h2o.ai/>. The package enables users to specify 'h2o' as an engine for several modeling methods.
Last updated
6.76 score 25 stars 114 scripts 970 downloadsshinymodels - Interactive Assessments of Models
Launch a 'shiny' application for 'tidymodels' results. For classification or regression models, the app can be used to determine if there is lack of fit or poorly predicted points.
Last updated
shiny
5.95 score 50 stars 51 scripts 282 downloadsplsmod - Model Wrappers for Projection Methods
Bindings for additional regression models for use with the 'parsnip' package, including ordinary and spare partial least squares models for regression and classification (Rohart et al (2017) <doi:10.1371/journal.pcbi.1005752>).
Last updated
mixomics
5.95 score 14 stars 90 scripts 446 downloadstabpfn - Prior-Data Fitted Network Foundational Model for Tabular Data
Provides a consistent API for classification and regression models based on the 'TabPFN' model of Hollmann et al. (2025), "Accurate predictions on small data with a tabular foundation model," Nature, 637(8045) <doi:10.1038/s41586-024-08328-6>. The calculations are served via 'Python' to train and predict the model.
Last updated
5.88 score 33 stars 19 scripts 721 downloadsimportant - Supervised Feature Selection
Interfaces for choosing important predictors in supervised regression, classification, and censored regression models. Permuted importance scores (Biecek and Burzykowski (2021) <doi:10.1201/9780429027192>) can be computed for 'tidymodels' model fits.
Last updated
5.09 score 19 stars 26 scripts 199 downloads

