R packages by tidymodels

broom - Convert Statistical Objects into Tidy Tibbles

Summarizes key information about statistical objects in tidy tibbles. This makes it easy to report results, create plots and consistently work with large numbers of models at once. Broom provides three verbs that each provide different types of information about a model. tidy() summarizes information about model components such as coefficients of a regression. glance() reports information about an entire model, such as goodness of fit measures like AIC and BIC. augment() adds information about individual observations to a dataset, such as fitted values or influence measures.

Last updated 5 days ago

modelingtidy-data

21.58 score 1.5k stars 1.5k dependents 37k scripts 770k downloads

recipes - Preprocessing and Feature Engineering Steps for Modeling

A recipe prepares your data for modeling. We provide an extensible framework for pipeable sequences of feature engineering steps provides preprocessing tools to be applied to data. Statistical parameters for the steps can be estimated from an initial data set and then applied to other data sets. The resulting processed output can then be used as inputs for statistical or machine learning models.

Last updated 2 hours ago

18.81 score 586 stars 384 dependents 7.2k scripts 211k downloads

rsample - General Resampling Infrastructure

Classes and functions to create and summarize different types of resampling objects (e.g. bootstrap, cross-validation).

Last updated 21 days ago

16.72 score 341 stars 79 dependents 5.2k scripts 55k downloads

tidymodels - Easily Install and Load the 'Tidymodels' Packages

The tidy modeling "verse" is a collection of packages for modeling and statistical analysis that share the underlying design philosophy, grammar, and data structures of the tidyverse.

Last updated 1 months ago

16.52 score 783 stars 15 dependents 66k scripts 49k downloads

parsnip - A Common API to Modeling and Analysis Functions

A common interface is provided to allow users to specify a model without having to remember the different argument names across different functions or computational engines (e.g. 'R', 'Spark', 'Stan', 'H2O', etc).

Last updated 21 days ago

16.37 score 612 stars 69 dependents 3.4k scripts 46k downloads

infer - Tidy Statistical Inference

The objective of this package is to perform inference using an expressive statistical grammar that coheres with the tidy design framework.

Last updated 7 months ago

15.75 score 736 stars 18 dependents 3.5k scripts 31k downloads

yardstick - Tidy Characterizations of Model Performance

Tidy tools for quantifying how well model fits to a data set such as confusion matrices, class probability curve summaries, and regression metrics (e.g., RMSE).

Last updated 21 days ago

15.47 score 387 stars 60 dependents 2.2k scripts 51k downloads

hardhat - Construct Modeling Packages

Building modeling packages is hard. A large amount of effort generally goes into providing an implementation for a new method that is efficient, fast, and correct, but often less emphasis is put on the user interface. A good interface requires specialized knowledge about S3 methods and formulas, which the average package developer might not have. The goal of 'hardhat' is to reduce the burden around building new modeling packages by providing functionality for preprocessing, predicting, and validating input.

Last updated 2 months ago

14.86 score 104 stars 437 dependents 175 scripts 194k downloads

dials - Tools for Creating Tuning Parameter Values

Many models contain tuning parameters (i.e. parameters that cannot be directly estimated from the data). These tools can be used to define objects for creating, simulating, or validating values for such parameters.

Last updated 2 months ago

14.31 score 114 stars 52 dependents 426 scripts 40k downloads

tune - Tidy Tuning Tools

The ability to tune models is important. 'tune' contains functions and classes to be used in conjunction with other 'tidymodels' packages for finding reasonable values of hyper-parameters in models, pre-processing methods, and post-processing steps.

Last updated 29 days ago

14.27 score 293 stars 39 dependents 756 scripts 34k downloads

workflows - Modeling Workflows

Managing both a 'parsnip' model and a preprocessor, such as a model formula or recipe from 'recipes', can often be challenging. The goal of 'workflows' is to streamline this process by bundling the model alongside the preprocessor, all within the same object.

Last updated 1 months ago

13.97 score 207 stars 43 dependents 876 scripts 45k downloads

corrr - Correlations in R

A tool for exploring correlations. It makes it possible to easily perform routine tasks when exploring correlation matrices such as ignoring the diagonal, focusing on the correlations of certain variables against others, or rearranging and visualizing the matrix in terms of the strength of the correlations.

Last updated 1 years ago

13.82 score 593 stars 7 dependents 2.9k scripts 13k downloads

probably - Tools for Post-Processing Predicted Values

Models can be improved by post-processing class probabilities, by: recalibration, conversion to hard probabilities, assessment of equivocal zones, and other activities. 'probably' contains tools for conducting these operations as well as calibration tools and conformal inference techniques for regression models.

Last updated 6 months ago

12.09 score 115 stars 1 dependents 21k scripts 2.2k downloads

workflowsets - Create a Collection of 'tidymodels' Workflows

A workflow is a combination of a model and preprocessors (e.g, a formula, recipe, etc.) (Kuhn and Silge (2021) <https://www.tmwr.org/>). In order to try different combinations of these, an object can be created that contains many workflows. There are functions to create workflows en masse as well as training them and visualizing the results.

Last updated 5 months ago

12.04 score 94 stars 19 dependents 294 scripts 25k downloads

butcher - Model Butcher

Provides a set of S3 generics to axe components of fitted model objects and help reduce the size of model objects saved to disk.

Last updated 14 days ago

11.66 score 132 stars 13 dependents 146 scripts 5.6k downloads

stacks - Tidy Model Stacking

Model stacking is an ensemble technique that involves training a model to combine the outputs of many diverse statistical models, and has been shown to improve predictive performance in a variety of settings. 'stacks' implements a grammar for 'tidymodels'-aligned model stacking.

Last updated 5 months ago

11.46 score 298 stars 840 scripts 1.8k downloads

tidypredict - Run Predictions Inside the Database

It parses a fitted 'R' model object, and returns a formula in 'Tidy Eval' code that calculates the predictions. It works with several databases back-ends because it leverages 'dplyr' and 'dbplyr' for the final 'SQL' translation of the algorithm. It currently supports lm(), glm(), randomForest(), ranger(), earth(), xgb.Booster.complete(), cubist(), and ctree() models.

Last updated 3 months ago

dbplyrdplyrpurrrrlang

11.05 score 262 stars 2 dependents 241 scripts 1.5k downloads

textrecipes - Extra 'Recipes' for Text Processing

Converting text to numerical features requires specifically created procedures, which are implemented as steps according to the 'recipes' package. These steps allows for tokenization, filtering, counting (tf and tfidf) and feature hashing.

Last updated 14 days ago

10.86 score 160 stars 1 dependents 964 scripts 1.2k downloads

modeldata - Data Sets Useful for Modeling Examples

Data sets used for demonstrating or testing model-related packages are contained in this package.

Last updated 5 months ago

10.72 score 23 stars 17 dependents 2.2k scripts 29k downloads

themis - Extra Recipes Steps for Dealing with Unbalanced Data

A dataset with an uneven number of cases in each class is said to be unbalanced. Many models produce a subpar performance on unbalanced datasets. A dataset can be balanced by increasing the number of minority cases using SMOTE 2011 <doi:10.48550/arXiv.1106.1813>, BorderlineSMOTE 2005 <doi:10.1007/11538059_91> and ADASYN 2008 <https://ieeexplore.ieee.org/document/4633969>. Or by decreasing the number of majority cases using NearMiss 2003 <https://www.site.uottawa.ca/~nat/Workshop2003/jzhang.pdf> or Tomek link removal 1976 <https://ieeexplore.ieee.org/document/4309452>.

Last updated 2 months ago

10.37 score 143 stars 2 dependents 1.3k scripts 12k downloads

broom - Convert Statistical Objects into Tidy Tibbles

recipes - Preprocessing and Feature Engineering Steps for Modeling

rsample - General Resampling Infrastructure

tidymodels - Easily Install and Load the 'Tidymodels' Packages

parsnip - A Common API to Modeling and Analysis Functions

infer - Tidy Statistical Inference

yardstick - Tidy Characterizations of Model Performance

hardhat - Construct Modeling Packages

dials - Tools for Creating Tuning Parameter Values

tune - Tidy Tuning Tools

workflows - Modeling Workflows

corrr - Correlations in R

probably - Tools for Post-Processing Predicted Values

workflowsets - Create a Collection of 'tidymodels' Workflows

butcher - Model Butcher

stacks - Tidy Model Stacking

tidypredict - Run Predictions Inside the Database

textrecipes - Extra 'Recipes' for Text Processing

modeldata - Data Sets Useful for Modeling Examples

themis - Extra Recipes Steps for Dealing with Unbalanced Data

bonsai - Model Wrappers for Tree-Based Models

rules - Model Wrappers for Rule-Based Models

embed - Extra Recipes for Encoding Predictors

censored - 'parsnip' Engines for Survival Models

tidyposterior - Bayesian Analysis to Compare Models using Resampling Statistics

discrim - Model Wrappers for Discriminant Analysis

finetune - Additional Functions for Model Tuning

spatialsample - Spatial Resampling Infrastructure

baguette - Efficient Model Functions for Bagging

multilevelmod - Model Wrappers for Multi-Level Models

modeldb - Fits Models Inside the Database

brulee - High-Level Modeling Functions with 'torch'

applicable - A Compilation of Applicability Domain Methods

poissonreg - Model Wrappers for Poisson Regression

tidyclust - A Common API to Clustering

modelenv - Provide Tools to Register Models for Use in 'tidymodels'

agua - 'tidymodels' Integration with 'h2o'

usemodels - Boilerplate Code for 'Tidymodels' Analyses

plsmod - Model Wrappers for Projection Methods

orbital - Predict with 'tidymodels' Workflows in Databases

shinymodels - Interactive Assessments of Models

modeldatatoo - More Data Sets Useful for Modeling Examples

desirability2 - Desirability Functions for Multiparameter Optimization