broom - Convert Statistical Objects into Tidy Tibbles
Summarizes key information about statistical objects in tidy tibbles. This makes it easy to report results, create plots and consistently work with large numbers of models at once. Broom provides three verbs that each provide different types of information about a model. tidy() summarizes information about model components such as coefficients of a regression. glance() reports information about an entire model, such as goodness of fit measures like AIC and BIC. augment() adds information about individual observations to a dataset, such as fitted values or influence measures.
Last updated 2 months ago
modelingtidy-data
21.67 score 1.5k stars 1.4k dependents 35k scripts 820k downloadsrecipes - Preprocessing and Feature Engineering Steps for Modeling
A recipe prepares your data for modeling. We provide an extensible framework for pipeable sequences of feature engineering steps provides preprocessing tools to be applied to data. Statistical parameters for the steps can be estimated from an initial data set and then applied to other data sets. The resulting processed output can then be used as inputs for statistical or machine learning models.
Last updated 1 months ago
18.48 score 577 stars 365 dependents 6.5k scripts 160k downloadsrsample - General Resampling Infrastructure
Classes and functions to create and summarize different types of resampling objects (e.g. bootstrap, cross-validation).
Last updated 3 months ago
16.85 score 341 stars 77 dependents 4.8k scripts 81k downloadsparsnip - A Common API to Modeling and Analysis Functions
A common interface is provided to allow users to specify a model without having to remember the different argument names across different functions or computational engines (e.g. 'R', 'Spark', 'Stan', 'H2O', etc).
Last updated 18 days ago
16.22 score 602 stars 66 dependents 3.2k scripts 32k downloadstidymodels - Easily Install and Load the 'Tidymodels' Packages
The tidy modeling "verse" is a collection of packages for modeling and statistical analysis that share the underlying design philosophy, grammar, and data structures of the tidyverse.
Last updated 2 months ago
16.07 score 771 stars 12 dependents 44k scripts 39k downloadsinfer - Tidy Statistical Inference
The objective of this package is to perform inference using an expressive statistical grammar that coheres with the tidy design framework.
Last updated 3 months ago
15.74 score 728 stars 14 dependents 3.4k scripts 30k downloadsyardstick - Tidy Characterizations of Model Performance
Tidy tools for quantifying how well model fits to a data set such as confusion matrices, class probability curve summaries, and regression metrics (e.g., RMSE).
Last updated 2 months ago
15.40 score 378 stars 55 dependents 2.1k scripts 39k downloadshardhat - Construct Modeling Packages
Building modeling packages is hard. A large amount of effort generally goes into providing an implementation for a new method that is efficient, fast, and correct, but often less emphasis is put on the user interface. A good interface requires specialized knowledge about S3 methods and formulas, which the average package developer might not have. The goal of 'hardhat' is to reduce the burden around building new modeling packages by providing functionality for preprocessing, predicting, and validating input.
Last updated 16 days ago
14.76 score 103 stars 419 dependents 180 scripts 158k downloadstune - Tidy Tuning Tools
The ability to tune models is important. 'tune' contains functions and classes to be used in conjunction with other 'tidymodels' packages for finding reasonable values of hyper-parameters in models, pre-processing methods, and post-processing steps.
Last updated 23 days ago
14.25 score 285 stars 36 dependents 748 scripts 28k downloadsdials - Tools for Creating Tuning Parameter Values
Many models contain tuning parameters (i.e. parameters that cannot be directly estimated from the data). These tools can be used to define objects for creating, simulating, or validating values for such parameters.
Last updated 10 days ago
14.13 score 114 stars 49 dependents 418 scripts 29k downloadsworkflows - Modeling Workflows
Managing both a 'parsnip' model and a preprocessor, such as a model formula or recipe from 'recipes', can often be challenging. The goal of 'workflows' is to streamline this process by bundling the model alongside the preprocessor, all within the same object.
Last updated 24 days ago
13.75 score 207 stars 40 dependents 820 scripts 30k downloadscorrr - Correlations in R
A tool for exploring correlations. It makes it possible to easily perform routine tasks when exploring correlation matrices such as ignoring the diagonal, focusing on the correlations of certain variables against others, or rearranging and visualizing the matrix in terms of the strength of the correlations.
Last updated 12 months ago
13.73 score 590 stars 8 dependents 2.5k scripts 11k downloadsworkflowsets - Create a Collection of 'tidymodels' Workflows
A workflow is a combination of a model and preprocessors (e.g, a formula, recipe, etc.) (Kuhn and Silge (2021) <https://www.tmwr.org/>). In order to try different combinations of these, an object can be created that contains many workflows. There are functions to create workflows en masse as well as training them and visualizing the results.
Last updated 2 months ago
12.32 score 92 stars 17 dependents 304 scripts 25k downloadsprobably - Tools for Post-Processing Predicted Values
Models can be improved by post-processing class probabilities, by: recalibration, conversion to hard probabilities, assessment of equivocal zones, and other activities. 'probably' contains tools for conducting these operations as well as calibration tools and conformal inference techniques for regression models.
Last updated 2 months ago
11.57 score 115 stars 21k scripts 1.8k downloadsstacks - Tidy Model Stacking
Model stacking is an ensemble technique that involves training a model to combine the outputs of many diverse statistical models, and has been shown to improve predictive performance in a variety of settings. 'stacks' implements a grammar for 'tidymodels'-aligned model stacking.
Last updated 2 months ago
11.56 score 295 stars 860 scripts 1.9k downloadstidypredict - Run Predictions Inside the Database
It parses a fitted 'R' model object, and returns a formula in 'Tidy Eval' code that calculates the predictions. It works with several databases back-ends because it leverages 'dplyr' and 'dbplyr' for the final 'SQL' translation of the algorithm. It currently supports lm(), glm(), randomForest(), ranger(), earth(), xgb.Booster.complete(), cubist(), and ctree() models.
Last updated 7 days ago
dbplyrdplyrpurrrrlang
11.19 score 259 stars 2 dependents 251 scripts 1.4k downloadsbutcher - Model Butcher
Provides a set of S3 generics to axe components of fitted model objects and help reduce the size of model objects saved to disk.
Last updated 4 months ago
11.13 score 132 stars 13 dependents 148 scripts 5.0k downloadsmodeldata - Data Sets Useful for Modeling Examples
Data sets used for demonstrating or testing model-related packages are contained in this package.
Last updated 2 months ago
10.88 score 22 stars 14 dependents 2.1k scripts 27k downloadstextrecipes - Extra 'Recipes' for Text Processing
Converting text to numerical features requires specifically created procedures, which are implemented as steps according to the 'recipes' package. These steps allows for tokenization, filtering, counting (tf and tfidf) and feature hashing.
Last updated 2 months ago
cpp
10.84 score 160 stars 1 dependents 1.0k scripts 1.4k downloadsthemis - Extra Recipes Steps for Dealing with Unbalanced Data
A dataset with an uneven number of cases in each class is said to be unbalanced. Many models produce a subpar performance on unbalanced datasets. A dataset can be balanced by increasing the number of minority cases using SMOTE 2011 <arXiv:1106.1813>, BorderlineSMOTE 2005 <doi:10.1007/11538059_91> and ADASYN 2008 <https://ieeexplore.ieee.org/document/4633969>. Or by decreasing the number of majority cases using NearMiss 2003 <https://www.site.uottawa.ca/~nat/Workshop2003/jzhang.pdf> or Tomek link removal 1976 <https://ieeexplore.ieee.org/document/4309452>.
Last updated 2 months ago
9.78 score 141 stars 1 dependents 756 scripts 8.4k downloadsrules - Model Wrappers for Rule-Based Models
Bindings for additional models for use with the 'parsnip' package. Models include prediction rule ensembles (Friedman and Popescu, 2008) <doi:10.1214/07-AOAS148>, C5.0 rules (Quinlan, 1992 ISBN: 1558602380), and Cubist (Kuhn and Johnson, 2013) <doi:10.1007/978-1-4614-6849-3>.
Last updated 2 months ago
9.51 score 40 stars 1 dependents 20k scripts 1.1k downloadsbonsai - Model Wrappers for Tree-Based Models
Bindings for additional tree-based model engines for use with the 'parsnip' package. Models include gradient boosted decision trees with 'LightGBM' (Ke et al, 2017.), conditional inference trees and conditional random forests with 'partykit' (Hothorn and Zeileis, 2015. and Hothorn et al, 2006. <doi:10.1198/106186006X133933>), and accelerated oblique random forests with 'aorsf' (Jaeger et al, 2022 <doi:10.5281/zenodo.7116854>).
Last updated 2 months ago
9.36 score 52 stars 1 dependents 644 scripts 1.1k downloadsembed - Extra Recipes for Encoding Predictors
Predictors can be converted to one or more numeric representations using a variety of methods. Effect encodings using simple generalized linear models <arXiv:1611.09477> or nonlinear models <arXiv:1604.06737> can be used. There are also functions for dimension reduction and other approaches.
Last updated 1 months ago
9.30 score 142 stars 1.1k scripts 1.5k downloadscensored - 'parsnip' Engines for Survival Models
Engines for survival models from the 'parsnip' package. These include parametric models (e.g., Jackson (2016) <doi:10.18637/jss.v070.i08>), semi-parametric (e.g., Simon et al (2011) <doi:10.18637/jss.v039.i05>), and tree-based models (e.g., Buehlmann and Hothorn (2007) <doi:10.1214/07-STS242>).
Last updated 5 months ago
parsniptidymodels
8.96 score 123 stars 1 dependents 248 scripts 1.8k downloadstidyposterior - Bayesian Analysis to Compare Models using Resampling Statistics
Bayesian analysis used here to answer the question: "when looking at resampling results, are the differences between models 'real'?" To answer this, a model can be created were the performance statistic is the resampling statistics (e.g. accuracy or RMSE). These values are explained by the model types. In doing this, we can get parameter estimates for each model's affect on performance and make statistical (and practical) comparisons between models. The methods included here are similar to Benavoli et al (2017) <https://jmlr.org/papers/v18/16-305.html>.
Last updated 2 months ago
8.38 score 103 stars 234 scripts 434 downloadsfinetune - Additional Functions for Model Tuning
The ability to tune models is important. 'finetune' enhances the 'tune' package by providing more specialized methods for finding reasonable values of model tuning parameters. Two racing methods described by Kuhn (2014) <arXiv:1405.6974> are included. An iterative search method using generalized simulated annealing (Bohachevsky, Johnson and Stein, 1986) <doi:10.1080/00401706.1986.10488128> is also included.
Last updated 4 months ago
8.32 score 62 stars 708 scripts 1.5k downloadsdiscrim - Model Wrappers for Discriminant Analysis
Bindings for additional classification models for use with the 'parsnip' package. Models include flavors of discriminant analysis, such as linear (Fisher (1936) <doi:10.1111/j.1469-1809.1936.tb02137.x>), regularized (Friedman (1989) <doi:10.1080/01621459.1989.10478752>), and flexible (Hastie, Tibshirani, and Buja (1994) <doi:10.1080/01621459.1994.10476866>), as well as naive Bayes classifiers (Hand and Yu (2007) <doi:10.1111/j.1751-5823.2001.tb00465.x>).
Last updated 2 months ago
8.27 score 28 stars 1 dependents 1000 scripts 1.8k downloadsspatialsample - Spatial Resampling Infrastructure
Functions and classes for spatial resampling to use with the 'rsample' package, such as spatial cross-validation (Brenning, 2012) <doi:10.1109/IGARSS.2012.6352393>. The scope of 'rsample' and 'spatialsample' is to provide the basic building blocks for creating and analyzing resamples of a spatial data set, but neither package includes functions for modeling or computing statistics. The resampled spatial data sets created by 'spatialsample' do not contain much overhead in memory.
Last updated 3 months ago
cpp
8.18 score 72 stars 2 dependents 117 scripts 786 downloadsmultilevelmod - Model Wrappers for Multi-Level Models
Bindings for hierarchical regression models for use with the 'parsnip' package. Models include longitudinal generalized linear models (Liang and Zeger, 1986) <doi:10.1093/biomet/73.1.13>, and mixed-effect models (Pinheiro and Bates) <doi:10.1007/978-1-4419-0318-1_1>.
Last updated 2 months ago
8.06 score 74 stars 207 scripts 307 downloadsmodeldb - Fits Models Inside the Database
Uses 'dplyr' and 'tidyeval' to fit statistical models inside the database. It currently supports KMeans and linear regression models.
Last updated 12 months ago
databasedbplyrdplyrggplot2modelingrlangsqltidyevalvisualization
7.59 score 79 stars 62 scripts 276 downloadsbaguette - Efficient Model Functions for Bagging
Tree- and rule-based models can be bagged (<doi:10.1007/BF00058655>) using this package and their predictions equations are stored in an efficient format to reduce the model objects size and speed.
Last updated 2 months ago
7.58 score 25 stars 566 scripts 1.1k downloadsbrulee - High-Level Modeling Functions with 'torch'
Provides high-level modeling functions to define and train models using the 'torch' R package. Models include linear, logistic, and multinomial regression as well as multilayer perceptrons.
Last updated 2 months ago
7.47 score 67 stars 212 scripts 694 downloadsapplicable - A Compilation of Applicability Domain Methods
A modeling package compiling applicability domain methods in R. It combines different methods to measure the amount of extrapolation new samples can have from the training set. See Netzeva et al (2005) <doi:10.1177/026119290503300209> for an overview of applicability domains.
Last updated 2 years ago
7.43 score 47 stars 1 dependents 48 scripts 694 downloadspoissonreg - Model Wrappers for Poisson Regression
Bindings for Poisson regression models for use with the 'parsnip' package. Models include simple generalized linear models, Bayesian models, and zero-inflated Poisson models (Zeileis, Kleiber, and Jackman (2008) <doi:10.18637/jss.v027.i08>).
Last updated 1 months ago
7.39 score 22 stars 1 dependents 368 scripts 422 downloadstidyclust - A Common API to Clustering
A common interface to specifying clustering models, in the same style as 'parsnip'. Creates unified interface across different functions and computational engines.
Last updated 6 months ago
7.35 score 109 stars 136 scripts 1.9k downloadsmodelenv - Provide Tools to Register Models for Use in 'tidymodels'
An developer focused, low dependency package in 'tidymodels' that provides functions to register how models are to be used. Functions to register models are complimented with accessor functions to retrieve registered model information to aid in model fitting and error handling.
Last updated 2 months ago
7.34 score 4 stars 41 dependents 1 scripts 30k downloadsagua - 'tidymodels' Integration with 'h2o'
Create and evaluate models using 'tidymodels' and 'h2o' <https://h2o.ai/>. The package enables users to specify 'h2o' as an engine for several modeling methods.
Last updated 7 months ago
6.97 score 22 stars 69 scripts 1.0k downloadsusemodels - Boilerplate Code for 'Tidymodels' Analyses
Code snippets to fit models using the tidymodels framework can be easily created for a given data set.
Last updated 2 months ago
6.90 score 84 stars 136 scripts 220 downloadsshinymodels - Interactive Assessments of Models
Launch a 'shiny' application for 'tidymodels' results. For classification or regression models, the app can be used to determine if there is lack of fit or poorly predicted points.
Last updated 2 months ago
shiny
6.38 score 46 stars 50 scripts 145 downloadsorbital - Predict with 'tidymodels' Workflows in Databases
Turn 'tidymodels' workflows into objects containing the sufficient sequential equations to perform predictions. These smaller objects allow for low dependency prediction locally or directly in databases.
Last updated 5 days ago
6.12 score 22 stars 10 scripts 185 downloadsplsmod - Model Wrappers for Projection Methods
Bindings for additional regression models for use with the 'parsnip' package, including ordinary and spare partial least squares models for regression and classification (Rohart et al (2017) <doi:10.1371/journal.pcbi.1005752>).
Last updated 2 months ago
mixomics
5.97 score 14 stars 55 scripts 344 downloadsmodeldatatoo - More Data Sets Useful for Modeling Examples
More data sets used for demonstrating or testing model-related packages are contained in this package. The data sets are downloaded and cached, allowing for more and bigger data sets.
Last updated 9 months ago
4.85 score 7 stars 34 scripts 161 downloadsdesirability2 - Desirability Functions for Multiparameter Optimization
In-line functions for multivariate optimization via desirability functions (Derringer and Suich, 1980, <doi:10.1080/00224065.1980.11980968>) with easy use within `dplyr` pipelines.
Last updated 2 months ago
4.53 score 10 stars 17 scripts 183 downloads