Package: recipes 1.1.0.9000

Max Kuhn

recipes:Preprocessing and Feature Engineering Steps for Modeling

A recipe prepares your data for modeling. We provide an extensible framework for pipeable sequences of feature engineering steps provides preprocessing tools to be applied to data. Statistical parameters for the steps can be estimated from an initial data set and then applied to other data sets. The resulting processed output can then be used as inputs for statistical or machine learning models.

Authors:Max Kuhn [aut, cre], Hadley Wickham [aut], Emil Hvitfeldt [aut], Posit Software, PBC [cph, fnd]

recipes_1.1.0.9000.tar.gz
recipes_1.1.0.9000.zip(r-4.5)recipes_1.1.0.9000.zip(r-4.4)recipes_1.1.0.9000.zip(r-4.3)
recipes_1.1.0.9000.tgz(r-4.4-any)recipes_1.0.10.9000.tgz(r-4.3-any)
recipes_1.1.0.9000.tar.gz(r-4.5-noble)recipes_1.1.0.9000.tar.gz(r-4.4-noble)
recipes_1.1.0.9000.tgz(r-4.4-emscripten)recipes_1.1.0.9000.tgz(r-4.3-emscripten)
recipes.pdf |recipes.html
recipes/json (API)
NEWS

# Installrecipes in R:
install.packages('recipes',repos = c('https://tidymodels.r-universe.dev', 'https://cloud.r-project.org'))

Peer review:

Bug tracker:https://github.com/tidymodels/recipes/issues

On CRAN:

198 exports 549 stars 10.54 score 53 dependencies 335 dependents 124.9k downloads

Last updated 17 hours agofrom:7aa83c2c23

Exports:.get_data_types%>%add_checkadd_roleadd_stepall_dateall_date_predictorsall_datetimeall_datetime_predictorsall_doubleall_double_predictorsall_factorall_factor_predictorsall_integerall_integer_predictorsall_logicalall_logical_predictorsall_nominalall_nominal_predictorsall_numericall_numeric_predictorsall_orderedall_ordered_predictorsall_outcomesall_predictorsall_stringall_string_predictorsall_unorderedall_unordered_predictorsare_weights_usedaveragesbakecheckcheck_classcheck_colscheck_missingcheck_namecheck_new_datacheck_new_valuescheck_rangecheck_typecorrelationscovariancescurrent_infodenom_varsdetect_stepdiscretizedummy_extract_namesdummy_namesellipse_checkestimate_yjextract_fit_timeextract_parameter_dialsextract_parameter_set_dialsfixedformat_ch_vecformat_selectorsfrequency_weightsfully_trainedget_case_weightsget_keep_original_colshas_rolehas_typeimp_varsimportance_weightsis_trainedjuicemediansnames0pca_wtsprepprepareprepperprint_stepprinterprofrand_idreciperecipes_eval_selectrecipes_extension_checkrecipes_names_outcomesrecipes_names_predictorsrecipes_pkg_checkrecipes_ptyperecipes_ptype_validaterecipes_remove_colsremove_original_colsremove_rolerequired_pkgssel2charstepstep_arrangestep_bagimputestep_bin2factorstep_BoxCoxstep_bsstep_centerstep_classdiststep_classdist_shrunkenstep_corrstep_countstep_cutstep_datestep_depthstep_discretizestep_dummystep_dummy_extractstep_dummy_multi_choicestep_factor2stringstep_filterstep_filter_missingstep_geodiststep_harmonicstep_holidaystep_hyperbolicstep_icastep_impute_bagstep_impute_knnstep_impute_linearstep_impute_lowerstep_impute_meanstep_impute_medianstep_impute_modestep_impute_rollstep_indicate_nastep_integerstep_interactstep_interceptstep_inversestep_invlogitstep_isomapstep_knnimputestep_kpcastep_kpca_polystep_kpca_rbfstep_lagstep_lincombstep_logstep_logitstep_lowerimputestep_meanimputestep_medianimputestep_modeimputestep_mutatestep_mutate_atstep_naomitstep_nnmfstep_nnmf_sparsestep_normalizestep_novelstep_nsstep_num2factorstep_nzvstep_ordinalscorestep_otherstep_pcastep_percentilestep_plsstep_polystep_poly_bernsteinstep_profilestep_rangestep_ratiostep_regexstep_relevelstep_relustep_renamestep_rename_atstep_rmstep_rollimputestep_samplestep_scalestep_selectstep_shufflestep_slicestep_spatialsignstep_spline_bstep_spline_convexstep_spline_monotonestep_spline_naturalstep_spline_nonnegativestep_sqrtstep_string2factorstep_timestep_unknownstep_unorderstep_windowstep_YeoJohnsonstep_zvterms_selecttidytunabletune_argsupdateupdate_roleupdate_role_requirementsvariancesyj_transform

Dependencies:classcliclockcodetoolscpp11data.tablediagramdigestdplyrfansifuturefuture.applygenericsglobalsgluegowerhardhatipredKernSmoothlatticelavalifecyclelistenvlubridatemagrittrMASSMatrixnnetnumDerivparallellypillarpkgconfigprodlimprogressrpurrrR6RcpprlangrpartshapeSQUAREMstringistringrsurvivaltibbletidyrtidyselecttimechangetimeDatetzdbutf8vctrswithr

Handling categorical predictors

Rendered fromDummies.Rmdusingknitr::rmarkdownon Jul 05 2024.

Last update: 2023-12-07
Started: 2017-11-17

Introduction to recipes

Rendered fromrecipes.Rmdusingknitr::rmarkdownon Jul 05 2024.

Last update: 2022-06-13
Started: 2022-02-18

On skipping steps

Rendered fromSkipping.Rmdusingknitr::rmarkdownon Jul 05 2024.

Last update: 2022-02-18
Started: 2017-12-21

Ordering of steps

Rendered fromOrdering.Rmdusingknitr::rmarkdownon Jul 05 2024.

Last update: 2022-02-18
Started: 2017-06-15

Roles in recipes

Rendered fromRoles.Rmdusingknitr::rmarkdownon Jul 05 2024.

Last update: 2022-06-13
Started: 2019-03-19

Selecting variables

Rendered fromSelecting_Variables.Rmdusingknitr::rmarkdownon Jul 05 2024.

Last update: 2023-11-01
Started: 2017-03-29

Readme and manuals

Help Manual

Help pageTopics
Get types for use in recipes.get_data_types .get_data_types.character .get_data_types.Date .get_data_types.default .get_data_types.double .get_data_types.factor .get_data_types.hardhat_case_weights .get_data_types.integer .get_data_types.list .get_data_types.logical .get_data_types.numeric .get_data_types.ordered .get_data_types.POSIXct .get_data_types.Surv .get_data_types.textrecipes_tokenlist
Add a New Operation to the Current Recipeadd_check add_step
Apply a trained preprocessing recipebake bake.recipe
Using case weights with recipescase_weights
Helpers for steps with case weightsare_weights_used averages case-weight-helpers correlations covariances get_case_weights medians pca_wts variances
Check variable classcheck_class
Check if all columns are presentcheck_cols
Check for missing valuescheck_missing
Check for new valuescheck_new_values
Check range consistencycheck_range
Detect if a particular step or check is used in a recipedetect_step
Developer functions for creating recipes stepsdeveloper_functions
Discretize Numeric Variablesdiscretize discretize.default discretize.numeric predict.discretize
Create a formula from a prepared recipeformula.recipe
Check to see if a recipe is trained/preparedfully_trained
Role Selectionall_date all_datetime all_datetime_predictors all_date_predictors all_double all_double_predictors all_factor all_factor_predictors all_integer all_integer_predictors all_logical all_logical_predictors all_nominal all_nominal_predictors all_numeric all_numeric_predictors all_ordered all_ordered_predictors all_outcomes all_predictors all_string all_string_predictors all_unordered all_unordered_predictors current_info has_role has_type
Extract transformed training setjuice
Naming Toolsdummy_extract_names dummy_names names0
Estimate a preprocessing recipeprep prep.recipe
Wrapper function for preparing recipes within resamplingprepper
Print a Recipeprint.recipe
Create a recipe for preprocessing datarecipe recipe.data.frame recipe.default recipe.formula recipe.matrix
Evaluate a selection with tidyselect semantics specific to recipesrecipes_eval_select
Checks that steps have all S3 methodsrecipes_extension_check
Manually alter rolesadd_role remove_role roles update_role
Methods for selecting variables in step functionsselection selections
Sort rows using dplyrstep_arrange
Create a factors from A dummy variablestep_bin2factor
Box-Cox transformation for non-negative datastep_BoxCox
B-spline basis functionsstep_bs
Centering numeric datastep_center
Distances to class centroidsstep_classdist
Compute shrunken centroid distances for classification modelsstep_classdist_shrunken
High correlation filterstep_corr
Create counts of patterns using regular expressionsstep_count
Cut a numeric variable into a factorstep_cut
Date feature generatorstep_date
Data depthsstep_depth
Discretize Numeric Variablesstep_discretize
Create traditional dummy variablesstep_dummy
Extract patterns from nominal datastep_dummy_extract
Handle levels in multiple predictors togetherstep_dummy_multi_choice
Convert factors to stringsstep_factor2string
Filter rows using dplyrstep_filter
Missing value column filterstep_filter_missing
Distance between two locationsstep_geodist
Add sin and cos terms for harmonic analysisstep_harmonic
Holiday feature generatorstep_holiday
Hyperbolic transformationsstep_hyperbolic
ICA signal extractionstep_ica
Impute via bagged treesimp_vars step_impute_bag
Impute via k-nearest neighborsstep_impute_knn
Impute numeric variables via a linear modelstep_impute_linear
Impute numeric data below the threshold of measurementstep_impute_lower
Impute numeric data using the meanstep_impute_mean
Impute numeric data using the medianstep_impute_median
Impute nominal data using the most common valuestep_impute_mode
Impute numeric data using a rolling window statisticstep_impute_roll
Create missing data column indicatorsstep_indicate_na
Convert values to predefined integersstep_integer
Create interaction variablesstep_interact
Add intercept (or constant) columnstep_intercept
Inverse transformationstep_inverse
Inverse logit transformationstep_invlogit
Isomap embeddingstep_isomap
Kernel PCA signal extractionstep_kpca
Polynomial kernel PCA signal extractionstep_kpca_poly
Radial basis function kernel PCA signal extractionstep_kpca_rbf
Create a lagged predictorstep_lag
Linear combination filterstep_lincomb
Logarithmic transformationstep_log
Logit transformationstep_logit
Add new variables using dplyrstep_mutate
Mutate multiple columns using dplyrstep_mutate_at
Remove observations with missing valuesstep_naomit
Non-negative matrix factorization signal extractionstep_nnmf
Non-negative matrix factorization signal extraction with lasso penalizationstep_nnmf_sparse
Center and scale numeric datastep_normalize
Simple value assignments for novel factor levelsstep_novel
Natural spline basis functionsstep_ns
Convert numbers to factorsstep_num2factor
Near-zero variance filterstep_nzv
Convert ordinal factors to numeric scoresstep_ordinalscore
Collapse infrequent categorical levelsstep_other
PCA signal extractionstep_pca
Percentile transformationstep_percentile
Partial least squares feature extractionstep_pls
Orthogonal polynomial basis functionsstep_poly
Generalized bernstein polynomial basisstep_poly_bernstein
Create a profiling version of a data setstep_profile
Scaling numeric data to a specific rangestep_range
Ratio variable creationdenom_vars step_ratio
Detect a regular expressionstep_regex
Relevel factors to a desired levelstep_relevel
Apply (smoothed) rectified linear transformationstep_relu
Rename variables by name using dplyrstep_rename
Rename multiple columns using dplyrstep_rename_at
General variable filterstep_rm
Sample rows using dplyrstep_sample
Scaling numeric datastep_scale
Select variables using dplyrstep_select
Shuffle variablesstep_shuffle
Filter rows by position using dplyrstep_slice
Spatial sign preprocessingstep_spatialsign
Basis splinesstep_spline_b
Convex splinesstep_spline_convex
Monotone splinesstep_spline_monotone
Natural splinesstep_spline_natural
Non-negative splinesstep_spline_nonnegative
Square root transformationstep_sqrt
Convert strings to factorsstep_string2factor
Time feature generatorstep_time
Assign missing categories to "unknown"step_unknown
Convert ordered factors to unordered factorsstep_unorder
Moving window functionsstep_window
Yeo-Johnson transformationstep_YeoJohnson
Zero variance filterstep_zv
Summarize a recipesummary.recipe
Tidy the result of a recipetidy.check tidy.check_class tidy.check_cols tidy.check_missing tidy.check_new_values tidy.check_range tidy.recipe tidy.step tidy.step_arrange tidy.step_bin2factor tidy.step_BoxCox tidy.step_bs tidy.step_center tidy.step_classdist tidy.step_classdist_shrunken tidy.step_corr tidy.step_count tidy.step_cut tidy.step_date tidy.step_depth tidy.step_discretize tidy.step_dummy tidy.step_dummy_extract tidy.step_dummy_multi_choice tidy.step_factor2string tidy.step_filter tidy.step_filter_missing tidy.step_geodist tidy.step_harmonic tidy.step_holiday tidy.step_hyperbolic tidy.step_ica tidy.step_impute_bag tidy.step_impute_knn tidy.step_impute_linear tidy.step_impute_lower tidy.step_impute_mean tidy.step_impute_median tidy.step_impute_mode tidy.step_impute_roll tidy.step_indicate_na tidy.step_integer tidy.step_interact tidy.step_intercept tidy.step_inverse tidy.step_invlogit tidy.step_isomap tidy.step_kpca tidy.step_kpca_poly tidy.step_kpca_rbf tidy.step_lag tidy.step_lincomb tidy.step_log tidy.step_logit tidy.step_mutate tidy.step_mutate_at tidy.step_naomit tidy.step_nnmf tidy.step_nnmf_sparse tidy.step_normalize tidy.step_novel tidy.step_ns tidy.step_num2factor tidy.step_nzv tidy.step_ordinalscore tidy.step_other tidy.step_pca tidy.step_percentile tidy.step_pls tidy.step_poly tidy.step_poly_bernstein tidy.step_profile tidy.step_range tidy.step_ratio tidy.step_regex tidy.step_relevel tidy.step_relu tidy.step_rename tidy.step_rename_at tidy.step_rm tidy.step_sample tidy.step_scale tidy.step_select tidy.step_shuffle tidy.step_slice tidy.step_spatialsign tidy.step_spline_b tidy.step_spline_convex tidy.step_spline_monotone tidy.step_spline_natural tidy.step_spline_nonnegative tidy.step_sqrt tidy.step_string2factor tidy.step_time tidy.step_unknown tidy.step_unorder tidy.step_window tidy.step_YeoJohnson tidy.step_zv
Update role specific requirementsupdate_role_requirements
Update a recipe stepupdate.step