Package: recipes 1.1.0.9001
recipes: Preprocessing and Feature Engineering Steps for Modeling
A recipe prepares your data for modeling. We provide an extensible framework for pipeable sequences of feature engineering steps provides preprocessing tools to be applied to data. Statistical parameters for the steps can be estimated from an initial data set and then applied to other data sets. The resulting processed output can then be used as inputs for statistical or machine learning models.
Authors:
recipes_1.1.0.9001.tar.gz
recipes_1.1.0.9001.zip(r-4.5)recipes_1.1.0.9001.zip(r-4.4)recipes_1.1.0.9001.zip(r-4.3)
recipes_1.1.0.9001.tgz(r-4.4-any)recipes_1.1.0.9001.tgz(r-4.3-any)
recipes_1.1.0.9001.tar.gz(r-4.5-noble)recipes_1.1.0.9001.tar.gz(r-4.4-noble)
recipes_1.1.0.9001.tgz(r-4.4-emscripten)recipes_1.1.0.9001.tgz(r-4.3-emscripten)
recipes.pdf |recipes.html✨
recipes/json (API)
NEWS
# Install 'recipes' in R: |
install.packages('recipes', repos = c('https://tidymodels.r-universe.dev', 'https://cloud.r-project.org')) |
Bug tracker:https://github.com/tidymodels/recipes/issues
Pkgdown site:https://recipes.tidymodels.org
Last updated 30 days agofrom:e738967d28. Checks:OK: 3 NOTE: 4. Indexed: yes.
Target | Result | Date |
---|---|---|
Doc / Vignettes | OK | Dec 26 2024 |
R-4.5-win | OK | Dec 26 2024 |
R-4.5-linux | OK | Dec 26 2024 |
R-4.4-win | NOTE | Dec 26 2024 |
R-4.4-mac | NOTE | Dec 26 2024 |
R-4.3-win | NOTE | Dec 26 2024 |
R-4.3-mac | NOTE | Dec 26 2024 |
Exports:.get_data_types%>%add_checkadd_roleadd_stepall_dateall_date_predictorsall_datetimeall_datetime_predictorsall_doubleall_double_predictorsall_factorall_factor_predictorsall_integerall_integer_predictorsall_logicalall_logical_predictorsall_nominalall_nominal_predictorsall_numericall_numeric_predictorsall_orderedall_ordered_predictorsall_outcomesall_predictorsall_stringall_string_predictorsall_unorderedall_unordered_predictorsare_weights_usedaveragesbakecheckcheck_classcheck_colscheck_missingcheck_namecheck_new_datacheck_new_valuescheck_rangecheck_typecorrelationscovariancescurrent_infodenom_varsdetect_stepdiscretizedummy_extract_namesdummy_namesellipse_checkestimate_yjextract_fit_timeextract_parameter_dialsextract_parameter_set_dialsfixedformat_ch_vecformat_selectorsfrequency_weightsfully_trainedget_case_weightsget_keep_original_colshas_rolehas_typeimp_varsimportance_weightsis_trainedjuicemediansnames0pca_wtsprepprepareprepperprint_stepprinterprofrand_idreciperecipes_eval_selectrecipes_extension_checkrecipes_names_outcomesrecipes_names_predictorsrecipes_pkg_checkrecipes_ptyperecipes_ptype_validaterecipes_remove_colsremove_original_colsremove_rolerequired_pkgssel2charstepstep_arrangestep_bagimputestep_bin2factorstep_BoxCoxstep_bsstep_centerstep_classdiststep_classdist_shrunkenstep_corrstep_countstep_cutstep_datestep_depthstep_discretizestep_dummystep_dummy_extractstep_dummy_multi_choicestep_factor2stringstep_filterstep_filter_missingstep_geodiststep_harmonicstep_holidaystep_hyperbolicstep_icastep_impute_bagstep_impute_knnstep_impute_linearstep_impute_lowerstep_impute_meanstep_impute_medianstep_impute_modestep_impute_rollstep_indicate_nastep_integerstep_interactstep_interceptstep_inversestep_invlogitstep_isomapstep_knnimputestep_kpcastep_kpca_polystep_kpca_rbfstep_lagstep_lincombstep_logstep_logitstep_lowerimputestep_meanimputestep_medianimputestep_modeimputestep_mutatestep_mutate_atstep_naomitstep_nnmfstep_nnmf_sparsestep_normalizestep_novelstep_nsstep_num2factorstep_nzvstep_ordinalscorestep_otherstep_pcastep_percentilestep_plsstep_polystep_poly_bernsteinstep_profilestep_rangestep_ratiostep_regexstep_relevelstep_relustep_renamestep_rename_atstep_rmstep_rollimputestep_samplestep_scalestep_selectstep_shufflestep_slicestep_spatialsignstep_spline_bstep_spline_convexstep_spline_monotonestep_spline_naturalstep_spline_nonnegativestep_sqrtstep_string2factorstep_timestep_unknownstep_unorderstep_windowstep_YeoJohnsonstep_zvterms_selecttidytunabletune_argsupdateupdate_roleupdate_role_requirementsvariancesyj_transform
Dependencies:classcliclockcodetoolscpp11data.tablediagramdigestdplyrfansifuturefuture.applygenericsglobalsgluegowerhardhatipredKernSmoothlatticelavalifecyclelistenvlubridatemagrittrMASSMatrixnnetnumDerivparallellypillarpkgconfigprodlimprogressrpurrrR6RcpprlangrpartshapesparsevctrsSQUAREMstringistringrsurvivaltibbletidyrtidyselecttimechangetimeDatetzdbutf8vctrswithr
Handling categorical predictors
Rendered fromDummies.Rmd
usingknitr::rmarkdown
on Dec 26 2024.Last update: 2023-12-07
Started: 2017-11-17
Introduction to recipes
Rendered fromrecipes.Rmd
usingknitr::rmarkdown
on Dec 26 2024.Last update: 2022-06-13
Started: 2022-02-18
On skipping steps
Rendered fromSkipping.Rmd
usingknitr::rmarkdown
on Dec 26 2024.Last update: 2022-02-18
Started: 2017-12-21
Ordering of steps
Rendered fromOrdering.Rmd
usingknitr::rmarkdown
on Dec 26 2024.Last update: 2022-02-18
Started: 2017-06-15
Roles in recipes
Rendered fromRoles.Rmd
usingknitr::rmarkdown
on Dec 26 2024.Last update: 2022-06-13
Started: 2019-03-19
Selecting variables
Rendered fromSelecting_Variables.Rmd
usingknitr::rmarkdown
on Dec 26 2024.Last update: 2023-11-01
Started: 2017-03-29
Readme and manuals
Help Manual
Help page | Topics |
---|---|
Get types for use in recipes | .get_data_types .get_data_types.character .get_data_types.Date .get_data_types.default .get_data_types.double .get_data_types.factor .get_data_types.hardhat_case_weights .get_data_types.integer .get_data_types.list .get_data_types.logical .get_data_types.numeric .get_data_types.ordered .get_data_types.POSIXct .get_data_types.Surv .get_data_types.textrecipes_tokenlist |
Add a New Operation to the Current Recipe | add_check add_step |
Apply a trained preprocessing recipe | bake bake.recipe |
Using case weights with recipes | case_weights |
Helpers for steps with case weights | are_weights_used averages case-weight-helpers correlations covariances get_case_weights medians pca_wts variances |
Check variable class | check_class |
Check if all columns are present | check_cols |
Check for missing values | check_missing |
Check for new values | check_new_values |
Check range consistency | check_range |
Detect if a particular step or check is used in a recipe | detect_step |
Developer functions for creating recipes steps | developer_functions |
Discretize Numeric Variables | discretize discretize.default discretize.numeric predict.discretize |
Create a formula from a prepared recipe | formula.recipe |
Check to see if a recipe is trained/prepared | fully_trained |
Role Selection | all_date all_datetime all_datetime_predictors all_date_predictors all_double all_double_predictors all_factor all_factor_predictors all_integer all_integer_predictors all_logical all_logical_predictors all_nominal all_nominal_predictors all_numeric all_numeric_predictors all_ordered all_ordered_predictors all_outcomes all_predictors all_string all_string_predictors all_unordered all_unordered_predictors current_info has_role has_type |
Extract transformed training set | juice |
Naming Tools | dummy_extract_names dummy_names names0 |
Estimate a preprocessing recipe | prep prep.recipe |
Wrapper function for preparing recipes within resampling | prepper |
Print a Recipe | print.recipe |
Create a recipe for preprocessing data | recipe recipe.data.frame recipe.default recipe.formula recipe.matrix |
Evaluate a selection with tidyselect semantics specific to recipes | recipes_eval_select |
Checks that steps have all S3 methods | recipes_extension_check |
Manually alter roles | add_role remove_role roles update_role |
Methods for selecting variables in step functions | selection selections |
Using sparse data with recipes | sparse_data |
Sort rows using dplyr | step_arrange |
Create a factors from A dummy variable | step_bin2factor |
Box-Cox transformation for non-negative data | step_BoxCox |
B-spline basis functions | step_bs |
Centering numeric data | step_center |
Distances to class centroids | step_classdist |
Compute shrunken centroid distances for classification models | step_classdist_shrunken |
High correlation filter | step_corr |
Create counts of patterns using regular expressions | step_count |
Cut a numeric variable into a factor | step_cut |
Date feature generator | step_date |
Data depths | step_depth |
Discretize Numeric Variables | step_discretize |
Create traditional dummy variables | step_dummy |
Extract patterns from nominal data | step_dummy_extract |
Handle levels in multiple predictors together | step_dummy_multi_choice |
Convert factors to strings | step_factor2string |
Filter rows using dplyr | step_filter |
Missing value column filter | step_filter_missing |
Distance between two locations | step_geodist |
Add sin and cos terms for harmonic analysis | step_harmonic |
Holiday feature generator | step_holiday |
Hyperbolic transformations | step_hyperbolic |
ICA signal extraction | step_ica |
Impute via bagged trees | imp_vars step_impute_bag |
Impute via k-nearest neighbors | step_impute_knn |
Impute numeric variables via a linear model | step_impute_linear |
Impute numeric data below the threshold of measurement | step_impute_lower |
Impute numeric data using the mean | step_impute_mean |
Impute numeric data using the median | step_impute_median |
Impute nominal data using the most common value | step_impute_mode |
Impute numeric data using a rolling window statistic | step_impute_roll |
Create missing data column indicators | step_indicate_na |
Convert values to predefined integers | step_integer |
Create interaction variables | step_interact |
Add intercept (or constant) column | step_intercept |
Inverse transformation | step_inverse |
Inverse logit transformation | step_invlogit |
Isomap embedding | step_isomap |
Kernel PCA signal extraction | step_kpca |
Polynomial kernel PCA signal extraction | step_kpca_poly |
Radial basis function kernel PCA signal extraction | step_kpca_rbf |
Create a lagged predictor | step_lag |
Linear combination filter | step_lincomb |
Logarithmic transformation | step_log |
Logit transformation | step_logit |
Add new variables using dplyr | step_mutate |
Mutate multiple columns using dplyr | step_mutate_at |
Remove observations with missing values | step_naomit |
Non-negative matrix factorization signal extraction | step_nnmf |
Non-negative matrix factorization signal extraction with lasso penalization | step_nnmf_sparse |
Center and scale numeric data | step_normalize |
Simple value assignments for novel factor levels | step_novel |
Natural spline basis functions | step_ns |
Convert numbers to factors | step_num2factor |
Near-zero variance filter | step_nzv |
Convert ordinal factors to numeric scores | step_ordinalscore |
Collapse infrequent categorical levels | step_other |
PCA signal extraction | step_pca |
Percentile transformation | step_percentile |
Partial least squares feature extraction | step_pls |
Orthogonal polynomial basis functions | step_poly |
Generalized bernstein polynomial basis | step_poly_bernstein |
Create a profiling version of a data set | step_profile |
Scaling numeric data to a specific range | step_range |
Ratio variable creation | denom_vars step_ratio |
Detect a regular expression | step_regex |
Relevel factors to a desired level | step_relevel |
Apply (smoothed) rectified linear transformation | step_relu |
Rename variables by name using dplyr | step_rename |
Rename multiple columns using dplyr | step_rename_at |
General variable filter | step_rm |
Sample rows using dplyr | step_sample |
Scaling numeric data | step_scale |
Select variables using dplyr | step_select |
Shuffle variables | step_shuffle |
Filter rows by position using dplyr | step_slice |
Spatial sign preprocessing | step_spatialsign |
Basis splines | step_spline_b |
Convex splines | step_spline_convex |
Monotone splines | step_spline_monotone |
Natural splines | step_spline_natural |
Non-negative splines | step_spline_nonnegative |
Square root transformation | step_sqrt |
Convert strings to factors | step_string2factor |
Time feature generator | step_time |
Assign missing categories to "unknown" | step_unknown |
Convert ordered factors to unordered factors | step_unorder |
Moving window functions | step_window |
Yeo-Johnson transformation | step_YeoJohnson |
Zero variance filter | step_zv |
Summarize a recipe | summary.recipe |
Tidy the result of a recipe | tidy.check tidy.check_class tidy.check_cols tidy.check_missing tidy.check_new_values tidy.check_range tidy.recipe tidy.step tidy.step_arrange tidy.step_bin2factor tidy.step_BoxCox tidy.step_bs tidy.step_center tidy.step_classdist tidy.step_classdist_shrunken tidy.step_corr tidy.step_count tidy.step_cut tidy.step_date tidy.step_depth tidy.step_discretize tidy.step_dummy tidy.step_dummy_extract tidy.step_dummy_multi_choice tidy.step_factor2string tidy.step_filter tidy.step_filter_missing tidy.step_geodist tidy.step_harmonic tidy.step_holiday tidy.step_hyperbolic tidy.step_ica tidy.step_impute_bag tidy.step_impute_knn tidy.step_impute_linear tidy.step_impute_lower tidy.step_impute_mean tidy.step_impute_median tidy.step_impute_mode tidy.step_impute_roll tidy.step_indicate_na tidy.step_integer tidy.step_interact tidy.step_intercept tidy.step_inverse tidy.step_invlogit tidy.step_isomap tidy.step_kpca tidy.step_kpca_poly tidy.step_kpca_rbf tidy.step_lag tidy.step_lincomb tidy.step_log tidy.step_logit tidy.step_mutate tidy.step_mutate_at tidy.step_naomit tidy.step_nnmf tidy.step_nnmf_sparse tidy.step_normalize tidy.step_novel tidy.step_ns tidy.step_num2factor tidy.step_nzv tidy.step_ordinalscore tidy.step_other tidy.step_pca tidy.step_percentile tidy.step_pls tidy.step_poly tidy.step_poly_bernstein tidy.step_profile tidy.step_range tidy.step_ratio tidy.step_regex tidy.step_relevel tidy.step_relu tidy.step_rename tidy.step_rename_at tidy.step_rm tidy.step_sample tidy.step_scale tidy.step_select tidy.step_shuffle tidy.step_slice tidy.step_spatialsign tidy.step_spline_b tidy.step_spline_convex tidy.step_spline_monotone tidy.step_spline_natural tidy.step_spline_nonnegative tidy.step_sqrt tidy.step_string2factor tidy.step_time tidy.step_unknown tidy.step_unorder tidy.step_window tidy.step_YeoJohnson tidy.step_zv |
Update role specific requirements | update_role_requirements |
Update a recipe step | update.step |