Package: themis 1.0.2.9000

Emil Hvitfeldt

themis: Extra Recipes Steps for Dealing with Unbalanced Data

A dataset with an uneven number of cases in each class is said to be unbalanced. Many models produce a subpar performance on unbalanced datasets. A dataset can be balanced by increasing the number of minority cases using SMOTE 2011 <arxiv:1106.1813>, BorderlineSMOTE 2005 <doi:10.1007/11538059_91> and ADASYN 2008 <https://ieeexplore.ieee.org/document/4633969>. Or by decreasing the number of majority cases using NearMiss 2003 <https://www.site.uottawa.ca/~nat/Workshop2003/jzhang.pdf> or Tomek link removal 1976 <https://ieeexplore.ieee.org/document/4309452>.

Authors:Emil Hvitfeldt [aut, cre], Posit Software, PBC [cph, fnd]

themis_1.0.2.9000.tar.gz
themis_1.0.2.9000.zip(r-4.5)themis_1.0.2.9000.zip(r-4.4)themis_1.0.2.9000.zip(r-4.3)
themis_1.0.2.9000.tgz(r-4.4-any)themis_1.0.2.9000.tgz(r-4.3-any)
themis_1.0.2.9000.tar.gz(r-4.5-noble)themis_1.0.2.9000.tar.gz(r-4.4-noble)
themis_1.0.2.9000.tgz(r-4.4-emscripten)themis_1.0.2.9000.tgz(r-4.3-emscripten)
themis.pdf |themis.html
themis/json (API)
NEWS

# Install 'themis' in R:
install.packages('themis', repos = c('https://tidymodels.r-universe.dev', 'https://cloud.r-project.org'))

Peer review:

Bug tracker:https://github.com/tidymodels/themis/issues

Pkgdown:https://themis.tidymodels.org

Datasets:

On CRAN:

9.78 score 141 stars 1 packages 756 scripts 8.4k downloads 18 exports 57 dependencies

Last updated 1 months agofrom:95e578bbe9. Checks:OK: 5 NOTE: 2. Indexed: yes.

TargetResultDate
Doc / VignettesOKDec 09 2024
R-4.5-winNOTEDec 09 2024
R-4.5-linuxNOTEDec 09 2024
R-4.4-winOKDec 09 2024
R-4.4-macOKDec 09 2024
R-4.3-winOKDec 09 2024
R-4.3-macOKDec 09 2024

Exports:adasynbsmotenearmissrequired_pkgssmotesmotencstep_adasynstep_bsmotestep_downsamplestep_nearmissstep_rosestep_smotestep_smotencstep_tomekstep_upsampletidytomektunable

Dependencies:classcliclockcodetoolscpp11data.tablediagramdigestdplyrfansifuturefuture.applygenericsglobalsgluegowerhardhatipredKernSmoothlatticelavalifecyclelistenvlubridatemagrittrMASSMatrixnnetnumDerivparallellypillarpkgconfigprodlimprogressrpurrrR6RANNRcpprecipesrlangROSErpartshapesparsevctrsSQUAREMstringistringrsurvivaltibbletidyrtidyselecttimechangetimeDatetzdbutf8vctrswithr

Readme and manuals

Help Manual

Help pageTopics
Adaptive Synthetic Algorithmadasyn
borderline-SMOTE Algorithmbsmote
Synthetic Dataset With a Circlecircle_example
Remove Points Near Other Classesnearmiss
SMOTE Algorithmsmote
SMOTENC Algorithmsmotenc
Apply Adaptive Synthetic Algorithmstep_adasyn tidy.step_adasyn
Apply borderline-SMOTE Algorithmstep_bsmote tidy.step_bsmote
Down-Sample a Data Set Based on a Factor Variablestep_downsample tidy.step_downsample
Remove Points Near Other Classesstep_nearmiss tidy.step_nearmiss
Apply ROSE Algorithmstep_rose tidy.step_rose
Apply SMOTE Algorithmstep_smote tidy.step_smote
Apply SMOTENC algorithmstep_smotenc tidy.step_smotenc
Remove Tomek’s Linksstep_tomek tidy.step_tomek
Up-Sample a Data Set Based on a Factor Variablestep_upsample tidy.step_upsample
Remove Tomek's linkstomek