Package: themis 1.0.3.9000

Emil Hvitfeldt

themis: Extra Recipes Steps for Dealing with Unbalanced Data

A dataset with an uneven number of cases in each class is said to be unbalanced. Many models produce a subpar performance on unbalanced datasets. A dataset can be balanced by increasing the number of minority cases using SMOTE 2011 <doi:10.48550/arXiv.1106.1813>, BorderlineSMOTE 2005 <doi:10.1007/11538059_91> and ADASYN 2008 <https://ieeexplore.ieee.org/document/4633969>. Or by decreasing the number of majority cases using NearMiss 2003 <https://www.site.uottawa.ca/~nat/Workshop2003/jzhang.pdf> or Tomek link removal 1976 <https://ieeexplore.ieee.org/document/4309452>.

Authors:Emil Hvitfeldt [aut, cre], Posit Software, PBC [cph, fnd]

themis_1.0.3.9000.tar.gz
themis_1.0.3.9000.zip(r-4.7)themis_1.0.3.9000.zip(r-4.6)themis_1.0.3.9000.zip(r-4.5)
themis_1.0.3.9000.tgz(r-4.6-any)themis_1.0.3.9000.tgz(r-4.5-any)
themis_1.0.3.9000.tar.gz(r-4.7-any)themis_1.0.3.9000.tar.gz(r-4.6-any)
themis_1.0.3.9000.tgz(r-4.6-emscripten)
manual.pdf |manual.html
card.svg |card.png
themis/json (API)
NEWS

# Install 'themis' in R:
install.packages('themis', repos = c('https://tidymodels.r-universe.dev', 'https://cloud.r-project.org'))

Bug tracker:https://github.com/tidymodels/themis/issues

Pkgdown/docs site:https://themis.tidymodels.org

Datasets:

On CRAN:

Conda:

10.60 score 142 stars 2 packages 1.7k scripts 18k downloads 19 exports 65 dependencies

Last updated from:76cf8d0add. Checks:9 OK. Indexed: yes.

TargetResultTimeFilesSyslog
linux-devel-x86_64OK191
source / vignettesOK197
linux-release-x86_64OK224
macos-release-arm64OK123
macos-oldrel-arm64OK287
windows-develOK155
windows-releaseOK131
windows-oldrelOK146
wasm-releaseOK117

Exports:adasynbsmotenearmissrequired_pkgsrosesmotesmotencstep_adasynstep_bsmotestep_downsamplestep_nearmissstep_rosestep_smotestep_smotencstep_tomekstep_upsampletidytomektunable

Dependencies:classcliclockcodetoolscpp11data.tablediagramdigestdplyrfarverfuturefuture.applygenericsggplot2globalsgluegowergtablehardhatipredisobandKernSmoothlabelinglatticelavalifecyclelistenvlubridatemagrittrMASSMatrixnnetnumDerivparallellypillarpkgconfigprodlimprogressrpurrrR6RANNRColorBrewerRcpprecipesrlangROSErpartS7scalesshapesparsevctrsSQUAREMstringistringrsurvivaltibbletidyrtidyselecttimechangetimeDatetzdbutf8vctrsviridisLitewithr

Readme and manuals

Help Manual

Help pageTopics
Adaptive Synthetic Algorithmadasyn
borderline-SMOTE Algorithmbsmote
Synthetic Dataset With a Circlecircle_example
Remove Points Near Other Classesnearmiss
ROSE Algorithmrose
SMOTE Algorithmsmote
SMOTENC Algorithmsmotenc
Apply Adaptive Synthetic Algorithmstep_adasyn tidy.step_adasyn
Apply borderline-SMOTE Algorithmstep_bsmote tidy.step_bsmote
Down-Sample a Data Set Based on a Factor Variablestep_downsample tidy.step_downsample
Remove Points Near Other Classesstep_nearmiss tidy.step_nearmiss
Apply ROSE Algorithmstep_rose tidy.step_rose
Apply SMOTE Algorithmstep_smote tidy.step_smote
Apply SMOTENC algorithmstep_smotenc tidy.step_smotenc
Remove Tomek’s Linksstep_tomek tidy.step_tomek
Up-Sample a Data Set Based on a Factor Variablestep_upsample tidy.step_upsample
Remove Tomek's linkstomek