Changes in version 0.6.0.9001                      

New models for tabular data:

  - Regularization Learning Networks (brulee_rln()) use a conventional
    MLP architecture but each weight learns its own adaptive
    regularization coefficient.

  - ResNet (brulee_resnet()) can fit a multilayer neural networek with
    skip (i.e. residual) connections and batch normalization.

  - AutoInt (brulee_auto_int()) uses residual connections and columnwise
    attention mechanisms to create embeddings that encourage in-context
    learning of features.

  - All modeling functions now support GPU acceleration via the device
    parameter. Users can specify device = "cpu", device = "cuda", or
    device = "mps" (Apple Silicon). When device = NULL (default), the
    package automatically selects CUDA if available, otherwise defaults
    to CPU. Note: MPS is not auto-selected because it doesn't support
    float64 dtype required by brulee. See?training_efficiency for some
    related notes.

                 Changes in version 0.6.0 (2025-09-02)                  

  - Transition from the magrittr pipe to the base R pipe.

  - To try to help avoiding numeric overflow in the loss functions:
    
      - Tensors are stored as a 64-bit float instead of 32-bit.
    
      - Starting values were transitioned to using Gaussian distribution
        (instead of uniform) with a smaller standard deviation.
    
      - The results always contain the initial results to use as a
        fallback if there is overflow during the first epoch.
    
      - brulee_mlp() has two additional parameters, grad_value_clip and
        grad_value_clip, that prevent issues.
    
      - The warning was changed to "Early stopping occurred at epoch {X}
        due to numerical overflow of the loss function."

  - Several new SGD optimizers were added: "ADAMw", "Adadelta",
    "Adagrad", and "RMSprop".

  - Mixture parameter values different than zero cannot be used for
    several optimizers since they require L2 penalties.

                 Changes in version 0.5.0 (2025-04-07)                  

  - Removed a unit test for numerical overflow since it occurs less
    frequently and has become increasingly more challenging to
    reproduce.

                 Changes in version 0.4.0 (2025-01-30)                  

  - Added a convenience function, brulee_mlp_two_layer(), to more easily
    fit two-layer networks with parsnip.

  - Various changes and improvements to error and warning messages.

  - Fixed a bug that occurred when linear activation was used for neural
    networks (#68).

                 Changes in version 0.3.0 (2024-02-14)                  

  - Fixed bug where coef() didn't would error if used on a
    brulee_logistic_reg() that was trained with a recipe. (#66)

  - Fixed a bug where SGD always being used as the optimizer (#61).

  - Additional activation functions were added (#74).

                 Changes in version 0.2.0 (2022-09-19)                  

  - Several learning rate schedulers were added to the modeling
    functions (#12).

  - An optimizer was added to [brulee_mlp()], with a new default being
    LBFGS instead of stochastic gradient descent.

                 Changes in version 0.1.0 (2022-02-02)                  

  - Modeling functions gained a mixture argument for the proportion of
    L1 penalty that is used. (#50)

  - Penalization was not occurring when quasi-Newton optimization was
    chosen. (#50)

                 Changes in version 0.0.1 (2021-12-15)                  

First CRAN release.