Skip to content

Releases: mlr-org/mlr3

mlr3 0.13.1

20 Jan 12:16
19eddd8
Compare
Choose a tag to compare
  • Improved performance for many operations on ResampleResult and
    BenchmarkResult.
  • resample() and benchmark() got a new argument clone to control which
    objects to clone before performing computations.
  • Tasks are checked for infinite values during the conversion from data.frame
    to Task in as_task_classif() and as_task_regr(). A warning is signaled
    if any column contains infinite values.

mlr3 0.13.0

16 Nov 14:16
96008d0
Compare
Choose a tag to compare
  • Learners which are capable of resuming/continuing (e.g.,
    learner (classif|regr|surv).xgboost with hyperparameter nrounds updated)
    can now optionally store a stack of trained learners to be used to hotstart
    their training. Note that this feature is still somewhat experimental.
    See HotstartStack and #719.
  • New measures to score similarity of selected feature sets:
    sim.jaccard (Jaccard Index) and sim.phi (Phi coefficient) (#690).
  • predict_newdata() now also supports DataBackend as input.
  • New function install_pkgs() to install required packages. This generic works
    for all objects with a packages field as well as ResampleResult and
    BenchmarkResult (#728).
  • New learner regr.debug for debugging.
  • New Task method $set_levels() to control how data with factor columns
    is returned, independent of the used DataBackend.
  • Measures now return NA if prerequisite are not met (#699).
    This allows to conveniently score your experiments with multiple measures
    having different requirements.
  • Feature names may no longer contain the special character %.

mlr3 0.12.0

05 Aug 18:04
0df584c
Compare
Choose a tag to compare
  • New method to assign labels to columns in tasks: Task$label().
    These will be used in visualizations in the future.
  • New method to add stratification variables: Task$add_strata().
  • New helper function partition() to split a task into a training and test
    set.
  • New standardized getter loglik() for class Learner.
  • New measures "aic" and "bic" to compute the Akaike Information Criterion
    or the Bayesian Information Criterion, respectively.
  • New Resampling method: ResamplingCustomCV. Creates a custom resampling split
    based on the levels of a user-provided factor variable.
  • New argument encapsulate for resample() and benchmark() to conveniently
    enable encapsulation and also set the fallback learner to the
    featureless learner. This is simply for convenience, configuring each learner
    individually is still possible and allows a more fine-grained control (#634,
    #642).
  • New field parallel_predict for Learner to enable parallel predictions via
    the future backend. This currently is only enabled while calling the
    $predict() or $predict_newdata methods and is disabled during resample()
    and benchmark() where you have other means to parallelize.
  • Deprecated public (and already documented as internal) field $data in
    ResampleResult and BenchmarkResult to simplify the API and avoid
    confusion. The converter as.data.table() can be used instead to access the
    internal data.
  • Measures now have formal hyperparameters. A popular example where this is
    required is the F1 score, now implemented with customizable beta.
  • Changed default of argument ordered in Task$data() from TRUE to FALSE.
  • Fixed getter ResamplingRepeatedCV$folds() (#643).
  • Fixed hashing of some measures.
  • Removed experimental column role uri. This role be split up into multiple
    roles by the mlr3keras package.

mlr3 0.11.0

05 Mar 14:21
Compare
Choose a tag to compare
  • Added a as.data.table.Resampling method.
  • Renamed column "row_id" to "row_ids" in the as.data.table() methods
    for PredictionClassif and PredictionRegr (#547).
  • Added converters as_prediction_classif() and as_prediction_regr() to
    reverse the operation of as.data.table.PredictionClassif() and
    as.data.table.PredictionRegr().
  • Specifying a weight column during learner$predict_newdata() is not mandatory
    anymore (#563).
  • Task$data() defaults to return only active rows and columns, instead of
    asserting to only return rows and columns. As a result, the $data() method
    can now also be used to query inactive rows and cols from the DataBackend.
  • New (experimental) column role uri which is intended to point to external
    resources, e.g. images on the file system.
  • New helper set_threads() to control the number of threads during calls to
    external packages. All objects will be migrated to have threading disabled in
    their defaults to avoid conflicting parallelization techniques (#605).
  • New option mlr3.debug: avoid calls to future in resample() and
    benchmark() to improve the readability of tracebacks.
  • New experimental option mlr3.allow_utf8_names: allow non-ascii characters in
    column names in tasks.

mlr3 0.10.0

21 Jan 13:51
Compare
Choose a tag to compare
  • Result containers ResampleResult and BenchmarkResult now optionally remove
    the DataBackend of the Tasks in order to reduce file size and memory
    footprint after serialization. To remove the backends from the containers,
    set store_backends to FALSE in resample() or benchmark(),
    respectively. Note that this behaviour will eventually will be the default for
    future releases.
  • Prediction objects generated by Learner$predict_newdata() now have row ids
    starting from 1 instead auto incrementing row ids of the training task.
  • as.data.table.DictionaryTasks now returns an additional column properties.
  • Added flag conditions to ResampleResult$score() and
    BenchmarkResult$score() to allow to work with failing learners more
    conveniently.

mlr3 0.9.0

06 Dec 19:48
Compare
Choose a tag to compare
  • New methods for Task: $set_col_roles and $set_row_roles as a replacement
    for the deprecated and less flexible $set_col_role and $set_row_role.
  • Learners can now have a timeout (#556).
  • Removed S3 method friedman.test.BenchmarkResult() in favor of the new
    mlr3benchmark package.

mlr3 0.8.0

21 Oct 09:07
Compare
Choose a tag to compare
  • MeasureOOBError now has set property minimize to TRUE.
  • New learner property "featureless" to tag learners which can operate on
    featureless tasks.
  • Fixed [ResampleResult] ignoring argument predict_sets for returned
    [Prediction] objects.
  • Compability with new version of lgr.

mlr3 0.7.0

07 Oct 12:39
Compare
Choose a tag to compare
  • Updated properties of featureless learners to apply it on all feature types
    (did not work on POSIXct columns).
  • Fixed measures being calculated as NaN for BenchmarkResult for resamplings
    with a single iteration (#551).
  • Fixed a bug where a broken heuristic disabled nested parallelization via
    package future (mlr3tuning#270).
  • ResampleResult and BenchmarkResult now share a common interface to store
    the experiment results. Manual construction is still possible with helper
    function as_result_data()
  • Fixed deep cloning of ResamplingCV and ResamplingRepeatedCV.
  • New measure classif.prauc (area under precision-recall curve).
  • Removed dependency on orphaned package bibtex.

mlr3 0.6.0

13 Sep 15:01
Compare
Choose a tag to compare
  • Compact in-memory representation of R6 objects to save space when
    saving objects via saveRDS() or serialize().
  • Objects in containers like ResampleResult or BenchmarkResult are now
    de-duplicated for an optimized serialization.
  • Fixed data set breast_cancer: all factor features are now
    correctly stored as ordered factors.
  • Added a new utility function convert_task().

mlr3 0.5.0

07 Aug 08:44
26c80f6
Compare
Choose a tag to compare
  • Added classification task breast_cancer
  • Added ResamplingLOO for leave-one-out resampling.
  • Regression now supports predict type "distr" using the distr6 package.
  • Fixed ResamplingBootstrap in combination with grouping (#514).
  • Fixed plot method of TaskGeneratorMoons.
  • Added hyperparameter keep_model to learners "classif.rpart" and
    "regr.rpart".