Releases: mlr-org/mlr3
Releases · mlr-org/mlr3
mlr3 0.13.1
- Improved performance for many operations on
ResampleResult
and
BenchmarkResult
. resample()
andbenchmark()
got a new argumentclone
to control which
objects to clone before performing computations.- Tasks are checked for infinite values during the conversion from
data.frame
toTask
inas_task_classif()
andas_task_regr()
. A warning is signaled
if any column contains infinite values.
mlr3 0.13.0
- Learners which are capable of resuming/continuing (e.g.,
learner(classif|regr|surv).xgboost
with hyperparameternrounds
updated)
can now optionally store a stack of trained learners to be used to hotstart
their training. Note that this feature is still somewhat experimental.
SeeHotstartStack
and #719. - New measures to score similarity of selected feature sets:
sim.jaccard
(Jaccard Index) andsim.phi
(Phi coefficient) (#690). predict_newdata()
now also supportsDataBackend
as input.- New function
install_pkgs()
to install required packages. This generic works
for all objects with apackages
field as well asResampleResult
and
BenchmarkResult
(#728). - New learner
regr.debug
for debugging. - New
Task
method$set_levels()
to control how data with factor columns
is returned, independent of the usedDataBackend
. - Measures now return
NA
if prerequisite are not met (#699).
This allows to conveniently score your experiments with multiple measures
having different requirements. - Feature names may no longer contain the special character
%
.
mlr3 0.12.0
- New method to assign labels to columns in tasks:
Task$label()
.
These will be used in visualizations in the future. - New method to add stratification variables:
Task$add_strata()
. - New helper function
partition()
to split a task into a training and test
set. - New standardized getter
loglik()
for classLearner
. - New measures
"aic"
and"bic"
to compute the Akaike Information Criterion
or the Bayesian Information Criterion, respectively. - New Resampling method:
ResamplingCustomCV
. Creates a custom resampling split
based on the levels of a user-provided factor variable. - New argument
encapsulate
forresample()
andbenchmark()
to conveniently
enable encapsulation and also set the fallback learner to the
featureless learner. This is simply for convenience, configuring each learner
individually is still possible and allows a more fine-grained control (#634,
#642). - New field
parallel_predict
forLearner
to enable parallel predictions via
the future backend. This currently is only enabled while calling the
$predict()
or$predict_newdata
methods and is disabled duringresample()
andbenchmark()
where you have other means to parallelize. - Deprecated public (and already documented as internal) field
$data
in
ResampleResult
andBenchmarkResult
to simplify the API and avoid
confusion. The converteras.data.table()
can be used instead to access the
internal data. - Measures now have formal hyperparameters. A popular example where this is
required is the F1 score, now implemented with customizablebeta
. - Changed default of argument
ordered
inTask$data()
fromTRUE
toFALSE
. - Fixed getter
ResamplingRepeatedCV$folds()
(#643). - Fixed hashing of some measures.
- Removed experimental column role
uri
. This role be split up into multiple
roles by themlr3keras
package.
mlr3 0.11.0
- Added a
as.data.table.Resampling
method. - Renamed column
"row_id"
to"row_ids"
in theas.data.table()
methods
forPredictionClassif
andPredictionRegr
(#547). - Added converters
as_prediction_classif()
andas_prediction_regr()
to
reverse the operation ofas.data.table.PredictionClassif()
and
as.data.table.PredictionRegr()
. - Specifying a weight column during
learner$predict_newdata()
is not mandatory
anymore (#563). Task$data()
defaults to return only active rows and columns, instead of
asserting to only return rows and columns. As a result, the$data()
method
can now also be used to query inactive rows and cols from theDataBackend
.- New (experimental) column role
uri
which is intended to point to external
resources, e.g. images on the file system. - New helper
set_threads()
to control the number of threads during calls to
external packages. All objects will be migrated to have threading disabled in
their defaults to avoid conflicting parallelization techniques (#605). - New option
mlr3.debug
: avoid calls tofuture
inresample()
and
benchmark()
to improve the readability of tracebacks. - New experimental option
mlr3.allow_utf8_names
: allow non-ascii characters in
column names in tasks.
mlr3 0.10.0
- Result containers
ResampleResult
andBenchmarkResult
now optionally remove
the DataBackend of the Tasks in order to reduce file size and memory
footprint after serialization. To remove the backends from the containers,
setstore_backends
toFALSE
inresample()
orbenchmark()
,
respectively. Note that this behaviour will eventually will be the default for
future releases. - Prediction objects generated by
Learner$predict_newdata()
now have row ids
starting from 1 instead auto incrementing row ids of the training task. as.data.table.DictionaryTasks
now returns an additional columnproperties
.- Added flag
conditions
toResampleResult$score()
and
BenchmarkResult$score()
to allow to work with failing learners more
conveniently.
mlr3 0.9.0
- New methods for
Task
:$set_col_roles
and$set_row_roles
as a replacement
for the deprecated and less flexible$set_col_role
and$set_row_role
. - Learners can now have a timeout (#556).
- Removed S3 method
friedman.test.BenchmarkResult()
in favor of the new
mlr3benchmark
package.
mlr3 0.8.0
MeasureOOBError
now has set propertyminimize
toTRUE
.- New learner property
"featureless"
to tag learners which can operate on
featureless tasks. - Fixed [ResampleResult] ignoring argument
predict_sets
for returned
[Prediction] objects. - Compability with new version of
lgr
.
mlr3 0.7.0
- Updated properties of featureless learners to apply it on all feature types
(did not work on POSIXct columns). - Fixed measures being calculated as
NaN
forBenchmarkResult
for resamplings
with a single iteration (#551). - Fixed a bug where a broken heuristic disabled nested parallelization via
packagefuture
(mlr3tuning#270). ResampleResult
andBenchmarkResult
now share a common interface to store
the experiment results. Manual construction is still possible with helper
functionas_result_data()
- Fixed deep cloning of
ResamplingCV
andResamplingRepeatedCV
. - New measure
classif.prauc
(area under precision-recall curve). - Removed dependency on orphaned package
bibtex
.
mlr3 0.6.0
- Compact in-memory representation of R6 objects to save space when
saving objects viasaveRDS()
orserialize()
. - Objects in containers like
ResampleResult
orBenchmarkResult
are now
de-duplicated for an optimized serialization. - Fixed data set
breast_cancer
: all factor features are now
correctly stored as ordered factors. - Added a new utility function
convert_task()
.
mlr3 0.5.0
- Added classification task
breast_cancer
- Added
ResamplingLOO
for leave-one-out resampling. - Regression now supports predict type
"distr"
using thedistr6
package. - Fixed
ResamplingBootstrap
in combination with grouping (#514). - Fixed plot method of
TaskGeneratorMoons
. - Added hyperparameter
keep_model
to learners"classif.rpart"
and
"regr.rpart"
.