
Feature selection step using background level
Source:R/select_background.R
step_select_background.Rd
Select features that exceed a background level in at least a defined number of samples.
Arguments
- recipe
A recipe object. The step will be added to the sequence of operations for this recipe.
- ...
One or more selector functions to choose variables for this step. See
recipes::selections()
for more details.- role
Not used by this step since no new variables are created.
- trained
A logical to indicate if the quantities for preprocessing have been estimated.
- background_level
Background level to exceed.
- n_samples, prop_samples
Count or proportion of samples in which a feature exceeds
background_level
to be retained.- res
This parameter is only produced after the recipe has been trained.
- skip
A logical. Should the step be skipped when the recipe is baked by
recipes::bake()
? While all operations are baked whenrecipes::prep()
is run, some operations may not be able to be conducted on new data (e.g. processing the outcome variable(s)). Care should be taken when usingskip = TRUE
as it may affect the computations for subsequent operations.- id
A character string that is unique to this step to identify it.
- x
A
step_select_background
object.
Value
An updated version of recipe with the new step added to the sequence of any existing operations.
Examples
rec <-
iris %>%
recipe(formula = Species ~ .) %>%
step_select_background(all_numeric_predictors(),
background_level = 4, prop_samples = 0.5) %>%
prep()
rec
#>
#> ── Recipe ──────────────────────────────────────────────────────────────────────
#>
#> ── Inputs
#> Number of variables by role
#> outcome: 1
#> predictor: 4
#>
#> ── Training information
#> Training data contained 150 data points and no incomplete rows.
#>
#> ── Operations
#> • Background filtering on: Sepal.Length Sepal.Width, ... | Trained
tidy(rec, 1)
#> # A tibble: 4 × 3
#> terms kept id
#> <chr> <lgl> <chr>
#> 1 Sepal.Length TRUE select_background_fdVHI
#> 2 Sepal.Width FALSE select_background_fdVHI
#> 3 Petal.Length TRUE select_background_fdVHI
#> 4 Petal.Width FALSE select_background_fdVHI
bake(rec, new_data = NULL)
#> # A tibble: 150 × 3
#> Sepal.Length Petal.Length Species
#> <dbl> <dbl> <fct>
#> 1 5.1 1.4 setosa
#> 2 4.9 1.4 setosa
#> 3 4.7 1.3 setosa
#> 4 4.6 1.5 setosa
#> 5 5 1.4 setosa
#> 6 5.4 1.7 setosa
#> 7 4.6 1.4 setosa
#> 8 5 1.5 setosa
#> 9 4.4 1.4 setosa
#> 10 4.9 1.5 setosa
#> # ℹ 140 more rows