filter_scoped_df
subsets rows of a data frame based on grouping structure
(see group_by
). Filtering statements are provided in a separate tibble
where each row represents a combination of a logical expression and a list of groups
to which the expression should be applied to corresponding to see indices from
cur_group_id
).
filter_scoped_df(dframe, condition_df)
A grouped or ungrouped tibble
or data.frame
A tibble
with two columns; condition_df[ ,1]
with
character
strings which evaluate to valid logical expressions applicable in
subset
or filter
, and condition_df[ ,2]
,
a list-column with group scoping levels (numeric
) or NULL
for
unscoped filtering. If all groups are given for a statement, the operation is
the same as for a grouped data.frame
in filter
.
An object of the same type as dframe
. The output is a subset of
the input, with groups and rows appearing in the same order, and an additional column
.dcrindex
representing the group indices.
The output may have less groups as the input, depending on subsetting.
This function is applied in the "Filtering" tab of the datacleanr
app,
and applied in the reproducible code recipe in the "Extract" tab.
Note, that multiple checks for valid statements are performed in the app (and only valid operations
printed in the "Extract" tab). It is therefore not advisable to manually alter this code or use
this function interactively.
# set-up condition_df
cdf <- dplyr::tibble(
statement = c(
"Sepal.Width > quantile(Sepal.Width, 0.1)",
"Petal.Width > quantile(Petal.Width, 0.1)",
"Petal.Length > quantile(Petal.Length, 0.8)"
),
scope_at = list(NULL, NULL, c(1, 2))
)
fdf <- filter_scoped_df(
dplyr::group_by(
iris,
Species
),
condition_df = cdf
)
# Example of invalid expression:
# column 'Spec' does not exist in iris
# "Spec == 'setosa'"