Skip to contents

This function matches two dataframe objects by their unique identifier (usually "time" or "datetime in a timeseries), and returns a new dataframe which contains only rows that have changed compared to previous data. It will not return any new rows.

Usage

catch(df_current, df_previous, datetime_variable)

Arguments

df_current

data.frame, the newest/current version of dataset x.

df_previous

data.frame, the old version of dataset, for example x - t1.

datetime_variable

character, which variable to use as unique ID to join df_current and df_previous. Usually a "datetime" variable.

Value

A dataframe which contains only rows of df_current that have changes from df_previous, but without new rows. also returns a waldo object as in loupe().

Details

The underlying functionality is handled by create_object_list().

Examples

# Returning only matched rows which contain changes
df_caught <- butterfly::catch(
  butterflycount$march, # This is your new or current dataset
  butterflycount$february, # This is the previous version you are comparing it to
  datetime_variable = "time" # This is the unique ID variable they have in common
)
#> The following rows are new in 'df_current': 
#>         time count
#> 1 2024-03-01    23
#> 
#>  The following values have changes from the previous data.
#> old vs new
#>            count
#>   old[1, ]    17
#>   old[2, ]    22
#>   old[3, ]    55
#> - old[4, ]    18
#> + new[4, ]    11
#> 
#> `old$count`: 17 22 55 18
#> `new$count`: 17 22 55 11
#> 
#>  Only these rows are returned.

df_caught
#>         time count
#> 1 2023-11-01    18