I need to extract a year corresponding to a time period and put it into a new column in a data frame. The tricky part is that the years I need to extract are not calendar years. My 'standardized' years start the 1st of July of a given calendar year and end up the 30th of June of the subsequent calendar year. So, if an event occurred anytime within this time period, the standardized year is the first calendar year in this time period involving two calendar years. For example, if events occurred the 1st of July 2019, the 25th of December 2019, and the 30th of June 2020, the 'standardized' year for all these three events is 2019 (i.e. the first year within the time period between 1st of Jul 2019 and 30th June 2020). How can I extract such standardized years in R and assigned it to a new column in the data frame?
My data file is very large, but as a simplified example, here are some events happening in specific dates
dat <- as.Date(c("2-Feb-18", "24-May-10", "30-Dec-19","1-Jul-20"),"%d-%b-%y")
dat <- as.data.frame(dat)
names(dat)[1] <- "events"
dat
events
1 2018-02-02
2 2010-05-24
3 2019-12-30
4 2020-07-01
In this case the column I want to create 'standardized_year' should look like this
events standardized_year
1 2018-02-02 2017
2 2010-05-24 2009
3 2019-12-30 2019
4 2020-07-01 2020
In the first row, the standardized year is 2017 because 2-Feb-18 is within 1st of Jul 2017 and 30th Jun 2018, so the first year is extracted within the 'standardized' year. The same criteria for all other values.
Is there a way to do this automatically in R for a large amount of events in a dataframe?
Any help would be much appreciated. Thanks