I have seen examples of using lapply across columns, but not one that considers (1) a timestamp (2) groups based on timestamp (3) detects when values change
I'm looking for a way to do the follow for an arbitrary number of Panels per Sensor (there could be Panel 3, Panel 4, etc.):
- for each
year,month,hour, I'm looking for thesumandCounts Turn Onwhich is a count of the # times a value changes from 0 to a non zero number. To simplify, non-zero values at the start of thehourshould not be counted towards this value (even if the previous value is 0).
take the df:
cols <- c("Timestamp","1000 Sensor 2 Panel 1","1000 Sensor 2 Panel 2")
tstmp <- seq(as.POSIXct("2018-08-13 00:00:00", tz="US/Eastern"),
as.POSIXct("2018-08-13 03:30:00", tz="US/Eastern"),
by="15 min") %>% as.data.frame()
stage1 <- c(rep(c(0,.7,1),5)) %>% as.data.frame()
stage2 <- c(0,1,rep(c(0,.5),5),0,1,1) %>% as.data.frame()
df = cbind(tstmp,stage1,stage2)
colnames(df) = cols
I'd like the result to be results_1:
ID Year Month Hour Sum Count Turn On
1000 Sensor 2 Panel 1 2018 8 0 1.7 1
1000 Sensor 2 Panel 1 2018 8 1 2.4 1
1000 Sensor 2 Panel 1 2018 8 2 2.7 1
1000 Sensor 2 Panel 1 2018 8 3 1.7 1
1000 Sensor 2 Panel 2 2018 8 0 1.5 2
1000 Sensor 2 Panel 2 2018 8 1 1 2
1000 Sensor 2 Panel 2 2018 8 2 1 2
1000 Sensor 2 Panel 2 2018 8 3 2 1
For those more ambitious, I'd like to see a solution that is able to determine whether the last reading in the previous hour was 0 and the first reading in the next hour is non-zero, and is able to count that towards Count Turns On -- the solution would look like below in results_advanced:
ID Year Month Hour Sum Count Turn On
1000 Sensor 2 Panel 1 2018 8 0 1.7 1
1000 Sensor 2 Panel 1 2018 8 1 2.4 2
1000 Sensor 2 Panel 1 2018 8 2 2.7 1
1000 Sensor 2 Panel 1 2018 8 3 1.7 1
1000 Sensor 2 Panel 2 2018 8 0 1.5 2
1000 Sensor 2 Panel 2 2018 8 1 1 2
1000 Sensor 2 Panel 2 2018 8 2 1 2
1000 Sensor 2 Panel 2 2018 8 3 2 1
I'd like a solution for at least results_1, but would appreciate solutions for both results_1 and results_advanced. Please provide any detail you can as to your thought process and this will help me (and others) learn more.
I believe there are both data.table and dplyr solutions so I'll tag both.