Data looks as follows:
df1=data.frame(Date=as.Date(c('8/27/2001','8/27/2001','8/27/2001','11/13/2001','11/13/2001','11/13/2001','8/3/2012','8/3/2012'),format="%m/%d/%Y"),
Name=c('Joe', 'Joe', 'Joe', 'Billy', 'Billy', 'Billy','Emma','Emma'),
Sample=c('Pre','Post','Discard','Pre','Post','Discard','Bone','Pre'),
Cells=c(15,7,3,12,5,2,14,NA))
Date Name Sample Cells
1 2001-08-27 Joe Pre 15
2 2001-08-27 Joe Post 7
3 2001-08-27 Joe Discard 3
4 2001-11-13 Billy Pre 12
5 2001-11-13 Billy Post 5
6 2001-11-13 Billy Discard 2
7 2012-08-03 Emma Bone 14
8 2012-08-03 Emma Pre NA
I would like to add a calculated column called "Yield" based on unique groupings of date and name (e.g. entries 1-3, 4-6 or 7-8 would all represent distinct groups). Real data can be incomplete (see entries 7-8).
The "yield" column should be:
Cells where Sample="Post" divided by Cells where Sample="Pre"
Desired output:
Date Name Sample Cells Yield
1 2001-08-27 Joe Pre 15 NA
2 2001-08-27 Joe Post 7 0.46
3 2001-08-27 Joe Discard 3 NA
4 2001-11-13 Billy Pre 12 NA
5 2001-11-13 Billy Post 5 0.41
6 2001-11-13 Billy Discard 2 NA
7 2012-08-03 Emma Bone 14 NA
8 2012-08-03 Emma Pre NA NA
I am new to R, and would like to use it efficiently (e.g. with dplyr). The above can be done through loops, but I am looking for a more elegant solution. I've consulted the following threads for guidance, but so far haven't found a solution:
Assign value to group based on condition in column
R create column from another column, depending on row
Conditional calculation in R based on Row values and categories