1

I'm trying to create a long tibble dataframe with a date sequence. Now I tried to use this example. The example works but not when I try to implement to my own data. It gives an error message: Error in seq.int(0, to0 - from, by) : wrong sign in 'by' argument. At can't figure out why the code on my tibble throws an error... All help much appreciated.

This example works:

library(tidyverse)

example <- structure(list(idnum = c(17L, 17L, 17L), start = structure(c(8401, 
8401, 8401), class = "Date"), end = structure(c(8765, 8765, 8765
), class = "Date")), class = "data.frame", .Names = c("idnum", 
"start", "end"), row.names = c(NA, -3L)) 

example %>%
  as.tibble() %>%
  nest(start, end) %>% view
  mutate(data = map(data, ~seq(unique(.x$start), unique(.x$end), 1))) %>%
  unnest(data)

That's kind of what I'm looking for.

The code on my data gives an error message.

df <- structure(list(nieuw = c("Nieuw", "Nieuw", "Nieuw"), jaar = c(NA, 
2013, 2014), aow_jaar = c("65", "65", "65"), aow_maanden = c(NA, 
"1", "2"), vanaf = structure(c(-8036, -8036, -7701), class = "Date"), 
    tot_en_met = structure(c(-8037, -7702, -7367), class = "Date")), class = c("tbl_df", 
"tbl", "data.frame"), row.names = c(NA, -3L))

df %>%
  nest(vanaf, tot_en_met) %>%
  mutate(data = map(data, ~seq(unique(.x$vanaf), unique(.x$tot_en_met), 1))) %>%
  unnest(data)

Error in seq.int(0, to0 - from, by) : wrong sign in 'by' argument

The error message say it has to do with the by = argument but I can't understand why it's not working...

Tdebeus
  • 1,519
  • 5
  • 21
  • 43

1 Answers1

2

Here, the issue is that one of the rows (1st row) end date is lower than the start date. An option is to check the min/max and then do seq

library(dplyr)
library(purrr)
df %>% 
   mutate(out = map2(vanaf, tot_en_met, 
          ~ seq(min(.x, .y), max(.x, .y), by = 1))) # %>%
   # unnest # if needed
# A tibble: 3 x 7
#  nieuw  jaar aow_jaar aow_maanden vanaf      tot_en_met out         
#  <chr> <dbl> <chr>    <chr>       <date>     <date>     <list>      
#1 Nieuw    NA 65       <NA>        1948-01-01 1947-12-31 <date [2]>  
#2 Nieuw  2013 65       1           1948-01-01 1948-11-30 <date [335]>
#3 Nieuw  2014 65       2           1948-12-01 1949-10-31 <date [335]>

Also, instead of doing min/max in each row, we can do this in a vectorized way with pmin/pmax

df %>%
   mutate(out = map2(pmin(vanaf, tot_en_met), 
                     pmax(vanaf, tot_en_met), seq, by = 1))
akrun
  • 874,273
  • 37
  • 540
  • 662
  • @Tdebeus. I think the error is not very informative and if we don't check the data it is hard to pinpoint – akrun Aug 08 '19 at 15:18