'Generate random date after a date

I have a dataset like this:

set.seed(123)
date_entry<- sample(seq(as.Date('2000-01-01'), as.Date('2010-01-01'), by="day"), 1000)
df <- data.frame( date_entry)
df <- df %>% mutate(id = row_number())

I want to to generate a random date_end column for each id that is greater than date_entry. For instance, for these dates, I want greater than 2006 for id=1:3 and 2002 for id=4.

    date_entry  id
1   2006-09-28   1
2   2006-11-15   2
3   2006-02-04   3
4   2001-06-09   4
5   2000-07-13   5


Solution 1:[1]

Create a daily sequence between date_entry and today's date (i.e., Sys.Date()), then pick 1 sample for date_end.

library(tidyverse)

df %>% 
  rowwise %>% 
  mutate(date_end = sample(seq(date_entry, Sys.Date(), by="day"), 1))

Output

   date_entry    id date_end  
   <date>     <int> <date>    
 1 2006-09-28     1 2016-01-08
 2 2006-11-15     2 2019-04-27
 3 2006-02-04     3 2016-02-17
 4 2001-06-09     4 2012-12-26
 5 2000-07-13     5 2008-11-12
 6 2008-03-04     6 2011-12-27
 7 2005-01-15     7 2015-01-04
 8 2003-02-15     8 2020-07-28
 9 2009-03-24     9 2014-11-01
10 2003-06-06    10 2004-03-22
# … with 990 more rows

Solution 2:[2]

Pick a random number of days to add to each date_entry. Here I sample uniformly between 1 and 100,000 days to add - pick whatever range of possibilities / distribution you want.

df %>%
  mutate(date_end = date_entry + sample(1:1e5, size = n(), replace = TRUE))
#     date_entry  id   date_end
# 1   2006-09-28   1 2104-02-13
# 2   2006-11-15   2 2199-06-24
# 3   2006-02-04   3 2042-08-30
# 4   2001-06-09   4 2153-04-10
# 5   2000-07-13   5 2140-04-28
# 6   2008-03-04   6 2106-07-06
# 7   2005-01-15   7 2169-06-14
# ...

If you want to make sure the date_end is in the following year (maybe somewhat implied in your question?), round up before adding random days:

df %>%
  mutate(date_end = 
    lubridate::ceiling_date(date_entry, unit = "year") + 
      sample(0:1e5, size = n(), replace = TRUE)
  )

Solution 3:[3]

In a function f we may use as.POSIXlt and add 1901 to the year element, which simply yields next year, in which we create January 1st using ISOdate. Transformed as.Date we add a random integer from zero up to a defined dmax, resulting in the desired random date starting no earlier than the following year.

f <- \(x, dmax=3652) with(as.POSIXlt(x), as.Date(ISOdate(year + 1901, 1, 1)) + 
                            sample(0:dmax, length(x), replace=TRUE))

set.seed(42)
transform(dat, date_end=f(date_entry))
#   date_entry id   date_end
# 1 2006-09-28  1 2014-02-21
# 2 2006-11-15  2 2013-06-26
# 3 2006-02-04  3 2010-03-22
# 4 2001-06-09  4 2005-01-02
# 5 2000-07-13  5 2004-06-05
# 6 2008-03-04  6 2017-10-23

Data:

dat <- structure(list(date_entry = structure(c(13419, 13467, 13183, 
11482, 11151, 13942), class = "Date"), id = 1:6), class = "data.frame", row.names = c(NA, 
-6L))

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 AndrewGB
Solution 2 Gregor Thomas
Solution 3 jay.sf