r - Fill missing date values in column by adding delivery interval to another date column -
data:
db1 <- data.frame(orderitemid = 1:10, orderdate = c("2013-01-21","2013-03-31","2013-04-12","2013-06-01","2014-01-01", "2014-02-19","2014-02-27","2014-10-02","2014-10-31","2014-11-21"), deliverydate = c("2013-01-23", "2013-03-01", "na", "2013-06-04", "2014-01-03", "na", "2014-02-28", "2014-10-04", "2014-11-01", "2014-11-23"))
expected outcome:
db1 <- data.frame(orderitemid = 1:10, orderdate= c("2013-01-21","2013-03-31","2013-04-12","2013-06-01","2014-01-01", "2014-02-19","2014-02-27","2014-10-02","2014-10-31","2014-11-21"), deliverydate = c("2013-01-23", "2013-03-01", "2013-04-14", "2013-06-04", "2014-01-03", "2014-02-21", "2014-02-28", "2014-10-04", "2014-11-01", "2014-11-23"))
hey guys, it´s me again ;) , unfortunately (i think) have pretty difficult question... can see above have missing values in delivery dates , want replace them date. date should order date of specific item + average delivery time in (full) days.(in example 1,75days 2days) average delivery time time calculated average value of samples not contain missing values = (2days+1day+3days+2days+1day+2days+1day+2days):8=1,75
so in first step average delivery time needs calculated in second step order date + average delivery time (in full days) needs entered instead of na´s
i tried little [is.na(db1$deliverydate)] unfortunately have no idea how solve problem...
hope got idea
you want date-arithmetic, , fill nas in deliverydate
column adding date-interval of 2 days orderdate
column. lubridate
supplies convenience functions time intervals days(), weeks(), months(), years(), hours(), minutes(), seconds()
purpose. , first, have parse (european-format) datestrings r date objects.
something following, using lubridate date-arithmetic , dplyr dataframe manipulation:
require(dplyr) db1$orderdate = as.posixct(db1$orderdate, format="%d.%m.%y", tz='utc') db1$deliverydate = as.posixct(db1$deliverydate, format="%d.%m.%y", tz='utc') db1 %>% group_by(orderdate) %>% summarize(delivery_time = (deliverydate - orderdate)) %>% ungroup() %>% summarize(median(delivery_time, na.rm=t)) # median(delivery_time, na.rm = t) # 1.5 days # round 2 days delivery_days = 2.0 require(lubridate) db1 <- db1 %>% filter(is.na(deliverydate)) %>% mutate(deliverydate = orderdate + days(2)) # orderitemid orderdate deliverydate # 3 2013-04-12 2013-04-14 # 6 2014-02-19 2014-02-21
Comments
Post a Comment