if statement - How to use ifelse inside the for loop in R -
i have check names of variables in data.frame , if match found, need replace na values in variable median, else others replace nas mean.
the data.frame cyl_spec has 11 variables , have replace na below:
- viscosity: impute median
- wax: impute median
- others: impute mean
i can picking variables 1 @ time trying following code :
attach(cyl_spec) var <- colnames(cyl_spec) for(val in var) { if(val == 'viscosity'){viscosity[is.na(viscosity == t)] <- median(viscosity, na.rm = t)} else if(val == 'wax'){wax[is.na(wax == t)] <- median(wax, na.rm = t)} else {val[is.na(val == t)] <- mean(val, na.rm = t)} } detach(cyl_spec)
somehow code not doing , still getting same no of na in variable using command :
sum(is.na(cyl_spec$viscosity)
also, when run code following warning message :
warning messages: 1: in mean.default(val, na.rm = t) : argument not numeric or logical: returning na 2: in mean.default(val, na.rm = t) : argument not numeric or logical: returning na 3: in mean.default(val, na.rm = t) : argument not numeric or logical: returning na 4: in mean.default(val, na.rm = t) : argument not numeric or logical: returning na 5: in mean.default(val, na.rm = t) : argument not numeric or logical: returning na 6: in mean.default(val, na.rm = t) : argument not numeric or logical: returning na 7: in mean.default(val, na.rm = t) : argument not numeric or logical: returning na 8: in mean.default(val, na.rm = t) : argument not numeric or logical: returning na 9: in mean.default(val, na.rm = t) : argument not numeric or logical: returning na
could please me finding solution this, stuck! in advance!!
you not need loop this. moreover, correct syntax test na values is.na(var)
, not is.na(var == true)
. finally, if want avoid typing name of dataframe, need use function (like with
or dplyr
functions). here, r looking object named viscosity
found because name of column inside cyl_spec
(same other variable names).
cyl_spec$viscosity[is.na(cyl_spec$viscosity)] <- median(cyl_spec$viscosity, na.rm = t) cyl_spec$wax[is.na(cyl_spec$wax)] <- median(cyl_spec$wax, na.rm = t) cyl_spec$val[is.na(cyl_spec$val)] <- mean(cyl_spec$val, na.rm = t)
if need deal data.frame , 3 variables, recommend stick base-r solution. if, however, looking on data frame more variables , want automate it, dplyr::mutate_each
. here example simulated data.
we create data.frame 7 variables , assign na values.
library(dplyr) set.seed(10) df <- data.frame(n=runif(100), m=runif(100), d=runif(100), o=runif(100), e=runif(100), f=runif(100), g=runif(100)) df <- mutate_each(df,funs(ifelse(.>.8,na,.))) head(df) n m d o e f g 1 0.50747820 0.34434350 0.2230884 0.347860110 na na na 2 0.30676851 0.06132255 0.5358950 0.007992606 0.6855115 na 0.7478783 3 0.42690767 0.36897981 0.6625291 0.401344915 0.6296311 na 0.7225419 4 0.69310208 0.40759356 na 0.588350693 0.7508252 0.29063776 0.5457709 5 0.08513597 na 0.1491831 na na 0.07203601 0.2641231 6 0.22543662 na 0.6700994 0.708542599 0.3600703 0.55888842 0.3057243
now, apply each variable function infer na values either mean or median :
df <- df %>% ## variables recoded mean? here, n , m mutate_each(funs(ifelse(is.na(.),mean(.,na.rm = true),.)),n,m) %>% ## variables recoded median? here, d,o,e,f,g mutate_each(funs(ifelse(is.na(.),median(.,na.rm = true),.)),d,o,e,f,g) head(df) n m d o e f g 1 0.50747820 0.34434350 0.2230884 0.347860110 0.3602354 0.39956699 0.4499041 2 0.30676851 0.06132255 0.5358950 0.007992606 0.6855115 0.39956699 0.7478783 3 0.42690767 0.36897981 0.6625291 0.401344915 0.6296311 0.39956699 0.7225419 4 0.69310208 0.40759356 0.4407363 0.588350693 0.7508252 0.29063776 0.5457709 5 0.08513597 0.40892568 0.1491831 0.378731867 0.3602354 0.07203601 0.2641231 6 0.22543662 0.40892568 0.6700994 0.708542599 0.3600703 0.55888842 0.3057243
Comments
Post a Comment