r - Add selection crteria to read.table -
let's take following simplified version of dataset import using read.table
:
a<-as.data.frame(c("m","m","f","f","f")) b<-as.data.frame(c(25,22,33,17,18)) df<-cbind(a,b) colnames(df)<-c("sex","age")
in reality dataset extremely large , i'm interested in small proportion of data i.e. data concerning females aged 18 or under. in example above last 2 observations.
my question is, can import these observations without importing rest of data using subset
refine database. computer's capacities limited , have been using scan
import data in chunks extremely time consuming.
is there better solution?
some approaches might work:
1 - use packages ff
can ram issues.
2 - use other tools/languages clean data before load r.
3 - if file not big (i.e., can load without crashing), save .rdata file , read file (instead of calling read.table):
# save each txt file once... save.rdata = function(filepath, filebin) { dataset = read.table(filepath) save(dataset, paste(filebin, ".rdata", sep = "")) } # read .rdata get.dataset = function(filebin) { load(filebin) return(dataset) }
this faster read txt file, i'm not sure if applies case.
Comments
Post a Comment