java - Does POI XSSF still have crazy bad memory issues? -
a couple years ago, ran issues creating large excel files using jxls
, poi xssf
. if memory correct, think xssf
create 1gb+ temp files on disk create 10mb excel files. stopped using jxls
, instead used sxssf
create excel files, today have new reasons use jxls
or jett
.
both jxls
, jett
websites seem allude performance better, poi
's xssf
website still says generically xssf
requires higher memory footprint. wondering if higher memory footprint reasonable 10% overhead these days, or if still 10,000% overhead couple years ago.
are crazy bad memory issues fixed poi 3.9 xssf
? should not worry using jxls
or jett
? or there gotchas avoid? careful reusing cell styles.
to answer question, yes, poi use large amount of memory when working on large xlsx files, larger size of xlsx files. don't think change anytime soon, , there pretty obvious reasons that: xlsx bunch of zipped xml files, , xml compressed (around 10x). getting xml sit in memory uncompressed increase memory consumption tenfold, if add overhead of data structures, there's no way should expect 10% increase in memory consumption on xlsx file size.
now, news mentioned in comments, apache poi introduced sxssf streaming large amount of data in spreadsheet performance , low memory usage. xlsx files generated way still streamed on hard disk can end taking quite bit of space, @ least don't risk oome when writing hundreds of thousands of rows.
the problem you won't able jett directly work sxssf, needs whole document loaded in memory performing template filling. jett author discussed topic here.
i had same problem, , ended doing two-step xlsx creation:
a standard jett xlsx template generate headers , formatting. last row of first sheet contains cells $$tokens$$, 1 per cell. don't use jett insert large amount of rows.
once jett did work, reopen workbook, read delete $$tokens$$ on last line of first spreadsheet, , start streaming data sxssf row row.
of course, there limitations approach: - cannot use jett on of streamed rows during rows insertion (but can before, dynamically pick order of $$tokens$$ example) - cells format won't copied unless take care of poi api. prefer format whole columns in xlsx file, , apply streamed data.
this works if want show charts using data inserted sxssf: can define named range functions offset , counta, create pivot table & pivot chart refreshed when xlsx opened in excel.
Comments
Post a Comment