java - Does POI XSSF still have crazy bad memory issues? -


a couple years ago, ran issues creating large excel files using jxls , poi xssf. if memory correct, think xssf create 1gb+ temp files on disk create 10mb excel files. stopped using jxls , instead used sxssf create excel files, today have new reasons use jxls or jett.

both jxls , jett websites seem allude performance better, poi's xssf website still says generically xssf requires higher memory footprint. wondering if higher memory footprint reasonable 10% overhead these days, or if still 10,000% overhead couple years ago.

are crazy bad memory issues fixed poi 3.9 xssf? should not worry using jxls or jett? or there gotchas avoid? careful reusing cell styles.

to answer question, yes, poi use large amount of memory when working on large xlsx files, larger size of xlsx files. don't think change anytime soon, , there pretty obvious reasons that: xlsx bunch of zipped xml files, , xml compressed (around 10x). getting xml sit in memory uncompressed increase memory consumption tenfold, if add overhead of data structures, there's no way should expect 10% increase in memory consumption on xlsx file size.

now, news mentioned in comments, apache poi introduced sxssf streaming large amount of data in spreadsheet performance , low memory usage. xlsx files generated way still streamed on hard disk can end taking quite bit of space, @ least don't risk oome when writing hundreds of thousands of rows.

the problem you won't able jett directly work sxssf, needs whole document loaded in memory performing template filling. jett author discussed topic here.

i had same problem, , ended doing two-step xlsx creation:

  1. a standard jett xlsx template generate headers , formatting. last row of first sheet contains cells $$tokens$$, 1 per cell. don't use jett insert large amount of rows.

  2. once jett did work, reopen workbook, read delete $$tokens$$ on last line of first spreadsheet, , start streaming data sxssf row row.

of course, there limitations approach: - cannot use jett on of streamed rows during rows insertion (but can before, dynamically pick order of $$tokens$$ example) - cells format won't copied unless take care of poi api. prefer format whole columns in xlsx file, , apply streamed data.

this works if want show charts using data inserted sxssf: can define named range functions offset , counta, create pivot table & pivot chart refreshed when xlsx opened in excel.


Comments

Popular posts from this blog

html - Sizing a high-res image (~8MB) to display entirely in a small div (circular, diameter 100px) -

java - IntelliJ - No such instance method -

identifier - Is it possible for an html5 document to have two ids? -