uBreckner February 2016

Reading zipped vs unzipped files into memory

I have a single file, delivered as .zip, and i want to read it into memory. Zipped it is about 50 MB and unzipped about 700MB. Now i am wondering if i should unzip the file first and then read it or if it makes no difference and i can read the data from the zip file directly.

In case of a normal file i use a InputStreamReader wrapped around a FileInputStream.
For a zip file i use a java.util.ZipFile to get the InputStream from a ZipEntry and then again wrap a InputStreamReader around it.
So in the end i work with InputStreamReader in both cases.

I tried to test it, but locally i can't read such a large file without running out of memory. On the server where the process runs are more processes interfering so i couldn't quite tell if there is any difference.

Does anybody know, if one of the options uses significantly more memory than the other ot is it just a question of design, which way to use?

Greetings, Uwe

Answers


vanje February 2016

The only difference is a small performance hit for unzipping the file. In both cases your InputStreamReader will read the unpacked 700 MB file.

The next question you should ask, is why do you need to read this large file completely into memory? Is it really necessary? Maybe you can process it line by line without holding all lines in memory.

Post Status

Asked in February 2016
Viewed 2,851 times
Voted 9
Answered 1 times

Search




Leave an answer