Dynamically Generating Zip Files in Jersey

We often need to pull a large number of rows from a database table, split those rows up into n groups, and write each group out to a separate text file.  These text files are then processed by another application.  Each text file starts with the number of rows in the file on the first line, and then contains each row on its own line after that.  It became a pain to generate these files by hand, so I added a new resource to our Jersey-based web service that would generate all of the files and wrap them all up into a single .zip file.  The processing app also used to be run by hand, but it’s now totally automated, so it obtains the .zip file, unzips it, and processes each individual file.

Here is the resource we ended up with, except for this blog I’m just generating a list of 100 random strings instead of pulling rows from a real database:

import java.io.*;
import java.util.*;
import java.util.zip.*;

import javax.ws.rs.*;

public class FilesResource
public InputStream getZipFile(@QueryParam(“per_file”) @DefaultValue(“25”) final int perFile)
throws IOException
//we write to the PipedOutputStream
//that data is then available in the PipedInputStream which we return
final PipedOutputStream sink = new PipedOutputStream();
PipedInputStream source = new PipedInputStream(sink);

//apparently we need to write to the PipedOutputStream in a separate thread
Runnable runnable = new Runnable()
public void run()
//PrintStream => BufferedOutputStream => ZipOutputStream => PipedOutputStream
ZipOutputStream zip = new ZipOutputStream(sink);
PrintStream writer = new PrintStream(new BufferedOutputStream(zip));

//break the strings up into multiple files
List strings = getStrings();
int stringCount = strings.size();
int fileCount = (int) Math.ceil((double) stringCount / (double) perFile);
for (int file = 0; file < fileCount; file++) { zip.putNextEntry(new ZipEntry("file" + (file + 1) + ".txt")); int first = file * perFile; int last = Math.min((file + 1) * perFile, stringCount); int imagesInFile = last - first; writer.println(imagesInFile); for (int i = first; i < last; i++) writer.println(strings.get(i)); writer.flush(); zip.closeEntry(); } //also include a single file with all strings writer.println(stringCount); zip.putNextEntry(new ZipEntry("file.txt")); for (int i = 0; i < stringCount; i++) writer.println(strings.get(i)); writer.flush(); zip.closeEntry(); } catch (IOException e) { } writer.flush(); writer.close(); } }; Thread writerThread = new Thread(runnable, "FileGenerator"); writerThread.start(); return source; } private List getStrings()
List strings = new ArrayList();
for (int i = 0; i < 100; i++) strings.add(String.valueOf(Math.random())); return strings; } } [/sourcecode] The getZipFile() method will be called in response to a GET /files request.  I also registered the .zip URI extension so it will also respond to GET /files.zip.  By default it will split up the strings into groups of 25, but the per_file query parameter can be specified in the request to change that, like GET /files.zip?per_file=50 to split them into groups of 50.

The easiest way to create a .zip file in Java is using ZipOutputStream.  Once you have that created you call its putNextEntry() method to start a new file within the .zip file.  We also wrapped the ZipOutputStream in a PrintStream, and write the contents of the text files by calling println() on the PrintStream (there’s also a BufferedOutputStream in there for good measure).

Jersey is smart enough to read the data from an InputStream and use it as the HTTP response body, so ultimately we need to return an InputStream.  By hooking the ZipOutputStream up to a PipedOutputStream, and then connecting the PipedOutputStream to a PipedInputStream, Jersey can read this all of this zip file data from the PipedInputStream.  It’s best to write-to and read-from these piped streams in different threads so we start a new writing thread and return the PipedInputStream right away.

This ends up working perfectly: the resource at http://site.com/files.zip feels like a static zip file, but is really generated dynamically on every request.  It also can now be accessed overy HTTP from any machine and processed automatically.  If it was really expensive to generate we could cache it and re-generate only when one of the rows in the database changed, but for our purposes it’s a cheap .zip file to generate.


One thought on “Dynamically Generating Zip Files in Jersey

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s