Bob Buzzard Blog: Zip Handling in Apex GA

Image generated by ChatGPT 4o based on a prompt from Bob Buzzard

Introduction

Anyone who's developed custom solutions in Salesforce world is likely to have come up against the challenge of processing zip files. You could use a pure Apex solution, such as Zippex, leverage the Metadata API or push it all to the front end and use JavaScript. What these options all had in common was leaving you dissatisfied. Apex isn't the best language for CPU intensive activities like compression, so the maximum file size was relatively small. The metadata API isn't officially supported and introduces asynchronicity, while forcing the user to the front end simply to extract the contents of a file isn't the greatest experience and might introduce security risks.

One year ago, in the Spring '24 release of Salesforce, we got the developer preview of Zip Handling in Apex which looked very cool. I encountered a few teething issues, but the preview was more than enough to convince me this was the route forward once Generally Available. Fast forward a week or so from now (31st Jan) and it will be GA in the Spring '25 release. That being the case, I felt I should revisit.

The Sample

I built the sample in a scratch org before the release preview window opened, and all I had to specify was the ZipSupportInApex feature. The sample is based on my earlier attempts with the developer preview - Lightning Web Component front end which allows you to select a zip file from Salesforce Files, an Apex controller to extract the entries from the zip file and more Lightning Web Component to display the details to the user:

The key differences are (a) I could get the zip entries back successfully this time and (b) I captured how much CPU and heap were consumed processing the zip file. The heap was slightly larger than the size of the zip file in this case, but the CPU was a really nice surprise - just 206 milliseconds consumed.

One expected fun side effect was the ability to breach the heap size limit quite dramatically. As we all know, this limit is enforced through periodic checking rather than a finite heap size, so you can go over as long as it's only for a short while. Processing the entries in a zip file clearly satisfies the "short while" requirement, as I was able to extract the contents of a 24Mb zip file, pushing the heap close to 25Mb:

It was gratifying to see that once again the CPU wasn't too onerous - even when handling a zip file that is theoretically far too large, only 1,375 milliseconds of CPU were consumed.

Show Me The Code!

You can find the code in my Spring '25 Github repository. The key Apex code is as follows:

   @AuraEnabled(cacheable=true)
    public static ZipResponse GetZipEntries(Id contentVersionId)
    {
        ZipResponse response=new ZipResponse();
        List<Entry> entries=new List<Entry>();
        try 
        {
            ContentVersion contentVer=[select Id, ContentSize, VersionData, Title, Description, PathOnClient
                                        from ContentVersion
                                        where Id=:contentVersionId];

            Blob zipBody = contentVer.VersionData;
            Compression.ZipReader reader = new Compression.ZipReader(zipBody);
            for (Compression.ZipEntry zipEntry : reader.getEntries())
            {
                System.debug('Entry = ' + zipEntry.getName());
                Entry entry=new Entry();
                entry.name=zipEntry.getName();
                entry.method=zipEntry.getMethod().name();
                entry.compressedSize=zipEntry.getCompressedSize();
                entry.uncompressedSize=zipEntry.getUncompressedSize();
                entries.add(entry);
            }

            response.entries=entries;
            response.cpu=Limits.getCpuTime();
            response.size=contentver.ContentSize;
            response.heap=Limits.getHeapSize();
        }
        catch (Exception e)
        {
            System.debug(e);
        }

        return response;
    }

ZipResponse is a custom class to return the information in a friendly format to the front end, including the CPU and Heap consumption, while Entry is another custom class that wraps the information I want to show the user in an AuraEnabled form. The actual amount code required to process a Zip file is very small - once you have the zip file body as a Blob, simple create a new ZipReader on it and iterate the entries:

   Blob zipBody = contentVer.VersionData;
    Compression.ZipReader reader = new Compression.ZipReader(zipBody);
    for (Compression.ZipEntry zipEntry : reader.getEntries())
    {
        // your code here!
    }

Performance

Regular readers of this blog will know that I'm always keen to compare performance, so here's the CPU and heap consumption for the new on platform solution versus Zippex for my test zip files

Zip Size	On Platform		Zippex
Zip Size	CPU	Heap	CPU	Heap
66,408	30	90,306	19	142,149
123,746	57	143,425	17	253,086
3,992,746	48	4,022,537	208	8,000,354
8,237,397	58	8,256,426	387	16,480,442
24,014,156	973	24,431,186	Heap size exception at 48,028,743

Zippex uses less CPU at the start, but that looks to be falling behind as the zip file size increases. Heap is where the on platform solution really scores though, needing 60-100% less than Zippex and able to process larger files.

More Information

Follow @bob_buzzard

Bob Buzzard Blog

Sunday, 2 February 2025

Zip Handling in Apex GA

Introduction

The Sample

Show Me The Code!

Performance

More Information

No comments:

Post a Comment