Pages

Saturday, 3 April 2021

Unbounded Queries [Featuring Spring 21 FIELDS()]


Introduction

FIELDS() is a new SOQL function in the Spring 21 Salesforce release that allows you to query standard, custom or all fields for an sObject without having to know the names. Could it be, are we finally getting the equivalent of SELECT * so beloved of developers working with an SQL database? Can we retire our Apex code that figures out all of the field names using describe calls? Digging into the docs, the answer is no. You can pull back the standard fields, but not all of them.

Disappointing, right? Maybe at first glance, but from a future proofing your Apex code perspective, absolutely not. It's actually saving you from having unbounded queries lurking like a time bomb in your Salesforce instance, waiting to blow the heap limit on an unsuspecting user.

Unbounded Queries

Simply put, an unbounded query doesn't limit the results that may be returned from the database. 

Most of the time, we (and the rest of the developer world) think of unbounded queries as those without a restrictive WHERE clause, or with no WHERE clause at all. These queries start out life doing their best to be helpful - backing custom pages with search capabilities where the user can choose some criteria to restrict the number of records, but can click search straight away if they want to. When there are 20 records in the system, this isn't a problem. Once you've been life for 5 years and you have 15,000 records, it's an accident waiting to happen. I think of this as vertically unbound, as something needs to be done to limit the depth of the record list retrieved. 

So putting in a LIMIT clause to your query solves the problem and future proofs your application? Not quite. Queries can be unbound horizontally as well as vertically - each record can become so wide that the total number that can be retrieved is massively reduced. I view this is a problem that impacts Salesforce implementations more than other technologies, as it's so easy to add fields to an sObject and change the database schema. No need to involve a DBA, just a few clicks in Setup. The odd field here and there racks up over the years, and if your query is retrieving all fields on a record, the size of each record will become a problem, even if you are restricting the number of records that you retrieve.

Some Numbers

To reflect the 'SELECT *' concept, I threw together some Apex that dynamically creates a query by iterating the field map returned by the schema describe for an sObject. I created 500 Contact sObjects with a variety of fields present, then executed some anonymous Apex to query them back, iterate them and debug the records and debug the heap consumed. I took the simplistic view that the entire heap was down to the records retrieved, so all calculations around the maximum records will probably be over optimistic, but I'm sure you get the point.

With no contacts, the heap was a svelte 1968 bytes. 

With 500 contacts with the First and Last Name fields defined as 'Keir Bowden', the heap was 250,520 bytes, so approx 500 bytes per record. This sounds like a lot for a couple of fields, but there are quite a few standard fields that will be retrieved too - Id, IsDeleted, OwnerId, HasOptedOutOfEmail, HasOptedOutOfFax, DoNotCall, CreatedById, CreatedDate, LastModifiedId, LastModifiedDate, SystemModStamp, IsEmailBounced, PhotoUrl. A whole bunch of stuff that I probably don't care about. Not too much to worry about though, as at that rate I can retrieve 12,000 contacts before approximately blowing the heap. This is a good example of why adding a LIMIT clause to a query is a good idea - 12,000 contacts really isn't that many for a decent sized organisation that has been around for a few years!

Adding Salution of 'Mr' to all of the records caused the heap to creep up to 253,864 bytes - 508 per record. A small increase, but one that has a material impact on my max records - I'm down to 11,811 - a drop of 189 records through a single field that maybe I don't need for my processing.

I then added a unique Email Address and a Lead Source of Purchased List - moving the heap needle to 288,454 bytes - an increase of 70 bytes per record over the previous run. I'm now down to 10,380 records, so I'll have to process 1,431 less records in my transaction to avoid the limit. All because of fields that are present on the record, not because of the fields that I want. 

I upped the ante on my next change - I decided my Sales Reps wanted a notes style field on the record where they could record their unstructured thoughts about the customer. So I added a Text Area (Long) field, and add 1,000 characters to each of my records, which isn't a lot for free text - if you are doubting that, consider that at this point in this blog post I am around 5,000 characters. The impact on the heap was much as expected, and my heap use is now at 794,946 bytes at a hefty 1,600 bytes per record. I'm now down to 3,750 records per transaction! I think it's highly unlikely that I have any interest in the Sales Rep's random musings, but I'm down to around 31% of the records in no small part because of it.

Hopefully there's no need for me to labour the point - specifying a LIMIT clause is only part of the solution, and it's unlikely to help if the size of each record is something that you are exerting no control over.

Other Problems With Unbounded Queries

Even when they aren't breaking limits, unbounded queries bring unwanted behaviour:

  • If you are sending the records to the front end, you could end up transmitting way more data than you need to, taking longer and putting more stress on the client who has to manage the bloated records.
  • It makes the code harder to read - if the only query is to retrieve all fields, you need to dig into how the results are used to figure out which fields are actually needed in specific scenarios.
  • You lose static bindings to field names. This is unlikely to be a problem, as you'll almost certainly be referring to the fields you want to process by name elsewhere in your Apex code. But if you are processing fields using dynamic DML, then if an admin deletes one of the fields you are relying on, the first anyone will know about it is when you try to access the field and it's not there any more.
Even if you absolutely, definitely, have to have all of the fields that are currently defined on an sObject, before using an unbounded query ask yourself if you absolutely, definitely, have to have all the fields that could possibly be defined in the future - including those that really don't belong on the contact but it was easier than making some fundamental changes to the data model! Yes, you'll save yourself some typing, but you'll be storing up problems for the future. Maybe not today, maybe not tomorrow, but some day it's going to go pop!

So Never All Fields?


As always, never say never. For example, I'll sometimes use this technique in debug Lightning Web Components where I want to dump everything out about an object that is in use on a page, even the fields that aren't currently being retrieved. In pretty much every case where I do use this approach, the query will return a maximum of 1 record!

Related Posts


No comments:

Post a Comment