Saturday 10 October 2020

Salesforce CLI Scanner Custom XPath Rules - Part 1


Earlier this week (6th October 2020) a tweet popped up in my feed about the Salesforce CLI Scanner, which obviously caught my attention as I seem to have some kind of tunnel vision on the CLI and plug-ins right now. I'd seen something about this before, but at the time it felt like it was aimed more at ISVs to help them pass the security review. This wasn't something I had any immediate need for, so it went on my never ending mental list of things to look at when I have time. 

The tweet linked to a Salesforce Developers Blog post and quick skim read of that suggested there was plenty more to it, so I set aside some time to read it properly and take a look at the code, which was well worth the investment. The Plug-in docs cover installation and running, so I won't go into them here.

Static Code Analysis

Simply put, a static code code analysis tool examines source code before the code is run, evaluating it against a set of rules. Assuming you've crafted your rules correctly, it will find every occurrence of code that breaks a rule. Essentially a code review carried out by a Terminator:

    "It can’t be bargained with. It can’t be reasoned with. It doesn’t feel pity, or remorse, or fear."

The CLI Scanner combines a bunch of static code analysis tools into a single plug-in. The power of the CLI is that this can be installed in a single command using the CLI itself. As the CLI is pretty ubiquitous, it's straightforward for most of us to install the scanner and start using it immediately. 


While the scanner supports other languages, I'm going to stick to Apex for this post, as it's where most of my code lives right now.

The scanner evaluates Apex using the Apex rules for PMD originally created by Robert Sösemann, part of the open-source PMD source code analyser. I'd looked at PMD several years ago and got the basics working, but things weren't quite as slick back then so I didn't proceed with it. I was also under the impression that I had to write any custom rules in Java, which I wasn't particularly interested in at the time.

Once I had the scanner working I decided to check out the custom rules again, as I was definitely going to need some to apply the BrightGen house rules. The documentation again suggests that Java is the way to go, so I went to the PMD site to learn more about it. Reading through the documentation I was delighted to stumble on the following:

XML rule definition

New rules must be declared in a ruleset before they’re referenced. This is the case for both XPath and Java rules. To do this, the rule element is used, but instead of mentioning the ref attribute, it mentions the class attribute, with the implementation class of your rule.

  • For Java rules: this is the class extending AbstractRule (transitively)
  • For XPath rules: this is net.sourceforge.pmd.lang.rule.XPathRule

My old friend XPath - the mechanism I've used many times in the past with Selenium automated testing. As long ago as March 2019 I was telling people to learn about this, as evidenced by Don Robin's tweet from my London's Calling session:

It now felt like I had a fighting chance of coming up with a custom rule in fairly short order, although obviously in PMD I'm not working with HTML. Instead the source code is parsed into an Abstract Syntax Tree, which I can navigate in a similar fashion.

One of the major helpers for Selenium is being able to run XPath expressions in the Chrome inspector and gradually build up a complex rule from simple steps. Reading more on the PMD documentation I was delighted to find the designer - a GUI where I could write some Apex code, execute an XPath expression against that code and see the matches live. It also has an excellent interactive reference where I can click on one of the tree elements and see its properties.  Well worth spending some time with:

Sample Rule

My sample rule was to check that SOQL statements are only present inside test methods or accessor classes, where accessor classes have the naming convention <item>Accessor.

The rule is somewhat lengthy - here it is in it's entirety before I break it down into sections:

//UserClass[not(ends-with(@Image, 'Accessor'))]/Method/ModifierNode[@Test=false()]/..//(SoqlExpression | MethodCallExpression[lower-case(@FullMethodName)='database.query'])

The first part checks that the class name doesn't end with Accessor - if it is, I am in an accessor class and we can stop now:

//UserClass[not(ends-with(@Image, 'Accessor'))]

if I'm not in an accessor, I look for any method that isn't a test method:


The @Test property is part of the magic of PMD - the method modifier tree element knows whether a method is a test or not, so I don't have to worry about annotations or keywords.

I then back up from the ModifierNode to the method and consider all the child nodes of all non-test methods:


Then comes the meat of the rule - I look for the union of nodes that are SOQL expressions or have a database.query method call - note that I convert the FullMethodName property to lower case so that I don't have to care about capitalisation:

(SoqlExpression | MethodCallExpression[lower-case(@FullMethodName)='database.query'])

Executing this rule against an Apex class snippet:

class MyClass
   void SoqlQuery()
       List<Contact> contacts=[select id from Contact];

   void SoqlQuery2()
        List<Contact> contacts=database.query('select id from Contact');

matched as expected against the SOQL query on line 5 and the Database.query method call on line 10:

That's as far as I'm going for part 1 - in part 2 I'll look at how to create a ruleset for this rule and install it into the CLI Scanner. You won't have to wait too long as I'm hoping to write it in the next day or so while it's fresh in my mind.


No comments:

Post a Comment