PMD is a static code analyzer for Java. Developers use PMD to comply with coding standards and deliver quality code. Team leaders and Quality Assurance folks use it to change the nature of code reviews. PMD has the potential to transform a mechanical and syntax check oriented code review into a to dynamic peer-to-peer discussion.
This article looks at PMD as an Eclipse plugin and the ways it can be used to improve the code quality and shorten the code review process. Since every organization has a unique set of coding conventions and quality metrics, it also demonstrates how to customize PMD to meet these needs.
PMD works by scanning Java code and checks for violations in three major areas:
In its current version, PMD comes packaged with 149 rules in 19 rulesets. Most of these rules can be parameterized at runtime by supplying properties or parameters. The standard package offers many well-thought rules. In addition users also have the ability to add their own rules for particular coding convention or quality metrics. Here are some of the rules distributed with PMD:
PMD started as a standalone application and today it is an open-source project hosted under SourceForge.net. It is still distributed as a standalone application, however there is now substantial support for most popular Java IDEs. The PMD team has developed plugins for JDeveloper, Eclipse, JEdit, JBuilder, Omnicore's CodeGuide, NetBeans/Sun Java Studio Enterprise/Creator, IntelliJ IDEA, TextPad, Maven, Ant, Gel, JCreator, and Emacs. This article covers PMD's Eclipse plugin only.
PMD binaries and source files can be downloaded from PMD's Sourceforge directory. For an open-source project, PMD is relatively well-documented. Among the items in PMD's main site are topics such as installation, PMD-related products and books, best practices, licensing info, usage guides and more. There is also a section for developers interested in joining the project. The following items target the latter: compiling PMD, project info, project reports, and development process.
The easiest way to install PMD is by using the remote update site. Users behind firewalls should check proxy settings before going any further. If these settings are misconfigured the updater will not work. PMD also supplies a zip file for manual installation. Download the file and follow the readme. Demonstrated below is installing PMD via the Eclipse Software Updater.
Before launching Eclipse make sure you have enough memory for PMD. This is particularly important when analyzing large projects. In these situations PMD tends to be memory-hungry. Hence, make sure to start with as much memory as you can afford, for example 512M (eclipse.exe -vmargs -Xmx512M)
public class Ying { private static final String gOOD = "GOOD"; public void StartSomething() { System.out.println("Hello PMD World!"); } public String thisIsCutAndPaste(String pFirst, int pSecond) { System.out.println("New world"); return "New world"; } }
public class Yang extends Thread { public Yang(String str) { super(str); } public void run() { for (int i = 0; i < 10; i++) { System.out.println(i + " " + getName()); try { sleep((long) (Math.random() * 1000)); } catch (InterruptedException e) { } } System.out.println("DONE! " + getName()); } public void WRITE_SOMETHING(String INPUT_PARAMETER) { System.out.println(INPUT_PARAMETER); } public static void main(String[] args) { new Yang("Good").start(); new Yang("Bad").start(); } public String thisIsCutAndPaste(String pFirst, int pSecond) { System.out.println("New world"); return "New world"; } }
// @PMD:REVIEWED:MethodNamingConventions: by Levent Gurses on 3/28/04 5:04 PM
Repeated (Cut&Paste;) code generally indicates poor planning or team coordination. Therefore, refactoring classes with repeating code should be given a high priority. PMD can help identify these classes by scanning the code in a way similar to PMD violation checks. The number of lines of similarity (the metrics used by PMD to match code patterns) is 25 by default and can be set in PMD's Preferences page.
===================================================================== Found a 8 line (25 tokens) duplication in the following files: Starting at line 6 of C:\temp\QA Project\Ying.java Starting at line 23 of C:\temp\QA Project\Yang.java new Yang("Bad").start(); } public String thisIsCutAndPaste(String pFirst, int pSecond) { System.out.println("New world"); return "New world"; } }
After running a lengthy violation check it may be desirable to share the findings with peers or store them for later reference. For situations of this kind PMD provides a handy reporting tool capable of generating reports in multiple formats. Currently PMD can generate reports in HTML, XML, plain text as well as Comma Separated Value (CSV) formats.
# | File | Line | Problem |
---|---|---|---|
1 | Yang.java | 8 | System.out.print is used |
2 | Yang.java | 14 | System.out.print is used |
3 | Yang.java | 17 | Method name does not begin with a lower case character. |
4 | Yang.java | 17 | Method names should not contain underscores |
5 | Yang.java | 17 | Parameter 'INPUT_PARAMETER' is not assigned and could be declared final |
6 | Yang.java | 18 | System.out.print is used |
7 | Yang.java | 21 | Parameter 'args' is not assigned and could be declared final |
8 | Yang.java | 27 | System.out.print is used |
9 | Ying.java | 1 | Each class should declare at least one constructor |
10 | Ying.java | 3 | Avoid unused private fields such as 'gOOD' |
11 | Ying.java | 5 | Method name does not begin with a lower case character. |
12 | Ying.java | 6 | System.out.print is used |
13 | Ying.java | 9 | Parameter 'pFirst' is not assigned and could be declared final |
14 | Ying.java | 9 | Parameter 'pSecond' is not assigned and could be declared final |
15 | Ying.java | 10 | System.out.print is used |
The easiest way to begin customizing PMD is by playing with existing rules. Adding new rules is also possible as well as removing unnecessary ones, however these require more knowledge. Since experimenting with existing rules is the easiest, it makes sense to start with them.
Each PMD rule has six attributes:
Of these six attributes the first two are immutable - they cannot be customized by users. While Message, Description and Example are text-based properties and can accept any String data, Priority is an integer field ranging from 1 to 5.
PMD stores rule configuration in a special repository referred to as the Ruleset XML file. This configuration file carries information about currently installed rules and their attributes. Changes made through the Eclipse Preferences page are also stored in this file. In addition, the PMD Preferences page allows exporting and importing Rulesets which makes them a convenient vehicle for sharing rules and coding conventions across the enterprise.
Before starting the customization it may be a good idea to back-up existing configuration.
PMD allows new rules in two formats: Java-based rules where the rule is written as a Java implementation class and XPath rules where the rule is defined in an XML file. Both ways are equally applicable and the choice is probably a matter of preference. This article will demonstrate creating a new rule with both Java and XPath. The rule, ParameterNameConvention, will be a simple coding convention checker making sure all method parameters start with a lowercase "p". To accomplish its task the rule will take advantage of regular expressions and will enforce the following expression to all method parameters: [p][a-zA-Z]+. Later this expression will be made into a rule property which will allow further customization by rule users.
This is all good, but before going any further into the custom PMD rules let's see how this PMD thing actually works.
PMD relies on the concept of Abstract Syntax Tree (AST), a finite, labeled tree where nodes represent the operators and the edges represent the operands of the operators. PMD creates the AST of the source file checked and executes each rule against that tree. The violations are collected and presented in a report. PMD executes the following steps when invoked from Eclipse (Based on PMD's documentation):
Since PMD works on a tree data structure, its unit of operation is Node. Every construct in a Java source file examined by PMD is assigned a node in the Abstract Syntax Tree. These nodes are then visited by all rules in the ruleset and the method public Object visit(SimpleNode node, Object data) of the rule implementation class gets invoked. It is inside this method where the rule logic is defined.
In PMD every node extends net.sourceforge.pmd.ast.SimpleNode. This concrete class implements net.sourceforge.pmd.ast.Node and has 104 first-level children as well as an additional 8 second-level children coming from net.sourceforge.pmd.ast.AccessNode.
This screenshot captures portion of SimpleNodes children. The entire list is available with the source distribution of PMD. This class hierarchy is important for writing custom PMD because each custom rule will rely on one or more of PMD's SimpleNodes to accomplish its task.
This is a good place to see the Abstract Syntax Tree in action.
<?xml version="1.0" encoding="UTF-8"?> <CompilationUnit beginColumn="1" beginLine="13" endColumn="3" endLine="13"> <TypeDeclaration beginColumn="1" beginLine="1" endColumn="1" endLine="13"> <ClassOrInterfaceDeclaration abstract="false" beginColumn="8" beginLine="1" endColumn="1" endLine="13" final="false" image="Ying" interface="false"> ........... ........... <MethodDeclarator beginColumn="23" beginLine="9" endColumn="67" endLine="9" image="thisIsCutAndPaste" parameterCount="2"> <FormalParameters beginColumn="40" beginLine="9" endColumn="67" endLine="9" parameterCount="2"> <FormalParameter abstract="false" array="false" arrayDepth="0" beginColumn="41" beginLine="9" endColumn="53" endLine="9" final="false"> <Type array="false" arrayDepth="0" beginColumn="41" beginLine="9" endColumn="46" endLine="9"> <ReferenceType array="false" arrayDepth="0" beginColumn="41" beginLine="9" endColumn="46" endLine="9"> <ClassOrInterfaceType beginColumn="41" beginLine="9" endColumn="46" endLine="9" image="String"/> </ReferenceType> </Type> <VariableDeclaratorId array="false" arrayDepth="0" beginColumn="48" beginLine="9" endColumn="53" endLine="9" exceptionBlockParameter="false" image="pFirst" typeNameNode="ReferenceType" typeNode="Type"/> </FormalParameter> <FormalParameter abstract="false" array="false" arrayDepth="0" beginColumn="56" beginLine="9" endColumn="66" endLine="9" final="false"> <Type array="false" arrayDepth="0" beginColumn="56" beginLine="9" endColumn="58" endLine="9"> <PrimitiveType array="false" arrayDepth="0" beginColumn="56" beginLine="9" boolean="false" endColumn="58" endLine="9" image="int"/> </Type> <VariableDeclaratorId array="false" arrayDepth="0" beginColumn="60" beginLine="9" endColumn="66" endLine="9" exceptionBlockParameter="false" image="pSecond" typeNameNode="PrimitiveType" typeNode="Type"/> </FormalParameter> </FormalParameters> </MethodDeclarator> ........... ........... </ClassOrInterfaceDeclaration> </TypeDeclaration> </CompilationUnit>
This AST report gives a full synopsis of the examinee, in this case Ying.java. It is possible for example to gain insight about the method parameter pFirst by expanding the FormalParameter node. Studying this Abstract Syntax Tree can contribute to a better understanding of how a Java source file is examined and what variables play role in the process. This in turn could prove useful when crafting custom rules.
Now, going back to ParameterNameConvention it is time to crank some code. This article will create a Java class for the rule implementation and package it as a plugin fragment. A fragment is an extension of a plug-in and all the classes and resource files it contains are automatically added to the main plug-in classpath. Since PMD searches main plugins classpath for rule implementation classes the fragment will be automatically available. In addition, this structure will allow for faster development and easier distribution.
package com.jacoozi.pmd.rules; import java.util.Iterator; import net.sourceforge.pmd.AbstractRule; import net.sourceforge.pmd.RuleContext; import net.sourceforge.pmd.RuleViolation; import net.sourceforge.pmd.ast.ASTFormalParameter; import net.sourceforge.pmd.ast.ASTMethodDeclaration; import net.sourceforge.pmd.ast.ASTVariableDeclaratorId; /** * @author Levent Gurses * Copyright 2005 Jacoozi */ public class ParameterNameConvention extends AbstractRule { private final static String PATTERN = "[p][a-zA-Z]+"; public Object visit(ASTMethodDeclaration node, Object data) { RuleContext result = (RuleContext) data; String rulePattern = (!getStringProperty("rulePattern").equalsIgnoreCase("")) ? getStringProperty("rulePattern") : PATTERN; if (node.containsChildOfType(ASTFormalParameter.class)) { Iterator iterator = node.findChildrenOfType(ASTFormalParameter.class).iterator(); while (iterator.hasNext()) { ASTFormalParameter element = (ASTFormalParameter) iterator.next(); Iterator decIdIterator = element.findChildrenOfType(ASTVariableDeclaratorId.class).iterator(); while (decIdIterator.hasNext()) { ASTVariableDeclaratorId decElement = (ASTVariableDeclaratorId) decIdIterator.next(); if (!decElement.getImage().matches(rulePattern)) { result.getReport().addRuleViolation(new RuleViolation(this, node.getBeginLine(), "Parameter '" + decElement.getImage() + "' should match regular expression pattern '" + rulePattern + "'", result)); } } } } return result; } }
Couple of points here. First, notice it extends import net.sourceforge.pmd.AbstractRule. All custom rules must extend this class. Second, notice the AST traversal starts from the method declaration. It then iterates through its children and looks for ASTFormalParameter. Finally, it compares the node.getImage() against the rule regular expression. Notice that a mismatch causes the creation of a new RuleViolation. As a side note, notice also the regular expression is fed into the rule as a property.
The rule class is now complete and there are no compilation errors (red Xs). This means it is ready for testing.
Rule name: | ParameterNameConvention |
---|---|
Rule implementation class: | com.jacoozi.pmd.rules.ParameterNameConvention |
Message: | Method parameters should begin with a lowercase "p" |
Description: | Method parameters should always begin with a "p". This is equivalent to the parameters complying with regular expression [p][a-zA-Z]+. This expression can be changed the preferences page by adding a property "rulePattern". |
Example: |
public void bringItHome( String pName, int
pNumber, boolean pDoneThat, List pTr)
|
What happens if you want to share a rule with colleagues or make it a common convention for the team? Simple. The fragment project can be exported as a zip file making it easy to distribute. First, export the fragment project; next unzip the file into Eclipse's plugins folder and it's done.
What happens if you want to share a rule with colleagues or make it a common convention for the team? Simple. The fragment project can be exported as a zip file making it easy to distribute. First, export the fragment project; next unzip the file into Eclipse's plugins folder. That's pretty much it.
The new rule is now ready for use.
The second way to add custom rules to PMD requires some XPath knowledge. XPath has been out for some time now and has proven to be an effective query language for DOM-based XML. Detailed XPath would fill an entire book, therefore for the sake of time and space it is left out. The following Wikipedia definition provides a basic idea of what the language is all about.
XPath |
---|
XPath (XML Path Language) is a terse
(non-XML) syntax for addressing portions of an XML document. Originally motivated by a desire to provide a common syntax and behavior model between XPointer and XSL, XPath has rapidly been adopted by developers as a small query language. The most common kind of XPath expression (and the one which gave the language its name) is a path expression. A path expression is written as a sequence of steps to get from one set of nodes to another set of nodes. The steps are separated by "/" (i.e. path) characters. Each step has three components:
The simplest kind of path expression takes a form such as /A/B/C, which selects C elements that are children of B elements that are children of the A element that forms the outermost element of the document. XPath syntax is designed to mimic URI (Uniform Resource Identifier) syntax or file name syntax. More complex expressions can be constructed by specifying an axis other than the default child axis, a node test, other than a simple name, or predicates, which can be written in square brackets after each step. For example, the expression /A/B/following-sibling::*[1] selects all elements (whatever their name) that immediately follow a B element that is a child of the outermost A element. |
From Wikipedia |
What is needed here is a way to implement ParameterNameConvention as an XPath rule. The chief advantage of XPath rules in PMD is the power, elegance and simplicity of XPath compared to Java. The main disadvantage is that not many people are familiar with XPath. With that being said, let's see how much time XPath can actually save.
Rule name: | ParameterNameConvention |
---|---|
Message: | Method parameters should begin with a lowercase "p" |
Description: | Method parameters should always begin with a "p". |
Example: |
public void bringItHome( String pName, int
pNumber, boolean pDoneThat, List pTr)
|
That's it. The whole rule takes a single line:
//FormalParameter/VariableDeclaratorId[not (starts-with(@Image, 'p'))]
This one-line XPath rule tells PMD to watch for method parameters whose VariableDeclaratorId name ('Image') does not start with a "p". Very elegant.
XPath-based PMD rules offer an efficient alternative to Java-based rules. As more development tools become XPath-compatible it is likely that an investment in this powerful query language will prove valuable for Java developers.
I must admit, I am impressed by PMD's capabilities. It has proven itself as a production grade tool, capable of handling large volumes of Java source files. There are couple of things I wish PMD had. The list below is dedicated to what's missing or can be improved.
There is little doubt that code quality is becoming a prime factor in today's software economy. Many companies are taking real steps to transform old-fashioned, paper-based and in many cases simply ineffective QA practices into modern, efficient, value-added software lifecycle practices. In this path, they are realizing the power of software as a tool to check, correct and improve itself. And a new breed of automated code analyzers is slowly emerging. Designed to free developers from repetitive and error-prone manual code checks, these software "robocops" take the burden from developers, thus enabling them to spend more time on real design and performance issues such as patterns and multithreaded execution.
Automated code analyzers are not meant to replace manual QA. Instead, they should be used in tandem with well-designed manual review processes. Research shows that companies using automated QA tools with paperless, manual QA gain competitive advantage by considerably reducing software maintenance costs. This reduction comes mainly from lower number of defects and less time spent on each defect.
PMD and Eclipse are two great tools for improving personal code quality as well as implementing a consistent company-wide coding convention. PMD operates by applying a set of rules to source files. It comes with a rich suite of rules which can be extended in two ways. The traditional way of adding new rules to PMD is by implementing the rule class in Java and then adding it to the plugin's classpath. PMD has recently added XPath support for custom rules. XPath helps PMD take advantage of the fact it operates on an XML-based Abstract Syntax Tree. Compared to Java rules, XPath rules are much shorter and look way cooler. XPath is a powerful XML query language capable of small wonders. The elegance and beauty of XPath makes it a great choice for developing custom PMD rules.