How Open-Sourced Projects use Checkstyle

  • Post author:
  • Post last modified:2021-02-10
  • Post category:Code Review
  • Reading time:16 mins read

Table of Contents

  • Overview of Checkstyle
  • Integrating Checkstyle into Integrated Development Environments (IDE)
  • Integrating Checkstyle into Build Tools
  • Default Check Item Categories of Checkstyle
  • Coding Styles Distributed by Default
  • Open-Sourced Projects and Checkstyle
  • How ElasticSearch uses Checkstyle in their project
  • Summary

Overview of Checkstyle

Checkstyle is a development tool which helps programmers write Java code adhering to a coding standard. The following lines are quoted from the official site of Checkstyle:

Checkstyle is a development tool to help programmers write Java code that adheres to a coding standard. It automates the process of checking Java code to spare humans of this boring (but important) task. This makes it ideal for projects that want to enforce a coding standard.

In addition, Checkstyle can be used to check many aspects of your source code, such as issues in class and method design and layout format. For further details of the checks, please refer to the Checks section of the Checkstyle site mentioned above.

Checkstyle can be customized with various parameters. If you already have an existing coding standard you follow, you can define suitable rules for Checkstyle. If you haven’t applied a particular coding style to your project yet, you can use config files for checking conventional coding standards such as “Sun Java Code Conventions” and “Google Java Style”, which are usable by default. By doing so, you can incorporate a coding standard into your project’s workflow with ease.

In Checkstyle, you use an XML file to describe the rules. Checks following the described rules are provided by java commands and ant tasks by default, but is more commonly used in actual projects by incorporating into the integrated development environment (IDE) or build tool.

In order to incorporate Checkstyle into IDEs, you can use plugins that are provided for various IDEs. Using these, you can run checks in real time while writing source code.

If you want to incorporate Checkstyle into build tools, you can use maven and gradle. They both are commonly-used build tools and provide a task for Checkstyle by default, which can be used for easy integration. By incorporating into build tools, you can check that the source code of a project are adhering to the coding standard at all times.

However, integrating into a build tool creates a very strong constraint: even one small issue prevents the code from being merged. It may be better to use web services dedicated to that cause, such as Sider, for existing projects with Checkstyle not implemented yet, or when a coding standard is not enforced so strictly.

Since you can check with both an IDE and a CI process(build tool or Sider) using a single XML file, we would recommend you to incorporate both check methods into a project.

Integrating Checkstyle into IDEs

Checkstyle can be integrated into major Java IDEs such as Eclipse, IDEA, and NetBeans using plugins, as well as lighter editors such as Emacs, Vim, Atom, Sublime Text, etc.

Using an IDE’s plugin for Checkstyle allows you to check for issues in real time and show them on the editor, and also make use of functions such as listing issues and jumping to their locations when clicked. By integrating Checkstyle into IDEs, programmers can write code adhering to a coding standard without stress in their everyday work.

A plugin for integrating Checkstyle into Eclipse. It provides functions such as real-time checking, displaying issues, listing all issues and jumping to their locations. Additionally, you can use this plugin as a high-functioning GUI based editing tool for Checkstyle’s config file (checkstyle.xml), so you can describe rules without much reference to Checkstyle’s documentation.

A plugin for integrating Checkstyle into IntelliJ IDEA. It provides functions such as running checks with Checkstyle and jumping to locations with issues. It does not provide functions for editing the config file.

A plugin for integrating Checkstyle into NetBeans. It provides functions such as real-time checking, displaying issues, listing all issues and jumping to their locations. It does not provide functions for editing the config file.

Integrating Checkstyle into Build Tools

Integration into ant is supported by Checkstyle by default. For maven and gradle, both widely used build tools, plugins for Checkstyle are provided on the build tools’ end. By integrating into build tools, you can incorporate checks with Checkstyle in CI (Continuous Integration), ensuring that the coding standard is followed throughout the project at all times.

If you want to enforce coding standards loosely, for instance if you wish to use it as a guideline in coding instead of follow it 100%, integration into build tools may not be the best solution for you. You may want to consider other methods.

Default Check Item Categories of Checkstyle

There are 154 rules, classified into 14 categories, in the latest version of Checkstyle as of December 2017, version 8.5. You can customize their behavior with various parameters for each rule. Since it would be difficult to go over each and every rule here, we will briefly take an overview of the rules defined for each category.

Annotations

They configure rules concerning annotations. They also define description styles (writing annotations on the same line as the implementations or a different line, showing multiple annotations on the same line or not, etc.) for each annotation target (class, method, arguments, etc.) . Additionally, you can force the usage of @Override and @Deprecated and limit the usage of @SuppressWarnings with these rules.

Block Checks

They configure rules concerning blocks in Java (sections in curly braces). You can define how new lines around curly braces will be arranged and choose to allow/deny empty code blocks with these rules.

Class Design

It configures rules concerning class design. They can define rules such as concerning the visibility of constructors and methods (private~public) and concerning other modifiers (final, abstract, etc.).

Coding

This category configure 43 rules concerning coding style. In general, It seems that rules concerning coding style that doesn’t fit into the other categories are gathered here. We can’t list the all here unfortunately rules such as the following belong to this category for your reference:

  • Putting a comma behind the last element of an array
  • Allowing empty statements or not
  • Limits to the number of nests of if, for and try statements
  • Prohibiting magic numbers
  • Forcing the usage of default in a switch statement

Headers

They configure rules concerning the header section of each source file. When using shared headers in a project, such as descriptions of copyright and license, you can use the rules in this category to describe them.

Imports

They can configure rules concerning import statements of Java. You can prohibit the use of import with wildcards, prohibit static imports, and set the order of imports by defining rules. Additionally, if there are classes to prohibit usage in a project (such as prohibiting the use of java.sql), you can define rules to prohibit importing them to ensure that they are not being used.

Javadoc Comments

They can configure rules concerning Javadoc Comments. You can force the description of Javadoc comments for each visibility of methods and fields and force the order of the @ expression of Javadoc (@param, @return, etc. ).

Metrics

They mainly measure metrics such as method/class complexity and class dependency and detects source code with complexities and dependencies above a certain level. You can use them to extract source code that may need refactoring.

Miscellaneous

These are miscellaneous rules that don’t belong to any other category. In the current version, rules such as the following are classified here:

  • Indent styles for source code and comments
  • Description methods for arrays (putting [] after the class name or the variable name)
  • Putting a new line at the end of a file
  • Forcing the usage of “L” in a long type literal (100L, etc., since the letter “l” is difficult to distinguish from the number “1”)
  • Checking if TODO comments exist
  • Checking that the key of a properties file is not duplicate
  • Prohibiting Unicode escapes (recommended to write with UTF-8)
  • Others

Modifiers

They can configure rules concerning modifiers for classes/methods (private~public, final, etc.). You can define the order when multiple modifiers exist and allow/deny redundant modifiers (such as a public modifier for an interface type method) with these rules.

Naming Conventions

They can configure rules concerning naming conventions . Since you can define the rules with regular expressions for each type of name that is seen in Java, such as class, method, arguments, local variables, etc., you can handle most naming conventions with them. The default regular expression basically follows general naming conventions of Java, so you can run checks adhering to the convention by defining a rule while omitting the regular expression parameter.

Regexp

They can configure rules using regular expressions for examining java files and other files (such as the properties file). You can write these rules in any regular expression, so the rules can be widely used in describing project specific standards which are difficult to describe directly using the default rules.

Size Violations

They can configure rules based on the sizes of elements such as the number of lines in a java file, number of lines for each method, and length of each line. By detecting files/methods that are too large, you can extract source code in need of refactoring.

Whitespace

They can configure rules concerning whitespaces. You can define in detail whether to put a whitespace in front and behind each element of the Java syntax. Additionally, you can configure rules to check for space characters other than half-width spaces.

Coding Styles Distributed by Default

On Checkstyle’s site, they distribute config files with rules according to “Sun Java Coding Conventions” and “Google Java Style”, both widely-used coding styles in Java projects. When deciding on a coding standard for a project, adopting common standards as they are or extracting the needed portions from them will help save your time and effort. Also, adopting proven standards allows you to avoid unnecessary problems.

  • Sun Java Coding Conventions
  • The source code for Java Foundations Classes (standard classes included in the JDK) are described as adhering to the Sun Java Coding Conventions. Therefore, this coding convention is adopted in projects such as OpenJDK.
  • It was announced that this convention would no longer be maintained after they release the last version in 1997.
  • Conventions concerning JDK extensions newer than 1997 (annotations, lambda statements, etc.) do not exist.
  • There are outdated rules, such as “80 letters per line” (which originates from the limits of text terminals).
  • Google Java Style
  • Has been maintained continuously since its first release in 2013. The current newest version was released in September 2017.
  • Commonly adopted in relatively new open-sourced projects.

Google Java Style is a standard that can generally be easily accepted. If you are choosing a project’s coding standard now, we would recommend you to use Google Java Style for a start.

Open-Sourced Projects and Checkstyle

How Open-Sourced Projects use Checkstyle

From the Java projects published in GitHub, we extracted the top-10 starred (which represents the popularity of a project in GitHub) projects and investigated how they utilize Checkstyle. The results are as follows:

Project NameProject OverviewHas checkstyle.xml FileIncorporation into Build ToolCoding StandardReactiveX/RxJavaAPI for asynchronous programmingYesgradleAdopts relatively a few rulesiluwatar/java-design-patternsDesign pattern implementation with JavaYesmavenBased on Google Java Styleelastic/elasticsearchDistributed search engineYesgradleAdopts relatively a few rulessquare/retrofitType safe HTTP client libraryYesmavenBased on Google Java Stylesquare/okhttpHTTP client library for AndroidYesmavenBased on Google Java Stylegoogle/guavaGoogle Core Libraries for Java — Google Java StylePhilJay/MPAndroidChartGraph library for Android — JetBrains/kotlinPrograming Language — JakeWharton/butterknifeView injection library for AndroidYesgradleBased on Google Java Stylebumptech/glideMedia management library for AndroidYesgradleNumerous checks with original rules

Out of the top-10 Java projects, 7 of them provide a checkstyle.xml file, and all of those projects have Checkstyle incorporated into the build process of maven/gradle. We can see that Checkstyle is widely used in relatively new projects published on GitHub.

On the other hand, it seems that Checkstyle is not used as much in projects with a history long as Checkstyle (2001~), such as Tomcat (1999~), Struts(2000~), and Spring(2003~). It may have been difficult for those long-standing projects to introduce Checkstyle halfway through a large project. In the repositories mentioned above, only Tomcat provides a checkstyle.xml file, and even that Tomcat doesn’t incorporate it into the build tool.

How ElasticSearch uses Checkstyle in their project

In this section, we will go over how ElasticSearch uses Checkstyle in their project, one of the top 10 starred Java projects on GitHub.

ElasticSearch is an implementation of a search engine compatible with distributed environments, one of the most actively developed open-sourced projects. It is a relatively new and large project, development starting in 2010, with over 29000 commits and contributions from over 900 people as of December 2017.

Configuration items of checkstyle.xml

Judging from its Git history, it seems that Checkstyle was introduced into the ElasticSearch project around the beginning of 2016. Being introduced halfway through the project, relatively a few rules were checked for at first. It seems that they are attempting to gradually expand and improve the checks while reducing the noise created from violations that can’t be fixed promptly, adding rules and removing them when inevitable. You can see that they took similar steps in their ReactiveX/RxJava project.

There is no point on running multiple checks if the detected issues are left unsolved. Leaving large amounts of violations unsolved results in noise, and may result in overlooking more urgent violations. The underlying idea is thought to be that if the violations are to be left unsolved, it is better off not checking in the first place.

On the other hand, in cases where Checkstyle is used from the start of the project, many of them adopt the Google Java Style with minor changes. It can be said it is reasonable to adopt a proven coding standard as it is if the checks with Checkstyle are to be executed orderly from the start of the project.

Adding Project Specific Checks

The ElasticSearch project uses an original serialization mechanism over Java’s default serialization with java.io.Serializable to ensure the performance of the distributed processing. Therefore, using the java.io.Serializable interface and serialVersionUID field is prohibited throughout in the project. Rules using regular expressions are defined to check for this prohibition with Checkstyle.

<module name="RegexpSinglelineJava">
  <property name="format" value="serialVersionUID" />
  <property name="message" value="Do not declare serialVersionUID." />
  <property name="ignoreComments" value="true" />
</module>
<module name="RegexpSinglelineJava">
  <property name="format" value="java.io.Serializable" />
  <property name="message" value="References java.io.Serializable." />
  <property name="ignoreComments" value="true" />
</module>

Because the rules using the aforementioned regular expressions are simple searches for the matching strings, these rules are not perfect checks in terms of the Java source code’s meaning. For example, field names such as “serialVersionUIDFake”, which should be allowed meaning-wise, become prohibited, and Serializable can be forcibly used by avoiding the rule by importing with import java.io.* and using implements Serializable (which does get caught by the rule prohibiting the use of * in an import statement).

However, excluding those clearly malicious cases, the checks are practical enough. You can run more semantically strict rules by implementing an original class for checking with Checkstyle, but in most cases, rules using regular expressions are more than enough.

Configuring Check Exceptions

There are still some cases that you want to have exceptions to allow violations, even if you choose to make a project follow a coding standard thoroughly.

For example, the source code of ElasticSearch include sections with sources automatically generated with the ANTLR tool, but checking such code does not make any sense. Also, there are needs to allow exceptions to violations for source code using special APIs such as JNA(Java Native Access). Additionally, there are many lines in the ElasticSearch project’s source code that exceed the line length limit but are handled as exceptions, as they will be fixed later on.

To describe exceptions like the above, they use exception configurations with a checkstyle_suppressions.xml file, a mechanism provided by Checkstyle, in the ElasticSearch project.

<!-- These files are generated by ANTLR so its silly to hold them to our rules. -->
<suppress files="org[/\]elasticsearch[/\]painless[/\]antlr[/\]PainlessLexer.java" checks="." />
<suppress files="org[/\]elasticsearch[/\]painless[/\]antlr[/\]PainlessParser(|BaseVisitor|Visitor).java" checks="." />

The same exceptions can be implemented by directly writing @SuppressWarnings annotations to the source code, but using checkstyle_suppressions.xml helps you prevent your code from being cluttered without having to write extra descriptions to them.

Either way, it is important to establish a method for configuring exceptions and avoid creating large amounts of noise from leaving check violations unresolved. These noises could hide more important and dire issues; it would be better off not checking at all than leaving the violations unresolved.

Summary

  • Incorporating checks with Checkstyle into both the IDE and the build tool allows one to write code adhering to a coding standard without stress and ensure they are being followed throughout the project.
  • When introducing Checksytle from the start of a project, adopting Google Java Style will help to keep the introduction easy, allowing you to make use of predefined rules. It is a proven standard and it will help you avoid unnecessary problems.
  • When introducing Checkstyle from the middle of a project, starting from the least amount of rules possible and adding rules orderly is an effective approach. If check violations are to be left unresolved for a long period of time, it would be better to remove that check, since they will hide more important issues.
  • You can run sufficient checks even in cases where there are project-specific standards by working around with regular expressions in describing rules.

Using Sider allows you to incorporate checks with Checkstyle into your project’s review process with ease. With Sider, you can immediately make use of checks with “Sun Java Coding Conventions” and “Google Java Style”. If needed, you can prepare your project’s unique checkstyle.xml and use it for checking.

In projects newly introducing Checkstyle, at first, you may become overwhelmed with the number of warnings. With Sider, you can introduce Checkstyle slowly but surely, being able to run checks with Checkstyle for newly written code per Pull Request. Also, you can immediately start discussions about the issues or which rules to remove. We would greatly recommend trying out Sider for free if you haven’t introduced Checkstyle to your project yet.


For more information about Sider, please go to our website.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.