By Mike Rozlog, Network World, 07/23/09
In today’s competitive market and economy, developers need every tool they can get to increase productivity, reduce cost and lower maintenance while ensuring proper execution in production. One of the more under utilized developer tools is static software audits.
The concept of static analysis has been around for years, and over the past few years tools to evaluate and diagnose the style of the code have matured. There are hundreds of software audits available to developers today in almost any language. These audits can isolate poor coding practices in various areas like Arrays, Loops, Coding Style, Design Issues, Duplicate Code, Naming Style, Performance, and many others. Inside each one of those top-level classifications is another full set of audits to be used by developers or their teams.
A quick word of caution, using audits can cause audit paralysis where the developers get overwhelmed by the tool reporting too many things to fix. I once looked at a 200,000 line application, small by today’s standards, and ran the full complement of 200+ audits. The report basically stated that I had over 350,000 violations that needed to be fixed. What, where, how… overwhelming!
One key to successful audit usage is defining a limited set of audits. This is usually different for every developer or team. Before embarking on an audit hunt, the developers should come to a common-goal regarding what they are trying to fix or solve. This takes a few minutes or in some cases hours, but the time is well worth it.
I always recommend developers and teams start small, pick a set of limited audits that can be used to fix various common issues and give a great return. It should be noted that before engaging in an audit hunt that the developers already follow common practices like SCM, Unit Testing, and hopefully a QA process.
The great thing about audits is they give information about code and how it is constructed. Just because the report states that line 100323 has an anomaly, it does not necessarily mean that a) it is a true problem, or b) it will actually cause a problem. A good tool should give the ability to mark a reported problem as verified; this can be done in comments or something to let the tool know that the area has been reviewed and is OK to continue without modification.
So when do you use audits? I recommend audits be run every single night or during an integration build to ensure good form, especially if the code is being developed from scratch or is a new project. I believe audits should be run at least once when performing Software Archeology on existing code, and should be run all the time during a refactoring process.
The five audits that I’m going to focus on are Numerical Literal in code, String Literal, god Method, Shotgun Surgery and Duplicate Code. When I present these to companies they always look at me and say the same thing, “Why those five?” There are many reasons but in general, once they are explained, they are easy to understand and at the same time they have real benefit.
Numerical Literal in Code:
For the masses that read Martin Fowler’s awesome book “Refactoring” this audit was named “Magic Number” refactor. However, for me this is one of the simplest refactoring methods to learn and understand.
We have all written code like:
double salaryCalc(double salary){
return salary * 1.34564333721
}
C++
Then two months later, we are asked to update the salaryCalc method and when they review the code they become stumped. Most likely they can not remember or were not involved in the original writing of the method to know what 1.34564333721 represents. This can become a maintenance nightmare, especially if the developers you work with don’t put a lot of comments in the code they are working on.
Instead of using a numeric literal, it would be better to replace that number with a symbolic constant. This way a meaningful name can be given to the number, as an example:
double salaryCalc(double salary){
return salary * BONUS
}
static final double BONUS = 1.34564333721
C++
Now when the developers review the code, at a very minimum they should know what 1.34564333721 is the bonus for the calculation.
Again if you are doing refactoring, the unit test should not have to change and the outcome should be exactly the same when the refactor is complete.
String Literal:
We all know there are times when time is of the essence, and we have to make changes to code and get it into production as soon as possible. String literals just happen. This audit is especially nice when you need to internationalize the software, or have already done that task. It should be run very often and all new occurrences should be remediated as soon as they are found. Just like the audits discussed in the paper, the unit tests should not change and the outcome should stay the same.
This would be incorrect:
public const helloWorldMessage : String = ‘Hello World!’;
Delphi
This would be correct:
Public const helloWorldMessage : String =
resourceManager.GetString(‘msg.helloworld’);
Delphi
This method would retrieve the helloWorldMessage string from a resource bundle that could be changed outside of the application and corrected without having to recompile.
god Method:
The god ‘X’ audits, where ‘X’ is equal to Method, Class, or Package are great checks on good object design. This audit does more than just look at the code and spit out that it should be done a better way, as with to two above audits.
The overview for this audit will be the god Method; the simple way to explain it is that you have a class, the class contains 50 methods, however only 1 method appears to be doing all the work. This method would be marked as a god Method.
By the way, most god ‘X’ situations do not start out that way. In the case of god Methods, they normally start out as simple normal methods but over time as more functionality gets added and the method has more responsibilities assigned to it, it grows and grows and eventually it becomes a god Method.
So what calculations are used to figure out if a method is doing more work than other methods? The usual calculation revolves around three major audits and a set of metrics to determine its status. The three audits used include Long Methods, Long Parameters, and Switch statements. The 4 basic metrics used include; Line of Code (LOC), Number of Parameters (NOP), Number of Local Variables (NOLV), and Maximum Number of Branches (MNOB).
By putting those numbers together you can start to see how the calculation works. Long Methods usually mean more than one operation is being performed, Long Parameters are hard to understand, and Switch statements mean a lot of different paths through the code. By adding in the metrics LOC per method compared to other methods and then verifying the NOP, NOLV, and MNOB, it can calculate what a god Method is and what is not.
So what do you do once you have isolated a god Method? There are a number of refactors that could occur, like Extract Method, Introduce Parameter Object, and many others. The idea is to make sure the method is only doing the work that the method was supposed to do in the first place. As an example, having a method that calculates the salary and updates the salary history, generates salary reports, and calculates taxes all in one single method may not be the best approach. Plus, from a simple maintenance perspective, separation is going to make that job much easier.
Shotgun Surgery:
This audit is one of my favorite audits, not just because of the cool name, but because this audit can save a lot of time. Like the god ‘X’ audits, this method also uses metrics to calculate its results.
I like to explain shotgun surgery this way; ever isolate the one method with one line of code that needs to be modified? You know in your developer heart that when you make the change to that one line of code that you will get to leave the office early, get to have dinner with the family, get to watch your favorite T.V. show, and maybe, just maybe, get to go to bed early! However, you change the one line of code in the one method, run unit tests it passes, put it in production… you don’t get to go home until 3:00 am, your spouse and kids are upset and your manager is really mad for breaking production.
However, if you would have used the shotgun surgery audit, the bad stuff may not have had to happen. This is because the audit basically looks at the number of places in your code that is relying on that one method. Meaning that you will know that one line change is going to affect the entire code base and you better take a step back before actually changing that line.
The two metrics that are used to help calculate a shotgun surgery method are Changing Methods (CM) and Changing Classes (ChC). The changing methods audit is really the number of methods that are associated or relying on the particular method, and the changing classes is the number of classes that is also affected by changing the method. So you can think of the shotgun surgery audit as a mini-method-dependency-checker.
So if you have a method that is reported to have shotgun surgery what should you do? Simple. Slow down… look a little deeper into the code, find the dependencies and review what the proposed change would do to the system. The extra time will be paid back many times over with the re-work that will not have to be done.
Duplicate code:
There are many days I believe Copy/Paste should be banished from our editors. The amount of times we use this feature in our code is astounding. Sometimes I refer to the process as Snarf and Barf, because it takes no time to execute and no real thought behind it. This leads too many of us (including myself) to copy bad code and paste it all over the place.
This leads to many problems, the biggest being a maintenance nightmare. Think of how many times we copy five or 10 lines of code and put that same code in the next method, in another method, and so on. Then we change the second method just a little for this one border case, then the next, how do you keep track?
Using this audit you should be able to find many of the areas that this occurs in. Many tools out there today state that 10 lines or more have to be duplicated before the audit returns a true. That is personally way too high, especially if the tool you use has the ability to check for duplicates in Conditionals and Constructors. I would recommend setting the number down to three to five lines.
Once the developer has isolated the duplicates, a couple of standard refactoring methods can be applied, usually Extract Method, or Pull Up Field will do the trick.
So an example maybe an IF statement like:
if (int x=0; x< 100; x++{
//other statements
someValue = 10;
} else {
//other statements
someValue = 10;
}
Java
Could be written as:
if (int x=0; x < 100; x++{
//other statements
} else {
//other statements
}
someValue = 10;
Java
The over-simplified example highlights it perfectly; the someValue was going to get changed to 10 every time so it may as well be changed only once.
There are a lot of audits out there today; there are most likely audits for almost anything we can think of. Remember, when starting with audits to start slow, pick a couple that you believe will give good feedback and have a potential for an easy payback and you should be ready to go.
Audits are one of those great tools that many developers seem to neglect or forget about. Take the time to find a couple and hopefully you will be able to produce better software in less time.
About the author: Rozlog is the senior director of Java solutions for Embarcadero Technologies. His latest book, a collaboration, is Mastering JBuilder from John Wiley & Sons, Inc. He can be reached at michael.rozlog@embarcadero.com.
All contents copyright 1995-2009 Java World, Inc. http://www.javaworld.com