Whattolookforinacodereview PDF
Whattolookforinacodereview PDF
Whattolookforinacodereview PDF
Contents
JetBrains Technical Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
About this book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
What do you look for when reviewing someone elses code? . . . . . . . . . . . . . . . .
What should you look for . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
3
4
Tests . . . . . . . . . . . . . . . .
Ask yourself these questions
Reviewers can write tests too
Summary . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
6
6
10
10
Performance . . . . . . . . . . . . . . . . . . . . . . . .
Performance Requirements . . . . . . . . . . . . . .
Calls outside of the service/application are expensive
Using resources efficiently and effectively . . . . . .
Warning signs a reviewer can easily spot . . . . . . .
Correctness . . . . . . . . . . . . . . . . . . . . . . .
Code-level optimisations . . . . . . . . . . . . . . .
Summary . . . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
11
11
12
13
15
17
18
20
Data Structures . . . . . . . . . . . . .
Lists . . . . . . . . . . . . . . . . . .
Maps . . . . . . . . . . . . . . . . .
Sets . . . . . . . . . . . . . . . . . .
Stacks . . . . . . . . . . . . . . . . .
Queues . . . . . . . . . . . . . . . .
Why select the right data structure?
Summary . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
21
21
23
26
27
27
28
29
SOLID Principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
What is SOLID? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Single Responsibility Principle (SRP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
30
30
30
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
CONTENTS
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
31
32
33
34
35
Security . . . . . . . . . . . . . . .
Automation is your friend . . .
Sometimes It Depends . . . .
Understand your Dependencies
Summary . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
36
36
37
38
40
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
42
42
43
43
44
45
46
48
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Source Code
The source code for the examples in this book can be found on the corresponding JetBrains GitHub
repository
JetBrainswebsite
http://blog.jetbrains.com/upsource/category/practices/
https://github.com/jetbrains/jetbrains-books-examples/whattolookforinacodereview
Introduction
Lets talk about code reviews. If you take only a few seconds to search for information about code
reviews, youll see a lot of articles about why code reviews are a Good Thing (for example, this post
by Jeff Atwood).
You also see a lot of documentation on how to use Code Review tools like our very own Upsource.
What you dont see so much of is a guide to things to look for when youre reviewing someone elses
code.
Probably the reason theres no definitive article on what to be looking for is: there are a lot of
different things to consider. And, like any other set of requirements (functional or non-functional),
individual organizations will have different priorities for each aspect.
Since this is a big topic to cover, the aim of this chapter is to outline just some of the things a reviewer
could be looking out for when performing a code review. Deciding on the priority of each aspect
and checking them consistently is a sufficiently complex subject to be a chapter in its own right.
Introduction
ensure that your code is consistently formatted, that standards around naming and the use of the
final keyword are followed, and that common bugs caused by simple programming errors are found.
For example, you can run IntelliJ IDEAs inspections from the command line so you dont have to
rely on all team members having the same inspections running in their IDE.
Design
How does the new code fit with the overall architecture?
Does the code follow SOLID principles, Domain Driven Design and/or other design
paradigms the team favors?
What design patterns are used in the new code? Are these appropriate?
If the codebase has a mix of standards or design styles, does this new code follow the current
practices? Is the code migrating in the correct direction, or does it follow the example of older
code that is due to be phased out?
Is the code in the right place? For example, if the code is related to Orders, is it in the Order
Service?
Could the new code have reused something in the existing code? Does the new code provide
something we can reuse in the existing code? Does the new code introduce duplication? If so,
should it be refactored to a more reusable pattern, or is this acceptable at this stage?
Is the code over-engineered? Does it build for reusability that isnt required now? How does
the team balance considerations of reusability with YAGNI?
Introduction
Functionality
Does the code actually do what it was supposed to do? If there are automated tests to ensure
correctness of the code, do the tests really test that the code meets the agreed requirements?
Does the code look like it contains subtle bugs, like using the wrong variable for a check, or
accidentally using an and instead of an or?
Tests
In the previous chapter we talked about a wide variety of things you could look for in a code review.
Now well focus on one area: what to look for in the test code.
Were assuming that * Your team already writes automated tests for code. * The tests are regularly
run in a Continuous Integration (CI) environment. * The code under review has been through an
automated compile/test process and passed.
Tests
You can check the coverage report for the new lines of code (which should be easy to identify,
especially if youre using a tool like Upsource) to make sure its adequately covered.
In this example above, the reviewer may ask the author to add a test to cover the case where the if
on line 303 evaluates to true, as the coverage tool has marked lines 304-306 with red to show they
arent tested.
100% test coverage is an unrealistic goal for pretty much any team, so the numbers coming out of
your coverage tool might not be as valuable as insights into which specific areas are covered.
In particular, you want to check that all logic branches are covered, and that complex areas of code
are covered.
Tests
Its a fairly simple test, but Im not entirely sure what its testing. Is it testing the save method? Or
getMyLongId? And why does it need to do the same thing twice?
The intent behind the test might be clearer as:
The specific steps you take in order to clarify a tests purpose will depend upon your language,
libraries, team and personal preferences. This example demonstrates that by choosing clearer
names, inlining constants and even adding comments, an author can make a test more readable
by developers other than him- or herself.
Tests
Can I think of cases that are not covered by the existing tests?
Often our requirements arent clearly specified. Under these circumstances, the reviewer should
think of edge cases that werent covered in the original bug/issue/story.
If our new feature is, for example, Give the user the ability to log on to the system, the reviewer
could be thinking, What happens if the user enters null for the username? or What sort of error
occurs if the user doesnt exist in the system?. If these tests exist in the code being reviewed, then
the reviewer has increased confidence that the code itself handles these circumstances. If the tests
for these exceptional cases dont exist, then the reviewer has to go through the code to see if they
have been handled.
If the code exists but the tests dont, its up to your team to decide what your policies are do you
make the author add those tests? Or are you satisfied that the code review proved the edge cases
were covered?
10
Tests
Performance Tests
In the previous chapter I talked about performance as being an area a reviewer might be examining.
Automated performance tests are obviously another type of test that I could have explored in this
chapter, but I will leave discussion of these types of tests for a later chapter about looking specifically
at performance aspects in a code review.
Summary
There are many advantages to performing a code review, no matter how you approach the process in
your organization. Its possible to use code reviews to find potential problems with the code before
it is integrated into the main code base, while its still inexpensive to fix and the context is still in
the developers head.
As a code reviewer, you should be checking that the original developer has put some thought into
the ways his or her code could be used, under which conditions it might break, and dealt with
edge cases, possibly documenting the expected behavior (both under normal use and exceptional
circumstances) with automated tests.
If the reviewer looks for the existence of tests and checks the correctness of the tests, as a team you
can have pretty high confidence that the code works. Moreover, if these tests are run regularly in a
CI environment, you can see that the code continues to work they provide automated regression
checking. If code reviewers place a high value on having good quality tests for the code they are
reviewing, the value of this code review continues long after the reviewer clicks the Accept button.
Performance
In this chapter were going to cover what you can look for in terms of the performance of the code
under review.
As with all architecture/design areas, the non-functional requirements for the performance of a
system should have been set upfront. Whether youre working on a low-latency trading system
which has to respond in nanoseconds, youre creating an online shopping site which needs to be
responsive to the user, or youre writing a phone app to manage a To Do list, you should have
some idea about whats considered too slow.
Lets cover some of the things that affect performance that a reviewer can look for during a code
review.
Performance Requirements
Before deciding on whether we need to undertake code reviews based on performance, we should
ask ourselves a few questions.
11
12
Performance
Calls to a database
Performance
13
For example, line 19 above might look fairly innocent, but its inside a for loop you have no idea
how many calls to the database this code might result in.
14
Performance
data structures that grow indefinitely. If, as a reviewer, you see new values constantly being added
to a list or map, question if and when the list or map is discarded or trimmed.
Infinite Memory
In the code review above, we can see all messages from Twitter being added to a map. If we examine
the class more fully, we see that the allTwitterUsers map is never trimmed, nor is the list of tweets
in a TwitterUser. Depending upon how many users were monitoring and how often we add tweets,
this map could get very big, very fast.
Close Resources
Its very easy for the original code author to miss this problem, as the above code will compile
happily. As the reviewer, you should spot that the connection, statement and result set all need
closing before the method exits. In Java 7, this has become much easier to manage thanks to trywith-resources. The screenshot below shows the result of a code review where the author has
changed the code to use try-with-resources.
https://docs.oracle.com/javase/tutorial/essential/exceptions/tryResourceClose.html
15
Performance
try-with-resources
Reflection
Reflection in Java is slower than doing things without reflection. If youre reviewing code that
contains reflection, question whether this is absolutely required.
http://docs.oracle.com/javase/tutorial/reflect/index.html
16
Performance
Reflection
The screenshot above shows a reviewer clicking on a method in Upsource to check where it comes
from, and you can see that this method is returning something from the java.lang.reflect package,
which should be a warning sign.
Timeouts
When youre reviewing code, you might not know what the correct timeout for an operation is, but
you should be thinking whats the impact on the rest of the system while this timeout is ticking
down?. As the reviewer you should consider the worst case is the application blocking while a 5
minute timeout is ticking down? Whats the worst that would happen if this was set to one second?
If the author of the code cant justify the length of a timeout, and you, the reviewer, dont know the
pros and cons of a selected value, then its a good time to get someone involved who does understand
the implications. Dont wait for your users to tell you about performance problems.
Parallelism
Does the code use multiple threads to perform a simple operation? Does this add more time and
complexity rather than improving performance? With modern Java, this might be more subtle than
creating new threads explicitly: does the code use Java 8s shiny new parallel streams but not benefit
from the parallelism? For example, using a parallel stream on a small number of elements, or on a
stream which does very simple operations, might be slower than performing the operations on a
sequential stream.
Parallel
In the code above, the added use of parallel is unlikely to give us anything the stream is acting
upon a Tweet, therefore a string no longer than 140 characters. Parallelising an operation thats
17
Performance
going to work on so few words will probably not give a performance improvement, and the cost of
splitting this up over parallel threads will almost certainly be higher than any gain.
Correctness
These things are not necessarily going to impact the performance of your system, but since theyre
largely related to running in a multi-threaded environment, they are related to the topic.
Threadsafe
In the code above, the author is using an ArrayList on line 12 to store all the sessions. However, this
code, specifically the onOpen method, is called by multiple threads, so the sessions field needs to be
a thread safe data structure. For this case, we have a number of options: we could use a Vector, we
could use Collections.synchronizedList() to create a thread safe List, but probably the best selection
for this case is to use CopyOnWriteArrayList, since the list will change far less often than it will
be read.
ThreadSafe 2
http://docs.oracle.com/javase/8/docs/api/java/util/Vector.html
http://docs.oracle.com/javase/8/docs/api/java/util/Collections.html#synchronizedList-java.util.Listhttp://docs.oracle.com/javase/8/docs/api/java/util/concurrent/CopyOnWriteArrayList.html
18
Performance
Race Conditions
Although the increment code is on a single line (line 16), its possible for another thread to increment
the orders between this code getting it and this code setting the new value. As a reviewer, look out
for get and set combos that are not atomic.
Caching
While caching might be a way to prevent making too many external requests, it comes with its own
challenges. If the code under review uses caching, you should look for some common problems, for
example, incorrect invalidation of cache items.
Code-level optimisations
If youre reviewing code, and youre a developer, this following section may have optimisations
youd love to suggest. As a team, you need to know up-front just how important performance is to
you, and whether these sorts of optimisations are beneficial to your code.
https://en.wikipedia.org/wiki/Linearizability
https://docs.oracle.com/javase/tutorial/essential/concurrency/locksync.html
https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/locks/package-summary.html
https://docs.oracle.com/javase/tutorial/essential/concurrency/atomicvars.html
http://www.developertutorials.com/anatomy-flawed-microbenchmark-050426/
19
Performance
For most organisations that arent building a low-latency application, these optimisations are
probably fall under the category of premature optimisations.
Does the code use synchronization/locks when theyre not required? If the code is always run
on a single thread, locks are unnecessary overhead.
Is the code using a thread-safe data structure where its not required? For example, can Vector
be replaced with ArrayList?
Is the code using a data structure with poor performance for the common operations? For
example, using a linked list but needing to regularly search for a single item in it.
Is the code using locks or synchronization when it could use atomic variables instead?
Could the code benefit from lazy loading?
Can if statements or other logic be short-circuited by placing the fastest evaluation first?
Is there a lot of string formatting? Could this be more efficient?
Are the logging statements using string formatting? Are they either protected by an if to check
logging level, or using a supplier which is evaluated lazily?
Logging
The code above only logs the message when the logger is set to FINEST. However, the expensive
string format will happen every time, regardless of whether the message is actually logged.
Logging
Performance can be improved by ensuring this code is only run when the log level is set to a value
where the message will be written to the log, like in the code above.
Logging
http://c2.com/cgi/wiki?PrematureOptimization
Performance
20
In Java 8, these performance gains can be obtained without the boilerplate if, by using a lambda
expression. Because of the way the lambda is used, the string format will not be done unless the
message is actually logged. This should be the preferred approach in Java 8.
Summary
As with my original list of things to look for in code review, this chapter highlights some areas
that your team might want to consistently check for during reviews. This will depend upon the
performance requirements of your project.
Although this chapter is aimed at all developers, many of the examples are Java / JVM specific. Id
like to finish with some easy things for reviewers of Java code to look for, that will give the the JVM
a good chance of optimising your code so that you dont have to:
Write small methods and classes
Keep the logic simple no deeply nested ifs or loops
The more readable the code is to a human, the more chance the JIT compiler has of understanding
your code enough to optimise it. This should be easy to spot during code review if the code looks
understandable and clean, it also has a good chance of performing well.
When it comes to performance, understand that there are some areas that you may be able to get
quick wins in (for example, unnecessary calls to a database) that can be identified during code review,
and some areas that will be tempting to comment on (like the code-level optimisations) that might
not gain enough value for the needs of your system.
http://www.oracle.com/technetwork/articles/java/architect-evans-pt1-2266278.html
Data Structures
Data structures are a fundamental part of programming so much so its actually one of the areas
thats consistently taught in Computer Science courses. And yet its surprisingly easy to misuse them
or select the wrong one. In this post, were going to guide you, the code reviewer, on what to look
out for were going to look at examples of code and talk about smells that might indicate the
wrong data structure was chosen or that its being used in an incorrect fashion.
Lists
Probably the most common choice for a data structure. Because it is the most common choice, its
sometimes used in situations it shouldnt be.
Iterating over a list Iterating over a list is not, in itself, a bad thing of course. But if iteration is
required for a very common operation (like the example above of finding a customer by ID), there
might be a better data structure to use. In our case, because we always want to find a particular item
by ID, it might be better to create a map of ID to Customer.
Remember that in Java 8, and languages which support more expressive searches, this might not be
as obvious as a for-loop, but the problem still remains.
21
22
Data Structures
Frequent Reordering
Lists are great if you want to stick to their default order, but if as a reviewer you see code thats
re-sorting the list, question whether a list is the correct choice. In the code above, on line 16 the
twitterUsers list is always re-sorted before being returned. Once again, Java 8 makes this operation
look so easy it might be tempting to ignore the signs:
In this case, given that a TwitterUser is unique and it looks like you want a collection thats sorted
by default, you probably want something like a TreeSet.
23
Data Structures
Maps
A versatile data structure that provide O(1) access to individual elements, if youve picked the right
key.
Global Map
In the above code, the author has chosen to simply expose the CUSTOMERS map as a global
constant. The CustomerUpdateService therefore uses this map directly when adding or updating
customers. This might not seem too terrible, since the CustomerUpdateService is responsible for
add and update operations, and these have to ultimately change the map. The issue comes when
other classes, particularly those from other parts of the system, need access to the data.
https://en.wikipedia.org/wiki/Big_O_notation#Orders_of_common_functions
24
Data Structures
Orders Service
Here, the order service is aware of the data structure used to store customers. In fact, in the code
above, the author has made an error they dont check to see if the customer is null, so line 12
could cause a NullPointerException. As the reviewer of this code, youll want to suggest hiding
this data structure away and providing suitable access methods. That will make these other classes
easier to understand, and hide any complexity of managing the map in the CustomerRepository,
where it belongs. In addition, if later you change the customers data structure, or you move to using
a distributed cache or some other technology, the changes associated with that will be restricted
to the CustomerRepository class and not ripple throughout the system. This is the principle of
Information Hiding.
https://en.wikipedia.org/wiki/Information_hiding
25
Data Structures
Hidden
Although the updated code isnt much shorter, you have standardised and centralised core functions
for example, you know that getting a customer who doesnt exist is always going to give you an
Exception. Or you can choose to have this method return the new Optional type.
Note that this is exactly the sort of issue that should be found during a code review hiding global
constants is hard to do once their use has propagated throughout the system, but its easy to catch
this when theyre first introduced.
26
Data Structures
Sets
An often-underused data structure, its strength is that is does not contain duplicate elements.
Sets
The author of this code has changed the initial set that tracks the sites a user has visited from
HashSet to LinkedHashSet this latter implementation preserves insertion order, so now our set
tracks every URI in the order in which they were visited.
There are a number of signs in this code that this is wrong though. Firstly, the author has had to do
a costly full iteration of the whole set to reach the last element (lines 13-15) sets are not designed
for accessing elements by position, something that lists are perfect for. Secondly, because sets do not
contain duplicate values, if the last page they visited had been visited previously, it will not be in
the last position in the set. Instead, it will be where it was first added to the set.
In this case, a list, a stack (see below), or even just a single field, might give us better access to the
last page visited.
Data Structures
27
Stacks
Stacks are a favourite of Computer Science classes, and yet in the real world are often overlooked
in Java, maybe this is because Stack is an extension of Vector and therefore somewhat old-fashioned.
Rather than going into a lot of detail here Ill just cover key points:
Stacks support LIFO, and should ideally be used with push/pop operations, its not really for
iterating over.
The class you want for a stack implementation in Java (since version 1.6) is Deque. This can
act as both a queue and a stack, so reviewers should check that deques are used in a consistent
fashion in the code.
Queues
Another Computer Science favourite. Queues are often spoken about in relation to concurrency
(indeed, most of the Java implementations live in java.util.concurrent), as its a common way to
pass data between threads or modules.
Queues are FIFO data structures, generally working well when you want to add elements
to the tail of the queue, or remove things from the front of the queue. If youre reviewing
code that shows iteration over a queue (in particularly accessing elements in the middle of
the queue), question if this is the correct data type.
Queues can be bounded or unbounded. Unbounded queues could potentially grow forever, so
if reviewing code with this type of data structure, check out the earlier post on performance.
Bounded queues can come with their own problems too when reviewing code, you should
look for the conditions under which the queue might become full, and ask what happens to
the system under these circumstances.
Data Structures
28
Performance
If youve studied data structures in computer science, youll often learn about the performance
implications of picking one over another. Indeed, we even mentioned Big O Notation in this
chapter, to highlight some of the strengths of particular structures. Using the right data structure in
your code can definitely help performance, but this is not the only reason to pick the right tool for
the job.
Reducing Complexity
The overall goal of any developer, and especially of a reviewer, should be to ensure the code does
what its supposed to do with the minimal amount of complexity this makes the code easier to read,
easier to reason about, and easier to change and maintain in the future. In some of the anti-patterns
above, for example the misuse of Set, we can see that picking the wrong data structure forced the
author to write a lot more code. Selecting the right data structure should, generally, simplify the
code.
https://docs.oracle.com/javase/8/docs/api/java/util/Map.html
http://docs.oracle.com/javase/8/docs/api/java/util/Set.html
https://en.wikipedia.org/wiki/Big_O_notation
https://en.wikipedia.org/wiki/Principle_of_least_astonishment
Data Structures
29
Summary
Picking the right data structure is not simply about gaining performance or looking clever in front
of your peers. It also leads to more understandable, maintainable code. Common signs that the code
author has picked the wrong data structure:
Lots of iterating over the data structure to find some value or values
Frequent reordering of the data
Not using methods that provide key features e.g. push or pop on a stack
Complex code either reading from or writing to the data structure
In addition, exposing the details of selected data structures, either by providing global access to the
structure itself, or by tightly coupling your classs interface to the operation of an underlying data
structure, leads to a brittleness of design, and will be hard to undo later. Its better to catch these
problems early on, for example during a code review, than incur avoidable technical debt.
SOLID Principles
In this chapter well look more closely at the design of the code itself, specifically checking to see if
it follows good practice Object Oriented Design. As with all the other areas weve covered, not all
teams will prioritise this as the highest value area to check, but if you are trying to follow SOLID
Principles, or trying to move your code in that direction, here are some pointers that might help.
What is SOLID?
The SOLID Principles are five core principles of Object Oriented design and programming. The
purpose of this post is not to educate you on what these principles are or go into depth about why
you might follow them, but instead to point those performing code reviews to code smells that might
be a result of not following these principles.
SOLID stands for:
30
31
SOLID Principles
Diff
This side-by-side diff from Upsource shows that a new piece of functionality has been added to
TweetMonitor, the ability to draw the top ten Tweeters in a leaderboard on some sort of user
interface. While this seems reasonable because it uses the data being gathered by the onMessage
method, there are indications that this violates SRP. The onMessage and getTweetMessageFromFullTweet methods are both about receiving and parsing a Twitter message, whereas draw is all
about reorganising that data for displaying on a UI.
The reviewer should flag these two responsibilities, and then work out with the author a better way
of separating these features: perhaps by moving the Twitter string parsing into a different class; or
by creating a different class thats responsible for rendering the leaderboard.
If you were reviewing the code above, it should be clear to you that when a new Event type is added
into the system, the creator of the new event type is probably going to have to add another else to
this method to deal with the new event type.
32
SOLID Principles
Another Diff
As always, theres more than one solution to this problem, but the key will be removing the complex
if/else*8 and the **instanceof checks.
Order
https://en.wikipedia.org/wiki/Liskov_substitution_principle
https://en.wikipedia.org/wiki/Liskov_substitution_principle
33
SOLID Principles
Now imagine we introduce the idea of electronic gift cards, which simply add balance to a wallet
but do not require physical inventory. If implemented as a GiftCardOrder, the placeOrder method
would not have to use the warehouse parameter:
Stuff
This might seem like a logical use of inheritance, but in fact you could argue that code that uses
GiftCardOrder could expect similar behaviour from this class as the other classes, i.e. you could
expect this to pass for all subtypes:
Liskov Review 2
But this will not pass, as GiftCardOrders have a different type of order behaviour. If youre reviewing
this sort of code, question the use of inheritance here maybe the order behaviour can be plugged
in using composition instead of inheritance.
34
SOLID Principles
ISP
In this example, given that there are times when the decode method might not be needed, and also
that a codec can probably be treated as either an encoder or a decoder depending upon where its
used, it may be better to split the SimpleCodec interface into an Encoder and a Decoder. Some
classes may choose to implement both, but it will not be necessary for implementations to override
methods they do not need, or for classes that only need an Encoder to be aware that their Encoder
instance also implements decode.
DI
http://www.martinfowler.com/articles/injection.html
https://en.wikipedia.org/wiki/Abstract_factory_pattern
SOLID Principles
35
Summary
Some code smells that might indicate one or more of the SOLID Principles have been violated:
As with all design questions, finding a balance between following these principles and knowingly
bending the rules is down to your teams preferences. But if you see complex code in a code review,
you might find that applying one of these principles will provide a simpler, more understandable,
solution.
https://en.wikipedia.org/wiki/Data_access_object
http://martinfowler.com/eaaCatalog/repository.html
Security
How much work you do building a secure, robust system is like anything else on your project it
depends upon the project itself, where its running, whos using it, what data it has access to, etc.
Often, if our team doesnt have access to security experts, we go too far in one direction or the other:
either we dont pay enough attention to security issues; or we go through some compliance checklist
and try to address everything in some 20 page document filled with potential issues.
This chapter aims to highlight some areas you might like to look at when reviewing code, but mostly
it aims to get you having discussions within your team or organisation to figure out what it is you
do need to care about in a code review.
Code Inspection
https://www.owasp.org/index.php/Testing_for_SQL_Injection_(OTG-INPVAL-005)
https://www.owasp.org/index.php/Testing_for_Cross_site_scripting
https://www.owasp.org/index.php/OWASP_Dependency_Check
36
37
Security
Sometimes It Depends
While there are checks that you can feel comfortable with a yes or no answer, sometimes you
want a tool to point out potential problems and then have a human make the decision as to whether
this needs to be addressed or not. This is an area where Upsource can really shine. Upsource displays
code inspections that a reviewer can use to decide if the code needs to be changed or is acceptable
under the current situation.
For example, suppose youre generating a random number. If all your security checks are enabled,
youll see the following warning in Upsource:
The JavaDoc for java.util.Random specifically states Instances of java.util.Random are not cryptographically secure. This may be fine for many of the occasions when you need an arbitrary random
number. But if youre using it for something like session IDs, password reset links, nonces or salts,
as a reviewer you might suggest replacing Random with java.util.SecureRandom.
If you and the code author decide that Random is appropriate for this situation, then its a good idea
to suppress this inspection for this line of code, and document why its OK or point to any discussion
on the subject this way future developers looking at the code can understand this is a deliberate
decision.
Suppress Warning
So while tools can definitely point you at potential issues, part of your job as a code reviewer is to
investigate the results of any automated checks and decide which action to take.
https://www.jetbrains.com/upsource/help/2.0/display_code_inspection.html
https://en.wikipedia.org/wiki/Cryptographic_nonce
https://en.wikipedia.org/wiki/Salt_(cryptography)
38
Security
Upsource
If you are using Upsource to review your code, you can customise your inspection
settings, including selecting security settings. Do this by opening your project in
IntelliJ IDEA and navigating to the Inspections settings. Select the settings you
want and save them to the Project Default profile. Make sure Project_Default.xml is
checked in with your project code, and Upsource will use this to determine which
inspections to run.
At the time of writing, these are the available security inspections:
Security Inspections
Security
39
been introduced. If you arent already automating the check for vulnerabilities, you should check
for known issues in newly-introduced libraries.
You should also try to minimise the number of versions of each library not always possible if
other dependencies are pulling in additional transitive dependencies. But one of the simplest way to
minimise your exposure to security problems in other peoples code (via libraries or services) is to
Use a few as sources as possible and understand how trustworthy they are
Use the highest quality library you can
Track what you use and where, so if new vulnerabilities do become apparent, you can check
your exposure.
This means:
Understanding your sources (e.g. maven central or your own repo vs arbitrarily downloaded
jar files)
Trying not to use 5 different versions of 3 different logging frameworks (for example)
Being able to view your dependency tree, even if its simply through Gradle/Maven
Security
40
Summary
This is just a tiny subset of the sorts of security issues you can be checking in a code review. Security
is a very big topic, big enough that your company may hire technical security experts, or at least
devote some time or resources to this area. However, like other non-coding activities such as getting
to know the business and having a decent grasp of how to test the system, understanding the security
requirements of our application, or at least of the feature or defect were working on right now, is
another facet of our job as a developer.
We can enlist the help of security experts if we have them, for example inviting them to the code
review, or inviting them to pair with us while we review. Or if this isnt an option, we can learn
enough about the environment of our system to understand what sort of security requirements
we have (internal-facing enterprise apps will have a different profile to customer-facing web
applications, for example), so we can get a better understanding of what we should be looking for
in a code review.
Security
41
And like many other things were tempted to look for in code reviews, many security checks can
also be automated, and should be run in our continuous integration environment. As a team, you
need to discuss which things are important to you, whether checking these can be automated, and
which things you should be looking for in a code review.
This post has barely scratched the surface of potential issues. Wed love to hear from you in the
comments let us know of other security gotchas we should be looking for in our code.
Navigation
It might seem like a trivial thing, but the ability to navigate through the code via the code is
something we simply take for granted when we use an IDE like IntelliJ IDEA. Yet the simple Cmd +
click feature to navigate to a given class, or to find the declaration of a field, is often missing when
we open code in a tool that is not our IDE. Upsource lets you click on a class name, method, field or
variable to navigate to the declaration.
Symbol actions
42
43
While this is very useful, something I find myself doing a even more in my IDE is pressing Alt + F7
to find usages of a class, method or field. This is especially useful during code review, because if a
method or class has been changed in some way, you as the reviewer want to see what the impact
of this change is which means locating all the places its used. You can see from the screenshot
above that this is easily done in Upsource clicking on a symbol gives you the option to highlight
the usages in the file, or to find usages in the project.
Inspections
Intuitive navigation is great for a reviewer as it lets you browse through the code in a way thats
natural for you, rather than having some arbitrary order imposed on you it makes it easier to see
the context of the changes under review.
But theres another IDE feature that would be extremely useful during code review inspections. If
youre already using IntelliJ IDEA, for example, youre probably used to the IDE giving you pointers
on where the code could be simpler, clearer, and generally a bit better. If your code review tool offered
the same kind of advice, you could easily check that all new/updated code doesnt introduce new
obvious issues, and possibly even cleans up long-standing problems.
Upsource uses the IntelliJ IDEA inspections we actually covered how to enable them for Upsource
in the previous chapter. There are rather a lot of inspections available in IntelliJ IDEA, so were just
going to give a taste of whats possible were going to cover some of the default ones that you may
find useful during code review.
Solutions
Its difficult to think of a time when catching and ignoring an Exception is the right thing to do. A
code reviewer should be suggesting:
https://www.jetbrains.com/idea/help/code-inspection.html
44
Catching the Exception and wrapping it in a more appropriate one, possibly a RuntimeException, that can be handled at the right level.
Logging the Exception (we also touched on appropriate logging in the chapter on security).
At the very least, documenting why this is OK. If theres a comment in the catch block, its
no longer flagged by the inspection.
ExpectedException
Configuring
Empty catch block is enabled in the default set of inspections. This and other related inspections
can be found in IntelliJ IDEAs inspections settings under Java > Error Handling.
Probable Bugs
There are a number of inspections available for probable bugs. These inspections highlight things
that the compiler allows, as theyre valid syntax, but are probably not what the author intended.
Examples
String.format() issues like the one above.
Comparing Strings using == not .equals().
Querying Collections before putting anything in them (or vice versa).
https://www.jetbrains.com/idea/help/accessing-inspection-settings.html
45
Accessing Collections as if they have elements of a different type (sadly possible due to the
way Java implemented generics on collections).
Solution
Not all of these problems are automatically bugs, but they do look suspicious. Theyll usually require
you, the code reviewer, to point them out to the author and have a discussion about whether this
code is intentional.
Configuring
Inspections to highlight all the potential problems listed are already selected by default. To find more
inspections in this category, look under Java > Probable Bugs in the inspections settings.
Examples
Using explicit true and false in a boolean expression (in the example above this is unnecessarily
verbose).
Boolean expressions that can be simplified, or re-phrased to be simpler to understand.
if or while expressions that always evaluate to the same value:
https://www.jetbrains.com/idea/help/accessing-inspection-settings.html
46
Solutions
As with the other examples above, you may simply want to flag them in the code review so
the author can use IntelliJ IDEAs inspections to apply the recommended fix.
In some cases, like if statements that can be simplified in equals() methods, the simplified
code is not always easier to read. If this is the case, you may want to suggest the code author
suppresses the inspection for this code so it is no longer flagged.
In other cases, the inspection might be pointing to a different smell. In the if statement above,
the inspection shows this code (which is in a private class) is always called with a particular
set of values so this if statement is redundant. It may be viable to remove the statement, but
as this specific example is only used in test code it implies theres a missing test to show what
happens when the two objects are equal. The code reviewer should suggest the additional test,
or at least have the author document why it wasnt needed.
Configuring
These types of inspections can be found in Java > Control flow issues and Java > Data flow issues.
Unused Code
Upsource highlights all unused code (classes, methods, fields, parameters, variables) in a grey colour,
so you dont even need to click or hover over the areas to figure out whats wrong with it grey
should automatically be a smell to a code reviewer.
47
Unused code
Examples
There are a number of reasons a code review might contain unused code:
Its an existing class/method/field/variable that has been unused for some time.
Its an existing class/method/field/variable that is now unused due to the changes introduced
in the code review.
Its new / changed code that is not currently being called from anywhere.
Solutions
As a reviewer, you can check which category the code falls into and suggest steps to take:
Delete the unused code. In the case of 1) or 2) above, this should usually be safe at the
field/variable level, or private classes and methods. At the class and method level, if these
are public they might be used by code outside your project. If you have control over all the
code that would call these and you know the code is genuinely unused, you can safely remove
them. In case 3) above, its possible that some code is work-in-progress, or that the author
changed direction during development and needs to clean up left over code either way, flag
the code and check if it can be deleted.
Unused code could be a sign the author forgot to wire up some dependencies or call the new
features from the appropriate place. If this is the case, the code author will need to fix the
unused code by, well, using it.
If the code is not safe to delete and is not ready to be used, then unused code is at the very
least telling you that your test coverage is not sufficient. Methods and classes that are used
by other systems, or will be used in the very near future, should have tests that show their
expected behaviour. Granted, test coverage can hide genuinely unused code, but its better to
have code that looks used because its tested than have code that is used that is not tested. As
the reviewer, you need to flag the lack of tests. For code that existed before this code review,
you might want to raise a task/story to create tests for the code rather than to slow down the
current feature/bug being worked on with unrelated work. If the unused code is new code,
then you can suggest suitable tests. New code thats untested should not be let off lightly.
http://blog.jetbrains.com/idea/2015/08/why-write-automated-tests/
48
If you and the code author decide not to address the unused code immediately by deleting it,
using it or writing tests for it, then at least document somehow why this code is unused. If
theres a ticket/issue somewhere to address it later, refer to that.
Gotchas
Inspections are not infallible, hence why theyre useful pointers for reviewers but not a fully
automated check with a yes/no answer. Code might be incorrectly flagged as unused if:
Its used via reflection
Its used magically by a framework or code generation
Youre writing library code or APIs that are used by other systems
Configuring
These types of inspections can be found in Java > Declaration redundancy, Java > Imports and
Java > Probable bugs. Or you can search for the string unused in the IntelliJ IDEA inspection
settings.
Summary
The navigation and inspection features are all available in the Upsource application. While it would
be great if the app could provide everything we as developers want, sometimes we just feel more
comfortable in the IDE. So thats why theres also an Upsource plugin for IntelliJ IDEA and other
JetBrains IDEs, so we can do the whole code review from within our IDE. Theres also a new Open
in IDE feature in Upsource 2.5 which, well, lets you open a code review in your IDE.
While many checks can and should be automated, and while humans are required to think about
bigger-picture issues like design and but did it actually fix the problem?, theres also a grey area
between the two. In this grey area, what we as code reviewers could benefit from is some guidance
https://www.jetbrains.com/idea/help/accessing-inspection-settings.html
https://www.jetbrains.com/upsource/whatsnew/#plugin
http://blog.jetbrains.com/upsource/2015/09/30/upsource-2-5-early-access-is-open/
49
about code that looks dodgy but might be OK. It seems logical that a code review tool should provide
this guidance. Not only this, but we should also expect our code review tool to allow us to navigate
through the code as naturally as we would in our IDE.
Upsource aims to make code review not only as painless as possible, but also provide as much help
as a tool can, freeing you up to worry about the things that humans are really good at.
Upsource is free for an unlimited number of projects for teams with up to 10 developers,
and provides a 60-day evaluation period for unlimited number of users. An evaluation
key is available upon request. If youd like to learn more about Upsource and how it
can help you perform useful code reviews, visit our page.
https://www.jetbrains.com/upsource/download/
http://jetbrains.com/upsource/