Clean C Sharp
Clean C Sharp
Clean C Sharp
Jason Roberts
This book is for sale at http://leanpub.com/cleancsharp
This is a Leanpub book. Leanpub empowers authors and publishers with the Lean
Publishing process. Lean Publishing is the act of publishing an in-progress ebook
using lightweight tools and many iterations to get reader feedback, pivot until you
have the right book and build traction once you do.
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
What is Clean C#? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Why Clean C#? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Using this Book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Code Samples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Prescriptive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Clean C# is a Spectrum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Repeating What the Code Already Says . . . . . . . . . . . . . . . . . . . . . 8
Change Control Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Comments as a Substitute for Self Documenting Code . . . . . . . . . . . . 10
Commented-Out Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Pointless XML Documentation Comments . . . . . . . . . . . . . . . . . . . 12
Acceptable Use of Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Naming Things . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Qualities of Clean Names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Expressive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
CONTENTS
Accurate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Suitable Length . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Pronounceable Names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Naming Specific Items . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Namespaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Method Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Booleans . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Generic Types Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Enums . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Some General Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Method Size and Clarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Cohesive Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Mixing Abstraction Levels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Action or Answering Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Method Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
Methods with Zero Parameters . . . . . . . . . . . . . . . . . . . . . . . 35
Methods with One Parameter . . . . . . . . . . . . . . . . . . . . . . . . 35
Methods with Two Parameters . . . . . . . . . . . . . . . . . . . . . . . . 35
Methods with Three Parameters . . . . . . . . . . . . . . . . . . . . . . . 36
Methods with More Than Three Parameters . . . . . . . . . . . . . . . 36
Refactoring to Reduce the Number of Parameters . . . . . . . . . . . . 36
Params . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Output Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Named Arguments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Boolean Switching Arguments . . . . . . . . . . . . . . . . . . . . . . . . 38
Multiple Returns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
CONTENTS
Visual Formatting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
The Principle of Proximity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
The Principal of Similarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
The Principal of Uniform Connectedness . . . . . . . . . . . . . . . . . . . . 68
The Principal of Symmetry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
Clean Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
Qualities of Good Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
Execution Speed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
Independent and Isolated . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
Repeatable and Reliable . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
Valuable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
Resilient to Production Code Changes . . . . . . . . . . . . . . . . . . . 79
The Three Logical Phases of Tests . . . . . . . . . . . . . . . . . . . . . . . . . 79
The Arrange Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
The Act Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
The Assert Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
CONTENTS
C# Tips
Write better C#.
¹http://bit.ly/psjason
²https://twitter.com/robertsjason
³http://dontcodetired.com
⁴http://keepingsoftwaresoft.com
About The Author 3
This book will help you become a better C# programmer. It contains a whole host of
useful tips on using C# and .Net.
C# Tips⁵
Pluralsight Courses
Browse Pluralsight courses by Jason Roberts⁶
⁵http://bit.ly/sharpbook
⁶http://bit.ly/psjason
About this Book
Welcome.
This book will help you (and the readers of your code) be happier and more
productive by writing cleaner, more maintainable, more readable, and generally
more pleasant C#.
I hope you enjoy reading and using the information in this book as much as I
did writing it. May it improve your software development experience and overall
happiness.
Best Wishes,
Jason Roberts
Introduction
What is Clean C#?
The concept of clean C# is that which is easily understandable.
It has been described as: “…simple and direct… like well-written prose.” (Grady
Booch); makes it “hard for bugs to hide” (Bjarne Stroustrup); and that it “looks like
it was written by someone who cares” (Michael Feathers).
Clean code emphasises the human reader of the source code. Just because the
compiler can easily understand the code, it does not mean the human reader can.
and outmanoeuvre their competition. Users may also get new features more quickly
and annoying defects removed sooner.
The human brain has limitations on the number of chunks of information it can
hold in working short term memory. Therefore writing clean C# code offers the
opportunity to reduce the cognitive load on subsequent readers, and also the
originating programmer. As an example, imagine an overly long method that makes
use of 15 different local variables. As the reader is scrolling through the method, they
have to try and retain these 15 variables in their short term memory while trying to
figure out what the code does.
Happiness is important in all areas of life. In a work setting, happier individuals may
be as much as 10-12% more productive⁷. Constantly working with dirty code may
reduce team morale, reducing the level of happiness in the programmers (and by
extension the wider team). Clean code can reduce this level of unhappiness, thus
increasing productivity. This results in obvious benefits to the both the business and
the end-user.
Code Samples
The code samples in this book are generally divided into “clean” and “dirty”. When
viewing a code sample, the namespace will usually indicate one of these, for example:
namespace CleanCSharp.Comments.Dirty.
Order
The book may be read in any order but it may prove beneficial for the reader to read
in sequential order as later chapters may assume previous chapters have been read.
Prescriptive
Whilst the techniques in this book will help to create cleaner C# code, some of the
suggestions may not suit the readers preferences or sense of style. The techniques
⁷http://www2.warwick.ac.uk/fac/soc/economics/staff/dsgroi/papers/manuscriptandappendix.pdf
Introduction 7
in this book will help to create cleaner C# code, though ultimately the team should
decide what and how clean C# code will be implemented by all the developers in the
team.
Clean C# is a Spectrum
Often, developers can exhibit a boolean mindset: black or white, awesome or rubbish,
dead or new. Most of the time there is a range rather than just two absolutes.
It is tempting to also think of code as either being completely clean or completely
dirty, when it is more accurate to think of clean code as a spectrum or scale of
cleanliness. One poorly named variable in an otherwise beautiful solution does not
mean the entire system is dirty.
Comments
Comments can be a highly useful form of clarifying why code is like it is. Often they
are not.
Dirty comments adversely affect the readability of the source code.
namespace CleanCSharp.Comments.Dirty
{
// This defines a class called BasicCalculator
public class Calculator
{
// Define default constructor
public Calculator()
{
}
Notice here how painful it is here to read through this code. Not only does the human
reader have to parse the actual lines of code, they also have to spend mental energy
wading though the repetitive comments.
There is also a subtle inconsistency here too:
Notice that the comment is not only pointless but also misleading, it is stating an
incorrect class name.
Comments can easily become out of sync with the code they describe. To keep them
in sync also requires additional time for no additional benefit.
Repetitive comments should be deleted.
The clean version of the above code looks like the following (note the redundant
default constructor declaration has also been removed:
namespace CleanCSharp.Comments.Clean
{
public class Calculator
{
public int AddTwoNumbers(int a, int b)
{
int result;
result = a + b;
return result;
Comments 10
}
}
}
namespace CleanCSharp.Comments.Dirty
{
/* 10 Oct 2010 Sarah Smith - Created initial version
* Edited 20 Oct 2010 Amrit P - change calculation method
* Edited 20 Nov 2010 Jane Q - fix defect 4286
*/
public class MyClass
{
}
}
Notice all these comments describing why and when the file was changed.
If a capable version control system is being used, these comments are unnecessary
and should be removed.
If a version control system is not being used, one should be implemented, and then
these comments deleted if they are not critical.
namespace CleanCSharp.Comments.Dirty
{
public class SimpleCalculator
{
// Add two numbers together
public int Calculate(int a, int b)
{
return a + b;
}
}
}
Here the comment // Add two numbers together is a substitute for a well-named
method.
In order to make this code self documenting and remove the need for the comment,
the method could be rewritten as follows:
namespace CleanCSharp.Comments.Clean
{
public class SimpleCalculator
{
public int AddNumbers(int a, int b)
{
return a + b;
}
}
}
Here the well-named method obviates the need for an explanatory comment.
Commented-Out Code
Often, especially in legacy code, blocks of code can exist, but in commented-out form.
Take the following example:
Comments 12
namespace CleanCSharp.Comments.Dirty
{
public class AnotherSimpleCalculator
{
public int AddNumbers(int a, int b)
{
// a = a + 42;
return a + b;
}
}
}
Here when the AddNumbers method is read there is some code that is commented
out. What does this code mean? Was it accidentally commented out? Should it be
uncommented?
This introduces uncertainty, disrupts the reader’s flow and harms readability.
This may have been from a previous change being made but the developer making
the change either forgot to remove the code or felt like it might be needed in the
future.
If a version control system is being used, once a logical series of changes is complete,
any temporarily commented-out code should be deleted. The previous version is
available in the version control history if it is ever needed.
namespace CleanCSharp.Comments.Dirty
{
/// <summary>
///
/// </summary>
public class BasicCalculator
{
/// <summary>
/// Adds two numbers
/// </summary>
/// <param name="a"></param>
/// <param name="b"></param>
/// <returns></returns>
public int AddNumbers(int a, int b)
{
return a + b;
}
}
}
In the preceding example about half of all the lines are taken up with meaningless
comments. These obscure the actual code and increase the time required to visually
process the code.
Compare this with the following clean version:
namespace CleanCSharp.Comments.Clean
{
public class BasicCalculator
{
public int AddNumbers(int a, int b)
{
return a + b;
}
}
}
Comments 14
If the project is creating a public API to be used by other people then XML comments
on public types and members can be of great use to the consuming developer. In this
case there is a good argument for adding XML comments.
• Expressive
• Accurate
• Suitable length
• Pronounceable
The name of something should indicate to the reader why it exists in the first place
and what it may be used for.
Expressive
Names should be expressive, they should clearly convey the intent of the writer.
For example, consider the following three variable declarations (from least to most
expressive):
Naming Things 16
Notice in the preceding example that a comment has been used as a substitute for
self documenting code, to make up for the poorly named variable n.
Accurate
Names should be accurate, they should not not mislead the reader into thinking
they mean something else. The following method is named Add when it performs
multiplication:
Whilst this is clearly wrong, if a reader decides to call the method without reading
the code that it contains, they will get unexpected results. A more subtle example of
inaccurate method names are methods with side effects.
Suitable Length
The length of a name should be suitable for the scope and context in which it will
be used. The length of a name should be descriptive enough to convey the intent but
not so long as to tire the reader when reading.
In the following code, try to locate four names with poor lengths:
Naming Things 17
namespace CleanCSharp.Naming.SuitableLength.Dirty
{
public class Cal
{
public int AddTwoNumbersTogetherAndReturnTheResult(
int theFirstNumberToAdd, int theSecondNumberToAdd)
{
return theFirstNumberToAdd + theSecondNumberToAdd;
}
}
}
In the preceding code the public class Cal is available throughout the system (large
scope) but it has an abbreviated name that is too short. What is a Cal and what does
it do? From the name of the class alone there is no way to tell, requiring the reader to
dig into the internals of the class before they can even get a high level idea of what
the class does.
Next the method AddTwoNumbersTogetherAndReturnTheResult is too verbose. If the
name of the class were better (for example Calculator) then simply naming the
method Add would be sufficient.
The final two items are the method parameter names. With a cleanly named class and
method, these names are too verbose. The parameters do not posses any uniqueness
in terms of what they represent: if you call an Add method of a Calculator you expect
to provide some numbers to add together. Because of this, the individual parameters
do not have a high level of semantic meaning when compared against each other, so
using more terse names such as a and b may be acceptable.
The following is a cleaner version:
Naming Things 18
namespace CleanCSharp.Naming.SuitableLength.Clean
{
public class Calculator
{
public int Add(int a, int b)
{
return a + b;
}
}
}
namespace CleanCSharp.Naming.SuitableLength.Clean
{
public class Calculator
{
public int Add(int firstNumber, int secondNumber)
{
return firstNumber + secondNumber;
}
}
}
In the following example, terse parameter names have been used when the parame-
ters do have individual semantic meaning:
Naming Things 19
namespace CleanCSharp.Naming.SuitableLength.Dirty
{
public class NewUserValidator
{
public bool ValidateName(string a, string b)
{
return true; // for demo code purposes
}
}
}
When calling ValidateName what do a and b do? We could guess that a represents
the first name and b the last name but the internals of the method would need to
be examined to confirm this assumption. In this example it makes sense to have
more semantically rich and expressive parameter names as in the following cleaner
version:
namespace CleanCSharp.Naming.SuitableLength.Clean
{
public class NewUserValidator
{
public bool ValidateName(string firstName, string lastName)
{
return true; // for demo code purposes
}
}
}
There are other options for cleanliness here, for example there could be
separate methods that validate the first name and last name, the name of
the method could also be improved. See the chapter on clean methods for
more information.
Naming Things 20
Pronounceable Names
Reading is a natural thing for the brain to do. When reading code we are interpreting
not only the C# language itself (keywords, etc.) but also the names of things. Having
names that are pronounceable helps readability as the brain has less work to do to
interpret the word(s).
The following code has some examples of unpronounceable names:
namespace CleanCSharp.Naming.Pronounceable.Dirty
{
public class NewUsrValidtr
{
public bool ValidateNme(string fstNme, string lstNme)
{
return true; // for demo code purposes
}
}
}
When reading this there is additional mental effort required for the brain to parse
names such as fstNme into “first name”. Imagine also talking with a fellow developer
about this code, how would fstNme be pronounced: “fust numee”, “fastnu me”? A
cleaner version would be as follows:
namespace CleanCSharp.Naming.Pronounceable.Clean
{
public class NewUserValidator
{
public bool ValidateName(string firstName, string lastName)
{
return true; // for demo code purposes
}
}
}
Naming Things 21
If using brand names that feature specific capitalisation as part of the brand
then the above rules may be overridden.
Namespaces
Namespaces should use Pascal Casing. They should accurately describe what the
reader is likely to find contained within.
Generally speaking, catch-all generic namespaces like Helpers or Utilities should
be avoided, though more specific versions such as HtmlHelpers or StringUtilities
are usually more indicative of what may be contained within the namespace.
Namespaces should not usually be versioned. For example “MyCompany.AwesomeLibV1”
and “MyCompany.AwesomeLibV2” would usually be considered bad practice.
MSDN⁹ specifies the following naming convention for namespaces:
⁸http://msdn.microsoft.com/en-us/library/ms229002%28v=vs.110%29.aspx
⁹http://msdn.microsoft.com/en-us/library/ms229026%28v=vs.110%29.aspx
Naming Things 22
[CompanyName].[ProductOrTechnology].[Feature].[Subnamespace]
The argument for the CompanyName is to prevent conflicts when working with
other libraries. It should be noted that not all authors or projects conform to this
naming convention.
It is acceptable to use plural namespace names where appropriate, for example in the
System.Collections namespace.
A namespace should also not be the same name as a type defined within that
namespace, for example a namespace of Chocolate that contains a class also called
Chocolate.
Interfaces
Interfaces should use Pascal Casing and be prefixed with the letter “I”, for example
IControl.
Interface names should use adjectives or adjective-phrases such as “ITransformable”
and “ISavable” or nouns/noun-phrases such as “IShape” and “IShapeTransformer”. In
the case of nouns/noun-phrases it may indicate that the interface be better defined
as a class or abstract class instead.
In the book “Clean Code”, Robert C. Martin argues for the dropping of
the “I” that precedes interface names. Whilst we should never keep doing
something just because “that’s the way it’s always been done”, the “I”
convention in C# is one that may cause more confusion to the general
reader if it were omitted, than benefit gained by its omission. As always
the team should decide and all the developers should conform to the team’s
expectations.
Classes
Classes should use Pascal Casing and use nouns or noun-phrases such as Customer,
Order, and ProspectiveCustomer.
Naming Things 23
Class name should never be prefixed with encodings such as “c” or “cls” such as
cCustomer or clsCustomer.
Methods
The names of methods should use Pascal Casing and be verb or verb-phrases that
signify the performing of some action. So a method called Customer is poorly named
as it is a noun rather than a verb. On the other hand a method called SaveCustomer
is a verb-phrase that indicates some action will be performed.
Properties
Properties should use be Pascal Casing and should use adjective or noun/noun-phases
such as Color or CustomerNumber.
If the property represents a collection of things, rather than simply adding the word
“List” or “Collection” as in OrderList it is usually more readable to simply pluralise
the property name such as: Orders.
Events
Events should use Pascal Casing and be named using verb/verb-phrases such as
Clicked, Opened, and Closed.
If the event describes a concept of something that happens before/after, use meaning-
ful past or present tense verbs; so rather than BeforeClose, use Closing and instead
of AfterClose use Closed.
¹⁰http://msdn.microsoft.com/en-us/library/ms229040%28v=vs.110%29.aspx
Naming Things 24
In event handlers, favour the parameter naming conventions “sender” and “e”, for
example: object sender, SomeEventArgs e.
Fields
There are no recommended guidelines from MSDN¹¹ for internal or private fields.
For public static fields and protected fields use Pascal Casing and noun/noun-phrases
or adjectives.
One tradition for private fields is to prefix the identifier with an underscore,
such as int _age; This is probably an unnecessary form of “encoding”
other information in the names of things. In a small, highly focussed class
the underscore would probably be unnecessary, so just: int age;. In this
case follow Camel Casing rules.
Attributes
When defining custom attributes, use Pascal Casing and add the suffix “Attribute”,
for example use public class ThisIsAwesomeAttribute : Attribute rather than:
public class ThisIsAwesome : Attribute.
Method Parameters
Use Camel Casing for method parameters and favour descriptive names for seman-
tically rich parameters.
Variables
Use Camel Casing for local variables. The length of loop counters may be single
letters, such as using i as a for loop counter. If the intent can be increased by using
a more descriptive (longer) loop counter variable name then this is also acceptable.
¹¹http://msdn.microsoft.com/en-us/library/ms229012%28v=vs.110%29.aspx
Naming Things 25
Booleans
When naming variables, methods, properties, etc. that represent Boolean values
consider naming them so they can read in such a way as they answer a yes or no
question. Consider how they would read when used in an if statement for example.
The following are some dirty examples:
namespace CleanCSharp.Naming.Booleans.Dirty
{
class BooleanRelatedNames
{
public void SomeMethodWithBooleanVariables()
{
bool close = false;
if (close)
{
// etc.
}
if (user)
{
//
}
}
namespace CleanCSharp.Naming.Booleans.Clean
{
class BooleanRelatedNames
{
public void SomeMethodWithBooleanVariables()
{
bool isClosed = false;
if (isClosed)
{
// etc.
}
if (loggedIn)
{
//
}
}
Rather than just prefixing all Booleans with “is”, consider how the statement
would read, maybe “has” improves the readability or maybe something like (in
the preceding example) loggedIn reads fine. Notice that even without “is”/”has”,
loggedIn answers a yes/no, true/false question: “is the user logged in?”.
Naming Things 27
Enums
Use Pascal Casing for enums and use a singular name unless the enum is a
bitwise/flags enum in which case use plural names. Do not add suffixes such as
“Flags” or “Enum” to names.
For example a bitwise/flags enum for display options would be named DisplayOptions
not DisplayOption.
When trying to reduce the number of lines in methods, this should not
be taken as a simple case of trying to fit as many logical operations on a
single line as possible using cryptic and hard to read code. If a method has
a high level of functional cohesion with each line of code at the same level
of abstraction, then it should not require hundreds of lines of code.
There are other factors (below) that contribute to clean methods, however method
size is one of the key things to look for. This does not mean that a method with fewer
lines of code is automatically clean, it could still be poorly named or have too many
parameters for example.
Some signs that a method may be too large include:
Again these are just approximate guidelines - while method size is a good indicator
of cleanliness, the other key factor is how many different things the method is doing.
Cohesive Methods
There are a number of different types of cohesion (how strongly related things are);
one of these is functional cohesion.
All of the lines of code inside a method that has a high level of functional cohesion
will all relate to performing a single logical task.
It is harder for methods to remain small if they are doing too much or too varied a
task.
In the following code, notice that the Process method is doing two different things:
validation and saving a Customer to a service.
namespace CleanCSharp.Methods.Dirty
{
class Utils
{
public int Process(Customer customer)
{
if (string.IsNullOrWhiteSpace(customer.FirstName)
|| string.IsNullOrWhiteSpace(customer.LastName))
{
return -1;
}
else
{
var service = new CustomerService();
Methods 30
if (!service.Save(customer))
{
return -1;
}
else
{
return 1;
}
}
}
}
}
In the preceding code (dirty naming aside) the method Process is responsible for
doing two separate logical things: validating that the first name or last name is not
null/empty; and saving the customer (using a CustomerService). The number of lines
of code in the method body (excluding blank lines) is 16, which falls into the category
“possibly clean, but take notice”. This method is clearly not clean because it is doing
too much: validating and saving.
This is a key point. The cleanliness of code is a holistic matter. Just because
the number of lines of code in a method may seem “clean”, the method can
still be dirty in other ways.
The Process method in the preceding code has low functional cohesion. Also because
it is doing two different logical things, it is also harder for the method to remain
smaller.
This method could be refactored into two separate methods, each method now being
more functionally cohesive as shown in the following code:
Methods 31
namespace CleanCSharp.Methods.Clean
{
class Utils
{
public int Process(Customer customer)
{
const int customerNotSaved = -1;
const int customerSavedSuccessfully = 1;
if (!IsValidCustomer(customer))
{
return customerNotSaved;
}
if (!SaveCustomer(customer))
{
return customerNotSaved;
}
return customerSavedSuccessfully;
}
return true;
}
return successfullySaved;
}
}
}
In this cleaner version, the Process method is now only 11 lines of code. Some
constants have been introduced to replace the magic values (-1 and 1) and the nesting
has been reduced by eliminating an unnecessary else. There is more that could
be done here, but for the purpose of exploring functional cohesiveness, notice that
both the IsValidCustomer and SaveCustomer methods each now do one well-defined
thing. Both of these methods have high functional cohesion, because of this, notice
that the methods are also short: 5 and 3 lines of code respectively.
If a method is doing both of these things there may be an opportunity to refactor it.
Take the following IsValidCustomer method:
return true;
}
This clearly falls into the “answers a question for the caller” category; it allows the
caller to answer the question “is the customer valid?”.
Contrast this with the following SaveCustomer method:
return successfullySaved;
}
In this example the Boolean return value of the method is being used as an error
flag. Ideally this would be refactored to use an exception based approach.
Like the other aspects that make up clean code, there may be valid reasons to allow
these two concepts to be mixed in a single method. For example, in ORM code
when the Update, SaveChanges, etc. methods are called they may return an integer
representing the number of database records affected.
Method Parameters
The more parameters a method has, the harder it becomes to understand.
Methods can be divided into a number of categories based on the number of
parameters they take:
Again these are just guidelines, the other factors of method cleanliness
should also be considered.
return true;
}
This method takes a single parameter called customer. Even though a Customer
contains multiple properties it is still classed as a single parameter.
Monadic methods generally fall more easily into one of the two “action” or “answer-
ing method types”.
Because there are now two parameters, it can become easier for bugs to be intro-
duced. For example if a method has two parameters, both of type string then it is
easier to get the values passed to these two strings mixed up.
If it is a simple matter to refactor a dyadic method into method(s) that take only
a single parameter this may increase readability and cleanliness, but only where it
makes sense and is not done in a contrived way. For example a method to plot a
co-ordinate could naturally require two parameters: x and y; or a method to add two
numbers together for example.
class PlotPoint
{
public int X { get; set; }
public int Y { get; set; }
}
The method can now be written using the monadic form as follows:
If the original method also contained a parameter to choose the size of the plotted
point then this refactoring would have reduced a triadic method to a monadic one
by also adding a Size property to the PlotPoint class.
Params
The params keyword allows a method parameter to take a variable number of values
when the method is called. Generally speaking a params parameter still counts as
a single parameter. For example a method with a single params parameter is still a
monadic method. However, params should not be used as a hack to reduce the number
of parameters in a method.
Output Parameters
C# allows a parameter to be defined as an out parameter. This means that what looks
like an input parameter can also function as an output from the method, perhaps in
addition to an actual return value.
Output parameters may be occasionally useful, for example the various TryParse
methods in the .NET framework that return a boolean if a value can be parsed from
a string, in addition to the output value in an out parameter.
The natural interpretation of method parameters is that they pass some information
into the method for it to use, rather than as a mechanism for the method to return
Methods 38
a value. Because of this, out parameters should be avoided unless there is a good
reason to use them.
Named Arguments
If a method with multiple parameters exists but cannot be refactored for some reason,
one way to increase the readability of the calling code is to explicitly state the
parameter names.
Assuming the following Plot method can not be refactored:
This method takes three parameters all of type int and can be called using positional
arguments with Plot(10, 15, 10); This is not particularly readable without needing
to dig into the method parameters. This call could instead be re-written using named
arguments: Plot(x: 10, y: 15, size: 10); This is slightly more verbose but
removes the ambiguity of what each argument means.
If a monadic method needs a named argument to improve readability, it may be a
sign that the method itself is not well-named.
namespace CleanCSharp.Methods.Dirty
{
class BooleanSwitchingArgumentsExample
{
public void CallingCode()
{
if (DateTime.Now.Hour < 12)
{
OutputGreeting(true);
}
else
{
OutputGreeting(false);
}
}
namespace CleanCSharp.Methods.Clean
{
class BooleanSwitchingArgumentsExample
{
public void CallingCode()
{
if (DateTime.Now.Hour < 12)
{
OutputMorningGreeting();
}
else
{
OutputDaytimeGreeting();
}
}
Multiple Returns
It is acceptable to have multiple return statements in a method if this improves the
clarity. This may also reduce the number of lines of code in the method.
In the following example, a strict “only ever have one return statement” policy has
been implemented:
Methods 41
namespace CleanCSharp.Methods.Dirty
{
class MethodExitPoints
{
public string GenerateAgeAppropriateGreeting(
int customerAgeInYears)
{
string greeting;
return greeting;
}
}
}
Compare this to the following method that uses multiple return statements:
Methods 42
namespace CleanCSharp.Methods.Clean
{
class MethodExitPoints
{
public string GenerateAgeAppropriateGreeting(
int customerAgeInYears)
{
if (customerAgeInYears < 16)
{
return "Yo!";
}
and is clearly bad practice. Not only this, it also means the method is doing more
than one logical thing.
Other side effects may include the unexpected changing of field/property values,
raising unexpected events, and changing input method parameter object values.
Structuring Programs for
Readability
The overall structure of large codebases can help to enhance readability or reduce it.
One way to improve readability is to think of the reader’s brain as only being able
to work at a similar level of abstraction at any given time.
For example, suppose a method that is named at a higher abstraction level such
as ValidateCustomer. The level of abstraction that the brain would expect for this
method (and the code it contains) is to describe validation/business rules. If the
ValidateCustomer method also performs work at a lower abstraction level such as
setting database fields, then the reader’s brain is having to switch mental abstraction
levels when trying to read and understand the code.
Levels of Abstraction
One metaphor to represent this idea of abstractions is that of traditional paper-based
books in a public lending library.
The following diagram represents the abstraction levels of source code using this
metaphor.
Structuring Programs for Readability 45
Levels of Abstraction
Structuring Programs for Readability 46
While this metaphor is not perfect, it seeks to illustrate the fact that items higher up
in the diagram need a different level of cognitive processing than items lower down.
Imagine a library building that had no shelves or books but rather pages of books
stuck to all the walls of the building. In this case there are greatly reduced levels
of abstraction, essentially reduced to pages and paragraphs. Clearly this harms
readability, even it might make a good art installation.
Grouping concepts into similar levels of abstraction can help in a number of ways.
Firstly it can help to improve navigation around the codebase. For example in a
traditional book it is easy to gradually “drill down” to find the right paragraph
by scanning the chapters first (higher abstraction level, less cognitive load) and
then drilling down into specific paragraphs. The second way is the reduction of
unnecessary cognitive processing. If for example a method is operating at multiple
abstractions levels, then it takes more mental energy to “read between the lines” and
only focus on the abstraction level that is required for the current task.
if (string.IsNullOrWhiteSpace(
prospectiveCustomer.SecondName))
{
throw new ArgumentException("Invalid SecondName");
}
return newValidCustomer;
}
}
Structuring Programs for Readability 48
Notice in the preceding code, other than the method doing too many different things,
it is also hard to get an overall higher-level-of-abstraction understanding of what is
going on. For example the exact rules that determine what a valid FirstName are are
lower in abstraction level than the overall process of creating a new customer. Also
notice the logic to determine if a new customer is a priority customer includes lower
level detail such as the annual income being greater than 100000.
Compare the preceding code to the following refactored version:
var validatedCustomer =
CreateNewCustomerFrom(prospectiveCustomer);
SetCustomerPriority(validatedCustomer);
return validatedCustomer;
}
EnsureValidSecondName(prospectiveCustomer);
}
Structuring Programs for Readability 49
customer.IsPriorityCustomer = true;
}
}
}
In the preceding code, the class has been refactored in an attempt to represent
different levels of abstractions (as noted by the comments that have been included
purely for demo purposes). This means that a reader can more easily choose what
abstraction level they need to perform a particular task. For example if the reader just
wants a high level understanding of the steps to convert a ProspectiveCustomer to a
Customer their brain can operate at that level of abstraction without being distracted
by lower level details (such as specifics on annual income numbers).
Errors and Exceptions
Error handling is an essential part of most codebases. There are a number of ways of
handling errors and allowing calling code to respond to errors.
namespace CleanCSharp.Errors.Dirty
{
public class SomeClass
{
public int DoSomeProcess(int? id)
{
if (id == null)
{
return -1; // null id
}
if (string.IsNullOrWhiteSpace(data))
{
return -2; // data is corrupt
}
ProcessData(data);
Errors and Exceptions 52
A consumer of this code will need to check the various status codes to know what
has happened as the following code shows.
namespace CleanCSharp.Errors.Dirty
{
public class ConsumerOfSomeClass
{
public void Consume()
{
var sc = new SomeClass();
switch (returnCode)
{
case -1: // null id
// do something
break;
Errors and Exceptions 53
Notice in the preceding code that there is a lot of clutter; the business logic/applica-
tion flow is harder to recognize due to all the error handling code.
There are a number of other problems with the error code approach. First, every time
the DoSomeProcess method is called anywhere in the codebase, the calling code must
check the return codes. Assuming the programmer remembers to do this and the
correct error codes are used, code duplication creeps in and readability is reduced.
Second, these magic numbers representing the error codes do not have a lot of
meaning, for example when a reader sees -2 they will need to do further reading
to try and understand the error that is being handled.
Returning error codes is also limited in other ways, for example if the method already
returns a value, how is an additional error return code added? The same limitation
exists when accessing properties.
Using Exceptions
Rather than returning error codes, in C#, error handling can be better implemented
using exceptions. Exceptions can simplify the calling code and make it easier for the
reader to reason about the error handling that has been implemented. In C#, the throw
Errors and Exceptions 54
statement is used to create an error condition. The try, catch, finally keywords can
be used to detect and respond to error conditions.
The following code shows a refactored version of SomeClass that uses exceptions
rather than error return codes. Notice that the DoSomeProcess method appears more
readable than the preceding version.
using System;
using System.IO;
namespace CleanCSharp.Errors.Clean
{
public class SomeClass
{
public void DoSomeProcess(int? id)
{
if (id == null)
{
throw new ArgumentNullException("id");
}
ProcessData(data);
}
if (string.IsNullOrWhiteSpace(demoData))
{
throw new
InvalidDataException(
"The data stream contains no data.");
}
Errors and Exceptions 55
return demoData;
}
using System;
using System.Diagnostics;
using System.IO;
namespace CleanCSharp.Errors.Clean
{
public class ConsumerOfSomeClass
{
public void Consume()
{
var sc = new SomeClass();
try
{
sc.DoSomeProcess(idToProcess);
}
catch (ArgumentNullException ex)
{
// null id
// do something
throw;
}
catch (Exception ex)
{
// any other exceptions that may occur
// do something
throw;
}
Save(idToProcess);
}
Notice that rather than harder to understand error codes, the reader can now
understand what types of error may occur by looking at the exception types in the
Errors and Exceptions 57
catch blocks. Also notice that the exceptions in the catch blocks are caught from the
more specific exceptions down to the most general (the Exception class).
Usually, an exception is only caught if it needs to be handled in some way. When
an exception has been caught in a catch block, the problem can either be fixed or
if it cannot be fixed it can be propagated “rethrown” to higher level callers. When
rethrowing exceptions, using throw; will throw the same exception to higher level
callers, if throw ex; is used the exception will still be rethrown but the stack trace
of the thrown exception will not be that of the originally caught exception.
• System.ArgumentException
• System.ArgumentNullException
• System.ArgumentOutOfRangeException
• System.InvalidOperationException
• System.NotSupportedException
using System;
namespace CleanCSharp.Errors.Clean
{
public class MyCustomException : Exception
{
public MyCustomException()
{
}
Try Methods
In addition to throwing exceptions, an additional option that can be provided to
consumers is to call a Try method. A Try method returns true if the operation
succeeded and false if it failed (and does not throw an exception). The Try method
usually has an out parameter by which the result of the operation can be passed to
the caller.
The following code shows an example of implementing a method (Parse) that throws
an exception and a companion Try version (TryParse).
using System;
namespace CleanCSharp.Errors.Clean
{
public class Color
{
public static Color Parse(string colorName)
{
if (!IsValidColor(colorName))
{
throw new ArgumentOutOfRangeException(
"colorName",
colorName + " is not a valid color");
}
if (!IsValidColor(colorName))
{
color = null;
return false;
}
Calling code can make use of the TryParse method as shown in the following code.
Color c;
interface) to a “real” object but means that the consumer does not have to write null
checking logic or possibly catch ArgumentNullExceptions.
The following code shows a simple example where the EmailCustomer method has
to first perform a null check before sending emails.
namespace CleanCSharp.Errors.Dirty
{
public class Customer
{
public string EmailAddress { get; set; }
}
}
}
}
namespace CleanCSharp.Errors.Clean
{
public class Customer
{
public string EmailAddress { get; set; }
c.SendEmail("Hello!");
}
}
}
Visual Formatting
The way that source code is visually formatted can have a great effect on the
readability of programs. Even if the code is otherwise clean, poor visual formatting
can still hurt readability.
The following code shows the version of ProspectiveCustomerValidator that main-
tains some of the other clean code practices outlined in this book, but with some
vertical whitespace removed.
namespace CleanCSharp.VisualFormatting.Dirty{
public class ProspectiveCustomerValidator{
public Customer CreateValidatedCustomer(
ProspectiveCustomer prospectiveCustomer){
EnsureValidDetails(prospectiveCustomer);
var validatedCustomer=CreateNewCustomerFrom(
prospectiveCustomer);
SetCustomerPriority(validatedCustomer);
return validatedCustomer;
}
private static void EnsureValidDetails(
ProspectiveCustomer prospectiveCustomer){
EnsureValidFirstName(prospectiveCustomer);
EnsureValidSecondName(prospectiveCustomer);}
private static Customer CreateNewCustomerFrom(
ProspectiveCustomer prospectiveCustomer){
return new Customer{
FirstName=prospectiveCustomer.FirstName,
SecondName=prospectiveCustomer.SecondName,
AnnualIncome=prospectiveCustomer.AnnualIncome};}
private static void EnsureValidFirstName(
Visual Formatting 65
ProspectiveCustomer prospectiveCustomer){
if (string.IsNullOrWhiteSpace(
prospectiveCustomer.FirstName)){
throw new ArgumentException("Invalid FirstName");
}}
private static void EnsureValidSecondName(
ProspectiveCustomer prospectiveCustomer){
if (string.IsNullOrWhiteSpace(
prospectiveCustomer.SecondName)){
throw new ArgumentException("Invalid SecondName");
}
}private static void SetCustomerPriority(Customer customer)
{
if (customer.AnnualIncome > 100000){
customer.IsPriorityCustomer = true;
}
}
}
}
Notice that when reading the preceding code that the eye has to try very hard to
differentiate different logical parts of the code.
There are some accepted design principles that can be applied to source code
formatting. These principles are often referred to as the Gestalt principles.
namespace CleanCSharp.VisualFormatting.ExampleCode
{
public class Class1
{
public void A()
{
//
}
The principle of Proximity can be used when declaring variables as the following
code demonstrates.
Visual Formatting 67
string x;
decimal y;
}
In the preceding code there is a sense that isVipCustomer and years are related
(though years should be renamed to something like yearsAtVipStatus rather than
relying solely on Proximity).
Proximity also applies to where variables are declared, for example the traditional
approach of declaring all variables at the top of the method (lower proximity), versus
declaring them throughout the method close to where they are first needed (higher
proximity).
_firstName = "Sarah";
_lastName = "Smith";
fullName = _firstName + " " + _lastName;
_firstName = "Sarah";
_lastName = "Smith";
Uniform Connectedness
In C#, code blocks contain groups of (hopefully) related code - lines of code are
contained inside sets of braces {}. These braces contain code (much like the line
containing the dots) and can increase the feeling of relatedness.
Visual Formatting 69
bool a;
bool b;
bool c;
bool d;
These Booleans feel related due to their proximity and their similarity of names
(single letters, ascending order). If these same variables are contained in additional
braces then this changes the perception of their relatedness as the following code
demonstrates.
{
bool a;
bool b;
}
{
bool c;
bool d;
}
In the preceding code there are now two strongly distinct groups.
if (true) {
Console.Write("true");
}
if (true)
{
Console.Write("true");
}
Cohesion and Coupling
The concepts of cohesion and coupling are important concerns when it comes to
creating clean C# code.
Cohesion
Cohesion is the “relatedness” of different pieces of code, it is the degree to which
pieces of code belong together. Often the level of cohesion is referred to as “low
cohesion” or “high cohesion” though there are a number of “degrees of cohesion”,
rather than it being a binary proposition.
The concept of cohesion can be applied at different levels in the source code. For
example an individual method could exhibit low cohesion if the lines of code within
it do not really belong together. The same thinking can be applied at the class level,
i.e. do all the methods, properties, etc. belong together? At a higher level, cohesion
can also be applied to namespaces and assemblies.
The following code shows a class that would be described as having low cohesion.
namespace CleanCSharp.CohesionAndCoupling.Dirty
{
public static class Utils
{
public static int AddNumbers(int a, int b)
{
return a + b;
}
}
}
}
In the preceding code, the Utils class itself exhibits low cohesion. The methods
inside the class are not highly related, the addition of two numbers has no relation
to validating a customer’s age or processing an Order. The ProcessOrder method
itself also exhibits low cohesion; the method is validating a customer and saving it
to a database. Notice in both these cases the names of the class and method are not
very specific, the class itself is called Utils which does not have much meaning. The
ProcessOrder method is also very non-specific. Often these generalized, non-specific
names are an indication that cohesion may be lacking.
There are various degrees of cohesion, from best to worst:
• Functional Cohesion: code that does highly related and well-defined task(s)
• Sequential Cohesion: code grouped by the (data) output of one thing being the
(data) input to the next, like a car assembly line
• Communicational Cohesion: code grouped together because it uses the same
set/table/entity of data
• Procedural Cohesion: code grouped by being parts of a task that need to be
executed in a particular order
• Temporal Cohesion: code grouped by when it is executed in the program
execution life cycle
Cohesion and Coupling 73
Coupling
Whereas cohesion refers to the relatedness of pieces of code, coupling refers to how
strong the connection is between pieces of code. Pieces of code that are strongly
coupled, like two pieces of plastic superglued together, are hard to separate, reuse
and test independently.
Code that is “superglued” together is referred to as highly coupled, strongly coupled,
or tightly coupled. Conversely, non-“superglued” code is referred to using terms like
low coupling, loosely coupled, or weakly coupled.
There are many ways that pieces of code can be coupled. The following code shows
an example of Global Coupling (also known as Common Coupling).
namespace CleanCSharp.CohesionAndCoupling.Dirty
{
public class ClassA
{
public static string SomeSharedData;
}
In the preceding code ClassB and ClassC are coupled to ClassA by accessing the static
SomeSharedData field. The coupling here occurs in the lines ClassA.SomeSharedData
= "xxxx"; and var someVariable = ClassA.SomeSharedData;. The classes ClassB
and ClassC are also harder to test in isolation. In a test ClassA will also need to be
accessed because it contains the shared data.
Another way that tight coupling can manifest itself is by relying on instantiation of
concrete dependencies in code rather than instead relying on abstractions (interfaces,
abstract classes) and having the dependency passed into to the class. Thus the
consuming class does not control the creation of its dependencies, but something
external to the class creates a concrete instance and passes this to the class.
The following code shows the SendWelcomeEmails method creating a new concrete
EmailGateway before using it. The line var gateway = new EmailGateway();
creates a tight coupling between the NewCustomerWelcomeEmailSender class and the
EmailGateway class.
Cohesion and Coupling 75
namespace CleanCSharp.CohesionAndCoupling.Dirty
{
public class EmailGateway
{
public void SendEmail(string address, string messageBody)
{
// etc.
}
}
namespace CleanCSharp.CohesionAndCoupling.Clean
{
public interface IEmailGateway
{
void SendEmail(string address, string messageBody);
}
Execution Speed
When executing unit tests, tests that execute only a small part (perhaps just a single
class) of the overall codebase, they should execute very quickly. While there is no
single absolute rule for the maximum time, if developers are waiting for 30 minutes
to run all the unit tests there may be a problem.
One of the purposes of unit tests is to get quick feedback once changes are made.
Whilst a developer may run a subset of tests, as a general rule the entire suite of
all unit tests should ideally execute in multiples of seconds rather than minutes. The
ideal maximum time will depend on the size and complexity of the code being tested.
Integration tests may rely on communication with out of process resources such as
the file system or a database. These types of tests will naturally run slower than unit
tests.
Valuable
Tests should provide some value. There is a cost to both create them and maintain
them over time. It seems like an obvious statement, but there is little value in testing
auto-implemented C# property setters and getters.
To test the PersonWriter.Write method the following simple test could be written
(using the xUnit.net testing framework):
Clean Tests 82
[Fact]
public void WithoutAutoFixture()
{
// Arrange phase
// Act phase
// Assert phase
Notice in this preceding test, a Person needs to be created (with a name) to be able
test the PersonWriter.
The test can be refactored to use AutoFixture to create a Person for us and
automatically set the Name property:
Clean Tests 83
[Fact]
public void WithAutoFixture()
{
// Arrange phase
// Act phase
// Assert phase
In the preceding test, the actual Name of the Person is irrelevant, it is now “anony-
mous”.
Combining AutoFixture with xUnit.net theories allow the test to be reduced further
as shown in the following test:
Clean Tests 84
[Theory]
[AutoData]
public void WithAutoData(PersonWriter sut)
{
// Arrange phase performed automatically for us
// Act phase
var result = sut.Write();
// Assert phase
To learn more about AutoFixture, see the project site readme document¹³ or the
Author’s Pluralsight course¹⁴.
¹³https://github.com/AutoFixture/AutoFixture/blob/master/README.md
¹⁴http://bit.ly/psautofixture
Building On Clean Code
There are a number of useful principles that build on top of or compliment the other
Clean C# concepts covered in this book. These principles can further improve the
overall cleanliness of the codebase.
• Classes only need to be changed when the “thing” that they are responsible for
change
• Classes have fewer lines of more highly-related code, making it easier for a
developers to understand them
• More individual source code files (one for each class) could result in fewer merge
conflicts in larger teams
• Classes may be more easily testable
Building On Clean Code 86
LSP can also help to identify where the OO model may be incorrect. For example, if
code starts to look like if (shape is Square) ... else if (shape is Rectangle)
..., it can be a “code smell” and the design evaluated against LSP.
Adhering to LSP can help to reduce complexity in calling code so that it can work
generically on any class or subclass without additional conditional code.
Building On Clean Code 87
There is a more subtle meaning to the DRY principle: “every piece of knowledge must
have a single, unambiguous, authoritative representation within a system” [Hunt, A
& Thomas, D. (1999) The Pragmatic Programmer: From Journeyman to Master]
This means that any given concept or abstraction that the code embodies should be
defined only once, in one place, and should be clear to the reader/maintainer.
DRY can also apply to things such as configuration files; for example a 300 line
configuration file that has 5 copies, one for each deployment environment, but that
only contains a few lines difference could be considered a violation of the DRY
principle. Instead there could be only one copy that gets transformed to change the
relevant values for the different environments.
• Bloated code: unused code exists now, and may never be removed
• Increased Cost: the unused code may still need testing and maintaining
• Slipped dates: the unused code takes time away from other features that may
impact overall delivery dates
• Waste: by the time the unused code is actually required, it may no longer be
valid if business requirements have changed in the meantime