Lessons from Clean Code and Refactoring

I’m not an OOP fan (although C++ is more my home language, than any other) so I’ve omitted the purely OOP stuff, which is heavily present in these books. (Indeed they at times seem to confuse OOP and refactoring/cleaning. OOP/refactored/clean=good, procedural/dirty=bad)

One difference between a smart programmer and a professional programmer is that the professional understands that clarity is king. Professionals use their powers for good and write code that others can understand.

I am not expecting you to be able to write clean and elegant programs in one pass. If we have learned anything over the last couple of decades, it is that programming is a craft more than it is a science. To write clean code, you must first write dirty code and then clean it.


– Comments are (mostly) bad. …comments are, at best, a necessary evil. …The proper use of comments is to compensate for our failure to express ourself in code. Note that I used the word failure. I meant it. Comments are always failures. We must have them because we cannot always figure out how to express ourselves without them, but their use is not a cause for celebration. …So when you find yourself in a position where you need to write a comment, think it through and see whether there isn’t some way to turn the tables and express yourself in code. Every time you express yourself in code, you should pat yourself on the back. Every time you write a comment, you should grimace and feel the failure of your ability of expression. … Why am I so down on comments? Because they lie. Not always, and not intentionally, but too often. The older a comment is, and the farther away it is from the code it describes, the more likely it is to be just plain wrong. The reason is simple. Programmers can’t realistically maintain them.
++++– they rot very quickly, becoming out-of-date and actively misleading/confusing
++++– Where a comment is needed to describe what a section does, make a function with that name.
++++ i.e. not
++++++++ #this bit draws the picture
++++ but
++++++++ DrawPicture()
++++ – comments that can be good: explanations of intent, warnings of consequences, clarification


– make names of variables, constants, functions etc as illuminating as possible. Names in software are 90 percent of what make software readable. You need to take the time to choose them wisely and keep them relevant. Dont name variables at too low a level. Use long names for long scopes; “i” is fine for a loop variable that only lasts a few lines. Names should describe everything that a function, variable, or class is or does. Don’t hide side effects with a name. Don’t use a simple verb to describe a function that does more than just that simple action.
If you have to look at the implementation (or documentation) of the function to know what it does, then you should work to find a better name or rearrange the functionality so that it can be placed in functions with better names.


There are blank lines that separate the package declaration, the import(s), and each of the functions. Spaces around = and == etc. Spaces in maths to accentuate precedence e.g. a = 3*4 + 5


– if a calculation looks too complex, break it up, introduce extra “explaining variables” with helpful names.
– a temporary variable should only do one thing. Split them into multiple variables if need be.
– replace magic numbers with well-named constants
– get rid of boolean control flags


Functions should not be 100 lines long. Functions should hardly ever be 20 lines long.

Command Query Separation
Functions should either do something or answer something, but not both. Either your function should change the state of an object, or it should return some information about that object. Doing both often leads to confusion.

The ideal number of arguments for a function is zero (niladic). Next comes one (monadic), followed closely by two (dyadic). Three arguments (triadic) should be avoided where possible. More than three (polyadic) requires very special justification—and then shouldn’t be used anyway.
Flag arguments are ugly. Passing a boolean into a function is a truly terrible practice. [functions should do only one thing – this function does 2 things!]
When a function seems to need more than two or three arguments, it is likely that some of those arguments ought to be wrapped into a class of their own. Consider, for example, the difference between the two following declarations:
++++ Circle makeCircle(double x, double y, double radius);
++++ Circle makeCircle(Point center, double radius);

Have No Side Effects
Side effects are lies. Your function promises to do one thing, but it also does other hidden things.

– simplify conditionals until they’re very simple

bad smells

duplicated code, long function, long parameter list, data clumps (needing to be put together into one object/structure), switch statements, comments (comments often are used as a deodorant. It’s surprising how often you look at thickly commented code and notice that the comments are there because the code is bad.)

from Refactoring:
When you find you have to add a feature to a program, and the program’s code is not structured in a convenient way to add the feature, first refactor the program to make it easy to add the feature, then add the feature.

Before you start refactoring, check that you have a solid suite of tests. These tests must be self-checking.

Any fool can write code that a computer can understand. Good programmers write code that humans can understand.

When you feel the need to write a comment, first try to refactor the code so that any comment becomes superfluous.

A good time to use a comment is when you don’t know what to do. In addition to describing what is going on, comments can indicate areas in which you aren’t sure. A comment is a good place to say why you did something. This kind of information helps future modifiers, especially forgetful ones.

REFACTORINGS (not including the purely OO ones)
Methods (i.e. functions, procedures)
Rename method
Add parameter
Remove parameter
Separate query from modifier
Methods should do one thing! If they return a value, they shouldnt change anything (no side-effects) – unless that’s obvious from the method name.
Parameterize method
Replace parameter with explicit methods
Preserve whole object
Instead of passing a parameter list, pass the whole object.
Replace parameter with method
Introduce parameter object
Form template method
Extract Method
Inline Method
Inline Temp
Replace Temp with Query
Introduce Explaining Variable
Split Temp
Remove assignments to parameters
Substitute algorithm
Replace magic number with symbolic constant
Decompose conditional
Consolidate conditional expression
Consolidate duplicate conditional fragments
Remove control flag
Introduce assertions
Replace nested conditional with guard clauses


The Three Laws of TDD
By now everyone knows that TDD asks us to write unit tests first, before we write production code. But that rule is just the tip of the iceberg. Consider the following three laws:
FIRST LAW You may not write production code until you have written a failing unit test.
SECOND LAW You may not write more of a unit test than is sufficient to fail, and not compiling is failing.
THIRD LAW You may not write more production code than is sufficient to pass the currently failing test.

Test code is just as important as production code. It is not a second-class citizen. It requires thought, design, and care. It must be kept as clean as production code.

The BUILD-OPERATE-CHECK pattern is made obvious by the structure of these tests.
Each of the tests is clearly split into three parts. [by blank lines] The first part builds up the test data, the second part operates on that test data, and the third part checks that the operation yielded the expected results.