Rating: 7.2/10.
Code Complete: A Practical Handbook of Software Construction, Second Edition by Steve McConnell
A fairly large book (about 850 pages) about “software construction”, essentially the process of writing code. The book is basically a long list of recommendations on how to write code that is correct and readable, kind of like a style guide, but longer and language-agnostic. A main theme is that you should write code for people first, and not just for the compiler. Although the book has a lot of pages, its scope is quite narrow: it only gives advice on the mechanical aspects of creating source code (eg: variable names, functions, loops, data types) but a lot of harder parts of software engineering are out of scope for this book (eg: design patterns, working with data, deployment, etc).
Overall, I think this book is best for junior programmers just starting out in the industry. As someone with a few years of experience, I found this book to be overly verbose and mostly confirming what I intuitively knew after 1-2 software developer internships. The advice is mostly language-agnostic and the book gives examples in C, C++, Java and Visual Basic, but it has not been updated since it was published in 2004, so some parts of it are obsolete, for example strategies of dealing with strings and arrays in C and strategies to avoid memory related problems in C++, where such errors are common; most high-level programming languages now no longer have to deal with these problems.
Part 1: Laying the Foundation
Ch1. Construction is any activity involving the program’s source code, including debugging, unit testing, etc, excluding non-programming activities like UI design.
Ch2. Metaphors for understanding the software development process. “Writing” is a poor metaphor because the project is never completed. “Building” is a better metaphor, the more complex a project is, the more planning you need to do since changes to the existing structure are more expensive.
Ch3. Need to do planning before jumping into coding, since mistakes are more expensive the later in the process they are discovered, thus gather requirements, design the architecture, before implementation and QA. At the same time, requirements cannot be known from the beginning.
Architecture breaks down the problem into building blocks where it’s clear what are the responsibilities of each block; it should also make decisions on data format, error handling, fault tolerance, etc.
Ch4. Characteristics of different programming languages, choice is important since it’s difficult to mix together different languages. Tools in their early stage of the lifecycle will require you to spend more time struggling with the tool rather than development.
Part 2: Creating High-quality Code
Ch5. Purpose of design is to minimize accidental complexity, and happens on several levels. At the component level, want to reduce communication between components so modifying one does not require you to understand everything else. At the class level, want to make abstractions that hide the details (especially things that are likely to change), so the user can work with a simpler view of the object. Avoid writing code that assumes knowledge of internals of another component that’s not part of their public contract.
Ch6. Abstract data types (ADTs) form the foundation for classes and encapsulate variables and methods to enable abstraction. All methods in a class should be on the same level of abstraction. Don’t expose implementation details and member variables in the public interface by making them private; also don’t assume semantic implementation details of classes you use. Inheritance is a different type of public interface, should only be used when base class can be substituted for the derived class (Liskov substitution principle), otherwise prefer encapsulation.
Ch7. Routines are a good tool for reducing complexity, try to put operations that have high cohesion together in a routine. Annotate which arguments are meant to be inputs vs outputs, and keep arguments in a consistent order across routines. Be careful with macro expansion as they have some gotchas.
Ch8. Often have tradeoff between correctness and robustness, robust application should handle the error in some way rather than crashing (eg: return last known value or neutral value). Safety-critical applications favor correctness but consumer applications favor robustness.
Exceptions are a useful tool to indicate error conditions, they are part of the public interface so make sure they’re at an appropriate abstraction level. Assertions are more for situations that should logically never occur. Don’t always apply the same error handling in dev and prod versions, sometimes you want to crash quickly in dev so it’s easier to debug, but recover gracefully in production.
Ch9. Pseudocode is useful as a intermediate before the actual implementation, can write pseudocode to discuss and review design choices. This lets you think at the level of interfaces and data structures, leaving implementation details for later.
Part 3: Variables
Ch10. Keep variables “live” for as short time as possible, this means declare them close to when they are first used, this minimizes the amount of time you have to mentally keep track of them. Initialize them properly since in some languages the default behavior can be unpredictable. Minimize the variable’s scope to what’s necessary, this is less convenient to write but easier to read. Each variable should only have one purpose and avoid having multiple entangled meanings in one integer.
Ch11. Optimal variable name length is about 10-16 characters, but short names are fine and signals that the variable is temporary and has small scope, like loop indices. Avoid magic constants and always give them names. Adopt consistent conventions for names of objects vs classes, global variables, named constants, etc, some languages have standard conventions already while some don’t. In older languages with hard limits on variable name length, use standardized abbreviations and comment on their meanings in a table.
Ch12. Characteristics of integers and floating point numbers, best practices with dealing with strings in C where errors are common. Enumerated types and booleans can be simulated in languages that don’t have them. Typedef is useful for communicating intent beyond default types.
Ch13. Structures are useful for grouping data, they’re basically like a class with only public members and no methods. Pointers are tricky, some best practices to prevent accidentally pointing to garbage data, check for corruption, etc. Give names to intermediate variables to avoid complicated pointer expressions. Global data is best avoided since they’re hard to factor and error prone, consider using access routines to manipulate global data in a more controlled way.
Part 4: Statements
Ch14. Use function arguments and return types to communicate whether a group of statements is ordering-dependent or not. Group related statements together so you should be able to draw non-overlapping boxes around groups of related statements.
Ch15. In if-else statements, use the if-case for the correct path and else for error conditions, this way all the error handling is at the end when they’re nested. In multiple if-else statements, use the last else to error in unhandled cases. In case statements, avoid the fallthrough feature (or at least document it carefully).
Ch16. While loops are good for flexibility when you don’t know how many times it will loop, for loops (or foreach) are good for rigid situations when the number of iterations is known. Put loop logic (like continue statements, index updates) at the beginning or end of loop. Use break sparingly since you can no longer treat the loop body as a black box.
Ch17. Recursion is useful sometimes but factorial / fibonacci are bad examples. Good idea to use a safety counter to prevent infinite recursion. Gotos are rare in most languages but sometimes useful for error handling logic.
Ch18. Table lookup is useful when alternative is a long if-else or switch statement. They can also be used when you want to do something based on what range a value is in.
Ch19. In boolean expressions, expressions that avoid using negations are easier to read. Name variables so that they indicate a true/false value, and avoid comparing them to true/false explicitly. Be explicit and compare eg: if(var != 0) rather than if(var). Deeply nested if-statements can be refactored using retesting, moving into routines, or polymorphism.
Part 5: Code Improvements
Ch20. Different measures of code quality (like correctness, efficiency, readability) are at odds with each other, so management should give directions on which to optimize for. Many different ways of detecting defects (code reviews, design inspections, unit tests, system tests, etc), each only has a moderate chance of detecting defects so a combination of multiple ways is best.
Ch21. Pair programming is when two people code on a problem together, this is most helpful for more difficult problems and when both people are actively engaged. Formal inspection is a meeting where multiple people review code with the aim of finding defects, this is more effective than informally walking through the code with a colleague.
Ch22. Developers tend to write easy tests, while it’s impossible to test all possible inputs, aim to cover every branch in the program, variables that end up in the same branch can be considered to be equivalent. Some data on types of errors typical in programs and their frequencies. Use tools like mock objects, test data generators, automating the testing procedure.
Ch23. When debugging, gather information about the problem, make hypotheses, systematically use scientific method to narrow down the root cause, until you have the minimal example that reliably reproduces the error. Use tools like diff tools, profilers, debuggers.
Ch24. Software changes during its development so often you will end up with “code smells” that are hard to understand, then you should refactor it. Long list of specific types of ways to refactor code on statement, method, class, system level.
Ch25. Before tuning for performance, get a sense of what performance is required. Benchmark to see what the slowest part of the program is, often IO or system calls, then iterate on it. Often improving performance makes it less readable.
Ch26. A bunch of tricks to improve performance at a low level, with benchmarks in several languages, including loop unrolling, replacing expensive arithmetic operations, caching function return values, rewriting in assembler, etc.
Part 6: System Considerations
Ch27. As team size grows, communication overhead increases, bugs per thousand lines of code increases, and relatively less time is spent on software construction.
Ch28. Managers should encourage good coding practices such as code review and version control. Ways to estimate the work of a project, be willing to reduce scope when falling behind. Measure metrics in software development processes that you want to improve.
Ch29. Integration is putting together pieces of software together, incremental integration is better than trying to put it all together at the end. Can write software from top-down, bottom-up (ie: write all the low level methods first), riskiest component first, feature by feature. Have a continuous build and smoke test that’s always passing and don’t let new changes break it.
Ch30. Overview of tools for programmers, such as IDEs, code checkers, build tools, debuggers, profilers, etc. Write your own tools to automate things that are tedious.
Part 7: Software Craftmanship
Ch31. Layout is important for making code readable for humans, even if they are equivalent to the compiler. Picking an arbitrary style is better than using inconsistent styles. Use spacing and indentation to indicate grouping, although there are disagreements about where to put braces. Use formatting and layout to distinguish between method boundaries and classes, put each class in its own file.
Ch32. Some developers argue that code should be written to be self-documenting and avoid comments. Comments should explain the code at a higher level than the code, communicate the intent and not just repeat the code, and be correct. Avoid comment styles that are hard to modify. Using the pseudocode process, you’ll probably end up with the right density of comments.
Avoid using comments to rescue bad code, look for ways to improve the code. Useful to write comments for data, such as allowable values and what do different values mean. For routines, document the input and output data formats, global effects, high level algorithms.
Ch33. Personality traits that are desirable in software developers. Be humble and write clear and easy programs instead of trying to appear smart. Be curious, read and experiment to know your tools better. Be intellectually honest, own up to your mistakes, be firm in your estimates. Lazy is good when you find a way to do it efficiently, not if you procrastinate on it. Persistence is bad when you insist on running into a wall.
Ch34. Summary of themes in the whole book. Write code primarily to be readable for other people. Establish conventions and processes to manage complexity and avoid hazards. Be flexible in your choice of methodology and notice when things are messy so you can iterate and fix them.