Understanding the DRY (Don't Repeat Yourself) Principle
Jan 6, 2022
A very common feeling while writing code is that of déjà vu. It's the uncanny sensation that you've come across this code before! At some point, while writing an application, you'll need to implement some business logic that you've previously used. This is inevitable as a project scales, and it's a problem that crops up in large codebases. The remedy to this common dilemma is the DRY (Don't Repeat Yourself) principle.
In this post, we'll explore this principle by reviewing common scenarios of duplication. You'll learn about the problems associated with duplication and review tested methodologies to go about solving these problems. At the end of the post, we'll also look at some common criticisms of DRY and discuss similar principles to keep in mind while tackling such problems!
Let's cover the basics first.
What Is DRY?
Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.
Andy Hunt and Dave Thomas formally defined this principle in 1999 in their book The Pragmatic Programmer. It's one of the few ideas that apply through all phases and levels of software development and even in other domains of knowledge work. The core problem it aims to solve is that of knowledge duplication. Let's take a look at a few examples!
Common Scenarios That Need DRY
A significant amount of duplication happens while designing classes. We end up implementing the same business logic in multiple portions of the code. A simple solution is to abstract the common functionality and only implement the parts that differ. In the following code, we abstract the common functionality getSalary.
Another common example is explaining the logic of code in comments. Writing more readable code often eliminates the need to write comments. In the following code, we can use a more meaningful variable name. This would eliminate the need to add a comment that explains the necessity of the variable.
Another common example of duplication happens while storing data. We end up creating unnecessary attributes in our tables. We can often derive these attributes from others as they're mutually dependent. In the following example, age is a redundant attribute as it can be derived from date_of_birth. The problem now is that change in one attribute requires a change in another.
The Problems of Duplication
The root cause of all these problems is the creation of duplicate representations of knowledge. This knowledge is mutable—in other words, subject to change. And as soon as it changes, we need to change all its representations!
Making such a change requires due diligence. In contrast, not making the changes properly will lead to contradictory representations of the same knowledge.
What are some common reasons for these types of duplications?
Someone made a mistake during design, and that mistake propagated to the latter stages of development. It's possible to salvage these problems without significant cost if you detect them early enough.
Sometimes projects have strict deadlines that put pressure on the developer. Soon, it becomes more tempting to take shortcuts instead of writing clean code.
There might be unintended duplications where developers don't even realize they're duplicating information. This is more common in large organizations, where entire functionalities might be duplicated. The most common reason for this is a lack of collaboration between teams.
Sometimes the environment demands code duplication, and the developer needs to show ingenuity to avoid such duplication.
Now that you know why duplication happens, what can you do about it?
Approaches to Resolving Duplication by DRY
DRY is only a principle. Developers or managers of the project are free to choose how they implement this principle. The implementations might be diverse depending on their use cases. Let's briefly discuss the three most common methods.
Abstractions
This is commonly the case for designing classes. When multiple classes share some common business logic, we simply abstract the logic into a superclass, which each class then inherits. Standard object-oriented programming (OOP) has many practices that encourage the DRY principle for writing readable, reusable, and maintainable code. Furthermore, there are alternatives like composition that solve the limitations of using inheritance and give programmers more flexibility.
Automation
This is the case for project awareness of developers. Teams should be cross-functional to ensure active communication between developers. Managers should encourage such communication so developers never end up repeating completed work and can discuss their mutual problems.
Removing duplication as it arises isn't enough. Instead, it must be an active process. Without proper systems and oversight, this problem will persist and continue to plague the organization.
Normalization
This is the case for designing databases. Duplications are common in many data representations. However, this creates redundant data, which is harder to maintain.
Instead, extract the duplicates into a separate entity. The source then references this entity. Edgar Codd proposed this concept as part of his model of relational databases.
The goal of normalization is to ensure the data is consistent and properly distributed. It also helps maintain data integrity and creates a single source of truth. Furthermore, data normalization removes redundancies in data, which makes the database more flexible and scalable.
How Plutora Implements DRY
While you're developing software, an important step is to correctly evaluate the value stream of the entire process—the steps in software development that create customer value. Awareness of the value stream helps managers optimize workflows by eliminating waste such as delays or reworks.
Furthermore, Plutora also breaks down operational silos by connecting diverse teams and their processes. To do this, the system uses real-time metrics from multiple platforms by integrating with existing tools and workflows. This creates a single source of truth for the entire software development life cycle! This process requires proper normalization and extraction of different representations of data across multiple teams.
Plutora's value stream flow metrics (VSFM) dashboard not only normalizes the data but also analyzes and displays it in a single, secure, user-friendly platform. This means everyone in the organization has access to the same information. Also, it gives managers visibility of all pipelines from idea to production and complete control to make necessary changes to optimize the process. To get a more comprehensive overview of the power of VSFM, you can refer to this free guide. You can also get a free, personalized demo of Plutora.
Things to Keep in Mind
These aren't strict methodologies. Instead, they're useful points to keep in mind to overcome habitual pitfalls that developers often fall into.
Tip 1: Make It Easy to Reuse
The abstractions must encourage reuse. Developers will most likely ignore abstractions if they find them to be obscure and harmful to their productivity. We need to make abstractions more usable and properly documented so it enables other developers to benefit from them rather than perceiving them as a roadblock.
Tip 2: Shortcuts Make for Long Delays
A majority of duplication during development is a result of laziness on the part of a developer. For projects with deadlines, it seems much easier to push something that "just works." In almost every situation where a developer resorts to a shortcut, it ultimately introduces technical debt later down the road. If technical debt is discovered late in the process, the cost of solving the issue may even be ten times or a hundred times the labor hours than if the team had solved it when they first encountered it.
Tip 3: Focus on Active Communication and Project Awareness
In a big organization with multiple teams working on different projects in parallel, it's hard to maintain awareness about everything that's going on. To solve this, developers should be aware of other projects within the company—especially ones that are relevant to their own. This allows developers to reuse work that other developers have already pushed. Also, it gives them a chance to contribute their own to others.
Is it possible to take DRY too far? Yes!
Overuse of DRY
Every line of code written doesn't have to be unique. Developers sometimes have a fear of rework and waste and perceive these as the epitome of unproductive behavior. But the overuse of DRY actually stops us from writing clean code.
Every principle needs to be applied depending on the context. Sometimes code duplication suits the environment. Removing that might introduce unnecessary upfront coupling or complexity into the system. In such situations, living with a bit of duplication might not be the worst thing.
Premature optimization is also a common issue with new developers. As a general rule, developers should avoid applying generalizations to functionality before it's repeated. More real and immediate problems take precedence, and it's paramount that we solve those problems first.
Other Schools of Thought
There's also a general principle called the Rule of Three. Martin Fowler popularized it and attributed it to Don Roberts. It basically states that two instances of duplicate code don't require refactoring. But when you need to repeat the code a third time, then it's time to refactor the code into an abstraction. This is a more practical approach to duplicating code than worrying about refactoring and abstractions every time.
AHA (Avoid Hasty Abstractions) is another mental model for abstractions. Kent Dodds coined it, and it aims to focus on change happening first instead of prematurely optimizing. There's a similar philosophy by Sandi Metz, who advises us to "prefer duplication over the wrong abstraction." This idea influenced AHA.
Fun fact: Violations of the DRY principle are commonly known as WET (which translates to "Write everything twice," "Write every time," or "We enjoy typing," depending on whom you ask). Programmers and their fun abstractions!
Summing up and Looking to the Future
It's easy to duplicate code, but the pain comes in maintenance. Avoiding duplication in the earlier stages of development often saves us from technical debt further down the line. However, overuse of principles like DRY often introduces many upfront costs that are unnecessary in some scenarios. There has to be a balance between the two.
DRY isn't just a principle for storing data or writing code. It's something we can apply throughout each stage of our project. DRY is a tool that helps us optimize our processes, collaboration, and products. It's a tool that we must tailor to our needs, apply when it fits, and let go when it doesn't.
Download our free eBook
Mastering Software Delivery with Value Stream Management
Discover how to optimize your software delivery with our comprehensive eBook on Value Stream Management (VSM). Learn how top organizations streamline pipelines, enhance quality, and accelerate delivery.