How do I do DevOps at Scale? Challenges and Solutions
Feb 22, 2016
DevOps in a startup is defined by developer-driven automation. Instrument everything, establish a culture of automation, and enable your developers to deploy software to production frequently and you’ll be practicing DevOps. You’ll be using any number of accepted tools to achieve these goals from open source tools such as Puppet and Jenkins to public clouds such as Amazon’s AWS or Openstack hosted on Rackspace. At a startup or a small company DevOps is defined by automation and single team success – developers own operations and the company can deliver software continuously.
At scale DevOps is an entirely different story. Larger companies can often accomplish more because they have more resources available to build larger systems, but scale introduces additional complexities that can make adopting DevOps a challenge. This post discusses some of the challenges faced when attempting to scale DevOps and how to overcome each one.
DevOps Decentralization and Variation at Scale
Challenge
The culture of DevOps revolves around decentralization and self-service. In a large enterprise adopting DevOps it is common to see one or more DevOps organizations emerge over time. While some variation across a large enterprise is healthy, too much variation can create challenges if different groups use incompatible technologies.
Solution #1: Establish a central group to standardize on common DevOps practices
During the initial ramp up for DevOps it is often a good idea to watch one or two teams develop different styles of DevOps to see which approaches work and which don’t. When the number of teams engaged in DevOps practices is small this is an acceptable strategy. Once you start scaling DevOps across tens or hundreds of projects you’ll want to set up a central function to support DevOps and establish some basic ground rules for teams interested in moving faster and taking more ownership of operations. This central group doesn't "do" DevOps for the teams it supports it is simply available to help guide teams toward the commonly accepted toolsets.
Who “Owns” Operations at Scale?
Challenge
The question of “who owns operations” is also a common challenge in a large organization with some teams taking a very aggressive approach and owning release management, performance testing, and production support while other teams are unwilling to do anything more than commit code to a source code repository. Some teams want to own ops while others avoid it.
Solution: Establish common expectations and staffing models for DevOps
You should develop a common definition of DevOps. What activities fall under DevOps? What does it encompass? At a small startup DevOps might cover all of operations because you are working on systems small enough to manage with one or two developers. At a large company you may need an army of Oracle consultants just to maintain the databases required by a major system. At scale you need to draw clear boundaries for responsibilities for infrastructure to avoid disagreements over who “owns” what.
It is also important to develop a common approach to staff projects appropriately. Teams adopting DevOps practices require additional staff to support new responsibilities. If you ask teams across your enterprise to adopt DevOps practices without providing adequate resources don’t be surprised if many of your teams decide to pushback on the idea that they “own their own operations.”
DevOps and Production Control
Challenge
When a developer is enabled to push to production multiple times a day, how does this align with strict production control requirements present in regulated industries? At large companies production control and change management can take days or weeks, how is this going to work when developers can just push to production?
Solution: Automate production control and maintain an audit trail
Once you educate your change management professionals about continuous delivery and the benefits of increasing the frequency of software development most will understand that change management will need to adapt to a new reality. Work with your change management team and create automated and audit-able systems to track changes to production. As long as production changes can be audit-able and traced back to a specific individual change management and production control shouldn't have any objections to moving faster.
If you have a production control team that fails to see the benefits of increased deployment frequency you’ll need to do some convincing. Point to successes in other organizations. Most production control teams understand that the last thing they want to be is an obstacle to faster execution.
Self-service Deployments vs. Central Budgeting
Challenge
Developers are no longer willing to wait for an operations group to provision hardware, they want access to self-service APIs to provision cloud instances and they are demanding the ability to provision systems dynamically. How does this work when budgets are fixed and only updated once a year?
Solution: Model cloud infrastructure costs and build in space for DevOps
At a small company it is clear that public cloud charges can be classified as operating expenses (OpEx). At a large company that maintains a private cloud there’s always a pressure to set capital budgets during an annual or quarterly budgeting process. If you own your own servers (and many large companies still do) you understand that there are times when you can’t just pull a few more servers out of nowhere.
When you practice DevOps at scale in a company that maintains a private cloud you need to make sure that you build in sufficient capacity to create and destroy environments that are necessary for automated testing and dynamic scaling. Large companies often want to pretend that they are still running applications on physical servers and that these requirements can be forecast years in advance. With cloud infrastructure this isn’t the case. You need to stand up Plutora and use the tool to forecast environment demand and plan accordingly.
Is it even possible to practice DevOps at scale?
Challenge
In large organizations everything takes more time. There are more people involved in meetings, project plans, and there is always some reason why projects are delayed. If you are advocating for DevOps in a large enterprise you’ll encounter people who might tell you that DevOps is impractical at Scale.
Solution: Ignore the naysayers.
There are many reasons for resistance to DevOps at Scale. The primary reason for pushback is that people often view change negatively – it is unnecessary risk and it is a disruption to the way things work. In a large company not used to moving quickly you’ll also encounter people who see DevOps as simply “more work.” Another source of pushback for DevOps at scale is operations professionals.
Most ops professionals understand the benefits of DevOps and value their interaction with developers, but a few see DevOps as a threat to operations. These individuals chafe at the idea that developers would provision self-service VMs and they don’t enjoy explaining network architecture to mere developers. Our advice is to ignore your DevOps skeptics. Bring data to support your changes, but if you encounter someone who doesn’t think DevOps works in your enterprise turn the tables. You should either involve them in the effort to convince them or route around such an individual.
Download our free eBook
Mastering Software Delivery with Value Stream Management
Discover how to optimize your software delivery with our comprehensive eBook on Value Stream Management (VSM). Learn how top organizations streamline pipelines, enhance quality, and accelerate delivery.