How much Dev and Ops shall be in the DevOps?
DevOps is a very popular modern set of practices for a software development and operations. And here comes quite interesting question:
How much Development and how much Operations shall be in the DevOps for engineers?
Is there an ideal balance?
I believe there is no particular numbers for such balance, but there is an ultimate ideal goal of making people free from Operations.
There are maybe many recommendations for the split between Development and Operations in the DevOps. E.g. Google SRE book mentions about ~ 50 %/50 % split. This can be good on start, with the lower level of automation and when decisions in the critical situations are made by operational staff. And I think that as of today this ratio is different.
Development and Automation
We shall remember, that “Development” means not only development of the end-user product. It also means development of the system internals, development of automation, development of all system components related to its reliability, availability, scalability, resiliency, performance, data integrity and so on.
When you build scalable complex solution, your ultimate goal is to automate as much as possible.
DevOps phases in the evolution of application
Phase 1: Dev > Ops
At this phase we start from zero and we are fully focused on the delivery of our application. Because scale is small, we are more focused on Development part. Of course, we do not want to be completely blind about our application performance, but its operational part is still small and we can handle it with a very simple automation, intervening manually in critical situations.
At this point Operations do not consume so much time and effort.
Phase 2: Dev ≤ Ops
Luckily, our application got users attention and there is significant growth in demand and our first architectural compromises, blind spots, bottlenecks start to appear. In order to “firefight” and solve growing problems we start to see more and more operational duties in the DevOps team.
This is a critical phase where good product owner, team leader shall see that balance is unhealthy and Operations start overweighting. At this point of time team shall start invest more and more development efforts into improvement of the application architecture and implementation, those also cover visibility, automation, self-healing, resiliency…
Eventually those efforts are payed off and issues requiring human intervention occur less and less often. Hence, were are moving to the Phase 3.
But if the leader does not able to cope with overwhelming operational pressure, then development of the application/product will stale and there is a high risk to fall into the phase 3’. I would call it “Operational death”.
Phase 3: Dev ≫ Ops
We successfully grew our application and now we have new operational challenges:
- Our application grew so large that amount of collected metrics, logs and other usage + system behavior information is overwhelming.
- While we automate most of operations, critical decisions are still done by operational staff. We starting to see that with the current scale of application, our human reaction for critical event is too slow. Even if operational staff decisions are right, they are coming too late. Delays in minutes or seconds cost significant amount of money, quality of user experience and even subscriber churn.
The best solution is to let algorithms (AI, ML) decide what to do in critical situations. And we expand our development in that area.
At this phase most of company DevOps engineers activity shall be in the Development, research and knowledge growth, because human operations shall become an extremely rare case.
Phase 3’: Dev ≪ Ops (“Operational death”)
If we are not able to recover in the Phase 2 and our DevOps engineers are in the constant firefighting mode, then we will stagnate in the best case. Operational burden will consume most of DevOps time and there will be little or no room for further development. In reality most probably we are going to lose our competitive advantages and extinct. This will happen because competitors do not wait, the world around is moving on and sooner or later alternative solutions will take subscriber base over.
Summary
My view is that the goal of DevOps is to reach the ultimate state when:
- Humans do Development and creation:
- Creation of new products,
- Creation of new automation tools and solutions,
- Creation of new AI and ML algorithms,
- Analysis, research and many other interesting things.
- Robots, Automata, Artificial Intelligence and Machine Learning do Operations.
Hence, Development vs Operations balance is not a constant, it is one of the KPIs which shall be eventually moving towards 100% of Development for people and leaving 100% of Operations to automation engines and robots.