This post concludes our bi-weekly blog series on Awesome Chef Paul Comtois’ DevOps Story. You can read the final part below, while part one is here and part two is here. Thank you to Pauly for sharing his tale with us!
The last hurdle was that, even with all we’d accomplished, we still weren’t reaching the sys admins. I had thought they would be my vanguard, we would charge forward, and we were going to show all this value. Initially, it turned out they didn’t want to touch Chef at all! Jumpstart, Kickstart and shell scripts were still the preferred method of managing infrastructure.
About the same time that the release team was getting up to speed, the database team decided that they wanted a way to get around the sys admin team because it took too long for changes to happen. One guy on the database team knew a guy on the apps team who had root access and that guy began to make the changes for the database team with Chef. The sys admins were cut out and the apps team felt resentful because the sys admins weren’t doing their job.
That started putting pressure on the sys admins. The app team was saying, “Hey, you guys in sys admin, you can’t use that shell script any more to make the change for DNS. Don’t use Kickstart and Jumpstart because they only do it once, and we don’t have access. We need to be able to manage everything going forward across ALL pods, not one at a time and we need to do it together.” It was truly great to see the app team take the lead and strive to help, rather than argue.
So, the app team started to train the sys admins. They’d say, “This is how you make the change. It’s super easy. Bring this file up in Vim or Emacs, make the change, save it up to the repository, and it’ll push it out on its own in the next Chef client run.” The sys admins were amazed. “Really, that’s all I have to do?” “Yep, that’s all you have to do.”
I worked with people to identify the biggest pain points that the systems guys dealt with. We were looking for things they had to manage across all six pods every day, LDAP, DNS, memory tuning and things like that. We showed them how to fix those things with Chef, and after a while they forgot that Chef was even managing it. It became more and more common for them to pop in and make a change, following a lean change management process. They started using the shell scripts less and less. Finally, I said let’s get rid of the shell scripts and let Chef replace the bastion host.
That was a very long process. It took months, and honestly none of it was really about tools. It was all cultural. It was all about managing people and their emotions and fears about losing their identities and their responsibilities. The inertia of staying with what they knew was countered not by me, but by the other teams and a shifting cultural approach.
Admittedly, a couple people didn’t make it. They couldn’t see any value in unifying configurations and streamlining our processes. Some called configuration automation and CI a fad. The result was that they were no longer happy with the new direction. While I don’t ever like losing good people, sometimes they must do what is right for themselves and seek a place where they can be happy.
As a leader you need to try to mentor individuals and get them to see the value of what you’re doing. I see three stages in these types of cultural changes. At the beginning, you try to get people on board, you try and get them excited, so they can contribute and feel good about it. You engage and empower them and get out of their way. It is critical to communicate openly and be transparent on why the change is needed. Sometimes you will have someone that just doesn’t want to change. In this case I believe in “You don’t have to agree, but I need you to support the idea.” Often they will see the value and benefit over time. If they don’t support the idea, then the message is, “Look, this is just not going to work out.” We were not able to provide them a place to hide, the organization was too small and everyone had to contribute.
Over time the company grew significantly, and it was bought by an even larger firm. Chef helped make that happen. DevOps helped make that happen. All that rolled back code before Chef caused attrition in customers and burned out employees. The platform would always go down, and customers would have to wait for 6 months for any new feature because it was so painful for us to get product out to the market.
Chef enabled us to have both frequent reliable deployments and a stable production environment. This lead directly to no more code rollbacks and happier, more satisfied customers.