Recently at work I had to help respond to a Severity 1 issue. This is our worst case scenario, something major is broke in production and is costing the company money. In the presentation I gave at Twin Cities Code Camp about pair programming I said that often troubleshooting bugs and fixing production issues weren’t the easiest to pair on. Reflecting on the past couple days I noticed that while trying to fix the issue at hand we were pairing the whole time. I don’t think we could have accomplished the fix without the entire team using some of the techniques I had outlined in my presentation.
We used two of my local pairing techniques along with two remote pairing tools. Locally we used a mixture of traditional paring and “Divide and Conquer” pairing.
With traditional pairing one person is “driving” while the other person “navigates”. As I sat in the driver seat with vim open and tailing a log on production my manager sat next to me helping navigate through the code as we figured out what needed to be modified. I used to think that I liked to work on these high stress issues alone and troubleshoot things with my own process, but now have a different opinion. Having someone there to limit the amount of thrashing was a major help.
Divide and Conquer
With a major issue there are lots of logs to check, experiments to try, and pieces of a system to update. This is where we brought another developer so we could divide up the work and conquer the problem. While I updated configs, he updated our applet code. By the time I was done getting the configs ready he had the applet built and ready to be pushed out. We were in constant contact sitting next to each other but were able to work in parallel to finish the one task.
Working with a team distributed around the world makes troubleshooting major issues difficult. I was able to keep everyone on the same page by using a combination of a screen sharing tool http://join.me and tmux. I used Join.me to share a remote desktop session from London. From the remote desktop I was able to use putty to connect to a tmux session on my own machine. This way I could show code and logs to the entire team. Using tmux allowed me to reboot the machine and pick up and start running very quickly.
Using pairing techniques and tools helped us diagnose the problem and solve it.