That first line of defense may be able to solve the problem, but it may be a deeper application or network issue. Often it needs to get escalated. But to whom? The senior IT tech? The security person? A developer? The team approach works better here.
ChatOps is the key to that team approach. The issue is escalated -- not to an individual, but to a defined group. Ideally, one member of the group knows how to diagnose and resolve the problem. But more often, understanding and resolution requires several people working together. We share our knowledge of the application, the infrastructure and the code to reach a fast and (hopefully) accurate diagnosis and fix.
Second, ChatOps provides for the use of bots or other automated tools for working with servers or the application remotely. These bots can reboot the server, change some configuration settings or do more detailed diagnosis of the problem. The bots are typically activated based on agreements reached by the group on the real-time chat.
The key is speed. With an increasing number of applications essential for business operations, ecommerce and even safety, any downtime can be unacceptable. But things do go wrong, and unless the problem is simple no one person is going to resolve it. Instead, when an issue needs to be escalated, we have teams of the best problem-solvers in their respective areas.
In the case of ChatOps, the whole is greater than the sum of its parts. Problems are resolved quickly by people with different skill sets collaborating in real time. When there are one or more possible diagnoses, the team can start deploying bots with specific tasks. Those tasks could be as simple as rebooting the server, or as complex as setting up a data collection agent to obtain more information.
Tools supporting ChatOps
A number of commercial tools support ChatOps. From the standpoint of real-time collaboration, products such as Slack and Atlassian's HipChat offer the ability to easily form groups and create communications channels that may be able to help facilitate planned or ad hoc group communications. These are group IM tools, and enable teams to quickly form groups and have conversations, exchange images, files or other artifacts, and make a wide range of information available to groups best positioned to use it to solve the problem.For managing incidents and escalations, applications such as VictorOps and Pager Duty provide a routing mechanism for issue management. These apps can determine the priority of a given alert type and make sure the alert ends up in the hands of those best equipped to resolve it. They integrate with IM systems and alerting engines to provide a fully automated approach to receiving, routing, communicating and resolving application issues.
Back to testing
The question is whether this is a new software testing strategy and if testers in general have a role to play here. I would argue that they have a vital role in getting an application back into production. Here's why.In a larger sense, testers are important contributors to a team approach to problem resolution. They are highly focused problem solvers, adept at taking application behavior and deducing the root cause of that behavior. While testers rarely fix a problem, they are often the first ones to the solution.
So astute testers must be on ChatOps teams to contribute application-specific and problem-solving experience. They join with developers, network specialists, systems administrators, security experts and others who hold responsibility for ensuring that an application is developed, tested, delivered and operated as expected. This is a software testing strategy worth pursuing.
The key to DevOps is speed and agility, and agility doesn't end once an application is delivered an in production. And testers are an integral part of this particular Agile team.
No comments:
Post a Comment