Point in Time Exercise: Beginning of Incident
02:30:00 The software release team and the operations team discuss that the release is now impacting service and should become an incident. Nothing formal is decided and no Incident Commander is identified. The release manager (Pete) is still functioning as the lead on the bridge.
02:31:00 An SME is asking to check the analytics. An unidentified person states that there is a problem on Server 24, and that this may be an e-mail problem. The person also says there is a need to call the e-mail application team. “On second thought,” he says, “someone should definitely call them.” The SME reports they are not able to log in and check the dashboard. An unknown voice asks, “Is something else failing”? One of the other SMEs was able to log in.
02:33:45 Neal from customer support is reporting that the dashboard is looking good.
02:34:01 Pete reports that the dashboard is displaying time delayed analytics. “Is it possible it hasn’t been updated yet?” He asks Jan to check on it. Jan asks, “We are checking on the e-mail issue—is that right”?
02:35:00 Pete asks Jan, “How do the analytics look?” Jan reports all good but Server 24. There is a lot of uncontrolled group discussion on database issues. The discussion centers on node 1 being shut down. There is also general discussion on which database team should be contacted.
02:40:47 One of the SMEs that earlier dropped from the call rejoins. Two SMEs from different database teams join and only provide their names, not their function/teams. Pete does not know them,and asks if they see any problems. They ask for the reason they were called, and want to know what is going on. There are several minutes of discussion on the situation. Pete is still the leader of the call but has not assumed the position of Incident Commander. The SMEs announce to the bridge that their analytics are showing some problems. They would like to move some partitions, and there is some discussion on this action. Other SMEs offer some other suggestions.
02:44:27 Customer service is reporting more customer tickets, and they are starting to pile up. Pete asks for more specifics. Seems to be related to Server 24 and Server 28. One of the SMEs asks, “Which servers are the customers on? Does anyone know?” This will have to be investigated, but Pete does not make this an assignment to a specific function or person.