The Problem Resolution Framework

Looking to go from the scientific method to something tightly related to the problem resolution in technical support, members of my team in Microsoft and I started to shape what we named Problem Resolution Framework. Based on the scientific method, tt consists of 4 phases:

Each one of these phases or quadrants has specific inputs, sets of actions and guidelines, and outputs. It is pretty much common sense, but our belief is that it helps to have a common ground on what needs to be done to resolve problems in a structured way and avoid randomization of processes. See below a simple description of each phase - any input is appreciated.

  1. Defining - This is always where the process begins. As we know, a proper problem definition is half way of having the problem resolved. Also, it is important to have the problem correct scoped and narrowed to a single entity that can be resolved - it is definitely not a good idea to handle two problems within the same resolution process. The indispensable input of this phase is the problem description - including the sympton or simptoms - as seen by the customer or whomever is asking the person to resolve a problem. In this phase the troubleshooter will explore the problem and try to use his previous education about the issue at stake to generate hypothesis on why the problem happens. This exporation will surely include using good questioning to quickly eliminate unlikely causes and to narrow the problem description. At the end, the troubleshooter should have as outputs a good, narrow problem description and one or more hypothesis on why the problem happens. These outputs will be necessary as inputs of the next phase. Note that a bad problem definition will cause wrong hypothesis and wrong data collection, misleading the troubleshooter and consequently either delaying the resolution or blocking it totally.
  2. Gathering - This is where the data collection happens. Being able to formulate hypothesis is critical to resolve the problem - a good hypothesis will never be found if the person does not have previous education about the matter - and it allows to determine what data needs to be collected to either confirm or reject the hypothesis. The data gathering activity takes as inputs the problem description and the hypothesis generated on the previous phase, and then establishes a data collection plan: what needs to be collected, why and how it should be done. Specific instructions to the person who will be actually performing the data collection is critical for its success - even the use of operational definitions must be considered. It is common that we assume that all parties involved in the problem resolution are on the same page, just to find out later that there are some concepts that are understood slightly different due to different cultural or other aspects. Having a clear dictionary on what means each term of the data collection plan helps to collect the right data. Wrongful data will cause unnecessary increase in the time spent on data analysis, delaying or impeding the resolution. At the end of this stage, the objective is to have meaningful, verified data that can be used for analysis. If this phase was not able to produce reliable, validated data, it cannot finish - before going to the next stage, data must be sound.
  3. Analyzing - After obtaining validated, readable data, it is time to go into the analysis. The troubleshooter needs to look for answers on how the data collected allow either confirming or rejecting the hypothesis. As with the first phase – Defining – in this phase the knowledge about the matter at stake is what allows the analysis to go through. Correct understanding of the architecture, proper knowledge about tools and how to use them, and careful thought will allow the person to establish what in the data confirms the hypothesis – or discard them. There may be more than one hypothesis – the data analysis should produce one single hypothesis to be carried out to the next phase; if the data collected is not useful to produce one single confirmed hypothesis, it is time to start back on the Defining phase and check what was missed there – it may happen that the troubleshooter needs to bring in help from other more experienced people to help him as well. Having one confirmed hypothesis will allow action planning for problem resolution, what comes in the next phase.
  4. Fixing – Finally, if the analysis was able to point out the root cause of the problem – or at least what needs to be fixed to eliminate the problem – in this phase the troubleshooter will create a resolution action plan. This is list of actions to be performed to resolve the problem, based on the previous analysis of sound data collected. The resolution action plan must have a specific sets of actions, how to perform them, and why. Detailed step-by-step or reference to “how to” articles are necessary to avoid any mistakes, as well as specific recovery procedures and contingence plans in case something goes wrong. Considerations about “change control” and using the resolution action plan in a test environment must also be taken. The expected result is the resolution of the problem scoped at the beginning of the process, but eventually something may have gone wrong and the action plan did not resolve the problem – time to go back to the Defining phase and see what was missed. Walk the cycle again and see if new hypothesis are needed; check what was missing; what was done incorrectly. This is a cycle that may need to be repeated multiple times, while the problem is supposed to be better defined at each cycle until it gets resolved.

Comments