Page 34 - Root Cause Failure Analysis
P. 34

Root Cause Failure Analysis Methodology   25

                  Group interviews cannot be used in a hostile environment. If  the problem or event is
                  controversial or political, this type of interview process is not beneficial. The personal
                  agendas of the participants generally preclude positive results.



                  Collecting Physical Evidence
                  The first priority when investigating an event involving equipment damage or failure
                  is to preserve physical evidence. Figure 3-5  is a flow diagram illustrating the steps
                  involved in an equipment-failure investigation. This effort should include all tasks and
                  activities required to fully evaluate the failure mode and determine the specific bound-
                  ary conditions present when the failure occurred.

                  If possible, the failed machine and its installed system should be isolated from service
                  until  a  full  investigation  can  be  conducted.  On  removal  from  service,  the  failed
                  machine and all its components should be  stored in a secure area until they can be
                  fully inspected and appropriate tests conducted.

                  If  this approach is not practical, the scene of the failure should be fully documented
                  before the machine is removed from its installation. Photographs, sketches, and the
                  instrumentation and control settings should be  fully documented to ensure that  all
                  data are preserved for the investigating team. All automatic reports, such as those gen-
                  erated by the Level I computer-monitoring system, should be obtained and preserved.

                  The legwork required to collect information and physical evidence for the investiga-
                  tion  can be  quite extensive. The following is  a partial list  of  the  information that
                  should be gathered:

                          Currently approved standard operating (SOP) and maintenance (SMP) pro-
                          cedures for the machine or area where the event occurred.
                          Company policies that govern activities performed during the event.
                          Operating and process data (e.g., strip charts, computer output.  and data-
                          recorder information).
                          Appropriate maintenance records for the machinery or area involved in the
                          event.
                          Copies of log books, work packages, work orders, work permits, and main-
                          tenance  records;  equipment-test  results,  quality-control  reports;  oil  and
                          lubrication analysis results; vibration signatures; and other records.
                          Diagrams, schematics, drawings, vendor manuals, and technical specitica-
                          tions, including pertinent design data for the system or area involved in the
                          incident.
                          Training records,  copies  of  training  courses,  and  other  information that
                          shows skill levels of personnel involved in the event.
                          Photographs, videotape, or diagrams of the incident scene.
                          Broken hardware (e.g., ruptured gaskets, burned leads. blown fuses, failed
                          bearings).
   29   30   31   32   33   34   35   36   37   38   39