Industry studies repeatedly confirm 70 percent or more of operations interruptions directly result from people working in the critical environment.
Lack of training, lack of procedures, incorrect procedures, and inadequate staff size are common root causes identified following incidents directly impacting a company’s operation, profit, and reputation. Often, leadership finds their focus on building a high availability facility did not extend to creating and implementing a successful strategy for reliable operation.
Highly successful critical facility operations consistently share five key practices as part of their critical facility operations strategy:
- Creating a staff plan to meet the operations objective
- Developing a site-specific training program
- Operating with comprehensive procedures
- Employing Critical Facility Work Rules
- Assessing the operation for continuous improvement
Creating a staff plan to meet the operations objective
Highly critical facilities are designed with redundant systems and concurrent maintainability so no planned interruptions to operations are needed. When one component or module fails, redundant equipment carries the load. If the business truly cannot afford an interruption, two Facilities technicians on each shift are required. This ensures each trained shift team can confidently respond to any emergency, while effectively employing the “pilot/co-pilot” process of reading and repeating back each step of the procedure they will follow.
Organizations with a less significant impact from an operations interruption will appropriately choose to operate with a smaller, single shift Facilities team that can quickly respond to a weekday incident. They recognize the team will be delayed in responding to after-hours events. This option provides substantial annual savings if a rare downtime event will not threaten the company’s viability.
The staffing strategy should include:
- Hiring key technical team members early in the new facility planning schedule
- Targeting specific hiring sources for talent and critical facilities experience
- Involving team members in design meetings, construction observation, and commissioning activities
- Defining individual ownership of critical processes and infrastructure systems
- Creating and maintaining a desirable work environment—as continuity (long term tenure) is a major contributor to achieving uninterrupted systems operation
Developing a site-specific training program
Every critical facility is unique, with infrastructure systems configurations the Facilities team will need to learn thoroughly, regardless of previous experience. By reviewing the design engineer’s basis of design and through the process of observing construction and commissioning, they will gain their initial orientation. It is critical these lessons are strengthened by continued, repetitive in-house training sessions, to provide needed confidence. One team member should be assigned the role of training program owner.
The training program owner’s role will include:
- Developing an orientation manual for the team
- Basis of design, written system descriptions, system capacities / redundancy levels / expected longevity, automated fail over scenarios, plus descriptions of scenarios in which team members will need to manually intervene
- Creating and overseeing an expected schedule for a new team member to “shadow” more experienced team members for multiple weeks (or months if the operation is highly critical) prior to operating without supervision
- Designing a monthly training calendar to ensure each team member receives consistent practice performing critical switching activities (isolate or restore a system) and in responding to anticipated emergencies for each system
- These sessions are often referred to as Drills
- Creating content for each monthly session by incorporating site-specific procedures
- Involving selected subject matter experts (SMEs) to serve as the initial trainers
- Training fellow team members (those assigned ownership of individual systems) to serve as trainers in future sessions after initial training sessions are complete
- Developing a means of measuring each team member’s knowledge retention
- Documenting training session completion for each team member
Operating with comprehensive procedures
Critical facilities require the consistent use of procedures by the team responsible for their operation. In addition to administrative procedures and non-invasive preventive maintenance (PM) procedures, the Facilities team should have a thorough and quickly accessible procedure for every anticipated emergency event (EOP). Similarly, procedures for each activity involving isolating systems prior to maintenance or repair, and for restoring systems to normal operation (SOPs)—are required.
The facility owner may contract with the design engineer or the commissioning agent to create EOPs and SOPs but should recognize procedures cannot be finalized until shortly after commissioning is complete. Alternatively, the Facilities team can be tasked to create the necessary volume of procedures, but this will be a multi-month or multi-year effort, leaving the operation exposed in the interim, as more than one hundred documents are needed.
A single individual should be assigned ownership of the procedures program. In addition to a master binder of procedures, smaller binders containing emergency procedures specific to equipment in each room should be located within those rooms for rapid access during an incident in progress.
Procedures should employ a consistent format, to include:
- Clear description of intended operation and risks inherent
- Notifications to be made
- Safety equipment needed
- Number of individuals required to perform the work
- Location(s) for the work described
- Space to record initials after each step
- Space to record names of those performing work described
- Approval signatures
- Back out plan
- No more than one action described per step
Employing Critical Facility Work Rules
Vendors, contractors and visitors should consistently be provided a thorough Work Rules document to review and sign in the presence of the Facilities team member responsible for escorting them, at the time they arrive on site. It is imperative the organization’s expectations regarding safety practices, security requirements, and detailed precautions for working in critical spaces in the facility are provided. Signed and dated copies of the document should be filed and retained. This practice must be implemented the day the critical facility begins operation.
Assessing the operation for continuous improvement
A program of repetitive assessments will ensure your critical facility is continually reviewed for risks to continuous operation, so they may be addressed before an impact to the business occurs. It will also highlight commendable practices, providing recognition and motivation for the Facilities team. Facility infrastructure systems should be inspected for single points of failure, end of life expectancy, load vs. capacity, and energy efficiency. Similarly, operations practices should be evaluated for optimum performance and safety.
Assessors should, at a minimum, have conducted 50 critical facility assessments and have 10 years’ experience in designing and/or managing highly critical facilities. This level of experience will permit them to compare your facility operation with significant peers in the critical facilities industry.
Most enterprise critical facility owners will schedule assessments every three years. Companies who manage a portfolio of critical facilities typically assess annually. Findings almost always justify funding for needed improvements and enhancements.
Our industry is renowned for investing significantly in robust and highly redundant facilities systems but failing to allocate adequate resources for successful long-term operation. Hard experience has compelled several to properly fund a Facilities Operations program that strategically matches their facility’s design objective. If downtime avoidance is paramount, employing the strategies outlined above will position your team for success. Should your organization have inadequate experience or resources to implement these practices effectively, a mission-critical property management firm may be consulted.
After acquiring McKinstry FMS, BGIS moves further into the data center market
While customers wait for data after the SBG2 fire in Strasbourg, Klaba wants to help
Sponsored Content Frozen out?
The Texas power outages had data center operators rushing to re-assess their business continuity plans. But what really happened, and what are the real lessons that can be learned?