E Book

Download as pdf or txt
Download as pdf or txt
You are on page 1of 522
At a glance
Powered by AI
The key takeaways are that process safety management aims to prevent incidents and disasters through proper risk management and safety precautions. Data-driven approaches and continuous improvement are important.

The PSM standard or 29 CFR 1910.119 is a mechanism to manage risk from human errors and accidents in industries using highly hazardous chemicals. It establishes a comprehensive management program to eliminate or minimize potential events.

Working with highly hazardous chemicals poses risks of accidental release of toxic, reactive or flammable gases/vapors which can result in injuries, fatalities, economic and environmental damage if not properly handled.

Welcome to the PSM eBook!

An interactive, e-learning book on Process Safety Management!

The electronic, interactive e-learning book consists of study material
for the entire curriculumof undergraduate chemical engineering
The course content is developed by three practicing professionals
Ronald CutshallSr, PE, CEng, Deborah Grubbe, PE, CEng, and
Steven J Swanson, PhD, ChE, who have taught the course and have
combined industrial experience of about 100 years in process safety!
The easy-to-learn eBook has 34 chapters in three sections titled
Hazard Identification, Analyzing Hazards and Managing Risk. Each
chapter comprises slides, detailed reference pdf files, helpful links
and audio. Homework assignments reinforce the learning and
examinations taken at your pace can pinpoint potential areas for

How to use the eBook?

1. Open the eBook, hover the mouse over the table of contents
and click on whichever chapter you wish to study.
2. The first page will open. You can read the basic points, and
click on the audio icon if you want to listen to the explanation.
3. If you wish to read about the subject in details, click on the
pdf icon at the top left hand corner of the first slide of the
4. There are various web-links that illustrate a point or provide
5. Homeowrk assignments can help you understand how well
you have learnt the topic at hand.

6. Examinations interspersed at intervals throughout the eBook

can help you with a practice run before the actual

What is Process Safety Management?

The PSM (Process Safety Management) standard or 29 CFR
1910.119 is a mechanism to manage risk.
Many accidents and disasters are due to human error. Such errors
can prove very costly in terms of human life, the involved facility
and equipment and the environment. As these are human mistakes
they could have been prevented. But really one cannot plan in
retrospect. So the best way to avoid or in the least mitigate such
disasters is by taking safety precautions. These safety precautions
are universal and keep getting better due to lessons learnt from
earlier incidents.
Many industries use highly hazardous chemicals that may be toxic,
reactive, flammable, explosive, or may exhibit a combination of
these properties. The potential for accidental release of toxic,
reactive or flammable gases in any such industry is very high,
unless proper measures are taken. The possibility of disaster looms
high in such cases.
When the magnitude of such an incident is great, there is a public
outcry. Todays instant communication media makes sure that such
incidents get global attention. However there are many other less
known releases of highly hazardous chemicals. Hazardous chemical
releases continue to pose a significant threat to employees and may
result in multiple injuries and fatalities, as well as substantial
economic, property, and environmental damage. Such dangers
provide impetus, internationally and nationally, for authorities to
develop or consider developing legislation and regulations to
eliminate or minimize the potential for such events.

Occupational Safety and Health Administration of USA proposed a

standard that emphasized the management of hazards associated
with highly hazardous chemicals and established a comprehensive
management program that integrated technologies, procedures,
and management practices. The final OSHA standard was issued on
February 24, 1992.

The standard mainly applies to manufacturing industries particularly
those pertaining to chemicals, transportation equipment, and
fabricated metal products. Other affected sectors include natural
gas liquids; farm product warehousing; electric, gas, and sanitary
services; and wholesale trade.
It also applies to pyrotechnics and explosives manufacturers
covered under other OSHA rules and it has special provisions for
contractors working in covered facilities.
In each industry, PSM applies to those companies that deal with any
of more than 130 specific toxic and reactive chemicals in listed
quantities; it also includes flammable liquids and gases in quantities
of 10,000 pounds (4,535.9 Kg) or more.
Process means any activity involving a regulated substance,
including any use, storage, manufacturing, handling, or on-site
movement of such substances, or combination of these activities. A
"covered process" is a process that contains a regulated substance
in excess of a threshold quantity (40 CFR 68.3).
The key provision of PSM is process hazard analysis (PHA) a
careful review of what could go wrong and what safeguards must be
implemented to prevent releases of hazardous chemicals.

PSM elucidates the responsibilities of employers and contractors

involved in work that affects or takes place near covered processes
to ensure that the safety of both plant and contractor employees is
fully taken into consideration. The standard also mandates written
operating procedures; employee training; pre-startup safety
reviews; evaluation of mechanical integrity of critical equipment;
and written procedures for managing change. PSM specifies a
permit system for hot work; investigation of incidents involving
releases or near misses of covered chemicals; emergency, action
plans; compliance audits at least every three years; and trade
secret protection.

Process Safety Management

Let us now understand the nature of PSM Process Safety
The first word is Process.
OSHA defines process as any activity involving a highly hazardous
chemical including using, storing, manufacturing, handling, or
moving such chemicals at the site, or any combination of these
activities. For purposes of this definition, any group of vessels that
are interconnected, and separate vessels located in a way that could
involve a highly hazardous chemical in a potential release, are
considered a single process.
PSM is concerned with process issues such as fires, explosions and
the release of toxic gases caused by process-oriented issues such as
runaway chemical reactions, corrosion and the inadvertent mixing
of hazardous chemicals. Process means any activity involving a
regulated substance. A "covered process" is a process that contains
a regulated substance in excess of a threshold quantity (40 CFR

The second word in the term PSM is Safety. Initially most of the
concerned companies were focused on the need to meet the safety
regulations and to reduce safety incidents related to process upsets
and hazardous materials releases. However the role of PSM has now
increased to encompass a much wider canvas. PSM is now
becoming a more and more crucial part of Operational Integrity and
Excellence programs in many companies.
When used in process facilities safety can have three connotations:
Technical safety, Process safety and Occupational Safety.
Technical safety implies safe engineering and design of the facility
and equipment. It is obvious that it is considered in the initial
stages of a design.
Process Safety as we have seen is focused on process-related
events that have high consequences. So what is a PSM event? The
Center for Chemical Process Safety (CCPS 2007b) defines it as:
It must involve a chemical or have chemical process
It must be above a minimum reporting threshold
It must occur at a process location
The release must be acute, i.e. it must occur over a short
period of time.
The third word is Management.Here a manager implies any person
who has some degree of control over the process, including
operators, engineers and maintenance workers. Use of the word
management also means that PSM is not just about equipment and
instrumentation, but also covers issues such asEmployee
Participation, Operating Procedures and Management of Change.

The PSM Standard

The process safety management standard targets highly hazardous
chemicals with potential to cause a catastrophic incident. The
purpose of the standard as a whole is to aid employers in their
efforts to prevent or mitigate episodic chemical releases that could
lead to a catastrophe in the workplace and possibly in the
surrounding community.
To control these types of hazards, employers need to develop the
necessary expertise, experience, judgment, and initiative within
their work force to properly implement and maintain an effective
process safety management program as envisioned in the
Occupational Safety and Health Administration (OSHA) standard.

Important Features
Some of the more important features of a process safety
management system include the following.
PSM is not a management program designed exclusively by the top
management. Here management implies management of the facility
under consideration by the concerned employees. So all managers,
employees and contract workers are responsible for the successful
implementation of PSM. The top management will design the PSM
along with representatives from the concerned workers and
operatives. They will be involved in its implementation and
improvement because they are the people who know the most
about how a process really operates, and they are the ones who
have to implement recommendations and changes. PSM is
fundamentally a line responsibility.

PSM is a process, an on-going activity that never ends; it is a

process, not a project. Because risk can never be zero, there must
always be ways of improving safety and operability. Process safety
management cannot be viewed as being a one-time fix. It is, in
reality, a continuous improvement process.
Process safety management programs are non-prescriptive. The
PSM standard is basically recommendation result expectations, a
frame of reference on which each individual company has to build
its PSM program to achieve the desired outcome. What this means
is that the managers and employees together have to determine
exactly what should make their particular PSM standard. What does
it take to make their facility safety-oriented?
There is no universal PSM standard. Measures that are right for a
particular facility may be inadequate for another. Not all hazards
are caused by the same factors or involve the same degree of
potential damage. The PSM standards simply require that that
companies set their own standards, and then adhere to them.
As there are no universal standards, how to judge a programs
success? Of course the target is zero accidents. However that is
near impossible to achievethats why they are accidents! The risk
can never be zero especially if hazardous chemicals and gases are
involved. It is also true that when a unit is run for a long time
without accidents, complacency can set in. Then every action
becomes routine and precautions sometimes get disregarded.
Likelihood of accident!

Hence, even though the stated PSM goal may be zero accidents, in
practice, management has to determine a level for acceptable
safety and for realistic goals.

The elements of PSM

1. Employee Participation
Requirements: The standard requires employers to:
Develop a Plan of Action for implementation ofEmployee
Consult with employees on the conduct of the development of
PSM Elements
Provide access to PSM information
Employer and employees need to together draft the PSM standard
for their facility. There should be a written action plan on employee
involvement. It is the frontline employee, the operator who is more
conversant with the equipment and its operation. That is why the
employees contribution to plan a program is crucial.
The PSM also rules that the employers will also provide employees
access to any information regarding analyses of process hazards.

2. Process Safety Information (PSI)

Requirements: the OSHA standard requires compiling of technical
information on the process and equipment in the system. This
requirement is to allow for PHA and maintaining information on the
system for Operator training and reference. Specifically:
Hazards of (type in the Chemical covered in this program)
pertaining to the technology of the (Type in the PROCESS
Information pertaining to the equipment in the process

Documentation that equipment complies with recognized and

generally accepted good engineering practices.
Employers are required to develop and maintain written safety
information about a hazardous process.
The PSI areas cover Chemical Hazards that include:
Physical Data
Toxicity Data
Permissible Exposure Limits
Chemical Stability
Hazardous Effects of Mixing with other chemicals
Process Technology
Flow Diagrams
Maximum Intended Inventories
Safe Operating Limits
Consequences of Deviation
Avoidance Procedures
Process Equipment
Construction Materials
Piping & Instrument Diagrams (P&IDs)
Electrical Classification
Relief and Vent System Information
Design Codes and Standards
Material & Energy Balances for new processes
Safety Systems
Documentation will verify equipment follows generally
accepted good engineering practices

3. Process Hazard Analysis (PHA)

Requirements: An initial process hazard analysis must be conducted
by a team with expertise in engineering and process operations,
including at least one employee who has experience and knowledge
of the system.
After Initial PHA
Establish a system to promptly address the team's findings
and recommendations
Assure that the recommendations are resolved in a timely
Document resolutions
Document what actions are to be taken
Complete actions as soon as possible
Develop a written schedule of when these actions are to be
completed; Communicate the actions to operating,
PHA review is required at least every five years to be updated and
revalidated by a qualified professional to assure that the process
hazard analysis is consistent with the current process.
PHA must address: The hazards of the process;
Identify previous incident which had a likely potential for
catastrophic consequences in the workplace
Engineering and administrative controls
Detection methods for providing early warning of releases
Consequences of failure of engineering and administrative
Facility site
Human factors

Qualitative evaluation of a range of the possible safety and

health effects of failure of controls on employees
Potential hazards are identified through a Process Hazard Analysis
conducted by anonsite cross-functional PHA Team.
The PHA will cover
Location of a process area
Hazards of a process
Engineering and administrative controls (safeguards)
Probable outcomes if controls fail
Possibility of Human Error
Previous Incidents or Catastrophes

4. Operating Procedures
Requirements: Develop and implement written operating
procedures that provide clear instructions for safely conducting
operations and maintenance. Operating procedures shall be readily
accessible to employees. The operating procedures shall be
reviewed as often as necessary to assure that they reflect current
operating practice. The employer shall certify annually that these
operating procedures are current and accurate.
Develop and implement safe work practices to provide for the
control of hazards during operations such as lockout/tagout;
confined space entry; opening process equipment or piping; and
control over entrance into a facility by maintenance, contractor,
laboratory, or other support personnel. These safe work practices
shall apply to employees and contractor employees.
It is essential that there are written operating procedures for the
following phases and that they are strictly followed.


Initial startup
Normal operations
Temporary operations
Emergency Shutdown
Conditions when emergency shutdown is required
Assignment of shutdown responsibility
Emergency Operations
Normal shutdown
Startup following a turnaround, or after an emergency
Steps required to correct / avoid deviation
Operating limits and consequences of deviation
Health and Safety Considerations
Built-in Safety Systems
Hazard Control for non-routine tasks (i.e. Line breaking,
Confined Space Entry, Control over entrance into a facility by
support personnel)

5. Training
Initial training:Each operator must be trained in an overview of
the process and in the operating procedures. The training shall
include emphasis on the specific safety and health hazards,
emergency operations including shutdown, and safe work practices
applicable to the employee's job tasks.
Refresher training shall be provided at least every three years,
and more often if necessary, to each employee involved in
operating a process to assure that the employee understands and
adheres to the current operating procedures of the process. The


employer, in consultation with the employees involved in operating

the process, shall determine the appropriate frequency of refresher
Training documentation. The employer shall ascertain that each
employee involved in operating a process has received and
understood the training required by this paragraph. The employer
shall prepare a record that contains the identity of the employee,
the date of training, and the means used to verify that the
employee has understood the training.
Training should cover:
Operating Procedures
Specific Safety and Health Hazards
Emergency operations and shutdown
Safe Work Practices
Refresher Training

6. Contractors
Obtain and evaluate information regarding the contract
employer's safety performance and programs
Contract employers of the known potential fire, explosion, or
toxic release hazards related to the contractor's work and the
process to contract employers the applicable provisions of the
emergency action plan
Develop and implement safe work practices to control the
entrance, presence and exit of contract personnel
Evaluate the performance of contract employers in fulfilling
their obligations

A contract employee injury and illness log related to the

contractor's work in process areas
The PSM Standard gives specific instructions to the employer and
contractor concerning specific responsibilities before contractors
may work in or around a hazardous process.
The significance here is that the employer is also responsible for the
contractors on the employers site. If the employer does not train
the contractor, then the employermust ensure that the contractor
has the right training for the work to be done.The employer also
needs to ensure that the contractor is familiar with the specific
hazards of the site as it pertains to the work of the contractor

7. Pre-Start-Up Safety Reviews (PSSR)

Requirements: Perform a pre-startup safety review for new facilities
and for modified facilities when the modification is significant
enough to require a change in the process safety information.
Before beginning a new or modified process, the employer must
confirm that:
Construction meets design specifications
Adequate safety, operating, maintenance, and emergency
procedures are in place
PHAs have been performed and recommendations resolved or
implemented for new facilities
Modified facilities meet the requirements of the MANAGEMENT
OF CHANGE element
All necessary training has been completed


8. Mechanical Integrity (MI)

Requirements: Establish and implement written procedures to
maintain the on-going integrity of the equipment. This includes:
Test & Inspections (T&Is) on equipment following recognized
and generally accepted good engineering practices,
manufacturers recommendations and operating experience for
the conduct and frequency.
Documentation of T&Is, identifying:
o Date
o Name of the person performing T&I
o Serial number or other identifier of the equipment
o Description of the inspection or test performed
o Results
Equipment deficiencies: Rectify deficiencies in equipment
that are outside acceptable limits before further use or in a
safe and timely manner when necessary means are taken to
assure safe operation.
New Equipment: Assure that equipment as it is fabricated is
suitable for the process application for which they will be
used. Additionally, conduct appropriate checks and
inspections to assure that equipment is installed properly and
consistent with design specifications and the manufacturer's
Material Control: Assure that maintenance materials, spare
parts and equipment are suitable for the process application
for which they will be used.
A good MI program ensures that vital equipment is designed,
installed, serviced, and operated properly by providing:
Written procedures for on-going maintenance

Training maintenance workers

Equipment must be inspected and tested
Repair or replace worn or defective parts
All new and replacement parts must be in good working order

9. Hot Work Permits (HWP)

Requirements: The employer shall issue a hot work permit for hot
work operations conducted on or near a covered process.
HWP provides information for these functions:
Verifies that necessary fire prevention measures have been
Gives the dates authorized for hot work
Identifies the object on which the hot work will be performed
The permit shall be kept on file until completion of the hot work

10. Management of Change (MOC)

Requirements: Establish and implement written procedures to
manage changes (except for "replacements in kind") to process
chemicals, technology, equipment, and procedures; and, changes to
facilities that affect a covered process.
Before any changes are made, the written program should ensure
the safety of the planned modification and consider the following:
Technical basis for the change
Safety and Health Effects
Updating PSI and Operating Procedures
Time needed to make the change
Authorization requirements


This is the most important aspect of PSM. The original equipment
and process gets a tremendous amount of attention by the original
Engineering and Construction companies. Then, once in the hands
of the owner and run for some time, the owners operators see
places where they think they can improve the original design. But,
the owners operators may not completely understand why the
facility was originally designed the way it was.To change the design
much care and focus needs to be employed to ensure that no booby
traps are incorporated by well-meaning individuals. So, whenever
MOC is needed it should be done with utmost care and inquiry.

11. Incident Investigations

Investigate each incident that resulted in, or could
reasonably have resulted in a catastrophic release of highly
hazardous chemical in the workplace. An incident
investigation shall be initiated as promptly as possible, but
not later than 48 hours following the incident.
Establish an incident investigation team which consists of
at least one person knowledgeable in the process involved,
including a contract employee if the incident involved work of
the contractor, and other persons with appropriate knowledge
and experience to thoroughly investigate and analyze the
Incident Reports: A report shall be prepared at the
conclusion of the investigation that includes at a minimum:
o Date of incident
o Date investigation began
o Description of the incident


o Factors that contributed to the incident

o Recommendations resulting from the investigation
Corrective Actions: Establish a system to promptly address
and resolve the incident report findings and
recommendations. Resolutions and corrective actions shall be
Report Review: The report shall be reviewed with all
affected personnel whose job tasks are relevant to the
incident findings including contract employees where
applicable. Incident investigation reports shall be retained for
five years
Investigation of accidents, near misses is absolutely crucial.
Give dates of the incident and the investigation
Give a detailed description of the incident
Determine factors that caused or contributed to the incident
Recommend solutions
Bottom line is that if you ever get to this point in a facility, your
PSM process has failed you. Determine what, where, why, and
generate fixes that resolve the root cause of the failure. Learn from

12. Emergency Response and Planning

Requirements: Establish and implement an emergency action
plan for the entire plant in accordance with the provisions of 29 CFR
1910.38(a) and 29 CFR 1910.120(a), (p) and (q). In addition, the
emergency action plan shall include procedures for handling small


Emergencies will arise and in such eventualities the correct

response and actions need to be documented. This is the what if
question, and answers should be available. These include:
Procedures for handling small releases
Alarms and other methods for alerting workers
Emergency Shutdown
Evacuation Procedures
Accounting for employees after evacuation
How to report emergencies
Rescue and medical duties for employees
Employee Training

13. Compliance Audits

Requirements: Certify compliance with the provisions of the PSM
Standard at least every three years to verify that the procedures
and practices developed under the standard are adequate and are
being followed.
The compliance audit shall be conducted by at least one
person knowledgeable in the process.
A report of the findings of the audit shall be developed.
Promptly determine and document an appropriate response to
each of the findings of the compliance audit, and document
that deficiencies have been corrected.
The employer must certify that their facility complies with the
provisions of this regulation at least once every three years.
A report of the findings must be developed, and the two most
recent reports are to be kept on file.


14. Trade Secrets

Requirements: Make all information necessary to comply with the
section available to those persons responsible for compiling the
process safety information, those assisting in the development of
the process hazard analysis, those responsible for developing the
operating procedures, and those involved in incident investigations,
emergency planning and response and compliance audits without
regard to possible trade secret status of such information.
Rules and procedures set forth in OSHA Standard 1910.1200,
employees and their designated representatives shall have access
to trade secret information contained within the process hazard
analysis and other documents required to be developed by this
It is the employers responsibility to make all the necessary
information for complying with this regulation available to those
involved in the process safety management process, even if the
information includes trade secrets.
It is also the employers right to require those persons to enter into
a confidentiality agreement not to disclose such information.

How do the elements work?

The elements link with one another. For example, an engineer may
wish to change operating conditions. First she must find out what
the current operating limits are (element 2). The proposed change
will then be put through the Management of Change system
(element 10); which may require that a HAZOP be performed
(element 9); then operating information (element 4), operating
procedures and training programs (element 5) must be updated.
Before making changing conditions in the field a Readiness/Pre-


startup Safety Review (element 7) needs to be performed. Finally

the updated program must be audited.
So, weve seen the 14 elements of PSM, but how do we accomplish
the objective of improving Process Safety. It starts with leadership
at the top of every company to be committed to doing the right
thing. The top bosses need to set the process safety tone at the
top of the organization and establish appropriate expectations
regarding process safety performance.
No corners can be cut to obtain a short-term gain. The objective is
the long-term goal of safe operations. The leadership of a company
must be visible in the workplace, must ask the right questions that
ensure the momentum of achieving safe results occurs every day.
That forms the basis for an organizational commitment that extends
through the ranks from the rookie to the most senior member of the
company. All in the organization need to understand that safe
operation, hence success is the only option and that all are
accountable for their own and their co-workers success and safe
performance of their duties.


PSM Leadership
This lesson explores:
How leadership, organizational culture, ethics and HROs relate
to PSM. The success of a PSM depends almost entirely on the
quality of the team. For any team to be successful leadership
is most critical.
How organizational constructs in structure, work processes
and systems can have direct effect on safety and profitable
How poor leadership can adversely affect process safety.
Aggressive management too is not ideal and can impinge
upon the safety outcomes.
The concept of leadership vis--vis PSM, ethics and culture is
examined. Required behaviors will become clear.
The most important aspect of PSM leadership is the absolute
necessity of right leadership. Without leadership any PSM program,
however well structured, is bound to fail in time.
The topics covered include:
Leadership interpretation and importance of ethics. Workplace
ethics needs to be the backbone in any organization. Being
ethical can add value to processes and help solve problems.
HRO High Reliability Organization and how an organization
can operate and safely manage processes with risk potential.
The relationship between leadership and ethics

Different culture models and how culture can impact safety

Essential facts about culture and how todays business climate
impacts culture.

What exactly is leadership? Over the years the qualities and skills of
leaders have remained almost the same. However in todays
knowledge economy there is a subtle shift of power to the workers.
The leader has to have people skills to manage the workforce.
Warren Bennis, a contemporary leadership guru, emphasized the
difference between management and leadership. According to him
management is conducting, coordinating, being in charge of, having
responsibility for. Managers master routines and create efficiency.
In contrast,leading is influencing: guiding in direction, course,
action or opinion. Leaders acquire vision and judgment and become
effective. He says:
Managers are people who do things right, and
Leaders are people who do the right thing.
A manager has a set of goals (responsibilities), and set of tools
(authority) and they use the tools to accomplish the goals. A
manager reacts to a situation to correct it but a leader looks beyond
the immediate recovery, thinking if fundamental changes are
Bennis says there are 4 strategies for a leader:

A leader sets a vision:

Leaders have compelling visions. Visions that maintain the culture
and traditions of the organizations, at the same time envision the
future; interpret it in such a way that the employees too see it
clearly. They clearly see the sense of organizations purpose that
the vision shows, its direction, and the projected future state. This
enables them to understand their own roles. They feel empowered
and motivated.

A leader applies communication strategies:

Believing in ones dreams is not enough. Success requires the
capacity to relate a compelling image of a desired state of affairs
that induces enthusiasm and commitment in others. People must be
aligned with the organizations predominant goals.

A leader establishes trust:

Trust binds followers and leaders together. The buildup of trust is a
measure of the legitimacy of leadership. If vision is the idea, then
positioning is the niche the leader establishes. For this niche to be
achieved, the leader must be respected not only for clarity but also
for constancy and reliability. By establishing the position and
staying the course, leadership establishes trust.

A leader deploys self only 24 hours a day:

A leader is always available. Great leaders know their strengths and
leverage them. Effective leadership has much to do with the
creative use of ones self.
Since a leader is aware of her talents, she is always ready to work
on, develop and improve her skills. It is this capacity that

distinguishes leaders from followers. This constant awareness to

better oneself!

Organizational Ethics
Organizational ethics is not just the code of conduct of people
working in an organization but also the way they behave and
respond to situations. Culture, trust, processes, outcomes,
organizational character all contribute to ethics of an organization.
Sometimes if there is a question of an action that may be legal but
not necessarily ethical, it is up to the leadership to decide which
aspect it values.
What an organization valuesor cherishes are the core principles that
guide an organizations work. These values may not always be
formally stated but are intrinsically understood and followed by the
A Values statement that outlines the guiding principles of an
organization should be amongst the important policy documents.
Anorganizations values are an important part of its culture. Such
statements help define the principles and ethics by which
anorganization operates and can act as a paradigm for expected
behaviors during challenging situations. They help define what is
rightand wrong as well as the behaviors and perspectives that the
organization values.
A written code of ethics may be signed by all the employees. This
document should feature the ethics and standards of the
organization. It should also describe values that govern its
processes and operations. Non-compliance with the values may
invite some penalty. However just a formal document is not a

guarantee for ethical behavior nor are penalties a deterrent always!

Ethics have to be imbibed and the management needs to walk the
The knowledge-based organizations of today are more open to
sharing information. In fact the way an organization shares
information goes a long way in determining how keen it is to
inculcate its values into each employee.
Bad news too needs to be shared readily. This is important, as it is
bad news that tests the mettle of an organization. Such news can
get the stakeholders together to face any problem and take
measures to avoid recurrence of undesirable situations. Team spirit
can be strengthened during crisis and the synergy can solve
At such times various opinions would be given, alternative problem
solving resolutions could be suggested by employees. The courage
of an employee to put forth some novel or radical solution is the
measure of the strong character of the organization. Are differing
opinions treated with respect? Are they discussed and considered?
Problems always create pressure on the management. At such
times does the organization buckle down? Or does it focus on
solving the problem successfully first and consider the cause later to
learn from it? How does it deal with people accountable for the
crisis? Does it help them learn and move forward? Or does it dole
out severe punishment?
Accountability is not about punishment, or fear! It is about
willingness to assume responsibility for your actions and accepting
ownership of the results of their work. When the work environment
is designed for accountability, it will flourish. This is where a leader
needs to step in. the leader can create the right environment for

accountability, build trust and minimize fear. If a worker is afraid of

repercussions, work will suffer.

HRO High Reliability Organization

HRO in the context of PSM means High Reliability Organization.
HRO is one that has a record of high level of safety over long
periods of time despite a potential of extreme risk and complexity.
Such an organization has succeeded in avoiding catastrophes in an
environment where normal accidents can be expected due to risk
factors and complexity.
Forward-looking safety plans, procedures, expectations as well as
after action reviews are the elements of an HRO. However the basic
reliability of such an organization comes from the facility of its
people to respond to changes and challenges.
HRO repeatedly accomplishes its high hazard mission while avoiding
catastrophic events, despite significant hazards, dynamic tasks,
time constraints, and complex technologies. It also learns from
mistakes, while taking corrective action.
The organization needs to control the source of risk and be keenly
alert to any changes in the environment. HROs know fully well that
their processes and systems can fail and will fail. They work hard to
avoid all possible causes of failure. They are also ready for the
unexpected and inescapable and make efforts to minimize the
impact of unavoidable accidents, however insignificant.
Any unexpected incidents that may lead to such accidents are
noticed and hence they are able to stop them from escalating. If

such containment does not help, then they concentrate on getting

the system back to work at the earliest.
Researchers have found that successful organizations in high-risk
industries continually reinvent themselves.

Pre-occupied with failure/Continuous Improvement

Even if HROsgo accident-free for a long time, they do not rest on
their laurels. Complacency is an anathema. In fact failure is what
absorbs them. Errors and lapses, however minor, make them
perceive vulnerabilities in the system. And this perception of danger
helps them react quickly to amend the contingency.
In such organizations sixth sense is not derided. If an employee
has a gut feeling that something is not right, it is immediately
looked into. They are attentive to unexpected occurrences however
minor. Long stretches of success in safety are no cause for selfsatisfaction. The HROs are wary of quiet times and stay even more
tuned to the unexpected!

Defers to lowest level familiar with problem

The frontline workers are the real experts in the actual processes.
They operate the systems, know the procedures and have real-time
knowledge of the operations, strengths and weaknesses. They can
instantly zero in on the problem and suggest practical ways of
dealing with it. Their solutions are based on their experience and
expertise gained on the spot.
HROs therefore defer to the lowest level familiar with the problem
and their decisions are respected. The problem is best solved by
them rather that the top managers who may not have the complete

Has high levels of communication

Communication is the lifeblood of any organization. Smooth
channels of communication are even more important in high-risk
companies than in any other. Only hierarchical downward
communication is not enough; upward, transverse, all channels of
communication need to be open and easily accessible to all,
including frontline workers. That is because in HROs employees
across the unit make decisions, which may be interconnected.
Speedy cross communication is required to ensure safety. Effective
communication amongst different units, management and workers
go a long way in helping take timely actions in case of emergencies.

Embraces complexity
Shouldnt procedures be simplified? Why make everything complex?
Simply because business is complex, it is unpredictable and
inexplicable. There are no simplistic methods or systems. In fact
HROs do not unthinkingly simplify procedures and operations. They
accept that their work is indeed complex. Technology advances
have added to the complexity factor. As technology becomes allpervasive, it can help simplify processes. But the same technology
can have a greater potential to cause unexpected uncommon
Systems can fail in ways that have never before happened. It is
necessary to be alert to the possibility of failures due to unseen,
unpredicted reasons. It is also good to explore and identify reasons
that may lead to failures in future.
Simple interpretations of complex situation can be dangerous.
Simplifications with a thorough knowledge of all the factors involved
are appreciated. This knowledge is the result of taking into account
diverse dynamics and exploring a variety of explanations, listening

to differing views and ideas and then coming to a proper solution to

the problem. A complex organization is made up of diverse people
with diverse experience. Its complexity fosters adaptability.
Everybody involved is encouraged to think and consider a wide
spectrum of things that can go wrong. Accidents do not happen only
due to some single, simple cause.

Learning organization
Continuous learning is a core competency of HROs. They are
organizations that bounce back from any errors or near misses,
tougher and better. Learning from mistakes and thereby improving
their functioning is their intrinsic strength.
The climate of the organization created by the leadership is such
that people feel confident to reveal mistakes. The mistakes become
lessons, which can point to potential dangers and vulnerabilities
that may have been unobserved. Thus the learning organizations
focus on learning and not fault finding and blame game.
HROs learn from their own mistakes and also from others mistakes.
If they find some practices that help achieve better performance,
they do not hesitate to adapt and adopt!
In HROs dealing with hazardous processes it is not possible to learn
by trial and error method! So learning by other methods,
observing, imagining worst scenarios and devising methods to deal
with them, training for such eventualities under controlled
conditions are some alternatives.
Such HROs are continuously learning and moving towards a culture
of safety!

Forward focused
HROs have another quality forward focus!
The leaders of such organizations take their organizations to the
next level up and more. Their people are encouraged to think of the
future and bring it alive in the present. Not in just abstract thinking
but by keeping track of innovations in the field, new technologies,
new legislations, new environments and use all these to the
advantage of the organization.
Forward focused organizations are stimulated to think creatively
and have systems and processes poised to leap into the future,
ahead of others. They have the foresight and audacity to act today
for a better tomorrow. They keep pace with changing conditions and
are sometimes ahead of them. This adaptability to change qualifies
them as HROs

How leadership and ethics relate

Michael Josephson, a renowned ethicist, states that character is
based on six core ethical values: trustworthiness, respect,
responsibility, fairness, caring and citizenship. If yourdecisions are
based on these six core ethical values, you will always make the
right decision.
These core ethical values are what define the character of a person.
As these values are ingrained in an individual, their behavior will not
change when situation or circumstances change. So ethics is the
Leadership is by its very nature imbued with power over others.
Leaders can influence others. Ethical leadership can make everyone


in the organization do the right thing for the right reasons. For this
to happen leadership is required. Only ethical leaders can promote
an ethical organization.

Models of Safety Culture

DuPont Bradley Curve
Keil Centre
o Organizational transformation charts
o Models on Change Management
The Safety Culture of an organization is the deciding factor that
influences the safety of its people and systems. Individual behavior
and group behavior, their values, personal and organizational
attitudes, perceptions, competencies together determine their
commitment to safety. The culture begins with the leadership
commitment to safety. This needs to be visible. Then it has to
encompass the entire organization. It is essential to create a
corporate culture in which safety is understood to be and accepted
as, the number one priority.
After Chernobyl and other disasters safety of people and
environment came to the forefront. Many industrial safety models
were developed. Some proved of immense value and helped
organizations consciously develop an enduring safety culture.
The most user friendly definition came from the Cullen Report into
the Ladbroke Grove rail crash and which suggests that the culture is
simply the way we typically do things around here.
DuPont Bradley Curve

One of the most appropriate and useful safety models was the
DuPont Bradley Curve. The target is zero accidents.
This curve basically maps how the culture of the organization
impacts the safety of people, processes and productivity. The safety
culture depends on the maturity of the people towards safety. The
DuPont Bradley curve describes four stages of culture maturity:
Reactive, Dependent, Independent and Interdependent.
In the Reactive stage,people do not take responsibility for safety.
Safety is attributed to luck and not management. Accidents are
bound to happen is the attitude. Safety Manager looks after safety,
and compliance with rules and regulations. Top management is not
actively involved and safety is relegated to a lesser issu.
Unfortunately such lax attitude affects the productivity and the
profitability too, which is not at its best.
The management commitment begins at the Dependent stage.
Safety now becomes a responsibility of the supervisors. However
the emphasis is on discipline, and following rules and procedures.
There is no active involvement though necessary safety training is
provided. Safety compliance is due to fear of reprisal and because it
is an employment condition. However at this stage because of
safety awareness, productivity and profitability improve to an extent.
Accident rates decrease and management believes that safety could
be managed if only people would follow the rules.

The next stage is the Independent stage where individuals

become personally involved in safety. The management ensures
that employees have a thorough knowledge of safety issues and
methods. Individuals become committed to safety and follow safety
standards because they believe that they can make a difference to


safety with their own actions. The accident rates go down further
and profitability and productivity climbs higher.
Now the organizations and people are ripe for the Interdependent
stage. Here safety is no longer an individual issue but each person
feels responsible to their own as well as others safety. They
encourage others to conform to safety initiatives. They have an
active safety network and feel proud about their safety endeavors.
This is when the accident rate approaches zero and the productivity
and profits are at their best!
An organization can follow the DuPont-Bradley curve to achieve the
highest rates of safety. Understanding the psyche behind the
increasing safety culture stages, they can incorporate the safety
culture and sustained improvement in safety and productivity!
Kiel Centre
The Kiel centre has a safety model established on five maturity
levels vis--vis safety. The maturity level is based on ten elements
that incorporate the most common components of both theoretical
and measurement models. These components may differ from one
organization to another as the factors that signpost safety may be
The ten elements are
1) Visible management commitment
2) Safety communication
3) Production versus safety
4) Learning organisation


5) Supervision
6) Health and safety resources
7) Participation in safety
8) Risk-taking behavior
9) Contractor management


It is possible that an organization is not at the same level on all the

ten elements. At such times the appropriate level is decided on the
An important aspect of SCMM is that it is relevant only when an
organization fulfills some basic criteria of safety culture.
The five reiterative levels of maturity go increasingly from the least
to the ideal. These are Emerging, Managing, Involving, Cooperating and Continually improving.
As is evident the fundamental concept of both these models is quite
similar. Organizations go progressively from emerging-reactive,
managing-dependent, involving-independent to co-operatinginterdependent stages. The continually improving stage is included
in the interdependent stage of the earlier model.
The most important step of SCMM is to measure the present level of
an organizations maturity in order to carry on improvements. There
are different methods to measure the level that are based on safety
attitude surveys, safety management audits, safety culture
workshops, leading safety performance indicators, etc.


Apart from the above two there are a few other models that work
equally well for different industries.
Fords Health and Safety Program within its Corporate
Sustainability. Ford Blue Print for Sustainability Five key
material issues comprising Fords sustainabilityprogram
Lockheed Martin Energy Environment Safety and Health
Sustainability Report 2007 progressreport on meeting longterm sustainability program goals, including management
approaches tosafety and health
United Technologies Commitment Improvement Report
Highlight of the five key commitmentareas, including
discussion of safety performance indicators
Pfizer Environment Safety and Health Component of Its
Corporate Responsibility Report Overview of key
performance indicators as measure of performance goals
Dow Chemicals Health and Safety Program within its
Corporate Sustainability and Drive to Zero: Dow Chemicals
Injury Reduction Journey
BP Sustainability Review 2008 Includes reporting of safety
indicators from 2004-2008

FAQs for Safety Culture

What do managers do when they see an unsafe condition?
How do managers balance safety & production?
How tolerant are individuals of risk?
How open and honest is safety communication?
How is maintenance executed?
How are procedures kept up to date? Used?


In a culture of safety:
When managers see an unsafe condition they immediately act to
neutralize the condition. They take responsibility for the safety of
the employees involved, the environment, the equipmentin that
order. They do not indulge in blame game, but take actions to
minimize the problem with immediate effect. The analysis of the
cause of the condition, were there any lapses in the safety
measures, precautions, equipment, procedures is the next step.
Then measures are taken to avoid such a condition in future.
Learning from mistakes, mishaps, is the norm.
In such a culture managers balance safety and production. In fact
according to the safety models studied, it is evident that the higher
the safety culture, the higher the productivity. When employees
have safety ingrained in them, accidents do not happen. This feeling
of safety helps to increase production. Also time and man-hours are
not wasted. So productivity and safety go hand in hand.
Here individuals have zero tolerance for any risk however minor.
Risks are abhorrent to them. They take care to eliminate every risk
conceivable. Their workplace decisions are made based on zero risk
In safety culture, open and honest communication is essential.
There are clear guidelines on behavior to promote a positive and
safe workplace. Here leaders have a decisive role in promoting
safety and zero tolerance for risks. Their behavior ensures open
For workplace maintenance in a safe organization ensures legal
requirements. In addition to this a pro-active maintenance system
is in place. This includes controlling risks and accidents during
maintenance. Written checklists are followed. Maintenance itself is a


high-risk activity. It has to be carried out safely. Only regular

maintenance can keep equipment, machines and the work
environment safe, reliable and help eliminate workplace hazards.
Work environment in this age is a constantly changing phenomenon.
Competition too is relentless and technology is available to all. So it
is important especially for the leadership to keep in touch with the
latest technology and procedures in the industry. The latest trends
are used towards improving procedures and making them safer.

Other considerations
Right skill mix and staffing for the work?
Right work processes for the business?
Right work systems to support employees?
Right values and policies?
Right reward and promotion systems?
Appropriate board oversight?
Right organizational structure?
These considerations also affect safety. The leadership has to decide
the right skill mix and staffing for a work process. They have to
ensure right work processes and systems.
Values are what decide the culture of an organization. Values and
ethics of an organization impact the policies formulated by the
leadership. To maintain high standards of safety, the values of the
organization can be seen through its policies.
Just rewards and promotions also contribute to the culture of an
organization. The recognition depends on the company values and
policies, what are the key drivers of success.


Board Risk Oversight emphasizes the role of the board of directors

in risk management. Boards of course need not be involved in dayto-day management of risks, but their role in enterprise-wide risk
oversight has become increasingly crucial with time. This oversight
practice helps the board ensure that the organization has an
appropriate critical risk management process in place. The board
can encourage continuous improvement in this process as the
business environment changes. Through oversight board can know
the risks and strategies for their management.
Risk management means right supervision and monitoring to
confirmthat the policies and processes are carried out as per the
managements performance goals and risk tolerances.
Organizational structure does have influence on safety. A
systematic approach to managing safety, including the necessary
organizational structure, accountabilities, policies and procedures is
necessary. The objective of safety management is to prevent
human injury or loss of life, and to avoid damage to the
environment and to property.

Good Culture requires Leadership

Culture never stands still
Long wavelength issue
Gets worse faster than it gets better
One bad decision can be a setback
Everyone must be working together
No industry does it better than others


Good culture indeed requires leadership. To be able to respond and

take appropriate action at enterprise level needs good leadership. It
is known that leadership is crucial in developing a safety culture.
Business environment, technology, legislation are continually
changing. Naturally culture is in a state of flux most times. It is the
leadership that responds to changes. Different levels of leadership
need to cope with change and sustain the organization.
When a culture needs to move towards a safety-oriented focus, it is
a long-term project. Sudden initiative towards the ideal state is not
advisable, as it will not work. Employees need time to get used to
the new methods and systems.
With culture as with any improvement initiative, it can get worse
faster than the other way round. If the drop in culture and
consequently in values has to be corrected it takes much longer.
Even one bad decision can be a setback, which can have
organization wide impact.
For a safety culture the most important factor is that all have to
work together. As we have seen in the safety-culture models,
working together, cohesively, interdependently makes for better
safety and productivity.
Also each industry has its own standard and benchmark for safety
practices. No one industry can do it better than others, as the
parameters are totally different from industry to industry.

Tying it all together

Bennis said about leaders.

They ask:

Whats So?

So What?
Now What?
The leaders want to know, they want to learn. That is why they ask
what is so? They are willing to challenge the status quo. They have
a propensity toward action, risk, curiosity, and courage. They want
their people to think now what? The status quo is questioned, now
what would be the next step? How to improve the present
condition? What actions can be taken?
According to Bennis, Leadership is a function of knowing yourself,
having a vision that is well communicated, building trust among
colleagues, and taking effective action to realize your own
leadership potential.
The leaders are eager and willing to make any relevant changes in
the policies to get better results. That is the quality that helps them
carry out changes if required. That is the key to consistently
execute well.
Difference in results is based on right values, a good plan, the
ability to course-correct, and to consistently execute well.


Texas City - 2005

Multiple causes
Complex incident
Small explosion
Trailers too close
Value for technical
Texas city BP refinery suffered one of the worst accidents known to
industry. On March 23, 2005,at about 1.20 local time, there was a
massive explosion at the plant. 15 people died and over 180 were
injured. Many of the victims were in or around work trailers located
near an atmospheric vent stack. The explosion occurred when a
distillation tower flooded with hydrocarbons was overpressurized,
causing a geyser-like release from the vent stack. Apart from these
human losses, property losses and fines were humongous.
The disaster was due to organizational and safety defects at all
levels of the BP Corporation. BP had failed to implement safety
recommendations made before the blast. There were many warning
signs, which went unheeded.
There were multiple causes of the accident.
The Baker panel report found that the BP management had
not distinguished between occupational safety (i.e., slipstrips-and-falls, driving safety, etc.) versus process safety
(i.e., design for safety, hazard analysis, material verification,
equipment maintenance, process upset reporting, etc.). The
metrics, incentives, and management systems at BP focused
on measuring and managing occupational safety, while
ignoring process safety.


The Texas City BP plant had the worst safety culture. Over the
years, the working environment throbbed with resistance to
change, lack of trust and motivation. There was no sense of
purpose. Management and supervisors did not ensure that
safety rules were followed. Individuals did not feel confident
enough to suggest improvements.
There were no definite safety priorities set by the
The organization was huge and complex. There were no clear
roles and accountabilities. Internal communication was poor,
especially during handing over duties.
Individuals had no clear concept of hazard awareness and
process safety. Consequently they took high-level risks.
Temporary trailers were placed too close to the hazards.
Given poor communication and performance management
process, there was neither adequate early warning system of
problems, nor any independent means of understanding the
deteriorating standards in the plant. For example, the alarms
did not work!
Incremental equipment costs were the reasons not to upgrade
to a safer system or replace unsafe equipment altogether.
Cost-cutting, failure to invest and production pressures from
BP Group executive managers impaired process safety
Earlier eight incidents of flammable vapors issuing from the
blow-down vent did not warrant corrective measures. These
were totally ignored.
The incident was very complex. Many interconnecting factors
amplified the intensity of the disaster. Operators started up the
raffinate tower and began filling with gasoline components. Timely


discharge of the product was not started. Maintenance orders were

When the lack of draw-down from the tower was noticed, the
discharge valve was opened which worsened the problem. After this
everything went from bad to worse exponentially. A geyser-like
emission of hot flammable vapors and liquids was expelled from the
vent stack.
A contractors new diesel truck parked nearby provided the source
of ignition for the Vapor Cloud Explosion.
The office trailers were parked too close to the process unit. People
were holding a meeting inside oblivious to the chaos. Those sitting
with their backs to the process unit were killed, due to blunt-force
Safety training was woefully inadequate. That was the reason many
mistakes were committed while operators tried to control the
situation, resulting in compounding the hazard.

Underlying Cultural Issues

Business Context
Motivation: Management commitment to safety culture is a great
motivating factor for employees. When there is a certainty of a safe
working environment, the motivation to work is high.
Morale: Safety culture is an absolute morale booster for workers.
Safe working conditions means less turnover, less time wasted on
training new employees. Productivity also is higher.


When a company focuses on creating a safer workplace, employees

benefit. Attention to safety management results in higher employee
morale. When employees feel safer at work there is less turnover,
which means the company saves money on having to hire and train
new employees. There is also less absenteeism as well as an
increase in productivity.
PAS Score: Peril Assessment Score is determined by various
elements in the process safety. These may include number of
process safety incidents, OSHA recordable and lost workdays,
incidence rates employees and contractors, worker fatalities,
occupational diseases.

(Process) Safety as a Priority

Environmental and Occupational safety is a crossdisciplinaryexercise concerned with protecting the safety, health
and welfare of workers. A safe and healthy work environment is
their right. It is also imperative to ensure environmental safety.
Many companies are making a bigger effort toward environmental
Environmental and occupational safety can be important for moral,
legal, and financial reasons. Moral obligations would involve the
protection of employee's lives and health. Legal reasons relate to
the preventative, punitive and compensatory effects of laws that
protect worker's safety and health. OSH can also reduce employee
injury and illness related costs, including medical care, sick leave
and disability benefit costs.

Organizational Complexity & Capability

Capability is the ability of an organization. Organizational
capabilities are the collective skills and capabilities of its people, its


processes and structures. Information, knowledge, know-how,

understanding, and know-why all contribute to capability.
Complexity can be the characteristic of unusual problems and the
decisions needed to address the issue. Complexity also refers to
complexity of a job. In hierarchical organizations there are distinct
layers of increasing job complexity. These layers have different
work requirements and no two layers will have same job
requirement. These are layers of increasing complexity or different
complexity. Not all organizations are equally complex. Therefore,
not all companies require the same maximum number of
layers. The worlds largest corporations, such as GE or GM, have a
total of eight layers of complexity.
In times of economic turmoil, it is exceptionally crucial for
companies to invest in their people. Technologies advance,
processes upgrade, and customers demands increase! At such
times if people aremotivated, inspired, and trained, if workers are
kept up-to-date on all aspects of their work then they will use their
skills productively.
At such times leadership has to be strong. Restructuring and
redundancies may become inevitable but timely, clear and correct
communication can avoid unpleasantness.
In an organization layers of management have spans of control for
each level/layer. Span is the number of people reporting to one
manager. The wider span of control means more people reporting.
Narrow span means less people. There are pros and cons of both
types of spans. Too many layers make for complexity and
organizational effectiveness may suffer. A correct combination of
layers and spans of control will keep a check on costs, increase
organizational and decision effectiveness


A clear communication line between the layers will improve coordination and motivation since employees know what is expected
of them and when.

Inability to See Risk

Hazard Identification Skills
Understanding of PSM
Facility Siting
As happened in the case of Texas explosion, the risks were great
but the employees could not see them. As nothing had happened till
then, there was no risk!
A hazard can be considered as a dormant potential for harm, which
is present in one form or another within the system or process.
Managers and workers should have the skills to identify, not only
the obvious, but also emerging hazards in their day-to-day work.
In fact the entire concept of process safety management needs to
be understood by all employees. If this knowledge is lacking, then
there is serious problem!
Safety procedures need to provide practical information and
guidance on achieving healthy and safe work environment. Every
employee needs to know the correct health and safety procedures,
and that all employees, including new employees, need to have
access to information about safety procedures.


Occupational health and safety procedures must be implemented

wherever the work is being conducted, be that in an office, factory,
construction site or home.
After the Texas disaster process plant operators around the world
have performed facility-siting studies to evaluate the hazards facing
workers in permanent and portable occupied buildings. Better data
and improved facility siting tools are now available to support
process plant managers and safety personnel in evaluating the
hazards and determining the risk to occupants. Extensive work has
been carried out on the development of effective risk and
consequence mitigation plans, including building relocations and
The safety of vehicles in use also needs to be verified. Not just the
mechanical safety, but driving safety procedures for hazardous
materials, parking and other safety regulations should be clear.

Lack of Early Warning

Depth of Audit
KPIs for Process Safety
Sharing of Learning / Ideas
The baker panel found that BP has not implemented an effective
process safety audit system for its U.S. refineries. The auditor
qualifications, audit scope, reliance on internal auditors, and the
limited review of audit findings were the issues involved.
The principal focus of the audits was on compliance and verifying
that required management systems werein place to satisfy legal
requirements. It does not appear, however, that BP used the audits

to ensure that the management systems weredelivering the desired

safety performance or to assess a sites performance against
industry best practices.
There was very poor internal communication and performance
management process. This led to absence of early warning system.
The lack of communication was obvious when the Day Shift Board
Operator had not been informed of the faulty redundant high-level
alarm at the beginning of the shift.
The safety audits need to be in-depth. Perfunctory audits lead to
neglect of important and hazardous indicators. That was the
condition at Texas.
Safety Audits examine management, employee knowledge, program
responsibilities, records and effectiveness. To conduct in depth
safety audits a multi-unit team should be established. No employee
should be part of his own units safety audit. During the audit
surveys, regulation compliance and detection of unsafe hazards is
Based on OSHA focus areas Key Performance Indicators for safety
can be established for a particular organization or a process. These
focus areas are: Employee Participation, Process Safety Information,
Process Hazard Analysis, Operating Procedures, Training,
Compliance Audits, Trade Secrets, Mechanical Integrity, Hot Work
Permits, Management of Change, Incident Investigation,
Contractors, Pre-Startup Safety Review, Emergency Planning and
Keeping corporate strategy in view the most relevant KPIs can be
decided. The position and function of employees will decide their
KPIs. Once the relevant KPIs are decided, the metrics associated


with them have to be established. Metrics are applied at all levels,

and allowed drill-down, such as:
Organization -> Site -> Unit -> Hazard
Hazards are prevented by many risk control systems or barriers.
However barriers too can have weaknesses and thus they have
potential of failure. When weaknesses in many barriers emerge
simultaneously, a serious hazard can result. When one or more
barriers fail together, the reporting becomes lagging indicator. It is
retrospective and based on some outcome. But monitoring the
strength of a barrier is a leading indicator that is forward looking.
However no KPI based actions can benefit unless there is open and
effective communication. There has to be sharing of learning and
give and take of ideas. Team members as well as employees across
the organization should be encouraged to share knowledge and
experience. It can generate even better ideas by building on each
others knowledge.
To learn, people need time and a safe environment. They need time
to think about their experience and its implications and incorporate
new insights into their current mental models. They need safety to
explore new ideas and challenge their own assumptions. When they
develop trust and rapport, people can feel safe enough to share
their thinking, the reasons behind their conclusions, the questions
they have about their conclusions, even their half-baked ideas.
When they take time to collectively reflect on their experience, they
can build on each others ideas; deepen the richness of their
thinking and insights.
Future of Texas City


Culture measurements in place

Increased management attention
Working on systemic issues
Leadership, engagement, work processes
Increased regulatory attention
As of 2011: Too soon to tell
After the disaster and investigations carried out, the future of the
BP refinery now should look safer. Culture measurement is in place.
The leadership gives clear message that process safety is important.
They demonstrate this with improved policies and positive actions.
A positive, trusting, and openprocess safety culture is in place with
the relevant stakeholders.
Management is paying increasing attention to safety procedures.
The leaders have established an integrated and comprehensive
process safety management system that systematically and
continuously identifies, reduces, and manages process safety risks
at its U.S. refineries.
The systemic failures that contributed to the disaster are getting
attention and rectification process is under way.
A system is developed and implemented to ensure that its executive
management, itsrefining line management above the refinery level,
and all U.S. refining personnel, includingmanagers, supervisors,
workers, and contractors, possess an appropriate level of
processsafety knowledge and expertise.
Employee engagement is actively sought. That makes them fully
involved in their work and motivated to ensure better performance.


Such employees will follow all the safety precautions and ensure
that all safety measures are in place.
The process safety performance metrics are evolving. BP now
monitors at the corporate level several leading and lagging process
safety metrics. BP also is working with external experts to review
process safety performance indicators across the company and the
Apart from the fact that the organization is paying more attention to
safety and the concerned regulations, even the regulating bodies
are stricter and are giving more attention to fulfilling all the
As of 2011 it was difficult to predict the effect of all the measures

DuPont Established 1802

Black powder manufacturer
Safety definitely a part of every decision
DuPont is an American chemical company that was founded in July
1802 as a gunpowder mill, manufacturing black powder.
The company has an ingrained safety culture and every decision,
small or big, is based on safety considerations.

DuPont Core Values

Safety and Health

Core values are the character of a company. These are fundamental
to what DuPont is, what DuPont does, and is viewed as essential to
firms sustainable growth.
Safety and Health
BPfollows the highest standards to ensure the safety and health of
employees, customers and the people of the communities in which
they operate.
Environmental Stewardship
They are environmentally conscious and protect the
environment.Environmental issues are an integral part of all
business activities. They continuously strive to align their actions
with public expectations.
Highest Ethical Behavior
They conduct their business affairs to the highest ethical standards
and in compliance with all applicable laws. They work diligently to
be a respected corporate citizen worldwide.
Respect for People
They foster an environment in which every employee is treated with
respect and dignity, and is recognized for his or her contributions to
the business.

DuPont 1978
Learning from its own tragedies


Tragedies and hazards are best avoided. However if they do occur

then these should be a basis for future safety and precautions.
Despite being the global benchmark for safety, the company
suffered four major tragedies:
May; Vinyl acetate explosion - $20MM
June; Distillation explosion - $7.2MM, 1 fatality
August; Chlorine cooling/drying train explosion - $3.7MM
October; Ethylene vapor cloud explosion - $15.0MM
These tragedies wiped out complacency and the company reiterated
its commitment to safety by becoming more vigilant and alert to
any risks. The company learning from these disasters, made safety
their priority and ensured that it will remain the number one priority
for all the time.

Establishes goal of zero.
DuPont believes that a key aspect of human and worker rights is
the right to work in an environment that is safe and healthy. A
strong safety and health focus is the essential foundation for
successfully implementing a culture that seeks to integrate
sustainable development into the processes of the company. Safety
values are also critical in the successful transferring of new
technologies to developing countries. Support and respect the
protection of international human rights within the sphere of
influence- safe and healthy working conditions.


In 1994, as part of a process to increase transparency around

DuPont policy and operations, DuPont adopted a Safety, Health, and
Environmental (SHE) Commitment which clearly stated the 'Goal is
zero for all injuries, illnesses, and incidents' and that compliance
with the Commitment is the responsibility of every employee and
contractor working on behalf of DuPont. The goal also includes zero
impact on environment, zero waste, zero use of fossil energy.
Consciously, by adopting new technologies, forward thinking science
to re-engineer manufacturing processes the have managed to bring
harmful emissions down, cut global water consumption by at least
30% in areas where water is scarce and increase use of energy
from renewable sources.
There is no compromise on safety zero compromise!

DuPont in 2010
Engineering organization of 1000
DuPont has its core competency: a strong research and
development basis.
For more than 200 years, DuPont has brought world-class science
and engineering to the global marketplace through innovative
products, materials and services. Their market-driven innovation
introduces thousands of new products and patent applications every
year, serving markets as diverse as agriculture, nutrition,
electronics and communications, safety and protection, home and
construction, transportation and apparel.
Today, DuPont is proud to build on this heritage by partnering with
others to tackle the unprecedented challenges in food, energy and


protection now facing our world. With global population expected to

approach nine billion by 2050, DuPont is working with customers,
governments, NGOs and thought leaders to discover solutions to
todays toughest challenges.

Strong SHE record

DuPonts Safety health and Environment commitment was a major
achievement. They are committed to reducing Safety and Health
Incidents and Environmental Footprint. They have achieved a strong
record in these fields and their effort continues. This has helped
Lower worker compensation expenses
Command better rates from contractors and insurers
Enhance productivity and dependability of supply
Enhance reputation as a safe, caring and environmentally
aware corporation.

Always working to improve

DuPont completes a sweeping restructuring, divests its energy
business and draws on biotechnology to realize a new vision of
sustainable growth in its third century. The organization is always
working to improve in all the spheres.

Comparing Strong Safety Culture Organization (SSCO)

and Weak Safety Culture Organization (WSCO)


Safety first core value

Safety is a first priority


Strong core functions

Building core functions

Long term employees

Some long term people

Ethics policy

Ethics Policy

Operations focused

Learning ops focus

Executive experience

Executive experience

Investigates incidents

Investigates incidents



As can be seen from the table, there are major differences in the
approaches of both the companies. The basic distinction is the way
safety is considered.
In SSCO safety is the core value. It is not the work description or
systems in practice. But it is the value that governs all the actions
in the organization. So safety is the basis of all activities old and
new in SSCO. It is the essential tenet and requires no explanation
or change. It automatically happens.
In WSCO safety is the first priority, but it is not ingrained in the
culture. It is considered first for any new activity.
At SSCO the core functions are Safety & Health, Environmental
Stewardship, Highest Ethical Behavior and Respect for People.
These functions are strong and through these the company has
grown and become a benchmark for safety. The commitment to
core functions was always there and has never changed. They are
more important now than ever before.


At WSCO the core functions are being built. It is a difficult task, but
they will get to a stage where strong core functions will support all
their endeavors.
Being safety oriented and people focused, SSCO has employees
preferring to stay with them for a long time. Most of the employees
are with the company throughout their active career. In case of
WSCO the number of such employees is considerably less.
Both have written ethics policies.
SSCO is operation focused. The operations must go on as continuity.
All efforts are towards that. At WSCO the focus is forming on
Both have executives with experience. Both are committed to
investing incidents, including near misses.
The employees at SSCO are at the most efficient stage in the
DuPont-Bradley curve the interdependent. Here every employee is
alert to his/her own safety and the safety of others. They are
always alert to hazards and risks to people, processes, property and
environment. The WSCO employees are at calculative stage. They
assess risks and respond to them.

Effects on business
Information moves quickly
Short-term focus on profits
Reduced OJT training and development
MOC of everything an issue


International opportunities
Loss of tacit information
Losing reverence for technical accomplishment
If a company has a good safety culture, the business has, most
likely, some of these characteristics:
Strong EHS performance
Smaller fines and fine levels
Better relationships with regulators and communities
Higher product quality
Reduced waste levels and waste treatment costs
More reliable and predictive operations and product outcomes
Strong cost performance
Time available to train people effectively
Able to attract the best performers
Reduced employee turnover
More satisfied customers due to higher reliability of supply
and quality
Improved profitability


Columbia Space Shuttle disaster

How culture can impact the operations of an organization is proved
by the Columbia space shuttle accident.
Space shuttle Columbia, re-entering Earths atmosphere at 10,000
mph, disintegrates
All 7 astronauts are killed
$4 billion spacecraft is destroyed
Debris scattered over 2000 sq.-miles of Texas
NASA grounds shuttle fleet for 2-1/2 years
The apathetic response to safety precautions, nothing has
happened yet! was the rationale. Concerns of experts about the
safety did not reach the top decision makers, as the NASA culture
prevented it.
Past successes had created a can-do attitude that refuted failure!
Cultural traits and organizational practices detrimental to safety
were allowed to develop, including: reliance on past success as a
substitute for sound engineering practices (such as testing to
understand why systems were not performing in accordance with
requirements); organizational barriers that prevented effective
communication of critical safety information and stifled professional
differences of opinion; lack of integrated management across
program elements; and the evolution of an informal chain of
command and decision-making processes that operated outside the
organization's rules.
The shuttle safety organization, funded by the programs it was to
oversee, was not positioned to provide independent safety analysis.
The famous quote by the NASA administrator, 1994, Daniel S Goldin:


When I ask for the budget to be cut, Im told its going to impact
safety on the Space Shuttle I think thats a bunch of crap.

What will you stand for as a manager?

As a manager, safety has to be your core value. No single safety
violation however minor, however negligible, should be overlooked.
What does it mean? It means that you will not compromise the
safety code of conduct you live by, by your attitude to safety and
human life.


What is a hazard?
What is a hazard? is an important question. Human beings can
instinctively perceive minor daily hazards. We will not dip our
fingers in boiling water! However when we talk of workplace
hazards, we need to define the word more precisely. That leads to
spotting the hazards and taking measures to mitigate them.
There are various definitions of hazards. The dictionary defines
hazard as an unavoidable danger, even though oftenforeseeable.
Also it can mean something that can cause danger, peril or difficulty.
Another source says hazard is exposure or vulnerability to injury or
loss of life or limb. It is something likely to cause injury or an
accident waiting to happen.In relation to occupational safety and
health the most commonly used definition is A Hazard is a potential
source of harm or adverse health effect on a person or persons.
Hazard means a situation that has potential to cause harm. The
situation could involve a task, an operation or handling chemicals or
Hazard can be evident like a fast approaching vehicle. However in a
workplace hazards can be more devious for example, exposure to
potentially dangerous substances, working without proper PPE
around a process that involves dangerous chemicals.
Workplace hazards can be mechanical hazards, noise, bad
ventilation, faulty equipment, lack of proper training to use a
machine, misuse, system failures, chemical spills, etc. Most of the
hazards in a workplace can be and need to be identified. Those that
can cause serous harm or damage to people or organizations are
known as significant hazards.These are the ones that need serious


Why do we look?
We look and observe to prevent accidents or at least mitigate the
risk in case the situation is unavoidable.
We need to look out for hazards for our own safety and safety of
other people, property, and equipment. Most of us have intuitive
hazard sensitivity. For example while driving a car, we scan the
road, the traffic, hazard symbols, traffic signals, speed limits, and
other such elements without paying specific attention to any single
one. Only when one of these elements has a potential for hazard do
we pay extra attention. A good driver can see developing hazards
and takes measures to mitigate them.
In fact even if you are just travelling in a vehicle, you can spot
hazards. It doesn't matter if you are sitting on a bus or as
passenger in a car; you can observe the constantly changing road
situation.There may be many potential hazards infront of your
vehicle; some may develop into serious situations, some won't.It
could be a cyclist, a bend in road, jaywalkers, wet-slippery road
patches. We may not even be conscious of observing something and
reacting to avoid a developing hazard.
Similarly we look both ways before crossing a street. By doing this
we can perceive hazards and avoid them. We can for example see a
speeding vehicle and wait to cross thereby avoiding the potential
hazard. That means we do not let a developing hazard turn into an
actual hazard. It is very important to look and observe. What you
do not see and do not respond to may prove dangerous.
When the situation is something that we can control, for example
developing a chemical operation, we look carefully to make the

operation safe. Here we need to look for hazards and try to

eliminate them. Of course complete elimination of hazard may not
be always achievable but the danger can be considerably reduced.
There are many methods of reducing potential hazards, as we shall
see later in this course.
Similarly, while conducting a laboratory experiment, safety can and
should be considered. Here too the situation is under our control
and looking out for hazards will lead to a safely conducted
While playing your favorite sport the safety response becomes
automatic after a while. Avoiding a ball, avoiding a fall or if that is
not possible then falling so the injury is minimum, avoiding other
players and their equipment (Like a bat or a golf club thrown
inadvertently) are safety responses. In water sports, the risks are
great and we stay alert to dangers of drowning, capsizing, and take
precautions by using required safety gear.
Dehydration is another major hazard that occurs in sports such as
distance running, hiking and soccer.

Looking for Hazards

Method1: Look for ENERGY
To avoid workplace energy hazards, we must identify and assess
the likely spots and situations with hazard potential We need to look
for hazards and take measures to eliminate or mitigate them. How
to do this? The first method is to identify the various energy sources.

Workplace hazards include practices or conditions that release

uncontrolled energy. So to look for hazards, we look for the source
energy in the workplace.
Where to look for such sources of energy? How many kinds of
energy are there? Once that is known, we can look where it is used.
Potential, kinetic, chemical, radioactive, pressure or thermal,
electrical, electro-chemical, sound and nuclear are types of energy
that can create a hazard.
Potential - lifts, cranes, look up
When looking for potential energy, look up an object that could
fall from a height (potential or gravitational energy), a lift that may
malfunction, or a crane that may fail. Once such an assessment it
done, then measures can be taken to avoid or reduce the associated
risks. Ensure the right lift for various activities, carry out proper
maintenanceand keep lifts and cranes in good and safe working
Sources of potential energy can also be from energy stored in
machinery,weights andsprings, pistons under pressure or hydraulic
controls. Such potential energy can be released during work causing
injury or death. Look for such sources and prepare.
Overhead storage, stacked items, also pose a risk. Look up and
ensure safety.
Thermal- extremes, both hot and cryogenic
Thermal hazards are objects or substances that transfer energy as
heat. Substances or materials that release heat are contact and fire
hazards. In addition, some cold substances will absorb so much
heat that they can be thermal hazards. Dry ice and liquid nitrogen

are such thermal hazards. Substances or materials that absorb heat

are contact hazards.
In workplaces where temperature effects in indoor environments
are a risk, it is vital to control these. Heat stress, cramps, fatigue
can result from high temperatures. Dehydration is another hazard.
To avoid hot thermal burns look for open flames, boiling liquids,
red-hot coils. Some not so obvious places are equipment that is
indirectly heated by otherequipment, exposed light bulbs, metal
casing on equipment, heat sinks, and combustible products. It is
advisable to allow equipment to reach a safe temperature before
starting work.
Cryogenic hazards are associated with extremely cold temperatures.
Health hazards associated with cryogenic liquids arefrostbite due to
extreme cold, asphyxiation, and toxicity.Flammable gases such as
hydrogen, methane, liquefied natural gas and carbon monoxide can
burn or explode. CO and nitrogen can cause asphyxiation.
The release of compressed gas or steam (pressure.and high
temperature) is a hazard. Without adequate pressure-relief devices
on the containers, enormous pressures can build up. The pressure
can cause an explosion called a "boiling liquid expanding vapor
explosion" (BLEVE). Unusual or accidental conditions such as an
external fire, or a break in the vacuum which provides thermal
insulation, may cause a very rapid pressure rise.
Such hazards can be avoided by having a backup device for the
pressure vent such as a frangible (bursting) disc.
Kinetic - rotating equipment
This is the energy associated with motion or potential for
motion. Motion hazards are most commonly linked to mechanical

energy but other forms of movement are hazards as well.Energy of

moving machinery can cause amputations, lacerations,fractures or
evenloss of life.
Setting an object into motion/rotation requires that the object be
accelerated to attain the motion, and this energy, if hazardous,
must be dissipated. Maintenance activities on equipment in
operation can be the source of a kinetic hazard. Special tools may
permit the operator to stay outside the danger zone.
Shutting down the machine is an option. However, there can be
danger in this situation as well from unanticipated motion of a
component of the machine.Movement of the material being handled
can release residual energy within the machine or equipment and
result in the conversion of potential energy to kinetic energy
Chemical reactivity, toxicity
Chemicals have energy that can start fires,cause skin burns, and
generate harmful gases or fumes. To prevent such hazards, loss of
containment must be prevented by identifying sources before
working on a system. Systems must be released, drained or vented
safely before starting work.
The use of toxic chemicals is never to be taken lightly. Accidentally
released, they are potential and frequently actual dangers to human
life and the environment.
If appropriate measures are taken in the first place, most industrial
accidents can be prevented.Their effects can be minimized. This can
be done by labeling the hazardous chemical containers
appropriately and clearlyChemicals known to react with hazardous
consequences must be properly stored apart.

Radioactive sources
Radioactive sources are a boon for mankind, but can quickly turn
into a bane if accidents happen. Sources can be damaged,
compromised or lost.
Many types of radiation can be found in the workplace and in the
environment. Some are naturally occurring, for example, radon,
radium, uranium, and the sun (ultraviolet rays). Man-made
radiation include X-rays, CAT scans and magnetic resonance
imaging (MRI).
The human body cannot detect radiation. That's why exposure to
radiation can occur unknowingly and pose a health risk. Radiation
burns, cancer, harmful genetic mutations are some of the
aftereffects. Even the waste is damaging to humans, animals and
the environment.
That is the reason all radioactive sources need to be handled with
extreme care.

Pressure differentials, both high and low

Hazards exist within pressure systems because of the stored energy
of the compressed gas and the chemical nature of that gas.
Workplace hazards in high-pressure systems are mainly due to
leaks, pulsation, vibration, release of high-pressure gases and
whiplash from broken lines. Depending on the type and amount of
gas released, the resulting hazard can be fire, explosion or
poisoning of people in the vicinity.
High-pressure differential can occur across a reactor. If there is
plugging and flow drops off due to high pressure drop (pressure

differential across the reactor) the residence time would be

increased and could result in loss of control of the reactor. Most
tanks and many vessels are not rated for full vacuum so a condition
where the differential across the wall is too great, collapse can
While looking for pressure hazards notice sounds, soap solutions,
scents, and corrosion. Use leak detectors. Hazardous gas systems
should have a Hazardous Gas sign displayed, and a written shut
down procedure.
Electricity is unavoidable in any workplace. It can kill.
Engineers and electricians work directly with electricity while others
who use it as a service (lights, ACs, computers, etc.). Anyone may
be exposed to electrical hazards.
So here we look for faulty electric appliances, the correct electric
equipment, and proper cables. We should also ensure that electric
equipment used in flammable atmosphere are properly rated for
that area classification.
Look for a wrong fuse, loose wires, damaged cables, improper
connectors, and have corrective measures taken by the appropriate
personnel.Ensure regular maintenance with appropriate tools is
carried out by trained staff.
Electricity is not to be taken for granted. Proper care is a must.
Check circuits, lock and tag source breakers, replace worn cords
and faulty equipment, dont overload power points, avoid power
tools on metal ladders, keep power cords and extension leads out of
the way and always be alert.


Looking for Hazards

Method 2: Use the three Ps
This method classifies hazards under three Ps Process, Plant and
People. Then look for hazards systematically for each P. Let us start
with Process.
The first consideration for Process is Process Selection. There are
frequently options for which process to choose to make a desired
product. Some technologies or process alignments are inherently
more safe than others. This should be the first consideration.
Under this heading, look for chemical process, work process,
maintenance process, ruggedness of, quality of, yield of, source of
design, use of standards.
Chemicals have energy that can start fires, cause skin burns and/or
generate harmful gases or fumes. Even worse accidents and
leakages can happen. So look for the process guidelines, process
safety measures, equipment working, fire safety, storage safety,
and other safety elements. It is a good idea to have a checklist of
hazards and preventive measures and go through it often.
Another place to look for hazards is in the processes on a workplace.
Go through each step of the process meticulously, identifying the
hazards at each step of the process. Look at storage, delivery,
dispatch, various process stages, industry standards, legal
requirements, accident records, near-misses as per a checklist.
Maintenance safety must be considered from the initial plant design
and equipment layout. Servicing and maintenance process for

every detail including workspace, equipment, has to be perfected.

Maintenance can sometimes pose more danger especially when
carried out during operation. The maintenance personnel have to be
alert and take requisite safety precautions.
Look for correct design of the equipment, use the latest safety
enhanced equipment. Check for reliability of components and plan
for replacement at the end of their useful service life. Look for
critical components and least reliable components.
Check the ruggedness of the construction of the workplace, and
ensure the best acceptable quality of material. For everything
maintain legal and industrial standards.
Many of these hazards are interrelated. So scrutinize the process,
check the layout of the process area, check for equipment standard
and then look for the likely hazards that may be encountered.
Under this heading look formaterials, equipment, age, location,
constructed and designed, level of maintenance, level of capital
Machinery, equipment, simple tools, power tools, instruments, and
office equipment all these constitute plant. Each one presents its
own hazards, which can include electrical, mechanical and moving
parts, crushing or cutting, fire and explosion, hot parts of plant,
noise. Minor to serious injuries can result from any of these.
So look for possible health, safety and damage effects from the use
of the plant. Who designed the plant, who constructed it? Was it
done as per the design specifications? Did the design and
construction satisfy the plant requirements for safety in all aspects?


Check the hazards faced by operators, visitors, and others. Are the
materials used right, standard and safe? Is the equipment old and
needs replacement rather than maintenance? One potential risk
area is the lack of insulation of hot piping at levels where personnel
can get burnt.
Look for commissioning, operation, breakdown, repair and
relocation. What kind of hazards can happen? Look for likelihood of
entanglement, crushing, cutting, stabbing and puncturing, shearing,
friction, striking, high-pressure fluid, electrical or explosion. Ensure
proper safety precautions.
With respect to people look for staffing levels, experience, number
away, level of training, supervisory quality, organizational goals,
incentives, communications, shift turnover.
Hazards that people create include lack of attention, wrong
decisions, incorrect techniques, inappropriate equipment, hurrying
through the task, attempting task without proper training.
Check if the staffing level is right. Are there right numbers of the
right people, in the right place at the right time? Too many people
may be costly. There maybe decision and duty arguments. Too few
may create overtime and tensions. How many people are away?
Do the operators have the right experience and expertise for the
assigned task? Do they have proper training, and updates when
Are the supervisors correctly equipped in training and authority to
carry out their expected roles? The traditional supervisor
represents a crucial, final link between planning a job and its
execution. In fact supervision is extremely important in influencing


the performance of the concerned teams. Look for and ensure right
supervision. Poor supervision may result in accidents.
What are the organizational goals? Are the people aware of them?
Do they have enough incentives to motivate them to work well? It
must be absolutely clear that Safety, Health and Environment are
top priority!
Spoken and written communication can be critical in maintaining
safety. This can include general communications in the form of
safety information, communications between team members or
between different teams during operations or maintenance work,
and emergency communications.
Communications are very important in a wide range of safety
critical tasks and activities such as lifting operations, emergency
response, entry to confined spaces, as well as coordination of
activities between different parties and organizations.
During shift turnover, between shift and day workers, or between
different functions of an organisation within a shift e.g. operations
and maintenance, communication is crucial. For continuity and safe
working relevant information has to be properly communicated.

Here is an incomplete list of workplace hazards

Pneumatic vs. Hydraulic Testing

Operating Issues
Bypassed interlocks
Improper permits
Lack of discipline
Change & Subtle Change
Procedures not followed
Poor visual signals
Human Element (people)
Too many new
Too many untrained
Too many away
Family problems
Electrical Power


Hazard vs. Risk

Hazard and risk may sometimes be used interchangeably. However
there is a difference in the two terms.
Hazard something that can cause harm. It is the
Consequences of an event.
Risk - the possibility of incurring loss. It is the Probability an
event will occur.
Hazard is the potential harm that can be caused,
consequences.; risk on the other hand is the likelihood of harm,
probability an event will occur.
Hazard is an existing situation whereas risk only is an anticipated
situation. For example a steep cliff is a hazard, and only when you
begin to climb it, it poses a risk.
Hazards are all around us. A street is a hazard; if you decide to
cross it becomes a risk. Unless there is exposure to a hazard, there
can be no risk. So risk is the probability of a harmful event arising
from exposure to a hazard that can have consequences.
Hazard refers to the inherent properties of a substance or a
situation that make it capable of causing harm to human health or
the environment. However, just because that substance or situation
has potentially harmful properties, it does not automatically pose a
risk. Exposure to that hazard will turn it into a risk.
Factors that influence the degree of risk include:
How much a person is exposed to a hazardous thing or
How the person is exposed (e.g., breathing in a vapor, skin
contact), and


How severe are the effects under the conditions of exposure.

So when one looks at risk, particularly business risk, one must take
into account frequency and severity (consequence).
Risk is typically shown by a risk matrix (more later in the course on
this). Suffice it to say that a risk matrix is simple graphical tool. It
provides a process for combining:
The chance for an occurrence of an event (usually an estimate)
The consequence if the event occurred (usually an estimate)
Risk = Chance X Consequence
Risk Severity = Probability of Occurrence x Potential Negative
(link to Risk Matrix)

Hazards in the Chemical Process Industries

Have a look at this 9 minute video of the T-2 Incident developed by
the CSB:
While watching this video list the hazards that come through
your mind
Note that a ChE died in this incident



Process Safety Incidents

Why do we need Process Safety Management?
When things go WRONG
Serious lapses in PSM can create havoc, as we shall see in
thislesson. The results of some grim process safety incidents will be
illustrated with the following details:
Picture of the incident
Consequences Lives, Injuries, Losses
Description of Incident
What PSM system(s) failed?

Flixborough, UK 1974
The chemical plant, owned by Nypro UK (a joint venture between
Dutch State Mines and the British National Coal Board) and in
operation since 1967, produced caprolactam, a precursor chemical
used in the manufacture of nylon.
The Flixborough Disaster was an explosion at the chemical plant
close to the village of Flixborough England on 1stJune 1974.
Residents of the village of Flixborough were not happy to have such
a large industrial development so close to their homes and had
expressed concern when the plant was first proposed.

The process involved oxidation of cyclohexane with air in a series of
six reactors to produce a mixture of cyclohexanol and
The inquiry into the incident found out that a crack had appeared in
reactor number 5.The reactors were filled with liquid cyclohexane
under pressure at 155 C, through which compressed air was
bubbled to cause the reaction.
The plant was shut down and the reactor, one of a series of six, was
removed and a bypass installed to link reactor numbers 4 and 6.
The temporary bypass would allow continued operation of the plant
while repairs were made. This 50cm diameter bypass pipe was
designed by Nypro engineers who were not experienced in highpressure pipework.
Description of the Incident
The official inquiry into the accident determined that the bypass
pipe had failed because of unforeseen lateral stresses in the pipe
during a pressure surge. The bypass had been designed by
personnel who were not experienced in high-pressure pipework, no
plans or calculations had been produced, the pipe was not pressuretested, was mounted on temporary scaffolding poles that allowed
the pipe to twist under pressure and had not been reviewed by
appropriate chartered engineers.
Bellows were used to join the pipe to the 60cm reactor flanges and
crucially, because the gravity-assisted reactor series was built on a
slope, the pipe included a dog-leg bend to accommodate the
change in height.

The by-pass pipe was a smaller diameter (20") than the reactor
flanges (24") and in order to align the flanges, short sections of
steel bellows were added at each end of the by-pass - under
pressure such bellows tend to squirm or twist.
These shortcomings led to a widespread public outcry over
industrial plant safety, and significant tightening of the UK
government's regulations covering hazardous industrial processes.
(See COMAH Regulations).
During the late afternoon on 1 June 1974 a 20 inch bypass system
ruptured, which may have been caused by a fire on a nearby 8 inch
pipe. This resulted in the escape of a large quantity of cyclohexane.
The cyclohexane formed a flammable mixture and subsequently
found a source of ignition. At about 16:53 hours there was a
massive vapor cloud explosion, which caused extensive damage and
started numerous fires on the site.
This was the early indication that the US would need similar
regulations and OSHA was born shortly afterwards.
Any piping in such service needs to undergo a piping and flexibility
analysis to determine if the expansion with temperature has been
properly accounted for in the design. Additionally, the change in
pipe diameter must be accounted for in the pipe to account for
stress both during operation as well as during heat up and cool
down. Finally, the bellows incorporated in the system shows an
absolute lack of good engineering judgment. A bellows is intended
to accommodate a change in length, but without the possibility of
significant pressure of containment.

28 people were killed in the explosion. The number of fatalities
could have been much more over 500+ had it happened on a
Despite protests from the local community the plant was re-built
but, as a result of a subsequent collapse in the price of nylon, it
closed down a few years later. The site was demolished in 1981,
although the administration block still remains. The site today is
home to the Flixborough Industrial Estate, occupied by various
businesses and Glanford Power Station.
Whats Covered by PSM?
Process Safety Information

Mechanical Integrity

Employee Involvement

Hot Work

Process Hazard Analysis

Management of Change

Operating Procedures

Incident Investigation


Emergency Planning & Response


Compliance Audits

Pre-Startup Safety Review

Trade Secrets

What PSM Elements Were Not Followed?

Management of Change (MOC)
MOC requires qualified staff to review changes.
What OTHER Elements of PSM could have helped prevent this

Focus on Technology
Perspective of PSM implications in Process TechnologyThe objective of this lesson is to evaluate the implications of PSM in process
technology. Chemical industry uses technology in processes to manufacture
chemicals that other industries need. However technology is always in a
state of flux, evolving and improving. So a business needs to assess and
implement the right technology for optimum performance. Periodic review of
technological advances whileevaluating your current products and processes
is a necessity.
Process documentation and other process safety information (PSI) are
crucial for PSM. Such documentation is a must for OSHA and other
government authorities as well as insurance agencies. It also has to be upto-date giving the current state of material balances and energy balances.
Appropriate reactor design and the most fitting reactive chemistry are to be
considered. Quality control and ensuring the purity of incoming materials
and product streams will go a long way in ensuring product quality and
safety of the entire process.
Once again while changing or modifying either the product or the process,
review of all available technology and choosing the perfect one for your
requirements will help.
Proper risk management focuses on normal operations/conditions as well as
abnormal operations/conditions, equipment design, human factors, standard
operating and contingency procedures, maintenance operations, and facility
design and siting.

Management of Change applies to changes in technology that can potentially

have an adverse affect on a covered process. Many process technology
changes may also be categorized as other type of changes, but thinking in
terms of technology changes may trigger one to consider the Management of
Change procedure for situations that might not be otherwise considered.
Process Hazard Analysis is an exceptionally important step for any operation.
This means identifying high-risk hazards associated with a chemical process.

At the end of today, you will be able to:

Improve your understanding of the key items in process technology
to look for when evaluating the status of your units PSM health

PSM and Process Technology Why review?

Why does Process Technology need to be reviewed? Why review technology?
That is because the technology's functions and features need to be right for
a particular operation, for a particular unit. Technology may be identical but
each unit is different and its own variables will influence the technology for
that process. These variables are:
Locations in this case the local weather, the quality of water used
for the process, the quality of air and also the soil will affect the
process to varying extent.
Operations personnel the operatives also have an impact on the
process. The skills, perception, training and technique differ from
person to person. The environments in which the person uses the

technology, the individual's characteristics and preferences will also

have some influence.
Methods of operation may differ within a unit with shift variation.
The management and leadership styles can impact technology.
People and skills differ.
Business and customer product requirement vary and may change
periodically even with the same customer.
Are you using the correct technology for your process and product? Has the
technology improved? Are you using current technology? Has the technology
lowered process costs and/or improved the product?
A technology review will answer these questions.
Once the process is finalized, its safety review must be made. This review
too has to be tailored to the unit. Taking into consideration all the above
factors, informed decision has to be made regarding PSM.

Assessing Your Process Technology

How do you access your Process Technology? To do this the first step is to
examine your product. Check out the following:
Age of process?Age of Product?
Commodity chemical? Or newly invented product?
Number of process steps
Kinds of Process steps and Phase separations
Recycle streams
Reactions and reactors transformation of matter

Solids handling?
New catalysts or other new items internal?
If you have operating history, use it!
Keep your rating simple: Effect on PSM is Low, Medium, or High

Thinking Behind the Rating Process General

Re-examining your product and process basically acts as grounding and
communication tool. All the stakeholders then get on the same page. This
rating process builds a common understanding between all the site
personnel including managers, supervisors, engineers and site-operators.
Another advantage is that some silent issues and needs begin to surface, get
attention and get resolved. This helps in bringing the operation to a comfort
level for all concerned. This also helps management too review their thinking
and make necessary changes.
Once the technological process is well defined and the ratings decided, it
assists in formulating the PSM issue. The level of PSM required, the degree
and the path of PSM can then be determined.

Process Age and Product Age/Profitability

When the process in use is tried and tested, it has history! This can be good
or bad. The familiarity with the process can give rise to complacency. It can
breed contempt Its a known process I can do it easily! Alternatively
experience of the process can build expertise. Repeated exposure to the
process could help a capable operative introduce enhancements to improve

However PSM is not an immediate revenue generating task. Commodities

usually have increased cost pressure, which means fewer staff available to
do the PSM review. There may be pressure to get the PSM over with quickly
or even to skip it altogether.
Another issue is that hardware/units do not have reinvestment economics;
the plant is not rebuilt in the case of an explosion. Management certainly is
more concerned with finances and can be quite distracted by money
troubles. Incentives for correct outcomes may be wrong.
If new products are introduced, they may hold unrecognized and untested
hazards. This will mean more time spent on non-profit activity like hazard

Numbers and Kinds of Process Steps

The number of steps in a process generally indicates the nature of the

More process steps usually means more complexity, but not

If there are higher number of separationsin a process, it usually means that
there is more energy being fed into the process. Energy usually comes in the
form of steam, Dowtherm or cryogenic materials.Each of these can be
inherently dangerous within themselves.
So more process steps can mean increased potential for leaks and additional
corrosion, also increased thermal stresses with thermal cycling.

Number of Recycle Streams

PSM is harder to execute when there are many recycle streams between
units. Higher number of recycle-streams means increased complexity in
startup, shutdown and normal operations. This happens because the units
become interdependent. If even one unit has an upset, it can directly affect
other units in the process.
At such times, for such processes, operations communication becomes
critical and alarm management needs more attention. Also during shift,
handing over-taking over actions, communication needs to be absolutely

Reactors and Reactions

Process safety requirements around reactors are critical, and the more
reactors one has, usually the more difficult it is to perform a good Process
Hazards Analysis, or PHA. The reasons for such a situation are many.
Reactions have many variables, and the PHA needs to account for the
differences. More reactors simply multiply the variables and the consequent
steps for PHA. Also pressures can build very quickly during a reaction, which
needs to be accounted for.
If the flow of the heat transfer is interrupted or disrupted in some way it can
prove very dangerous. The temperatures can then exceed materials
tolerances leading to a catastrophe. Relief system design is complex,
especially with multi-phase flow the complexities increase.
Energy is usually generated or consumed in a reaction, and likelihood of
hazards increases when energy is transformed or generated. Obviously,

more reactors imply more difficulty in keeping energy under control. This is
because even Control systems are critical and complex.
Selection of reactor type can affect the PHA, due to the amount of materials
present at any given time (CSTR, Plug Flow, Fluidized Bed)

Solids Handling
Process safety requirements are more complex when solids handling is
present in a facility.

Solids can be notoriously difficult to characterize, and

are looked upon as harmless, yet can be explosive or poisonous under some
conditions.Solids in the process can cause wear, poor performance and
blockages in the equipment, which may lead to expensive shutdowns.
See this video of what can happen when common sugar is manufactured and
is mishandled:

New Catalysts and New Internals

The performance reactors depend upon the catalyst and also the design of
their internals. New catalyst may need redesigning the internals. The
feasibility of using new catalysts needs to be considered.
Rules of thumb:
You need to verify new and better against established PSM protocols. You
also need to verify interrelatedness. Does new catalyst mean higher yield?

Or is it just a variation of the old one? If you choose to use new catalyst the
questions to ask are
Can the reactor coolant system handle raised temps?
Will the metallurgy work?
Will the relief system design still be functional?
Are operating margins eroded?

Operating History
In order to ensure your Process Hazards Assessment is set up for success,
collect records from the current operations. You have to
Note repeated excursions outside of safe operating limits from your
data historian
Interview operators and ask them what ops are particularly difficult
from their standpoint and why
Examine poor quality product and note conditions under which it
Examine any shift related abnormalities
Verify the status and the correctness of your operating procedures

Key Documentation
Key documentation in the process technologies falls into four distinct groups:
Process flow diagrams the fundamental material and energy
balances and flow rates in your plant

Piping and instrument diagrams listing pipe codes, valve and

instrumentation location and types
Single line electrical diagrams that define power sources to key
equipment pieces
Major equipment written specifications
Before one even starts a Process Hazards Review, this documentation must
be in hand and verified.

Worrying About the Molecules and PSM

If your unit has a reactor, or has reactive chemicals as part of your
inventory, you must be careful in paying attention to the molecules and their
properties. Additionally, be alert to impurity levels in your raw materials,
product streams and wastes. All of these streams, as well as the recycle
streams, need to be part of your process flow diagrams and material and
energy balances.
If you are not sure about the chemistry, get the detail from a knowledgeable
source and verify with at least a verbal conversation. Two people are dead in
the US State of Florida because they took reactive chemical information off
the Internet and did not appreciate what they were dealing with. While the
Internet is a great source of information, it is a ONE WAY source, and it is
Buyer Beware.

The Process Hazard Analysis - PHA

The identification of hazards and their analysis is the heart of PSM.It is
detailed procedure to identify, evaluate and control process hazards

involving dangerous chemicals. We will discuss much more about hazard

analysis techniques in later chapters; however the key technologies around
Process Hazards Analysis are:
Systematic process to analyze the potential hazards in a given unit.
To effectively perform a PHA, a multi-disciplinary team is needed. The
team should include experts in: the PHA process itself (often a safety





manufacturing, and others (i.e. your customer for a third party sale).
Using the P&ID, the expert team goes from node to node in the
process looking at possible hazards (PHA meeting).
The same methodology is used until the whole process has been
There will be follow-up items (calculations, analyses, additional
information) that must be completed outside of the PHA meeting. The
PHA is complete when the action items are closed.
Everyone on the PHA team signs off on the completed PHA.
PHAs are important safety documents for a facility.
Verify the government regulations around frequency of review and other

Management of Change - Technology

In the Process Safety Management System, one of the most difficult
elements to understand and to work on is Management of Change.
All changesare not alike. There are 3 types of changes:
o Technology

o Facilities
o Personnel
The technology elements have been discussed today.

Focus on Facilities
Perspective of PSM implications
Pressure vessel design
Control system
Safety instrumented systems
Relief systems
Pressure vessels are designed to operate safely at a specific
pressure and temperature, technically referred to as the "Design
Pressure" and "Design Temperature". A vessel that is inadequately
designed to handle a high pressure constitutes a very significant
safety hazard. Therefore pressure vessels are designed with great
care because rupture of pressure vessels means an explosion which
may cause loss of life and property.
A control system is a device, or set of devices to manage, command,
direct or regulate the behavior of other device(s) or system(s).
Industrial control systems are used in industrial production.
A Safety Instrumented System (SIS) consists of an engineered set
of hardware and software controls which are especially used on
critical process systems.SIS are specifically designed to protect
personnel, equipment and the environment by reducing the
likelihood (frequency) or the impact severity of an identified
emergency event.
Effective pressure relief and flare system design helps companies
meet risk-management goals, compliance requirements, and sound
business practices.

PSM implications of pressure vessel design, control systems,

SIS,relief systems design, and maintenance are sometimes quite
obvious, but usually subtle. We will cover these areas at a high level
today and bring the connections more clearly in focus.

At the end of today, you will be able to:

Understand the connection of PSM to facility design, operation, and

PSM and profitability

Facility design, maintenance, and normal operating procedures will
make a difference in PSM and profitability for better or worse.
PSM will impact the profitability of your company. If done well the
cost of process safety will be low and its impact on the bottom line
will be negligible. And, the world will know that your company is
safe. However, if not done properly (and history shows this with
glaring clarity) the impact will be more widely known than you can
imagine and may bankrupt your company. So, no pressure.
Implemented properly, process safety and operations risk
management principles and systems can be effective in increasing
not only the safety of your operation, but its productivity, cost
efficiency, and quality as well. In fact, world-class PSM performance
has become a competitive differentiator in many industries.
Organizations that invest in workplace safety and health can expect
to reduce fatalities, injuries, and illnesses. This will result in cost
savings in a variety of areas, such as lowering workers'

compensation costs and medical expenses, avoiding OSHA penalties,

and reducing costs to train replacement employees and conduct
accident investigations. In addition, employers often find that
changes made to improve workplace safety and health can result in
significant improvements to their organization's productivity and
financial performance.

I am a Chemical Engineer, why should this matter to

You may design equipment
You may operate equipment
You may need to maintain equipment
You may re-design equipment
You must understand the operation
So, lets begin. Im a chemical engineer why should I even think
about those mechanical aspects of the business? During the course
of your career you may have the opportunity to design equipment
and processes. You may also, at some time operate that equipment.
If you operate it, you certainly will need to maintain that equipment
if its going to work for you in the long run. At some point in time, if
flaws are found in equipment or a more economical method of
running it becomes apparent, you may well have to RE-design
equipment. That can get tricky if it was put in service years before
and the original design memos become unavailable.
Bottom line however, in any of these roles where your career takes
you, you MUST understand the operation and fundamentals of the
pots and pans to do well. PSM must become second nature to how
you do your job.

During the course of your career you may find yourself in lots of
different roles. For example, maybe you go on to be a design
engineer. You have a responsibility to ensure that the equipment
you design comply with regulatory laws. You may find yourself in a
Production Team Leader role in which case you have a responsibility
to operate the equipment within the regulatory requirements. For
example relief devices are added as a last resort safety device. This
doesnt mean that just because the relief device will pop you can
intentionally run the vessel at pressures higher than regulated nor
can you bypass the safety devices.
In plants it has actually happened that operators put a blank flange
in front of the rupture disk because they were tired of it popping all
the time. You may find yourself working as a reliability engineer in
the maintenance organization. In this case you are responsible for
ensuring that the equipment is maintained properly. You need to
understand what tests or inspections are required by local, state
and federal agencies and ensure these tests are completed on time
and any deficiencies detected are corrected immediately.
You might also find yourself working as a process engineer. This is
the group, in my opinion, who have to watch out for process safety.
Many times a process engineer doesnt understand or make the
connection to how what seems like a simple re-design or
modification to the process or equipment can impact a regulated
piece of equipment. Simple changes in the process like process or
temperature can result in operating a piece of equipment outside of
it design and regulated parameters.
Bottom line is that you must understand the operation and that
means more than just the process.


What should you know?

You must understand the concepts and principles that
govern what is going on inside the equipment.
You must anticipate what could go wrong from a people
perspective as well as an equipment perspective
Fundamentals, Always look at the issues from a
fundamentals perspective
What you are learning today is a great beginning that should never
stop. That means that to do well, you must understand what is
going on in every piece of equipment, every possible upset or
corrosion mechanism. Not just what could happen, but why? That
includes mechanical failure and its causal factor, which could be a
simple failure of the people that operate it. So, to do well, anticipate
what could go wrong and put systems in place to prevent it. Thats
the key, anticipate and be proactive. The best managers and
engineers are the ones who have systems in place that let them
appear to be coasting. The ones who are always fighting fires are
the ones without proactive systems in place.
To get your own systems you must always look at your issues from
a fundamentals perspective, both equipment as well as a people

This is a photo of a reactor agitator shaft coupling. The nuts on the

studs loosened allowing the coupling to separate. When the
coupling separated the lower half of the agitator began to whip
inside of the vessel allowing the agitator shaft to contact the baffles
and vessel wall. This was a result of incorrect size fasteners
installed in the coupling when assembled. So you also have to
anticipate construction not assembling it correctly either.

This is the inside of that vessel. You can see the damage to the
Teflon liner caused by what seems like a simple loose bolt. In this
particular case the vessel itself was not a coded vessel but rather
contained a highly hazardous chemical that would have been fatal
to anyone exposed to its leaking contents. So in this case the
equipment itself was not regulated but the process chemical within
it is PSM covered.

Design anticipations
Design should anticipate maintenance
Design should anticipate inspection
Design should anticipate startup/shutdown
Design should anticipate unsteady state operation
When you design equipment, knowing what goes on inside you can
anticipate the high corrosion areas and install inhibitors and
neutralizer addition points. Whoever you work for will have
guidelines to follow, but do not blindly follow them. Ask questions;
understand why you do what you do and think it through based on
the chemical engineering fundamentals. Similarly you should ensure
that your design anticipate inspection, both on-line as well as off
line. You know what you expect the design to do and on-line
inspection can help you get the assurance that, in fact, that is
exactly what is happening.
During start-ups of units, the preparation of the unit and startup
sequence will necessarily mean that the unit runs differently than at
steady state. Make sure you anticipate that from equipment as well
as people perspective. The last thing you want to do is design a
piece of equipment that you cant easily startup. So, again, you
anticipate. Will water be a problem, where will it move to/from,

how will it be eliminated from the system, how can you verify? Get
the point?
Unsteady state operation is pretty similar to startup. But, and
heres the big difference it will be from steady state operation
how does that get recognized by the operators and how can they
recover. If you anticipate that and built that recognition into the
design you will be rewarded by a unit that just may run.

Inspections only occur on vessels that have been in service right?

There is no need to inspect a brand new vessel, right? This is an
example of a coded pressure vessel that had just been
manufactured and delivered to the site. The vendor was supposed
to perform all preliminary quality assurance checks prior to delivery.
The on site API inspector chose to inspect the vessel prior to it
being put into service and found numerous spots as the above

where the liner had already failed. Had this vessel been placed in
service not only would a premature failure have occurred but being
a pressure vessel the potential for it to have been a catastrophic
failure was very high.So this is an example of where the inspection
before the vessel was actually placed in service was absolutely

This is the same vessel as in the last photo and you can see the
additional contamination to the bottom head. Again this was
detected before the vessel had ever been placed in service.

What goes on and where

What are potential contaminants?
What are potential corrosives?
What effect does water have?
How do they affect the process?

How do you recognize early?

How do you mitigate?
We mentioned what happens where! You need to think about it.
Contaminants are always a problem so you need to anticipate what
happens if they get through. How do you recognize and mitigate?
Build that into your design. The same goes for corrosives and water.
They will at some time be where you dont want them to be.
Here is an example that happened to a chemical engineer while in
operations. She was running a unit whereshe had re-trayed the
debutanizer tower a few years previously. It worked very well
initially. When she took over the unit the separation was not what
it should have been and the delta p (pressure drop from stage to
stage) was just a little lower than it should have been. Some other
strange unexplainable things were also noticed.
One early morning she was sitting in the control room and saw a
maintenance message come up that instructed the operators to
inject water into some exchangers up stream of the debutanizer
that should never have water intentionally injected while the unit
was on-line. She stopped them from injecting the water and asked
for explanation.The answer was that a previous boss has
experienced pluggage and thought that an online water wash would
do the trick. Wellit cleared up the pluggage from improper
regenerations of the unit but the debutanizer was not designed for
water in its feed and the water could only get out of the system by
the partial pressure effect while refluxing inside the tower with trace
chlorides. So, that meant HCl going through the dewpoint and
vaporizing over and over until the partial pressure effect removed it!
Big time corrosion!
So, the learning for this chemical engineer was anticipate and look
for data that doesnt add up and then think about how to mitigate

while on stream. In this case she needed to shut down the unit and
re-tray the tower as well as fix corrosion damage.


Acid carryover in piping has led to corrosion of welds and heat

affected zones leading to several leaks. Pipe code was changed to
P91 to replace stainless steel filler with hastellloy C276. The pipe /
flange material of construction was changed from 304L to 316L.

Pressure vessel design

o Exothermic
o Endothermic
Distillation tower
Heat exchanger
Settler flash drum etc.
Moving on to pressure vessel design, is it a reactor? Is the reaction
endothermic or exothermic? If endothermic, usually there is no

problem since the reaction rate slows down as the process moves
through the reactor. That is, of course, unless a contaminant causes
the reactor to become exothermic.
But, if its an exothermic reaction, which means as the temperature
goes up so does the reaction rate go up. So, what is in place to
ensure that the reaction is controlled, how do you remove heat,
how do you ensure that the reaction does not become autogenous?
All of these issues need to be addressed in the design phase and
not after the unit has had a process safety incident. So, understand
the reaction kinetics, is the reaction regime stable or at a plateau?
Where the unit will run? Critical knowledge if you are to do your job
Moving on to a distillation column, this should be simple, but again
remember the contaminants, corrosives, and water. What to do
with them is the key. Even if they are not supposed to be there


some point in time they will be anticipate and design for it. Its
cheap when its on paper, when it is steel and concrete the costs
Heat exchangers are mentioned since early last year a refinery in
the Northwest had an explosion due to a failed heat exchanger that
took the lives of an entire crew. Proper inspection and maintenance
would probably have prevented that tragedy.
Settlers, flash drums are indicated since they have high probabilities
of having water at an interface and hence increased chances of
corrosion. Keep those possibilities in mind when designing,
operating, and maintaining them.


Operating procedures
Emergency Shutdown
Startup after and emergency shutdown
Routine operations
This will be an overview of procedures, more later in the PSM
course, but to get you thinking about their importance, detailed sets
of instructions for these areas are the basis for a smooth running
The most critical time in a units life from a process safety
standpoint is start up. 80% of the process safety incidents occur
during startup and 80% of the most serious events occur during
quickie startups after an unexpected shut down. Sothat is where
patience and knowing exactly where the unit stands is the first step
to a successful start up. Everyone is in a hurry to start the unit up,
but the smart managers will make their haste slowly. Never ever
forget those thoughts, when starting up a unit after an unexpected
shut down, make your haste very slowly and methodically.


This is a photo of the oil reservoir of the gearbox. If you look closely
you can see the oil looks very light in color and not as viscous as
you would expect oil to be. In this particular case water made it to
the reservoir because of poor operating practices.The operator
decided to flood the vent system with a high-pressure water hose to
clear a plugged vent line. Needless to say the damage caused by
not following proper procedures resulted in a very costly failure.
Rust was present on the gear teeth and shaft when inspected. It is
sure when this system was designed the engineer did not anticipate
an operator would use a high-pressure water hose to unplug a vent

Control systems
Anticipate steady state operation
What about start ups
Fail safe positions
Automatic actions
Safety Instrumented Systems


Control systems are what make the unit run day to day. Automatic
controls are best at maintaining steady state conditions. We will not
be going into how control systems are designed, but address their
function from a high level on what they do.
Some control loops are linked to other control loops in some fashion.
They can be on ratio control, reflux control, and so on. The key
issue is that control loops always are running in steady state
operation. That means that during startups the control loops must
be put on manual and adjusted by operators as conditions change
during the startup. Heres the rub the operators just might get
distracted during startup. All units have alarms in place to let the
operators know when conditions are outside of the expected, but
again, during startups everything is outside of expected conditions.
Clearly then startups are the time when attention to details and
knowing just what goes on and where is critical.
Fail-safe positions are just what they sound like. If all else fails the
unit will go into a shut down and failed safe position. Anyone can
easily shut down a unit by simply removing power to the control
system and all of the valves will safely go to their fail-safe position.
We design units to fail-safe. It cannot be emphasized enough that
documenting what the fail-safe position is and why is that position
is sacred. It must always be clearly documented and easily
retrieved if you do your job well.
As control systems and computers become more sophisticated the
control systems can be designed to have automatic actions. Say, if
a piece of equipment shuts down and a spare is available, that
spare can be set to automatically startup. Virtually every unit has
spare pumps set up in this manner. Similarly compressors can be
set up to do the same. This helps with reliability of the unit and
prevents major unit shutdowns.


Having said that, when a piece of equipment is out of its normal

operating range alarms will sound. If you read about the incident
that occurred some time ago at the nuclear facility on Three Mile
Island an overload of alarms caused operators to miss some critical
alarms and a very serious situation developed. A simple alarm
management technique of having the unit computer clarify alarm
management could have avoided the incident. What is meant by
that is if a certain piece of equipment malfunctions, the alarms that
will be triggered can be easily predicted and defined. Then, if that is
programmed into the units computer, when that piece of
equipment malfunctions the computer can offer a group
acknowledgement. Then, when those alarms are acknowledged, the
unit operators can see what other alarms are triggered and the
other malfunctioning equipment can be addressed. This type of
alarm management can make life on the units become considerably
less stressful.
In essence this is what safety instrumented systems entails.

Safety Instrumented Systems

Process Control system
Risk reduction
Logic solver
Support systems
A Safety Instrumented System (SIS) is a form of process
control usually implemented in industrial processes. The SIS
performs specified functions to achieve or maintain a safe state of


the process, when unacceptable or dangerous process conditions

are detected. Safety instrumented systems are separate and
independent from regular control systems but are composed of
similar elements, including sensors, logic solvers, actuators and
support systems.
The specified functions, or safety instrumented functions (SIF) are
implemented as part of an overall risk reduction strategy which is
intended to reduce the likelihood of identified hazardous events
involving a catastrophic release. The safe state is a state of the
process operation where the hazardous event cannot occur. Most
SIF are focused on preventing catastrophic incidents.
The correct operation of an SIS requires a series of equipment to
function properly. It must have sensors capable of detecting
abnormal operating conditions, such as high flow, low level, or
incorrect valve positioning. A logic solver is required to receive the
sensor input signal(s), make appropriate decisions based on the
nature of the signal(s), and change its outputs according to userdefined logic. The logic solver may use electrical, electronic or
programmable electronic equipment, such as relays, or
programmable logic controllers. Next, the change of the logic solver
output(s) results in the final element(s) taking action on the process
(e.g. closing a valve) to bring it to a safe state. Support systems,
such as power, instrument air, and communications, are generally
required for SIS operation. The support systems should be designed
to provide the required integrity and reliability.
International standard IEC 61511 was published in 2003 to provide
guidance to end-users on the application of Safety Instrumented
Systems in the process industries. This standard is based on IEC
61508, a generic standard for design, construction, and operation of
electrical/electronic/programmable electronic systems.



Relief Systems
Oxygen free
Worst case scenario
Process in place to ensure open path
When all else fails then you must rely on the units relief systems.
Generally this consists of a flare or two. In large facilities this could
involve a number of flares.One unit had two flares for just five units.
The entire refinery consisted of 11 total flares. When a vessel, be it
a reactor, a settler, a distillation tower, or whatever, is under
excess pressure,rather than overpressure the vessel and risk a
catastrophic release, devices called relief valves open and release
the excessive pressure into a closed system that leads to a device
called a flare that always has a flame at the point of release to the
atmosphere to harmlessly burn off the offending material.
Since the flare will always be a source of ignition it is incumbent to
keep the upstream system fuel rich meaning keep the oxygen OUT.
So a positive slight purge will always be maintained on the system.
Generally the flare system will be designed for a worst case
scenario, meaning every unit lets loose at the same time. This is not
a rare occurrence.
So, with that as a background, things to remember about relief
systems are that they will plug, foul, choke up, and generally try to
not work. So the PSM approach of anticipating what could go wrong
comes up here and is critical. Know your system, what could foul it,
what could plug it, and what could block the relief path. Anticipate
and put systems in place to prevent their occurrence as well as a

verification process to ensure that what you expect is truly what

you get. Finally, when maintenance is done on relief valves either
on-stream or off-stream you also need a process in place to verify
an open path after the work is complete
Reliable plants are safe plants.
Safe plants are reliable plants.
Safe and Reliable plants are Profitable plants!
For too many years safety and reliability have historically been
considered two separate elements of the operations system. It is
only in the recent years that people have truly begun to understand
just how interrelated they are. For too many years people didnt
correlate maintenance and reliability with regulations and laws. The
first was something you just had to have when you operate the
plant and the latter, well, the government made me do it!

Maintenance Culture


Fire fighting mode. Dont stop

Stop to think and plan the work

to think and plan the work

before executing

More likely to miss a key safety

More likely to plan for potential



Little to no predictive/

Utilize predictive / preventative


preventative technologies


~30% more likely to have an

Significantly less likely to have

accident or injury

an accident or injury

Too often companies find themselves in a reactive maintenance

organization. In this environment it is not unusual that the urgent
nature of the reactive work also requires maintenance personnel to
take risks they shouldnt be taking. With a proactive maintenance
culture significantly more problems are anticipated and identified as
problems long before they become big problems.

Examples of Reliability Technologies

Infrared Thermography (IR)
Non destructive Testing (NDT)
What you find when you work in the maintenance and reliability
world is that the best operators, maintenance personnel, engineers
are the ones who learn that if you use your senses: touch, sight,
sound, smell, the equipment tells you how it is running. Really
good operators and mechanics are the ones who can walk into the
middle of the manufacturing floor close their eyes and from the
sound of the plant and the vibration around them can tell you if the
plant is running well or not. Not everyone is that good though so for


the rest of us there are reliability technologies such as vibration,

lubrication, IR, NDT, etc.

This is a clip of fan base weld that had cracked. This could have
been detected and corrected long before it reached this point. This
failure didnt occur overnight either. Sadly many people probably
walked by this day after day, never noticed it and never reported it.


Here is an example of where proper oil sampling and analysis or

maybe even a simple oil change on a set schedule would have
prevented this pump from burning up. On the right you can see
what fresh oil looks like. The dark brown is what happens when you
run oil beyond its life and it has maxed out on contaminants.


This is an example of what is called CUI (corrosion under insulation).

If the wrong type of insulation is used in the wrong application the
above is an example of what will occur. CUI is caused by moisture
being trapped between the insulation and the metal, in this case
carbon steel. Over time the water held against the metal will result
in corrosion to occur under the insulation. CUI can be caused by
insulation of a vessel that has a tendency to sweat, hosing down
insulated equipment that is not watertight, etc.


The picture on the left shows the half pipe coils that encircle the
reactor that began to leak.
The picture on the right shows the crack pattern inside the reactor,
revealed by dye checking. The cracks are evenly spaced and seem
to initiate at the welds of the half pipe coil to the shell, running
vertically in both direction from the weld until they stop.
A piece of the leaking pipe was removed and sectioned for
metallographic examination. The instantly recognizable pattern of
chloride stress cracking was apparent in the metallographic mounts.
It appeared the process chemists had made a change, essentially
boiling and concentrating chlorides in the reactor. After many
batches, it was a recipe for chloride cracking which manifests itself
at the highest residual stress areas, i.e., opposite the half pipe coil


This is an example of an infrared thermography, commonly known

as an IR scan. For those not familiar with IR scans, the
instrumentation detects differences in heat generated.The
technology is not only used for detecting hot spots in the electrical
world but can be used to detect roof leaks or any other problem
where a potential temperature difference occurs. In this particular
case by looking at the photo on the right there is no indication of a
problem, but with the IR scan you can easily see that the fuse to
the far is extremely hotter than the other two fuses. This was most
likely caused by loose connection. If not corrected at best the fuse
may blow tripping the breaker, worst case a fire could occur.

Regardless of you job function know what is PSM covered.
Know what is considered coded by the state you are in.
Even the most seemingly small change can cause a major
Safety and reliability go hand in hand. Safety isnt the
responsibility of the safety department and reliability isnt
the responsibility of the maintenance department.


We are all responsible for each others safety.


Write one page on how better maintenance and ops procedures
could have changed the outcome of the event.


Chapter 6: Mitigating Hazards Via a Process

Safety Management System
The aim is to present the concept of a Process Safety Management
System and understand its nuances. We will also see how a good
PSM system is helpful in mitigating hazards in Chemical Process

At the end of this lesson, you will be able to:

Know and understandthe various elements of a process safety
management system
Recognize a simple risk matrix
Begin to understand the hierarchy of controlling risk

Quiz learning from homework
What is PSM?
Elements/Examples of a PSM system
How Risk Matrices and PSM work together

We have reviewed the Texas City Hazard in one of our previous
lessons. List out as many hazards as you can possibly remember.
Time: 15 minutes

What is PSM?
As we have seen PSM is a management system, employing the use
of elements, that, when used correctly, prevents the release of

hazardous materials and energy, thereby safeguarding life, property

and reputation.
PSM employs a management system and controls (programs,
procedures, audits, evaluations) to a manufacturing or chemical
process in a way that process hazards are identified, understood,
and mitigated or controlled, so that process-related injuries and
incidents are prevented.
Organizations that are introducing PSM programs benefit in many
ways. Plant efficiency increases, downtime is reduced, business
processes are streamlined, safety culture is improved, and business
performance in improved.

PSM Models
With its cross functional character PSM system is very complex,
including research, engineering, construction, manufacturing,
maintenance, training and sourcing. The structure of PSM is based
on 14 key elements divided into three groups: Technology, Facilities
and Personnel. To simplify the understanding, PSM is plotted as
PSM Wheel.
Many companies have graphics to demonstrate PSM elements.
There are two models well known in the industry. The DuPont wheel
and the Suncor wheel.

DuPont PSM Model

DuPont Safety Management Model: This figure illustrates the PSM

system at DuPont graphically.
Management leadership and commitment, which defines the core
value of safety necessary for implementing and maintaining strong
PSM programs, is shown at the center of the PSM Wheel. The main
features of the PSM program are arranged by Technology,
Personnel, and Facilities, separated into the essential 14 elements
around the spokes of the wheel.
Operational excellence is achieved through operational discipline,
which is shown as the rim of the PSM Wheel. This implies that such
discipline connects all of the 14 elements and translates the
required managing systems into real results for preventing injuries
and incidents.
DuPont PSM Model Works mainly because:
The center of the wheel is Management leadership and
commitment. Thus process safety is the Core Value

A robust Managing System that identifies, evaluates and

mitigates process risks at all stages of a facility's life cycle
Operational Discipline encircles all the technical elements
A single governance process
Integrated into all business processes
Flexible and adaptable to many industries

Suncor PSM Model

Suncors Process Safety model incorporates 14 distinct elements, all

of which need to work in unison for an effective PSM system. Here
too Management Leadership and Commitment are right at the
center of the PSM wheel, as these are crucial for the success of

The 14 elements encompass three key features of any

manufacturing process people, technology and facilities.
People: includes elements such as training and performance,
managing contractor safety, incident learning and prevention
emergency planning and response, and conducting operation
integrity audits.
Technology: includes assembling process safety information,
conducting process hazard analysis, and establishing operating
procedures and Safe Work practices.
Facilities: includes quality assurance, mechanical integrity and
conducting safety reviews prior to facility start-up.

Areas of Process Safety Management

Values, Beliefs, Leadership and Management
Contractor management
Safety Instrumented Systems
Preventive and Predictive Maintenance
Operating Envelopes and Parameters
Reactive Chemistry
Training of your entire workforce
Lock, Tag, Clear and Try
Integrity, corrosion, erosion of all equipment
For any Process Safety program to be successful, it is a must that
safety is foundational as a core value. The leadership and the
management have to be deeply committed to the PSM program and
believe that such program only works with operational discipline.

Not only the organizational workers but also the contractors and
their workers have to share the awareness about hazards and
safety management.
Safety Instrumented System (SIS)has to be in place and
regularly checked. SIS consists of an engineered set of hardware
and software controls which are especially used on critical process
systems. For such systems any operational problem occurring will
mean it needs to be put into a "Safe State" to avoid adverse Safety,
Health and Environmental(SH&E) consequences.
A Safe State is a process condition, whether the process is
operating or shutdown, such that a hazardous SH&E event cannot
occur. The safe state must be achieved in a timely manner or within
the "process safety time".
A SIS is designed to respond to conditions in the plant which may
be hazardous in themselves or if no action is taken, could eventually
give rise to a hazard, and to respond to these conditions by taking
defined actions that either prevent the hazard or mitigate the
hazard consequences.
Both proactive maintenance tasks, preventive (PM) and predictive
maintenance (PDM)have to be carried out regularly.
Preventive maintenance is a scheduled task carried out at predetermined time based on the number of hours for which equipment
has operated together with statistics and historical data for
different types of equipment and their need for maintenance. It is
assumed that a machine will degrade within a time period that is
common for its type. Mean-time-to-failure (MTTF) statistics can
determine a preventive maintenance management schedule to
include inspections, repairs and rebuilds.

Preventive Maintenance tasks are completed when the machines are

shut down.
Predictive maintenanceactivities are carried out as the machines
are running in their normal production modes. Direct monitoring
and analysis of a machines operating condition, efficient working
and other indicators reveals the need for maintenance tasks.
Predictive maintenance uses the actual operating condition of the
plant equipment and systems to optimize total plant operation.Such
maintenance activities can be undertaken when they are most
needed. Improvements in quality, profitability and productivity can
result from predictive maintenance and maintenance costs can also
be potentially reduced.
Operating envelopes and parameters is an innovative approach
to plant monitoring. These outline the technical limits within which a
system or process may be safely operated. These delineate the
maximum operating capability of a system
An operating envelope is a collection of those operating variables
and parameters of a plant, which when exceeded, affect the
integrity of equipment and pose a risk. When such an eventuality
occurs the process needs to be moved back within the operating
envelope quickly. That mitigates risk.
Any business operating a processing plant wants to maximize asset
uptime and minimize maintenance costs as well as unplanned
outages. In support of this goal is the management of a plants
operating envelope.
Reactive chemistry incidents are said to occur when no chemical
reaction is intended, but an incident occurs because of an
unanticipated reaction. The PSM persons and chemistry experts

may have the knowledge to anticipate such reactions and have

measures put in place to prevent them. The operating workers in
the plant probably do not have this knowledge. Absence of
reactivity hazard awareness and recognition is often a major
contributor to reactive chemistry incidents. These can be prevented
by educating the operating personnel about the process and
reactive chemistry. Tools, checklists, and resources for recognizing
and managing reactive chemical hazards need to be made available.
Training your entire workforce for PSM is essential. This means
that workers need to be fully conversant with PSM and its elements.
They need to have safety ingrained into their work process. Process
safety training with an overview of the elements comprising the
PSM mandate must be included in the employee training.
The communication for process safety management needs to be
fast and open.It is imperative tomaintain proper communication
between different components of the organization, amongst project
entities and process operatives. Such communication is critical for
process safety. A variety of communication tools cab be used to
properly facilitate processes, provide timely notices and assign
Lock, tag, clear and try: This is a technique used to prevent the
release or escape of hazardous energy. For this procedure each
worker places a personally controlled lock on the appropriate energy
isolating device that is in the off or open position, then adds a tag
to identify (who, when & why) them, and performs a test to assure
a Zero Energy State.
The idea behind lock/tag/try is to prevent energy from accidentally
being released while a machine or equipment is being serviced. The
primary goal is of course to protect the safety and health of

employees. The secondary goal is the protection of equipment from

The tag should indicate boldly what is wrong with the equipment,
for example: this machine is tagged for maintenance work. No one
can operate equipment that has been tagged out.
Integrity, corrosion, erosion of all equipment. Mechanical
integrity is a big challenge for PSM. In a plant there is a multitude
of equipment, piping and instrumentation, and other machines that
are vulnerable to erosion and corrosion. All need to be kept in good
operating condition to ensure safe, reliable, and profitable
Corrosion may be defined as the destruction of a metal by chemical
or electro-chemical reaction with its environment. These substances
can be process materials contained in a vessel, pipe, or other
equipment, or materials in the outside environment for example,
water, salt, or contaminants in the atmosphere. Rusting of steel is
an example of corrosion (or in other words oxidation).
There is also another process by which metal is destroyed and this
is known as erosion. Erosion may be defined as the destruction of a
material by the abrasive action of a gas, liquid, or solids. This is a
purely mechanical action. Erosion-corrosion can result in general
corrosion that occurs at a higher rate than would be expected under
stagnant conditions.This process usually occurs in pipelines etc.
where the flow of liquid along with particles in suspension causes
erosion of the pipeline. Erosion can also occur in process piping in
very dirty service.
Management of corrosion and erosion of process piping and
equipment must be a major component of any effective mechanical
integrity program.

PSM Elements to Think About

Why is pro-active management important in each model?
What are the 3 key areas of effort in each model?
Is one area more important than another?
Who works on process safety management?
Who is accountable for PSM in an operating facility?
Why should contractors be involved in PSM?
We have seen two PSM models. In each model management is
placed at the core. It is important that management and leadership
is committed to safety and is visible throughout the organization. It
is the management that has the responsibility to build and nurture a
strong safety culture. They need to follow this up with policies that
ensure consistently good safety performance, and implement it by
providing resources and establish high priority for safety centric
activities. The actions of all levels of management must support and
reinforce strong PSM programs and accountability.
The three key areas of effort in each model are personnel (people),
technology and facilities. All the three areas carry equal importance
and one cannot be most effective without the other two.
People at all levels in an organization need to work on Process
safety. As per Section 304 of the Clean Air Act
Amendments,employers are to consult with their employees and
their representatives regarding the employers efforts in the
development and implementation of the process safety
management program elements and hazard assessments. Section
304 also requires employers to train and educate their employees
and to inform affected employees of the findings from incident
investigations required by the process safety management program.


In an operating facility, the facility manager is responsible for their

facility, but overall the corporation is responsible for safety in all of
their facilities. That means they need to have sufficient
management processes in place to pro-actively manage safety. And
that their process ensures that all employees know, understand,
and follow the process safety programs in place. So, ultimately it is
the corporations chief operating officer who is responsible.
Contractors must be involved in PSM as many categories of contract
labor may be present at a jobsite. They may actually operate the
facility or do only a particular aspect of a job because they have
specialized knowledge or skill. Others work only for short periods
when there is need for increased staff quickly, such as in
turnaround operations.
As these workers are present at a facility, they also need to be
aware of PSM for their own safety and that of other people,
equipment and environment around the facility.
PSM includes special provisions for contractors and their employees
to emphasize the importance of everyone taking care that they do
nothing to endanger those working nearby who may work for
another employer.
PSM, therefore, applies to contractors performing maintenance or
repair, turnaround, major renovation, or specialty work on or
adjacent to a covered process. It does not apply, however, to
contractors providing incidental services that do not influence
process safety, such as janitorial, food and drink, laundry, delivery,
or other supply services.


Risk Management How it all fits together

A Risk Matrix is used during Risk Assessment to define various
levels of risk as the product of the harm probability categories and
harm severity categories. This is a simple mechanism to increase
visibility of risks and assist management decision-making.A
consistent process in place like this gives senior leaders an overview
of needed gaps and where corrective actions are needed
The elements of PSM address safety issues from all levels of an
organization. Feedback on PSM elements from the frontline Bottom up process (PSM) gives senior leaders assurance that
hazards are being addressed.

Risk Matrix


The figure illustrates basic Risk Matrix. The risk matrix records the
level of risk, which is determined by the relationship between the
likelihood of a hazard occurring, and the consequence of the hazard.
This is recorded as either a numerical or an alphabetical code. The
relationship between likelihood and consequence determines how
dangerous the hazard could be.
In the above matrix the left bottom is the sought after position. The
frequency of hazard as well as the consequences are approaching
zero. The dark red zone indicates high-risk area. In the middle is
gray area. This is subjective and each organization needs to assess
and evaluate the zone.

Typical Risk Matrix

Here is a typical risk matrix that is self-evident. The red zone has
higher severity of hazards with more likelihood of incidents. This is
not acceptable. Immediate action is required; this level of risk
needs detailed research and planning by senior management.


The green zone has acceptable levels of risk and should not need
specific resource allocation. A part of this zone bordering the yellow
area can be managed by routine procedures and employees under
The yellow zone indicates acceptable risk with mitigation. It requires
management attention in a reasonable timeframe to prevent or
reduce the likelihood and severity of an incident. Control action of a
short-term nature may need to be taken immediately so that work
could be carried out with further long term action to ensure that the
hazard was fully controlled. Consistent use of a risk matrix to
prioritize all risks at a location gives management clear guidance to
utilize available resources in the most effective manner to move the
facility to a lower risk profile of operation.

The Story for Calculating Risks

It is an inexact science
Multiple variables & unknowns
Accurate, not precise
Needs engineering judgment
The press and your boss will want a precise answer for a
business decision
Risk assessment is an imprecise science. Careful examination of
what could harm people and taking measures to prevent or mitigate
the consequences in case of an eventuality is advisable. However
there are multiple variables and imponderables.
Every business decision is also subject to similar uncertainties.
Maybe you can accurately calculate risks but not precisely.


Somewhere between the steps of risk assessment and risk

management, the concept of risk estimates as inherently imprecise
has been lost. This is probably due to a number of reasons, one of
which is likely because the risk manager has to communicate with a
public that wants to know with some certainty and precision what
the risks from hazards actually are (and in rather succinct terms),
rather than hearing the risks described more appropriately as
scientific judgments that are, by their very nature, imprecise. Or,
perhaps this is because risk assessors themselves become so
accustomed to using default positions/models to extrapolate risk
that they lose sight of the degree of uncertainty that is introduced
with each extrapolated area.
There are no universally accepted scientific or legal standards for
risk assessment. The outcome of a typical risk assessment depends
on the assumptions and sometimes biases of the assessor.

Investigating a risk matrix

Does a firms performance change the level of risk that is

Where is the line for BP?


Think about what has happened with BP:

Texas City
Pipeline Spill in Alaska
Trading irregularities
Gulf of Mexico Explosion

PSM System and Risk

How are the PSM elements related to risk?
If you were in charge of an oil refinery, how would you use
the PSM elements to reduce the risk in your plant?
PSM is basically a framework of activities based on 14 elements to
manage safety in workplace. A key element for PSM systems is the
so-called Process Hazard Analysis. Process risks can be managed
only if they are identified and evaluated. Many other PSM elements
include an "understanding/ evaluating risk" step, which needs to be
fulfilled properly in order for that element to be robust and "fit for
purpose". These elements include: 1) Work Permit Systems, 2)
Management of Change, 3) Pre-start up Safety Review.
An effective implementation of the above elements would ensure a
proper management of risks related to normal (stable) operation of
the plant, routine and non-routine activities.

Things to Remember about Risk

Not about numbers, about safety
Not well understood by many business leaders
Risk is always changing; key is to understand the changing
nature of risk
Risk has many forms: financial, technical, personnel,
ethical, asset based


Risk generally results from uncertainty. In organizations this risk

can come from uncertainty in the market place (demand, supply
and Stock market), failure of projects, accidents, natural disasters
Risk is not about calculating numbers but it is about safety. And it is
not about doing things to avoid sanctions. The primary goal is not
to avoid a legal action, but to stop people becoming unwell, or
being hurt, or being killed by their work. A secondary benefit is
saving equipment and the environment.
Many business leaders do not understand risk. As the saying goes,
If you think safety is expensive, try an accident. Good risk
management doesnt have to be expensive or time consuming. It
just needs commitment and belief in safety first.
Additionally, executives miss the concept that their personal
interest and views about risks will impact the firms risk assessment
and analysis. The more aware are they of risks in their organization,
the more attention they will pay to them and this action will
translate into taking measures to avoid/ mitigate potential risks.
Risk in any business is not a static entity. As the environment
changes, so does the risk-profile change. In the process industry,
risk will change with technology, upgrading of equipment or
processes. In the financial market risks change based on myriad
factors. The key to risk management is to understand the changing
nature of risk. Once that is understood, risk can be managed.
Risk has many forms: financial, technical, personnel, ethical, asset


Managing Risk via Hierarchy of Controls

Eliminate/remove the hazard or people
Substitute or reduce quantities
Provide engineered controls/barriers
Provide appropriate PPE
Or, just do not do the work until you are assured you can do
it safely.
The hierarchy of controls is a protocol that you use when deciding
what kind of control measures should be used to address a
particular hazard. The rationale underlying the hierarchy of
controls is that an organization should use more reliable control
measures rather than measures that are more likely to fail.
In simple terms, this is a priority order of control measures ranging
from elimination of the hazards and associated risks to providing
people with protective equipment.
Elimination is considered the most reliable control measure
because, if a hazard is eliminated, it no longer poses a risk. If the
hazard cannot be eliminated then remove people from the vicinity
of the hazard.
For specific hazardous chemicals, if a lesser dangerous substitute is
available then that should be used. If the use of the chemical
cannot be eliminated and there is no suitable less toxic substitute
then one investigates whether reducing the quantity could be
appropriate. Or else explore if there is an appropriate engineering
control such as installation of a ventilation system that reduces
the exposure or barriers that prevent workers from going
dangerously close to the installation.


If an engineering control cannot be used (or until it can be

installed), then other administrative controls such as warning
signs, PPE, etc. would need to be put in place to reduce the hazard.
However the wearing of personal protective equipment (PPE) is
considered a less reliable control because it is dependent on
individuals using the right PPE and wearing it correctly every time.
The use of hierarchy of controls is a reiterative process as
individual control measures are put in place, you need to go back
and re-evaluate the risk to see if it has reached an acceptable level
or if additional controls are still needed.
When none of these controls work to your satisfaction then the best
option is not to do the work until you are assured you can do it
The lower the level of control implemented, the higher is the level of
risk that is accepted.

Risk management involves both top down and bottom up
Better firms look at risk from both directions
PSM is a key component of the risk reduction system
Risk management is inexact and usually misunderstood; risk
is always changing
As we have seen, risk needs to be managed from top to down
through risk matrices and from bottom up through realtime
feedback and PSM system.
Best Practice Organizations are completely aware of risk
management and look at risk from both directions.


PSM as we have learned is the key component to methodologically

identify, understand and reduce risks related to hazardous
processes. Its main purpose is to prevent serious incidents like
major fires, explosions or toxic releases that might affect plant
personnel, off-site population, environment or result in significant
material losses.
Even with a great PSM system in place, the nature of risks needs to
be understood. Risks change and the management has to make
changes in their response accordingly. These changes may not
always be apparent and that makes risk management inexact!

Read Chapters 1 and 2 in RBPS text
Google search and read about James Reasons work on
managing infrequent, yet catastrophic events
In particular study Reasons Swiss Cheese model, and
think about how this model and PSM models fit together


The Concept of Risk

Probability & Consequences
Understand Risk/Probability and Hazard/Consequences
Understand Risk Matrix Assessment
Risk/Probability represent the likelihood that an event will occur. In
Quantitative Risk Analysis, you saw how reliability of a system could
be calculated given the KNOWN failure rates of all the elements.
Sometimes the data is not available for a quantitative analysis or
the system is well known by experts in the field so that a
QUALITATIVE analysis is done.
In this type of analysis, a group of knowledgeable people in the
process is gathered together to make their best judgment of the
Risk/Probability AND the Hazard/Consequences. This often leads to
one or more areas of the process that must have further evaluation
and mitigation. To make this assessment, a Risk Matrix is

Study, in some Detail one of the several Methods of Hazard
Recognize you may be asked to use a different method

Hazard Analysis:
The world is made up of systems and risks. With any system or
process, there is a risk of hazards and accidents. System safety
implies effective risk management is the identification and
mitigation of hazards. For this hazards have to be identified and

then risk analysis done. That is why hazard analysis needs to be
done periodically to systematically evaluate facility and process
hazards. This is to ensure safe operations, teach new workers,
control hazardous materials, and much more.
There are a wide variety of hazard analyses methods. Sometimes a
basic gross analysis needs to be done for choosing the most
appropriate method. Here are some methods, which are OSHA
guidelines. We will be studying most of these methods during this
WHAT - IF Checklist: The what - if checklist is a broadly-based
hazard assessment technique that combines the creative thinking of
a selected team of specialists with the methodical focus of a
prepared checklist. The result is a comprehensive process hazards
analysis that is extremely useful in training operating personnel on
the hazards of the particular operation.
Hazard and Operability Study (HAZOP): HAZOP is a formally
structured method of systematically investigating each element of a
system for all of the ways in which important parameters can
deviate from the intended design conditions to create hazards and
operability problems. The hazard and operability problems are
typically determined by a study of the piping and instrument
diagrams (or plant model) by a team of personnel who critically
analyze the effects of potential problems arising in each pipeline
and each vessel of the operation.
Failure Mode and Effect Analysis (FMEA): The failure mode and
effect analysis is a methodical study of component failures. This
review starts with a diagram of the process that includes all
components, which could fail and conceivably affect the safety of
the process.

Fault Tree Analysis:A fault tree analysis is a quantitative
assessment of all of the undesirable outcomes, such as a toxic gas
release or explosion, which could result from a specific initiating
event. It begins with a graphic representation (using logic symbols)
of all possible sequences of events that could result in an incident.

At the end of today, you will be able to:

To participate in a Probability & Consequences Review
To observe and participate in one of the other types of
reviews recognizing you will have a difference
methodology to learn

What can happen if we do not get this right!

Here is an example of what can happen if the hazard analysis is
either not done or not heeded.

The explosion of 1988 in the Shell refinery, Norco, LA is also

referred to as the big bang. This occurred at 3:40 A.M. on May 4,
1988. An elbow at depropanizer column piping system in a fluid
catalytic cracking (FCC) unit, failed.The reason could have been

corrosion in an eight-inch diameter pipeline and thinning of the
pipeline.Consequently 20,000 pounds of C-3 hydrocarbons escaped.
A resulting vapor ignited causing a major explosion. Damage from
the explosion radiated one mile from the center of the explosion
and debris could be found as far as five miles. The explosion caused
a fire to burn for eight hours at the oil refinery before it was
brought under control. Chemicals that escaped during the explosion
resulted in cars and homes being covered by a black film. Seven
shell workers were killed during the explosion and 48 residents and
shell workers were injured. The explosion released 159 million toxic
chemicals into the air, which led to widespread damage and the
evacuating on 4,500 people.
Flight over the next day showed a LARGE black hole where the unit
had been. All the flare tips were burnt off while trying to control the
releases. The people there will never forget this event. The studies
afterwards yielded an area that has been generally overlooked in
the industry.

Whats Covered by PSM?

Process Safety Information
Employee Involvement
Process Hazard Analysis
Operating Procedures
Pre-Startup Safety Review
Mechanical Integrity
Hot Work
Management of Change

Incident Investigation
Emergency Planning and Response
Compliance Audits
Trade Secrets
These are the areas where safety assessment can be used PSI,
PHA, MOC, Incident Investigation, PSSR, Operating Procedures,
Training, Mechanical Integrity, Compliance Audits, Emergency
Planning and Response. We will be learning about these in details
later during the course.
NOTE: OSHA requires Employee Involvement! This becomes a part
of the organization and planning for and organization for the review

Batch Reactor

This is a typical batch reactor. In a batch reactor, all the necessary

ingredients are placed in the tank and the chemical reaction is
allowed to take place,

Multiple components are loaded into the reactor and the reactor is
sealed. The temperature and pressure increase over time until the
reaction is complete. When finished, the product is removed from
the bottom and the top hatch is removed to wash out the reactor.
An incident occurred when an operator removed the top hatch and
was exposed to a hazardous chemical that was produced by a little
known side reaction that had occurred. A Study Team was
organized to do a HAZOP of this system and understand what
additional safety precautions needed to be taken.
Construction of a Probability and Consequences Review
Probability & Consequences for Operator Exposure to H2S
During Reactor Operation
H2S is very toxic, quickly reactive, and causes serious accidents. It
poses a very serious inhalation hazard. Prolonged exposure (for
several hours or days) to concentrations as low as 50-100 ppm can
lead to rhinal inflammation, cough, hoarseness, and shortness of
breath. Prolonged exposure to higher concentrations can produce
bronchitis, pneumonia and a potentially fatal pulmonary edema.
Consequence modeling refers to the computation of numerical
values (or their graphical representations) that describe the likely
hazards due to unforeseen loss of control over flammable, explosive
and toxic materials, with respect to their potential impact on people,
assets, or safety functions.
To illustrate the point have a look at the spreadsheet that details:
1. Potential problem areas.

When assessing a specific incident, a great deal of effort needs to
be given to generate the steps that could causean incident and the
elements of those steps that present the highest potential for such
an incident. These must all be addressed during the hazard analysis.
Color Pre

Color Post




Loading Reactor
Feed contains H2S
Proper Ventilation not in Place
H2S monitor fails: Personal / Area
Reaction Step
Flange Leak
Mixer Seal Fails
Pump Seal Fails
Other potential Loss of Containment

Emptying Reactor
Proper Ventilation not in Place
H2S monitor fails: Personal / Area

A knowledgeable multi-skilled team is assembled to generate this

list and to carry out the analysis.

2. Probability it may occur.
Some companies have standard probability lists to work from. In
some cases, you will be asked to develop your own list of
probabilities that a particular incident may occur.
Spreadsheet 2 gives a sample probability list.
Ways to Express Probability
Ways to Express Probability

Failure Probability Increases ----->

It has happened more than once a year at the Location

It has happened at the Location or more than once a year
in the Company
It has happened in the Company or more than once a year
in the Industry
Heard of in the Industry
Never heard of in the Industry

3. Consequences if it does.
Similarly, your company may have a standardized consequences list.
They may include additional categories.If asked, you may be
requested to develop such a list.
Spreadsheet 3 is an example.

Consequences Increases ----->
People /

No Health or

First Aid Case

Lost Time



Health Issues

Injury Risks

or Slight

Injury or


Disability /



or Major





or Severe



Consequences in Various Areas



Negligible Effect

Major Effect





/ Confined to

on Neighbors





Adjacent to




Plant /

Local TV






al TV


Coverage /

Coverage /




Product or

Some Product




Loss of

Service Quality

or Service Fails



/ Major


to meet







in Writing


Share due


to Problems

Asset or

Slight Damage.





Financial Loss

Less than











$0.1 mln

$1.0 mln

$10 mln


Slight to

Loss of

Loss of

Loss of

Loss of



Reputation in






in the







4. Overall risk matrix used to assess potential problems.

Spreadsheet 4shows the total risk matrix when probability and
consequences are plotted against each other. Again, corporate
HS&E may provide this list to you but you need to understand how
it is generated. The color ratings MUST be set (or agreed to) by
senior company management. These rankings represent the amount
ofrisk the COMPANY is willing to take. As a professional or
experienced location staff member, you are obligated to make sure
the result is the best it can be.

Risk Matrix - Probability vs. Consequences

Ways to Express Probability

It has happened more than

once a year at the Location

It has happened at the

Location or more than
once a year in the

Failure Probability Increases ----->


It has happened in the

Company or more than
once a year in the Industry

Heard of in the Industry

Never heard of in the


People / Health Issues


Consequences in Various

Consequences Increases ----->


First Aid

Lost Time




Case or

Injury or


Disability /




or Major

Fatality(s) or










Environmental Issues






e Effect

Effect on



age makes






Local TV


TV Coverage

d to Site




/ Newspapers

to Plant /








Product or Service Quality





Loss of




/ Major




Market Share




s Cancel

due to

Fails to


in Writing












. Less




Exceeds $10



$0.1 mln

$1.0 mln




Slight to

Loss of

Loss of

Loss of

Loss of






n in the

n in the







Asset or Financial Loss

Company Reputation



You should have a good appreciation for systematic review
You should have a good understanding of how to develop a
Probability & Consequences system for identifying overall
Risk / Hazard for a simple situation


Study the What If areas to explore and suggest additional
Categories and / or additional Sub-topics for either your suggested
Category or one of the existing Categories. Target a minimum of 5
to 10 suggestions (there are about 20 more in the full version of
this example).

Chapter 8 Analyzing Hazards

Analyzing Hazards
Task Checklists (S/U, S/D)
Task Specific Checklist (JSA)
Review Hazards What If List
The world is a dangerous place and the workplace even more so.
Hazards lurk everywhere. How do we analyze and manage the
existing hazards that we find? In this session we will see how we
can develop checklists for startups, shutdowns, etc. to help mitigate
the hazards that are expected to be encountered. Also, we can
develop checklists for job safety analysis that will help prevent
incidents on a day-to-day basis. One of the techniques we will look
at is a what if scenario. What if this, what if that, is a good
simple technique that is very useful to help prevent gotchas from
happening. So, how does this work?
Experienced personnel imagine a series of incidents that can happen
and ask questions that begin, "What if?
Each question represents a potential failure in the facility or wrong/
faulty operation of the facility
The engineers/ operators respond by evaluating the scenario and
determining if a potential hazard can possibly occur. If yes, then the
prevalent safeguards are checked to see if these can prevent/
mitigate the potential problem or if modifications are necessary.
Some example questions:
Equipment failures
o What ifa valvepacking leaks?

o What if an autostart fails?

o What if a furnace burner plugs?
Human error
o What if a step in the procedure is missed?
o What if a limit is exceeded by the operator?
o What if a pump is shut down inadvertently?
External events
o What if the unit floods?
o What if the temperature suddenly drops?

Appreciate the Value of Check Lists
Understand where Check Lists fit into Procedures
Understand Fit for Purpose in terms of who generates and
who approves a Check List
Our objective today is to develop an appreciation of checklists and
how they can be used to reduce the possibility of hazardous events
Checklists are the simplest yet most effective means of hazard
analysis. Checklists involve of using a detailed list of prepared
questions about the design and operation of the facility. The level of
detail is adaptable. The only limiting factor is the expertise of the
author(s) of the checklist! That is why the checklists must be
prepared byexperts who have conducted many hazards analyses
and who have extensive experience to do with the design, operation
and maintenance of process facilities. Even experience and
expertise backed checklists will not be all-inclusive. However
nothing should be overlooked.

Good checklists are precise and easy to use. They provide

reminders of the most important and critical steps to follow. They
are practical. Read and Do checklists expect you to read them and
then do it. Do and Confirm checklists allow you to do a thing and
confirm its correctness from the checklist. Checklists do work when
they are well practised. (From the Checklist manifesto)
Checklists continue to be effective if they are audited and updated.
Frequent re-visitations and continuous improvement is essential.
Speed and efficiency are the chief advantages of a checklist.
We will also look at how the checklist will fit into the procedures
that are used to guide day-to-day operations. They can be used to
ascertain everyday hazards. For example accidents can be result of
process equipment, human error or external factors. Here the
checklist can have yes/ no questions to find out if the right type of
equipment is used, if the procedures are properly followed and are
completed as per requirement, etc.
Checklists for day-to-day functions such as alarms, chemical
materials, control systems, documentation and training,
instrumentation, piping, pumps, vessels, etc. The questions could
include, for example, if the alarms are recognized for the cause, are
they different for different causes?
Finally, well look into how to build checklists that are fit for
purpose. That is they will do what you want them to do.
So, what is an example of fit for purpose? Example of Fit for
Purpose: Would you give me directions to the mens room? 1)
Here in the Forney Building. 2) While visiting a competitors
laboratory facilities. 3) While visiting the White House. Fit for

In addition to the checklist and what-if methods for process hazard

analysis, there is a combination of these two that can be used
effectively. That is the What-If / Checklist Method
This approach combines the two methods to benefit from the
advantages of each method. The hazards analysis team works
through a checklist. However they just do not pick boxes or answer
the questions, but for each question a what-if scenario is imagined
and discussed. Any important points thus noticed are incorporated
into the checklist.

Todays roadmap
Understand when a procedure might be required.
Know when to stand firm that a Check List be followed.
Understand how to develop a Check List; recognizing you will
likely require help.
Procedures area fixed, sequential set of instructions, to perform a
task or an activity, with definite start and stop points. Procedures
should be written with input from those who will implement them.
The person developing the procedure should be experienced and
must have expertise in the subject.
Checklist as we have seen is a list of routine activities to perform a
task that need to be carried out again and again. Checklists prove
very useful to do the task right every time, and to ensure
consistency and completeness in carrying out a task. Human error
can be avoided by using checklist.
Procedures may contain checklists. Checklists must have a
designated approval level, as must Procedures. If deviations are

from a checklist there must be a process in place to ensure that the

deviation will not produce a process safety event.
A checklist is a list of items for consideration. They can be in the
form of questions or actions to be carried out. They can have a
scoring system or they can collect comments. Checklists can speed
up the collection of information by using tick-boxes and rating
scales. They need to be carefully designed to make sure that when
they are completed, the results are reliable and true. Checklists can
act as memory aids to make sure that all the relevant issues have
been considered.
Checklists need to be relevant to whatever you are checking, and
detailed enough to enable you to do a thorough job. A checklist
needs to be constructed as questions and clear steps, in some sort
of logical sequence. The best way to do this is to work through all of
the issues that are likely to be important and prepare a set of
written comments about the product, task or environment. Out of
these written comments you can prepare your checklist.

Startups & Shutdowns:

Initial New Plant Startup
Routine Startups (Restarts)
Routine Shutdowns (Planned)
Emergency Shutdowns
Restarts from Emergency Shutdowns
An initial new plant startup must have a very detailed set of
checklists to ensure the desired goal of a safe startup is
accomplished. To ensure the checklists are complete they must be
a part of the pre safety startup review process. An initial startup of

a plant will have many one-time checks to ensure construction as

well as the plant process is where it is expected to be for a
successful and safe operation. Following the initial start up
subsequent start ups will entail many of the same checklists.
Routine shutdowns normally include checklists that ensure that the
plant is ready for whatever maintenance is planned during the
outage. As such, a critical checklist of a routine shutdown is the
blind list. The blind list is used to verify proper isolation of the plant
from the active plant.
After normal shutdowns a routine startup should utilize a complete
startup checklist. Many locations will use a special startup checklist
depending on the type of shutdown, although the author believes
that this is a misguided approach. In the authors opinion, one
checklist and procedure is the correct approach.
Emergency shutdown checklists are a review process post shutdown
to ensure that all is where is should be. If that is not the case, the
checklist will give a systematic approach at getting all into the
appropriate position.
Restarts after an emergency shutdown must conducted in a very
systematic fashion. The checklists used MUST verify the status of
the unit every step of the way to ensure a safe startup.

Routine Operations:
Procedure for every unit operation
Procedure for operational changes
o Rates / Conversion / Product specifications
Explicate checklist for maintenance activities
o Hand offs between operations and maintenance

o Job Safety Assessment for each task

Every unit operation must have a certified procedure in place to
be used for that operation. The procedure must be written by
qualified individuals and verified as accurate and current. Hence,
the certified nature of the procedure! This certification must be
completed every year.
Operational changes that are outside of normal operating ranges
need to have a specific process in place to ensure that the change
does not go outside of the safe operating range for the unit.
Similarly changes such as temperature changes must be within
specific ranges. The reason for changes in temperature being part
of a specific procedure is to ensure that the changes do not occur
too rapidly that might lead to discontinuities in the units mechanical
structure. No units are made with the same metallurgy, hence
different growth rates occur during heating. Expansion loops will
take this into account, but must have time to equilibrate.
Maintenance activities require specific starting points that are
identified by checklists that ensure the unit will be correctly
positioned for the activity to be completed. Similarly the hand off
back to operations requires a defined condition, and that is clarified
by a checklist.
Finally, before each job is started a job safety analysis must be
completed, to ensure all conditions are identified to both operations
as well as the mechanical stall, which will ensure a safe transition
and job completion.

Discussion Topics:
Survey the plant for cap and plug compliance.

Prepare a pump for maintenance.

Swap parallel (spared) 1800 hp compressors.
Prepare a distillation column for maintenance.
Procedure vs. Checklist: Procedure is required for more complex
tasks that require explaining not just the steps to carry out but also
the logic and more detail about HOW to carry out the steps.
Discuss in class who should be involved in developing procedures
and checklists for various activities. Who is ULTIMATELY
accountable but how is that accountability distributed through the

When is a checklist needed and when is a procedure with
checklists required?
Who is accountable for various levels of activities?
Who must set the tone in the organization so we get it right?

Would starting a car be best described by a checklist or a
procedure? (Recognizing we all pretty much have this activity
Develop the required document.
Individual work. Extra credit for completeness.

Basic Required Systems

Before we begin we should note that the PSM regulation is NOT
prescriptive. Rather, it is performance driven. Very few
Thoushalt items exist in the regulation. PHA (process hazard
analysis) is required to be done (or revalidated) every five years
and operating procedures are required to be certified once a year.
The quality of these is not clearly defined, simply stated. So, to day
well look at:
1. Reactive Hazards
2. Inherently Safer Design
3. PHA & PHA Re-evaluations
4. Pre Start Up Safety Review
5. Operating Procedures
6. Material Safety Data Sheet system
7. Management of Change

At the end of today, you will be able to:

Put in context what these elements do to support the overall PSM

Point 1:Reactive Hazards

CSB video on reactive hazards

Point 2: Inherently Safer Design

Inherently Safer Design (ISD) permanently eliminates or
reduces hazards before a process is built
ISD is a philosophy
ISD is an iterative process
Safe design and operation options cover a wide spectrum
There is no clear boundary between ISD and other strategies
Trevor Kletz, ICI, UK (1977) developed the concept of Inherently
Safer Design (also known as Inherently Safer Technology) in
response to 1974 Flixborough, UK. He named the concept and
developed a set of design principles for the chemicalindustry
What is inherently safer design?
Inherent - existing in something as apermanent and inseparable
Inherent safety thus is built in, not added on
Inherently safer design is an approach to process design and
operation, which builds in safety, health and environmental
considerations at the start. It tries to avoid or eliminate hazards or
reduce their magnitude, severity or likelihood of occurrence by
careful attention to the fundamental design and layout.
In reality no design can be completely safe, however you can have
an inherently safer design.
Hazards are eliminated or significantly reduced rather than
controlled and managed. The means by which the hazards are
eliminated or reduced are so fundamental to the design of the
process that they cannot be changed or defeated without changing
the process. In many cases this will result in simpler and cheaper

plants, because the extensive safety systems, which may be

required to control major hazards, will introduce cost and
complexity to a plant.
ISD is more a philosophy and way of thinking than a specific set of
tools and methods. It is a philosophy for the design and operation
of chemical plants, and applied to the design and operation life
cycle, including manufacture, transport, storage, use, and disposal.
It is generally applicable to any technology. Inherently safer design
is neither a specific technology nor a set of tools and activities at
this point in its development. It continues to evolve, and specific
tools and techniques for application of inherently safer design are in
early stages of development.
ISD is an reiterative process which considers options, including
eliminating a hazard, reducing a hazard, substituting a less
hazardous material, using less hazardous process conditions, and
designing a process to reduce the potential for, or consequences of,
human error, equipment failure, or intentional harm.
The reason is that this is still a developing phenomenon. There may
not be any standardized safe processes available. So even if
inherently safer technology is incorporated into the design, it still
may be largely an unexplored territory. So visiting the design and
the process often to determine its safety and use is essential. Also
the chemical industry is very complex and there are dependencies
throughout the system, and any change will have cascadingeffects
throughout the chemical ecosystem. So possibly a process that
appears safer initially may in reality, be less safe when you get into
the details and implications of the design
Therefore ISD is a way of continually assessing and examining that
the company is making the best choices in the processes that it
uses. Even after the plant is built, it is required to continuously

think about ISD to develop inherently safer operating and

maintenance procedures or while making changes. Also ISD
opportunities may present themselves with technology advances.
Safe design and operation cover a wide spectrum from inherent
through passive, active and procedural risk management strategies.
These are designs where engineers employ a variety oftechniques
to achieve classical risk reduction through design. In fact ISD can
be incorporated into PSM activities such as PHA, management of
change, incident investigation, mechanical integrity, etc., which are
normally done at every stage of process life cycle from initial
technology selection through detailed design and operation.
The quantification of inherent safety is challenging because it poses
three important problems:
Subjectivity: many of the factors that must be analyzed
require subjective evaluation and expert judgment.
Uncertainty: factors that are not subjective can present
uncertainties that must be taken into account during the
calculations in order to avoid undesirable results.
Complexity: many factors take effect on the overall level of
inherent safety, however it is difficult to evaluate all the
factors at the same time using one comparable scale.
There is no clear boundary between ISD and overall safe design and

Point 2: Inherently Safer Design

ISDs are relative:
Inherently safer designs only have meaning when compared
to a different technology

A technology may be inherently safer than another with

respect to some hazards while being inherently less safe
with respect to others
ISDs are based on an informed decision process
All have some Potential to transfer of risk from one impacted
population to another
Technical and economic feasibility
A technology can only be described as inherently safer when
compared to a different technology, including a description of the
hazard or set of hazards being considered, their location, and the
potentially affected population. A technology may be inherently
safer than another with respect to some hazards while being
inherently less safe with respect to others, and may not be safe
enough to meet societal expectations. Also chemical processes and
plants have multiple hazards, and different technologies will have
different inherent safety characteristics with respect to each of
those multiple hazards.
ISDs are based on an informed decision process. That is because an
option may be inherently safer with regard to some hazards and
inherently less safe with regard to others, decisions about the
optimum strategy for managing risks from all hazards are required.
The decision process must consider the entire life cycle, the full
spectrum of hazards and risks, and the potential for transfer of risk
from one impacted population to another. Technical and economic
feasibility of options must also be considered.
ISDas an informed decision process is started by instructing the
design engineers on the basic principles of ISD as a design
philosophy. This would make the engineers aware of the priorities
and options available to them and are more apt to apply them. They
are in the best position to invent, design, and promote inherently

safer alternatives. PHA can give them a clear idea about the
hazards in the process and they can incorporate safety into the
basic design so as to prevent or mitigate at least the known
All ISDs have some potential to transfer of risk from one impacted
population to another. That is because ISD can be inherently safer
in the context of a particular hazard or maybe more. However it is
only a remote possibility that any technology will be inherently safer
with respect to all possible hazards. Any change in the technology
to reduce one hazard may impact other hazards, positively or
Also though decision makers must be able to account for local
conditions and concerns in their decision, some technology choices
that are inherently safer locally may actually result in an increased
hazard when considered globally.
In addition to all these considerations, the technical and economic
feasibility also needs to be considered. If a correct technology is
located, then ISD is considered to be an economically better choice.
The means by which the hazards are eliminated or reduced are
incorporated in the basic design. Unless the process is changed
these cannot be changed. This safer design is simpler and will result
in cheaper plants as the cost and complexity of the hazard control
systems would be minimized. This cost includes both the initial
investment for safety equipment,and also the ongoing operating
cost for maintenance and operation of safety systemsthrough the
life of the plant.

Point 2:Levels of Inherently Safer Design

First Order
Inherently safer design refers to the identification of alternatives
that completely eliminate a particular hazard. Hazard Elimination is
the first priority.
Second Order
The second priority is consequence reduction where hazards cannot
be completely eliminated. Inherently safer design reduces the
magnitude of a hazard, or makes an accident associated with a
hazard less likely to occur by the design of the equipment.The focus
is to find less hazardous solutions to accomplish the same design
objective by techniques such as reducing exposure to a hazard,
reducing inventory of hazardous materials, and substitution of less
hazardous materials.
Layers of Protection
Likelihood Reduction - reduce the likelihood of events occurring by
techniquessuch as simplification and clarity (lowering the likelihood
of an initiating event),and layers of protection and redundancy of
safeguards (to reduce the progressionof an incident).Include risk
management equipment and management systems often
categorized as
Approaches to inherently safer design fall into these categories:

Minimize significantly reduce the quantity of hazardous material

or energy in the system, or eliminate the hazard entirely if possible.
Reduce the size of equipment operating under hazardous conditions.
Substitute replace a hazardous material with a less hazardous
substance, or a hazardous chemistry with a less hazardous
chemistry and process
Moderate reduce the hazards of a process by handling materials in
a less hazardous form, or process alternatives that operate at less
hazardous conditions, for example at lower temperatures and
Simplify eliminate unnecessary complexity to make plants more
user friendly and less prone to human error and incorrect

Point 3: PHA & PHA Re-evaluations

PHA Process Hazards Analysis
Broad based skill sets
Piping and Instrumentation drawing basis
Regularly re-visited
PHA is an organized and systematic method to identify and analyze
potential hazards and related accidents associated with processing
or handling highly hazardous chemicals. The PSM Rule allows the
use of different analysis methods, but the selected method must be
based on the process being analyzed.
A PHA helps employers and workers to make decisions for
improving safety and reducing the consequences of unwanted or
unplanned releases of hazardous chemicals. It is used to analyze

potential causes and consequences of fires, explosions, releases of

toxic or flammable chemicals, and major spills of hazardous
chemicals. It focuses on equipment, instrumentation, utilities,
routine and non-routine human actions, and external factors that
might impact a process.
PHA team needs to have broad based skill sets. This is because
chemical processes are complex and complicated and have myriad
aspects. There are various technologies and methods. One or more
established methodologies appropriate to the complexity of the
process should be used. The team members should represent a
cross-section of disciplines and functions, typically including
operations, engineering, maintenance, and process design. This
includes personnel with experience and knowledge specific to the
process being evaluated and the hazard analysis methodology being
used. Having all the disciplines present helps ensure that all types
of hazard scenarios are discussed. Furthermore, the interaction
between team members helps uncover those hazards that may be
created due to communication difficulties or misunderstandings
Block flow diagrams may be used to show major process equipment
and interconnecting process flow lines, flow rates, stream
composition, temperatures, and pressures. Construction materials,
pump capacities, head pressure, net positive suction head required,
compressor horsepower, and vessel design pressures and
temperatures need to be shown when necessary for clarity. Major
components of control loops are usually shown along with key
Piping and instrumentation diagrams (P&IDs), which are required
under process equipment information, may be more appropriate to
show some of these details. These are based on the Process Flow
Diagram and represent the technical process with graphical symbols

for equipment and piping as well as graphical symbols for process

measurement and control functions. They show all of piping
including the physical sequence of branches, reducers, valves,
equipment, instrumentation and control interlocks.
PHA should be regularly re-visited, at least every five years. The
technological advances, safer design and other safeguards can be
looked into. It is suggested that the PHA team should have at least
some members that were not included in the first PHA team. This
can give a fresh perspective to the PHA.

Point 4: Pre Startup Safety Review

MOC action items
HAZOP action items
o Alarm
o Instruments
o Overspeed trips
o Operating Envelopes
Staffing Plans
Startup Procedure
A comprehensive PSSR includes MOC action items, HAZOP items,
alarm checklists, training, overspeed trips, safe operatingenvelops,
staffing plans, etc.
So, lets see what do they all mean.
MOC (management of change): Since the last time the unit started
up some things will have changed. The fundamental purpose of a
PSSR is to ensure that any changes that are made to a facility or


equipment meet the original design or operating intent. Or else

after an interval of time certain things may have changed. The
PSSR aims to review any changes that may have crept into to the
system during the detailed engineering and construction phases of a
project. PSSR covers not only equipment, but also soft issues, such
as operating procedures and training.
All these changes MUST be reviewed to ensure they are compatible
with the unit and procedures in place. It may seem redundant to
review the MOC again, but a different set of eyes may find a gap.
PSM is all about catching problems before they become problems.
The operations engineer would develop his proposal for the MOC
process. Even if the proposed change is accepted, the system is not
put into operation immediately unless a proper PSSR is done.
The HAZOP (Hazard and Operability) method is a widely used
technique for identifying the hazards on process facilities. It is a
structured and systematic technique for system examination and
risk management.
Hazard is any operation that could possibly cause a catastrophic
release of toxic, flammable or explosive chemicals or any action
that could result in injury to personnel.
Operability is any operation inside the design envelope that would
cause a shutdown that could possibly lead to a violation of
environmental, health or safety regulations or negatively impact
Essentially the HAZOP procedure involves taking a full description of
a process and systematically questioning every part of it to
establish how deviations from the design intent can arise. Once
identified, an assessment is made as to whether such deviations


and their consequences can have a negative effect upon the safe
and efficient operation of the plant. If considered necessary, action
is then taken to remedy the situation.
A review of the previous HAZOP action items is intended to catch
potential hazards that have been identified and ensure they are
corrected prior to startup of the unit. Many times a HAZOP will
have action items that can only be implemented while a unit is shut
down for a turnaround (TAR) and the PSSR review is intended to
double check that all have been put in place prior to the startup.
Many checklists are a normal part of a units life and reviewing them
prior to a startup is intended to, again, double check that they are
all current and accurate. Among them would be a list of validated
measurement devices (pressure, level, flow) high/low limits, fail
safe positions on control valves, critical corrective actions, critical
alarms, overspeed trips, and correct operating envelopes (limits).
During a normal run a part of the duty of the unit personnel is to
make sure all checklists are appropriate. If anomalies are found the
correction should be made as quickly as possible, however, as with
the HAZOP action items, sometimes a change must be made only
during the time when the unit is shut down.
Staffing plans for the units startup should be reviewed to ensure
adequate personnel are on the unit, operators, mechanical,
instrument/electrical, and supervision including the management of
the unit must be on site 24/7 until stable operations are sustained.
The plan should be in place and all affected personnel should be
clearly informed.
Finally, proper training of all affected personnel should be complete
and include training of the startup procedures with dry runs.


The PSM regulations do not prescribe how the end result should be
obtained, simply that the end result should be a safe operation.


Point 5: Operating Procedures

Operating procedures describe tasks to be performed, data to be
recorded, operating conditions to be maintained, samples to be
collected, and safety and health precautions to be taken. The
procedures need to be technically accurate, understandable to
employees, and revised periodically to ensure that they reflect
current operations. The process safety information package helps to
ensure that the operating procedures and practices are consistent
with the known hazards of the chemicals in the process and that the
operating parameters are correct. Operating procedures should be
reviewed by engineering staff and operating personnel to ensure
their accuracy and that they provide practical instructions on how to
actually carry out job duties safely. Also the employer must certify
annually that the operating procedures are current and accurate.
Comprehensive written operating procedures should have step-bystep how to instructions. These should be generated where
applicable that address amongst other things:
Initial Startup
Normal operations
Normal shutdown
Emergency shutdown
Emergency operations
Startup following emergency shutdown
These instructions should include what the normally expected limits
should be and what to do if any are exceeded (sometimes these are
called critical corrective actions). Normally any procedure will
consist of specific instructions that are signed and dated as they are

completed. The exception being emergency procedures. These

should be gundrilled on a regular basis so that the crews can do
them in their sleep. After an emergency has been addressed the
emergency procedures should be pulled out and used as a checklist
to ensure that all vital steps have been completed correctly.
Following procedures seems to be a simple matter, but the actuality
is that knowledge of the unit and how to cope are one of the most
important things to include in these procedures. A unit will be
staffed by a variety of people who have differing ideas on what is
the best way to do things. Although many of the experienced
operators think they have unique knowledge and that they are
covered, they are limited to their own experience, thus the unit
engineer needs to ensure that all contingences are covered in the
procedures to help ensure that the people running the unit are
vested with the most experience available. This also ensures that a
common approach is used when running the unit under a variety of
Each section should be read in detail to gain understanding about
the particular requirements of the activity prior to undertaking the
activity itself and completing the associated checklist. The checklist
will serve as a permanent record of the activity, and can be
reviewed if future modifications are undertaken.

Point 6: MSDS
Product Stewardship
Properties of material
Procedures to handle in a safe manner


A material safety data sheet MSDS is a document that contains

information for the safe handling, use, storage and disposal of
potentially hazardous chemicals.
Wikipedia defines MSDS as a form, with data regarding the
properties of a particular substance. An important component of
product stewardship and workplace safety, it is intended to provide
workers and emergency personnel with procedures for handling or
working with that substance in a safe manner, and includes
information such as physical data (melting point, boiling point, flash
point, etc.), toxicity, health effects, first aid, reactivity, storage,
disposal, protective equipment, and spill-handling procedures.
MSDS formats can vary from source to source within a country
depending on national requirements.
MSDS (material safety data sheets) are a widely used system for
cataloging information on chemicals, chemical compounds, and
chemical mixtures. MSDS information may include instructions for
the safe use and potential hazards associated with a particular
material or product. These data sheets can be found anywhere
where chemicals are being used.
Information included in a Material Safety Data Sheet aids in the
selection of safe products, helps you understand the potential
health and physical hazards of a chemical and describes how to
respond effectively to exposure situations. Although there is an
effort currently underway to standardize MSDS-s the quality of
individual MSDS-s vary.
Under product stewardship, all participants in the product life cycle
such as designers, suppliers, manufacturers, distributors, retailers,
users, recyclers and disposers share responsibility for the
environmental effects of the products. What is unique about product


stewardship is its emphasis on the entire product system in

achieving sustainable development. Product Stewardship extends
manufacturers responsibility for products to the disposal and
recycling stages. This shift in responsibility provides an incentive for
manufacturers to think differently about resources and materials so
that toxicity reduction, reuse and recycling are considered at the
product design stage.
MSDS contain procedures to handle hazardous chemicals in a safe

Point 7: Management of Change

Most important element of PSM
Change must be understood by all affected
Written procedures in place to manage change
No compromise with Process Safety
No plant or system is ever static. In PSM, change includes all
modifications to equipment, procedures, raw materials, and
processing conditions other than "replacement in kind." These
changes must be properly managed by identifying and reviewing
them prior to implementing them.
MOC is a process to evaluate and properly manage any
modifications to the design, control, or operations (including
staffing) of a covered process. These written procedures must
ensure that the following considerations are addressed prior to any
The technical basis for the proposed change,
Impact of the change on employee safety and health,
Modifications to operating procedures,
Necessary time period for the change, and


Authorization requirements for the proposed change.

It is a process intended to
Assure no unintended hazards are introduced
Assure risks are properly evaluated & minimized
Keep current
o Process safety information, hazard analyses, operating
procedures, training, mechanical integrity, pre-startup
safety review,
Be completed before changes are implemented
Employees who operate a process, and maintenance and contract
employees whose job tasks will be affected by a change in the
process must be informed of, and trained in, the change prior to
startup of the process or startup of the affected part of the process.
If a change covered by these procedures results in a change in the
required process safety information, such information also must be
updated accordingly. If a change covered by these procedures
changes the required operating procedures or practices, they also
must be updated.

Reactive Hazards
Inherently Safe Design
PHA & PHA Re-evaluations
Pre Start Up Safety Review
Operating Procedures
Material Safety Data Sheet system
Management of Change


Have students read selected overview material from the text for
each topic.


Chapter 10

Safety Instrumented Systems

Some definitions:
What is Safety Instrumented Systems SIS?
Safety Instrumented Systems or SIS is basically a control system
for critical processes and consists of an engineered set of hardware
and software controls.
When a process or a system encounters conditions that may be
hazardous by themselves, or if not controlled may lead to a hazard,
SIS comes into play. The function of SIS is to take actions
automatically to prevent the hazard or to mitigate its consequences.
An SIS is engineered to perform "specific control functions" to
failsafe or maintain safe operation of a process when unacceptable
or dangerous conditions occur. Safety Instrumented Systems must
be independent from all other control systems that control the same
equipment in order to ensure SIS functionality is not compromised.
SIS is composed of the same types of control elements (including
sensors, logic solvers, actuators and other control equipment) as a
Basic Process Control System (BPCS). However, all of the control
elements in an SIS are dedicated solely to the proper functioning of
the SIS.(Ref: Wikipedia)
What is a Safety Instrumented Function SIF?
The specific control functions performed by an SIS are called Safety
Instrumented Functions (SIF). These functions are included in the
basic risk reduction strategy and are meant to eliminate the
probability of a recognized SH&E event. These known risks could be

minor to catastrophic. SIF protects against such risks to attain or

maintain a safety state for the process with respect to a specific
hazardous event
SIF definition also presupposes a reasonable knowledge of risks
associated with the chemical process, and the exact means that are
utilized to mitigate these risks. Safety Instrumented Functions are
intended to protect against specific and identifiable hazards instead
of general hazards, such as fire and gas explosion.An acceptable
safe failure rate is also normally specified for a SIF.
What is a Safety Integrity Level SIL?
Safety Integrity Level SIL is a measure of risk reduction provided by
a SIF based on four levels. It is the probability off a SIF performing
the required safety functions under all stated conditions within a
stated period of time. Each level represents an order of magnitude
of risk reduction. Safety integrity consists of two elements: 1)
hardware safety integrity and 2) systematic safety integrity.
The two standards (IEC 61508 and IEC 61511) define Safety
Integrity as probability of success and then define the Safety
Integrity Level (SIL) as four discrete levels (1 to 4) such that level
4 has the highest safety integrity.
Every SIF has a SIL assigned to it, the SIS and equipment does not
have a SIL assigned to it.

Importance of SIS
Video of what happens with the bypass of an SIS

From this short video what did you learn?

Interplay of people with safety systems
Importance of design
Importance of learning from others (oxidizer issue)
In this lesson we will learn in details about SIS and SIF. We will
know about
Context for SIS
Safety Instrumented Systems (SIS)
SIS and Risk
Safety Instrumented Function (SIF)
Safety Integrity Level (SIL)
Design Considerations - SIF and SIS
Integrity Specification of a SIF
Selection of Appropriate Components and Subsystems for
SIF/SIS design

Slide 4


Context for SIS

SIS practice was proposed in 1996 and finalized in 2004. In 1998
the IEC, which stands for International Electrotechnical Commission
published a document, IEC 61508, entitled: Functional safety of
electrical/electronic/programmable electronic safety-related
systems. It is generic functional safety standard, providing the
framework and core requirements for sector specific standard.
The key reference standards for managing SIS throughout the
lifecycle from risk assessment through design, operations and
maintenance are IECs 61508 and 61511 international standards. .
IEC 61508 is a generic functional safety standard that can be

applied across all industries. IEC 61511 is a functional safety

standard that applies specifically to the process industry sector.
In the United States ANSI/ISA 84.00.01-2004 was issued in
September 2004. It primarily mirrors IEC 61511 in content but also
provides support for old established systems and processes that
have a history of SIS. It says:
For existing safety instrumented systems (SIS) designed and
constructed in accordance with codes, standards, or practices prior
to the issuance of this standard (e.g. ANSI/ISA 84.01-1996), the
owner/operator shall determine and document that the equipment
is designed, maintained, inspected, tested, and operated in a safe
manner. (Grandfather clause)
This standard gives requirements for the specification, design,
installation, operation and maintenance of a safety instrumented
system, so that it can be confidently entrusted to place and/or
maintain the process in a safe state. This standard has been
developed as a process sector implementation of IEC 61508.
SIS is regarded as RAGAGEP Recognized And Generally Accepted
Good Engineering Practice". These comprise are engineering,
operation, or maintenance activities based on established codes,
standards, published technical reports or recommended practices
(RP) or a similar documents. RAGAGEPs elaborate generally
approved ways to perform specific engineering, inspection or
mechanical integrity activities, such as fabricating a vessel,
inspecting a storage tank, or servicing a relief valve.

(IEC 61511-Mod) Application of Safety Instrumented Systems (SIS)

for Process Industriesaddresses the application of SIS to take a
process to a safe state when predetermined conditions are violated,

such as set points for pressure, temperature, level, etc. The title of
the standard is "Functional safety - Safety instrumented systems for
the process industry sector".Its objective is to define requirements
for SISs.
Scope: initial concept, design, implementation, operation, and
maintenance through to decommissioning. In itself, it is a life cycle
system and defines: SIS, SIL, SIF and SRS - (Safety Requirement
An SRS documents the requirements detailed in the Safety
Standard IEC 61511. It outlines all the relevant safety requirements
for a product. It lays out the foundation to which a product should
be designed.
Slide 6

Safety Instrumented Systems (SIS)

SIS is one of the many layers to reduce risk in process and energy
industries and protect workers, equipment, environment and
communities around the facility. As technology is advancing, so is
SIS becoming more effective. SIS is used as a protection layer in
the engineering of processes (the worse the potential hazard, the
more the layers required for prevention and/or protection)
o Plant control system and alarms
o Emergency Shutdown system
o Pressure Relief Devices
Deluge systems
Automatic shutdowns

Slide 7

In the case of process industry SIS is one of the many layers to

reduce risk and protect workers, equipment, environment and
communities around the facility. If the basic process controls fail,
SIS helps maintain process conditions within a safe operating
There are many protection layers such as BPCS (Business Planning
and Control System), which is the basic control system. Alarms
should first alert the operator to an escalating temperature or
pressure,but if the operator is unable to address the problem, the
SIS takes over, automaticallyshutting things down before an out-ofcontrol process becomes an unsafe one.

Mechanical barriers, isolating personnel and other such protection

layers or barriers exist to mitigate hazards. When the number of
barriers is more, the safety factor increases and risk reduces.
Slide 8

SIF and SIL*

SIF is designed and used to reduce"ALARP". ALARPmeans "as low as
reasonably practicable". Reasonably practicable involves weighing a
risk against the trouble, time and money needed to control it. Thus,
ALARP describes the level to which we expect to see workplace risks
To determine SIL, initially all existing risks are identified. Then it is
determined if reduction is essential for each identified risk. The
identification of risk tolerance is subjective and site-specific.
Following this the required risk reduction must be quantified using
risk analysis methods that deliver results in the form of a SIL
SIL 1 is mildly hazardous and SIL 4 is extremely dangerous. A low
SIL requirement (SIL 1) means that only a comparably low risk
reduction is necessary, whereas a higher SIL (for example SIL 3)
requires a greater degree of risk reduction.
Each SIF is assigned a SILduring the analysis.
SIL 0/none lowest risk
SIL 1 95% of the SIFs
SIL 2 5% of SIFs
SIL 3 < 1% (not likely in refineries, but possible in offshore platforms or nuclear industry)

SIL 4 highest risk (only seen in nuclear industry)

*Courtesy of:

Slide 9

Safety Integrity Level (SIL)*

Each SIL rating (increasing in number) must be that much more
reliable and available at all times (and costs more for upkeep).
Reliability and availability are achieved by:
Design using proper safety component
Installation per manufacturers guidelines
Testing both at initial startup as well as at specified
intervals or after any modification (i.e., via PSSR)
*Courtesy of:

Slide 10

Safety Requirement Specification (SRS)

The safety requirements specification (SRS) is an important (SIS)
document shaped during the conceptual phase.The design and
verification is compiled into SRS document.
Both IEC 61508 and 61511 functional safety standards provide
guidelines on the minimum information that a SRS must contain.
Information that needs to be included:
o Intent of each SIF (the hazard that is mitigated)

o Components of each SIF (sensor, logic solver, final

o Calculations to verify the target (required) SIL can be
*Courtesy of:

The SRS is the document against which all of the safety lifecycle
activities are verified and validated. As such, it is important that this
documentation be simple to use and sustain.
Slide 11

Using SIS in Design

When a process cannot practically be designed to be inherently safe,
an SIS can be used to reduce risks to an acceptable level. An SIS
can be designed to deliver a specified safety integrity level (SIL) of
risk reduction.
It is the Design phase where the SIF/SIS is developed to achieve
the risk reduction that is determined in the PHA or SIL Analysis
(target SIL).
Risk Reduction = Inherent Risk-Acceptable Risk
SIF design and SIL verification are crucial to the proper functioning
of SIS. Detailed engineering expertise and experience are a given
for this task.
Design options can include:
Redundancy (initiators, control system, and/or final


Type/style of components (transmitter vs. switch or

modulating valve vs. on/off chop valve)
NOTE: If a SIS already exists, then analysis of the existing system
is done to determine if the target SIL can be achieved with the
current design. (Grandfather Clause)
*Courtesy of:

Slide 12

Design Issues
SIS comprises three elements: A Sensor, a Logic Solver and a Final
Control Element.
Sensorscollect required data to determine if an emergency situation
exists and if the equipment or process is in a safe state. Sensor
types range from simple pneumatic or electrical switches to Smart
transmitters with on-board diagnostics.
Logic Solvers decide the action to be taken based on the
information gathered. Highly reliable logic solvers can provide failsafe and fault-tolerant operation.
Final Control Elementimplements the action determined by the logic
system. This final control element is typically a pneumatically
actuated On-Off valve operated by solenoid valves.
It is absolutely essential that all these three components work as
designed to apply the control action required in case of an
emergency. However by understanding how they can fail, it is
possible to calculate a Probability of Failure on Demand PFD.


Failures around controls

When designing or modifying a SIS, keep in mind there are
different types of issues and failures:
o Safe Failure - FAIL SAFE (Desired)!!
o Dangerous Failure bad outcome
o Spurious (false undesired but still safe)
o Inhibited (bypassed could be safe or bad)
o Missed signal (doesnt trip when it should/needed)
o No signal (need a signal where there is none)
Safe failure is when SIS is virtually fail-safe! Dangerous Failure has
the potential to put the SIS in a hazardous or failtofunction state.
A spurious trip results in an emergency process shutdown,not
dangerous but is expensive! More dangerous is the other type of
failure when mechanisms necessary to work do not and the
operation carries on in an unsafe manner. These are the inhibited
failures when SIS is bypassed, not necessarily dangerous but could
become dangerous. Missed signal is when the system should trip
but doesnt and No signal is when danger signal is not received.
Both these issues can be hazardous.
*Courtesy of:

Slide 13

SIL Verification
Once the safety system is designed but before any safety functions
are implemented, the performance requirements of each safety
function must be verified against the documented requirements in
the Safety Requirement Specification.

SIL verification involves multiple equations to determine the

achieved SIL.
Some of the components to verify this include:
o MTTFS Mean Time to Fail Spurious
o PFD Probability of Failure on Demand
o RRF (inverse of PFD or 1/PFD)
NOTE: SIL 1 achieves a RRF of 10 to 100
MTTFS is the likelihood the safety function triggering unnecessarily,
causing anything from a minor nuisance to a severe operational or
financial loss.
PFD is calculated by evaluating SIF to determine if they achieve the
specific SIL. This may bereferred to as a probability of failure on
demand (PFD) evaluation of the SIF.
*Courtesy of:

Slide 14

SIL Verification
If the required SIL cannot be achieved with the initial design, some
options are:
More frequent proof testing
Add redundancy (i.e., initiating device, control system, final
Install smarter device (i.e., HART smart transmitter or
transmitter vs. switch or relay, smart control valve with
diagnostics and feedback and position indication vs. basic
control valve)

Add protection layers (independent), including the following:

BPCS (control system), alarms and operator response,
physical devices (PSV's, dikes, flares, deluges, etc.) and
other human mitigation (emergency response)
Adding redundancy implies using multiple instruments for the same
task. It will provide uninterrupted system operation even though
one or more specific instruments may fail.
The protection layers must be independent of each other to be
*Courtesy of:

Slide 15

General Concepts for Design

The basic issues to keep in mind when designing SIS are simple.
Sometimes even ordinary common failures can be overlooked
inadvertently. Even substandard components can cause failure of
costly safety shutdown systems.
Other issues while designing:
Transmitter is better than a switch or relay
When using switch, solenoid, or relay (anything on/off or
discrete), verify it is normally energized during operation
(fail safe)
Use dedicated wiring to each device (as much as possible)
Minimize common cause failures (i.e., common wires,
instrument taps including bridles, or same controller or I/O


*Courtesy of:

Slide 16

General Concepts for Design

While designing ensure that all the mechanical devices are working
well. These can prove to be the weakest link in the SIF. They can
stick if not moved periodically (i.e., PSVs, valves, switches). To
remedy this issue: install double blocks or modulatingvalves that
can be partially stroked. Also check out the metallurgy and and
adjust upwards if location has special needs (salt water corrosion,
corrosive atmosphere)
*Courtesy of:

Slide 17

Functional Proof Tests

Unexpected hardware failures can happen to SIS components
during normal working. These could be of spurious or dangerous.
The probability of a dangerous unidentified failure increases with
time. So the probability that the SIS will not operate as required in
the event of a demand due to a random hardware failure (often
called probability of failure on demand - PFD) also increases over
Proof testing is performed to reveal undetected faults in a SIS, so
that, if necessary, the SIS can be restored to its designed
functionality [BS EN 61511-1:2004 3.2.58].


That is why:
Tests must be performed at the frequency stated in the SRS
to continue the reliability of the SIF.
It should include the following information:
o Test procedure
o Test all bypasses, all individual initiators, and final
o Results of all steps of the procedure
o Verification that process has been restored to normal
Date of test and all personnel performing the test
Control logic version # (if available)
Results of entire test and any abnormalities found
*Courtesy of:


Chapter 11

Failure Mode and Effect Analysis

A Quantitative View of System Failure
Identifies Specific Areas requiring Mitigation
Remember the Tops Down Probability / Consequences analysis
from last week. This is another analysis that is a Bottoms Up
analysis. While Tops Down relies on the judgment of the member
of the study team, the bottoms up analysis relies on a mathematical
analysis of all the elements of a system to determine the expected
failure rate of a given system. It is typical to mitigate critical
systems to a calculated failure rate of once in 10,000 years.

To familiarize you with another common hazard analysis technique
that you are likely to experience in Industry
Different system or problems require a different type of analysis.
Critical control and shutdown systems lend themselves well to this
Bottoms Up analysis since the reliability of the components is well
known. Soft issues cannot be easily assigned a mathematical
probability, will rely heavily on the Tops Down judgment of the
experienced members of the study team. Typically, large, new
plants will rely on a mix of techniques appropriate for the specific
situations. Management must decide what is appropriate and
typically does this through standards (guides) with input and
challenge from line managers and staff.
Bottom-up approach

The bottom-up approach is used when a system concept has been

decided. Each component on the lowest level of indenture is studied
one-by-one. The bottom-up approach is also called hardware
approach. The analysis is complete since all components are
Top-down approach
The top-down approach is mainly used in an early design phase
before the whole system structure is decided. The analysis is usually
function oriented. The analysis starts with the main system
functions - and how these may fail.
Functional failures with signicant e ects are usually prioritized in
the analysis. The analysis will not necessarily be complete. The topdown approach may also be used on an existing system to focus on
problem areas.

At the end of today, you will be able to:

Understand the methodology being applied by skilled, specifically
trained professionals in this technique.
These are typically more complex analyses and you will likely have
a skilled facilitator and one or more experts in the various
technologies being reviewed. This analysis is typically used on small
parts or sections of a unit.

What is FMEA?
Failure modes and effects analysis (FMEA) is a methodical analysis
of a design, a manufacturing or assembly process, or a product or

service to whatever level of detail is required to demonstrate that

no single failure will cause an undesired event.
It is a tool that examines potential product or process failures,
evaluates risk priorities, and helps determine remedial actions to
avoid identified problems.
There are three kinds of FMEA:
Design FMEA is carried out to eliminate failures during equipment
design, taking into account all types of failures during the whole
life-span of the equipment
Process FMEA is focused on problems stemming from how the
equipment is manufactured, maintained or operated
System FMEA looks for potential problems and bottlenecks in larger
processes, such as entire production lines
How to begin FMEA?
The process is started by brainstorming, making cause and effect
matrix, (identify,explore and list all the causes related to a problem
and search for the root cause), looking up process map and history,
and utilizing expertise, experience of concerned people and applying
FMEA. What you will get is a list of actions to prevent causes or
detect failure modes and the history of actions taken.
Failure Mode
Failure modes means the ways, or modes, in which something
might fail. Failures are any errors or defects, especially ones that
affect the customer, and can be potential or actual.
Effects analysis

Effects analysis refers to studying the consequences of those

Failures are prioritized according to consequences, frequency of
occurrence, and ease of detection. The purpose of the FMEA is to
take appropriate actions to eliminate or reduce failures, starting
with the highest-priority ones.
FMEA also can be considered as a continuous quality improvement
tool. A detailed FMEA is expected to document the latest
understanding about risks of failures and the corrective actions.
The best time to catch failures is before they happen. That can be
done ideally at the conceptual stage of design and continue
throughout the life of the product or service or process. The failure
risk can then be minimized through design changes. If that is not
feasible, then operational procedures can be proposed. So FMEA
should be used:
During the design or redesign of a process, product or
For a new or modified process
When improving existing process, product or service, or
examining failures
Throughout the life of the process, product or service at
suitable intervals.

Problem for Today:

The instructor on the Tank Problem asserted that the Nitrogen
pressure makeup system was highly reliable.
Lets check that!

This is a Bottoms Up analysis where we look at the reliability

of each part to estimate the reliability of the system.

Details of Pressure Control System

Same tank with a bit more detail on just the N2 system.This

diagram is in Consultant Font!

What Elements need to be Checked?

Pressure Sensor
Pressure Transmitter
Control System
Control Valve Signal Converter
Control Valve

Link: http://en.wikipedia.org/wiki/Failure_mode_and_effects_analysis

The aspects to consider are:

Failure Mode:
Failure modes means the ways in which something might
potentially or actually fail to meet the design intent. It could be
complete or partial failure, intermittent failure, over performing or
underperforming functions.
Failure Cause and/or Mechanism
Underlying causes have to be identified for a failure mode. A failure
cause could be a design weakness. These should be limited to what
can be controlled. There is often more than one cause of failure for
each failure mode
Failure effect is the immediate consequence of an operational or
functional failure. Effects should be listed, as customer would
describe them. These should include (as appropriate) safety
/regulatory body, end user,internal customers (manufacturing,
assembly, service)
This step looks at the cause and the frequency of the failure mode.
For this, analyses of documented failure modes for similar products
or processes can be carried out. All the potential causes for a failure
mode should be identified and documented.
This is the consequence of a failure mode. Severity considers the
worst potential consequence of a failure, determined by the degree

of injury, property damage, system damage and/or time lost to

repair the failure. The severity of the failure effects should be
determined as a rating value.
Severity values may be available from governing bodies. If severity
is based upon internally defined criteria or is based upon standard
with specification modifications, rating tables should be included
with the analysis.
The Severity Rate (S) is the "best guess" of how serious it would be
to the customers, the product, or the service if the failure really
occurred. A rating of 1 would mean the effect of the failure is
considered minor; a rating of 10 would indicate that the effect of
the failure would be very severe.
The Detection Rate (D) is an estimate of how difficult it is to detect
the failure before the customer sees it. A rating of 1 would indicate
that it is obvious right away to anyone that the failure is occurring;
a rating of 10 would indicate that the failure would go undetected
until the effect is felt by the customer.
Ideally, detection values should correspond any existing standards
Risk priority number (RPN)
RPN play an important part in the choice of an action against failure
modes. They are threshold values in the evaluation of these actions.
After ranking the severity, occurrence and detectability the RPN can
be easily calculated by multiplying these three numbers: RPN = S
(Note: Lowest detection rating is used to determine RPN.)

RPN threshold should not be used as the prime prompt for definition
of recommended actions as the practice of prioritizing work on the
basis of RPN has no theoretical basis.
The FMEA has to be done for the entire process and/or design. Once
this is done it is easy to determine the areas of greatest concern.
The failure modes that have the highest RPN should be given the
highest priority for corrective action. This means it is not always the
failure modes with the highest severity numbers that should be
treated first. There could be less severe failures, but which occur
more often and are less detectable.
Recommended Actions
The RPN is used to identify items that require attention and assign a
priority to them. It is necessary for all critical or significant failures
to have recommended actions associated with them.
Recommended actions should be focused on design, and directed
toward mitigating the cause of failure, or eliminating the failure
Once recommended actions are determined, the next step is to
include targets, assign responsibility for completion of the action to
a specific person, and note dates of implementation. These actions
could be specific inspection, testing or quality procedures, redesign
(such as selection of new components), adding more redundancy
and limiting environmental stresses or operating range.
Once the actions have been applied to the design/process, the new
RPN should be checked to confirm the improvements. Whenever a
design or a process changes, an FMEA should be updated.
Recommended Actions (examples)

Attempt to remove the failure mode (some failures are more

preventable than others)
If not possible then limit the severity of the failure
If that is difficult then reduce the occurrence of the failure
Improve the detection
Action Results
Action taken must document what actions were taken, and the
results of those actions. Actions must be completed by the target
completion date
Update S, O, and D to reflect actions taken
Unless the failure mode has been eliminated, severity should
not change
Occurrence may or may not be lowered based upon the
results of actions
Detection may or may not be lowered based upon the
results of actions
If severity, occurrence or detection ratings are not
improved, additional recommended actions must to be

Tank Nitrogen Vacuum Protection System

This is the basic starting point for an analysis.


Occurrence - Rating Meaning




No effect

2/ 3

Low (relatively few failures)

4/ 5/ 6

Moderate (occasional failures)

7/ 8

High (repeated failures)


9/ 10

Very high (failure is almost inevitable)


Severity - Rating Meaning




No effect

Very minor (only noticed by discriminating


Minor (affects very little of the system, noticed

by average customer)

4/ 5/ 6

Moderate (most customers are annoyed)

7/ 8

High (causes a loss of primary function;

customers are

9/ 10


Very high and hazardous (product becomes

inoperative; customers angered; the failure
may result unsafe operation and possible

Determine all failure modes based on the functional requirements

and their effects. Examples of failure modes are: Electrical shortcircuiting, corrosion or deformation. A failure mode in one


component can lead to a failure mode in another component;

therefore each failure mode should be listed in technical terms and
for function.
Hereafter the ultimate effect of each failure mode needs to be
considered. A failure effect is defined as the result of a failure mode
on the function of the system as perceived by the user. In this way
it is convenient to write these effects down in terms of what the
user might see or experience. Examples of failure effects are:
degraded performance, noise or even injury to a user.
Each effect is given a severity number (S) from 1 (no danger) to 10
(critical). These numbers help an engineer to prioritize the failure
modes and their effects. If the severity of an effect has a number 9
or 10, actions are considered to change the design by eliminating
the failure mode, if possible, or protecting the user from the effect.
A severity rating of 9 or 10 is generally reserved for those effects
that would cause injury to a user or otherwise result in litigation.

Detection - Rating Meaning




Almost certain



4/ 5/ 6

Moderate - most customers are annoyed


7/ 8


9/ 10

Very remote to absolute uncertainty

When appropriate actions are determined, it is necessary to test

their efficiency. In addition, design verification is needed. The
proper inspection methods need to be chosen. First, an engineer
should look at the current controls of the system, that prevent
failure modes from occurring or which detect the failure before it
reaches the customer. Hereafter one should identify testing,
analysis, monitoring and other techniques that can be or have been
used on similar systems to detect failures. From these controls an
engineer can learn how likely it is for a failure to be identified or
Each combination from the previous 2 steps receives a detection
number (D). This ranks the ability of planned tests and inspections
to remove defects or detect failure modes in time. The assigned
detection number measures the risk that the failure will escape
detection. A high detection number indicates that the chances are
high that the failure will escape detection, or in other words, that
the chances of detection are low.

You should understand the basics of Failure Mode and Effect
You should be able to apply FMEA to a suitable problem.
You should be able to understand when this method is not
applicable and recommend an appropriate alternative.


Fill out the matrix for the Nitrogen Tank Vacuum Mitigation.
Here is a spreadsheet that you can fill out!



Fault Tree Analysis

Fault Tree Analysis
Introduce and Review FTA
When should FTA be used


An introduction to a more complex method of failure analysis
typically used in safety engineering.
The objective of any root cause analysis is to get to the
ROOT cause not just the superficialcause of failure.

At the end of today, you will be able to:

Participate in a Fault Tree Analysis facilitated by a trained
Understand the basic methodology and be able to contribute
to a system analysis.
Fault Tree Analysis (FTA) is a deductive, top-down method aimed at
analyzing the effects of initiating faults and events on a complex
system. The causal events are at the bottom of the fault tree, and
are linked via logic symbols (known as gates) to one or more TOP
events. These TOP events represent identified hazards or system
failure modes for which predicted reliability or availability data is
required. Typical TOP events might be:
Loss of feed
Unit shutdown (partial or total)
Off spec product with no change in operating conditions
Toxic emission
Basic events at the bottom of the fault tree generally represent
component and human faults for which statistical failure and repair
data is available. Typical basic events are:
Pump failure
Temperature controller failure
Loss of control over pressure
Operator does not respond

How does FTA work?

An undesired event/hazard is defined. This is the TOP event. For
such an event to occur what could be the various causes? These
causal factors are again resolved till finally basic causes are
identified. A logical diagram called a fault tree is constructed
showing the rational event relationships
The fault tree explicitly shows all the different relationships that are
necessary to result in the top event. During construction of the fault
tree, one can thoroughly understand the basic causes leading to the
top event and the logic behind it.
As we first define the undesirable event and then ascertain the
causes that might lead to it, FTA is a backward looking analysis,
looking backward at the causes of a given event. This backward
tracing process continues until the basic causes are identified.

What are gates?

Gate symbols describe the relationship between input and output
events. The symbols are derived from Boolean logic symbols.



OR gate

The output occurs if any input occurs


The output occurs only if all inputs


occur (inputs are independent)


Exclusive OR gate - the output occurs if

OR gate

exactly one input occurs


Priority AND gate - the output occurs if


the inputs occur in a specific sequence


specified by a conditioning event

So we have seen that FTA is carried out to exhaustively identify the

causes of a failure, to identify weaknesses in a system, to assess a
proposed design for its reliability or safety, to identify effects of
human errors, and many other issues.

The 5 Steps of FTA Analysis:

1. Define the undesired event to study
2. Obtain an understanding of the system
3. Construct the Fault Tree
4. Evaluate the Fault Tree
5. Control the hazards identified
Define the undesired event to study
Definition of the undesired event can be very hard to catch,
although some of the events are very easy and obvious to observe.
An engineer with a wide knowledge of the design of the system or a
system analyst with an engineering background is the best person
who can help define and number the undesired events. Undesired
events are used then to make the FTA, one event for one FTA; no
two events will be used to make one FTA.

Obtain an understanding of the system

Once the undesired event is selected, all causes with probabilities of
affecting the undesired event of zeroor more are studied and
analyzed. Getting exact numbers for the probabilities leading to the
event is usually impossible for the reason that it may be very costly
and time consuming to do so. Computer software is used to study
probabilities; this may lead to less costly system analysis.
System analysts can help with understanding the overall system.
System designers have full knowledge of the system and this
knowledge is very important for not missing any cause affecting the
undesired event. For the selected event all causes are then
numbered and sequenced in the order of occurrence and then are
used for the next step which is drawing or constructing the fault
Construct the fault tree
After selecting the undesired event and having analyzed the system
so that we know all the causing effects (and if possible their
probabilities) we can now construct the fault tree. Fault tree is
based on AND and OR gates which define the major characteristics
of the fault tree.
Evaluate the fault tree
After the fault tree has been assembled for a specific undesired
event, it is evaluated and analyzed for any possible improvement or
in other words study the risk management and find ways for system
improvement. This step is as an introduction for the final step which
will be to control the hazards identified. In short, in this step we
identify all possible hazards affecting in a direct or indirect way the

Control the hazards identified

This step is very specific and differs largely from one system to
another, but the main point will always be that after identifying the
hazards all possible methods are pursued to decrease the
probability of occurrence.


Logic gates

These are the various types of logic gates that can be used.


Simple Fault tree

Simple Fault Tree: When you do your homework assignment, think

of this type of simple analysis. Notice, it uses only AND and OR
gates. For the purposes of illustration, that should be adequate for
the homework.


A more elaborate Fault tree

This is a more complicated example. In branches 2 and 3 the

events with the bar on top mean NOT. So the middle branch is A +
NOT B + C. The last one is NOT A + B + C. Three simple AND
gates and one OR gate lead to the Top Event.

A complex Fault tree

Much more complex example just so you can see that these can
become quite complex and time consuming to construct.

FTA is the analysis of a single fault in a system or a small portion or
By its nature, it is very detailed and time consuming and the results
apply only to the one single fault being examined.


FTA and FMEA Comparison:

FTA is a deductive, top-down method aimed at analyzing the effects
of initiating faults and events on a complex system. This contrasts
with failure mode and effects analysis (FMEA), which is an inductive,
bottom-up analysis method aimed at analyzing the effects of single
component or function failures on equipment or sub-systems. FTA is
very good at showing how resistant a system is to single or multiple
initiating faults. It is not good at finding all possible initiating faults.
FMEA is good at exhaustively cataloging initiating faults, and
identifying their local effects. It is not good at examining multiple
failures or their effects at a system level. FTA considers external
events, FMEA does not.
(Wikipedia: http://en.wikipedia.org/wiki/Fault_tree_analysis)

When starting a car, you turn the key, but the car does not
start (Top Event). Generate a Fault Tree Analysis.
First row below Top Event (fails to start)
o Engine does not turn over.
o Engine cranks but fails to start.
o Engine tries to start but dies immediately.
Fill in the AND/OR boxes below these.


Analyzing Hazards
Analyzing Hazards
Simple Tank Problem
Tough Homework Problem
In this lesson we will learn Risk Analysis for a Simple Tank

The objective of this lesson is to illustrate from a simple piece of
equipment (a tank) how complicated the systems may need to be.
Heighten awareness that items that appear superficially simple may
require a more in depth, complex analysis.

At the end of today, you will be able to:

Appreciate the level of attention you must pay to projects you are
The next or second level of protection may require considerable
creativity and effort to implement but still be required.

Industry uses a variety of tanks such as storage tanks, feed tanks,
mixing tanks, etc. Tanks are thus a basic part of the industry.

A simple tank can have very complex operation and consequently a

variety of safety issues. These could include over-pressurization,
over or under temperature levels, overflow, running dry, etc.
Let us study the basic tank.

Basic Tank

A simple tank might have a variable flow rate product stream

flowing into it that is controlled by another unit. The product leaving
the tank is set and controlled by the Logistics Department. If the
two flows become too different, the tank runs the chance of
overflowing or running dry.
Initially, the design called for a High Level Alarm and a Low Level
Alarm. Alarms have certain failure rates that the manufacturer
provides. In addition, an alarm relies on operator intervention.
When the design team calculated the failure rate of this system, it
exceeded the corporate acceptable failure rate for such systems. To
increase the reliability of the system, the engineers added
automatic High-High and Low-Low Shut Down systems. The

combined safety provided by the redundant systems reduced the

overall system failure rate to an acceptable level for LEVEL

Tank with N2 Blanket

If the tank were to be pumped out at a rate considerably higher

than the product flow into the tank, the tank would crush inward
since tanks have relatively thin shells and not rated for vacuum. To
mitigate this, the design team added a nitrogen make-up system to
maintain pressure in that case. Nitrogen systems are very clean and
reliable and the calculated failure rate of that system was
Tank blanketing, also referred to as tank padding, is the process of
applying a gas to the empty space in a storage tank. Tank
blanketing means using a buffer gas to protect products inside the
storage container. Such a measure inhibits evaporation, reduces
runaway emissions, reduces corrosion, contamination, and oxidation

and considerably reduces fire hazard. It can also provide safety

blanket when pressure drops
The most common gas used in blanketing is nitrogen. Nitrogen is
widely used due to its inert properties, as well as its availability and
relatively low cost. The benefits of blanketing include a longer life of
the product in the container, reduced hazards, and longer
equipment life cycles.

Tank with Pressure Hatch:

Pressure hatches are designed to limit the maximum pressure that
can exist in a tank. Direct acting pressure/vacuum relief valves are
special types of relief valves, which are specifically designed for
tank protection.
When a tank is being filled the gas, the space above the gas is
compressed. If this pressure were to exceed the design pressure of
the tank, then tank rupture could result. Similar consequence can
result if the temperature of the tank increased. This would cause
vaporization and expansion would cause rise in pressure.
Conversely a reduction in temperature causes a vacuum.
Fitting a relief valve allows the pressure in the tank to run at a
slightly positive pressure. Such an increase in pressure increases
the boiling point of the liquid, reducing the amount of vapor, which
would otherwise form. When the relief valve does lift it will
predominantly discharge the nitrogen blanket gas rather than
product vapor.

In the another similar case is where the tank is not pumping out
fast enough and the level is rising quickly, still between the low and
high level alarms, the tank could overpressure and rupture a seam.
This would lead to product spilling out around the tank. The
engineers decided to add an over pressure blow out hatch to
prevent rupture of the tank.

We should have learned something about tanks simple but with
some complexity.
We should recognize that every element on a plants P&IDs MUST
be examined no matter how simple they appear at first glance.

The tank is continuing to fill since production is exceeding shipping
capacity. The nitrogen blank system has closed since the pressure
setting has been exceeded. In this case, the manufacturer reliability
for the blow out hatch was not quite good enough to meet company
standards. The engineers COULD have added a secondary hatch but
conditions that cause one hatch to fail (say, freezing weather)
would also affect the second hatch. The Blow Out Hatch FAILS!
What is the next level of protection the design engineer (YOU) has
incorporated to avoid a catastrophic release due to tank rupture
along the vertical seams?

An alternative safety system has to be incorporated to avoid spilling
product through rupture of a lower or vertical seam. What the
design team came up with was to make the welded seam between
the tank walls and the tank roof weaker so that during over
pressure, the highest seam would open first and relieve pressure
without spilling product. This is called a Frangible Roof Seam and
is common practice in the industry.
Due to filling and draining of the tanks, the vapor above the liquid
surface inside the tank may be within its flammability limits. Overpressurization could occur due to the ignition of this vapor and
could exceed the capability of the pressure relief vents specified in
storage tank design.
Emergency venting addresses the effects of an external fire in the
vicinity of the tank. Heat from fire exposure causes an increase in
the internal pressure of the tank that may not be adequately
relieved by normal venting. API 2000 (4.4.3) specifies that

emergency venting may be accomplished either by additional

venting or by a frangible roof-to-shell joint.
A frangible roof-to-shell attachment is designed to be weaker than
either the weakest vertical joint in the shell or the shell-to-bottom
However it is essential to know that under certain specific
circumstances a frangible roof cannot be used. They are as follows:
In case of a small tank there usually is not enough weight in
the shell and framing to meet the requirements of
When the design pressure is high enough for the tank to be
anchored, the cone roof would need to have a steeper pitch
to it. The angle is usually too steep for the roof to be
With a self-supporting roof, the roof-to-shell attachment
must be adequate to support the roof. The strength of the
joint prevents it from being frangible.
Emergency venting options other than frangible roofs are specified
in API 2000 (4.4.2) as: open vents (with flame arresters if the flash
point is below 100 F); PV valves, gauge hatches or manholes with
covers that lift when exposed to abnormal internal pressure;
rupture disk devices; or other forms of construction that can be
proven to be comparable to these devices.

PSM in Laboratories & Pilot Plants

Students should understand that different work environments (labs,
pilot plants) require slightly different PSM approaches but all contain
the same focus: Everyone goes home as healthy as they were
when they arrived.
Review PSM in the Laboratories and Pilot Plants

At the end of the day, you will:

Know key hazards and PSM elements for a laboratory environment.

Todays Roadmap
Hazards in the Lab/Pilot plant
Example & Discussion UCLA lab death

Hazards in the Lab/Pilot Plant

Pilot plant incidents have underlying causes similar to those found
in real process plant accidents and should be addressed by applying
the established guiding principle of OSHA Process Safety
Management (PSM) to these facilities.
Though the volumes of hazardous chemicals in lab-scale pilot plant
are lower than commercial plant hazards can be a problem due to
novel operations and processes used, high operation density of
equipment, unproven or changing technology, lack of safety related
information due to developmental stages, waste generated by the
operation, use of sophisticated instruments gives a significant

hazard impact that can cause injuries, fatalities and property

Key strategies for safety in laboratories and pilot plants include
properly conducted preconstruction safety reviews, leak tight design
and construction,electric safety measures, proper ventilation, wellplanned storage, proper maintenance, and correct procedures for
the control of change.
In addition to the inherent hazards of chemicals handled, following
hazards need to be considered in laboratories and pilot plants:
Elevated temperature and pressure
High density of operating equipment
Untested processes or technology
Switching operations
Nonstandard equipment
Things to worry about:
Flammability and Open flames
Due to some of the inherent hazards in a lab environment,
additional fire safety measures must be observed at all times.
Flammable liquids, compressed gases, oxidizers, and a lengthy list
of other chemicals can prove to be deadly in the event of a
laboratory fire. The best defense against these hazards is
prevention and safe operating procedures.
Controlled Flame
Bunsen burners are part of everyday operations in many labs. Keep
flammables away. Never leave a Bunsen burner lit in an unattended

Combustibles in Labs
Another source of potential fires in labs can be the presence of
relatively large quantities of combustible materials. If combustibles
are required in the lab for daily usage, maintaining them in an
organized and tidy manner will help to reduce the associated risk.
State regulations for storage and handling of flammable and
combustible liquids must be scrupulously followed.
Good Laboratory Practices should be in operation in all research
labs. These good practices include the following:
Good housekeeping and tidiness.
Keep all aisles and exits clear of obstacles.
Reduce all tripping, slipping, and fall hazards.
All tools must have a designated/labeled storage space.
Label ALL equipment, materials, bottles, etc. with chemical
content and responsible persons name.
Material Safety Data Sheets (MSDS) must be available for all
chemicals in use in the pilot plant.
For all materials learn about:
o Flash points
o Auto ignition temperatures
o Explosive limits
Know evacuation routes.
Know where emergency contact numbers are posted.
Have reactive chemicals properly stored and well labeled.
Have appropriate personal protection equipment (PPE)
available and in good condition.
Hood airflows been checked within last year?

Properly functioning fume hoods for hazards of chemical vapors and

other harmful airborne substances are mandatory. It is important to
remember that a fume hood is not a storage area. Keeping
equipment and chemicals unnecessarily in the hood may cause
airflow blockage.
A gas mask worn over the face is to protect the wearer from
inhaling airborne pollutants and toxic gases. The mask forms a
sealed cover over the nose and mouth, but may also cover the eyes
and other vulnerable soft tissues of the face.Airborne toxic materials
may be gaseous or particulate. Many gas masks include protection
from both types.
A smoke hood is a protective device similar in concept to a gas
mask. A translucent airtight bag seals around the head of the
wearer while an air filter held in the mouth connects to the outside
atmosphere and is used to breathe. Smoke hoods are intended to
protect victims of fire from the effects of smoke inhalation.
Gas Cylinders
Cylinders, no matter what their contents, deserve our
Make sure your cylinders are in good condition
o Rust, condition of the cylinder bottom, valves at
o Every cylinder should have a cap
Watch this 8 minute video:
Compressed gas cylinders are required to be secured in the
upright position by a suitable retaining strap or chain.

Any cylinders that are not in use are required to have a

protective valve stem cap in place.
Cylinders of gases that may react with one another are not
to be stored in the same area.
Cylinders of flammable gases are not to be stored with
oxidizing materials or with cylinders containing gases that
support combustion.
Diffusion of leaking gases may cause rapid contamination of the
atmosphere, giving rise to toxicity, anesthetic effects, asphyxiation,
and rapid formation of explosive concentrations of flammable gases.
The flash point of a flammable gas under pressure is always lower
than ambient or room temperature. Leaking gas can therefore
rapidly form an explosive mixture with air.
The procedures adopted for the safe handling of compressed gases
are mainly centered on containment of the material, to prevent its
escape to the atmosphere, and proper control of pressure and flow.
Emergency procedures are usually only necessary because a basic
rule of handling has been broken. It is far better to observe the
rules and avoid the need for emergency measures.
Glass in laboratories:
When handling glassware, check for cracks and chips before using
it. Damaged glassware must be repaired (if an option) before use or
disposed. Handle glassware with care avoid impacts, scratches,
and intense heating of glassware.
Flexible hoses:
Pressurized hoses are used to run tools like paint sprayers and nail
guns. These hoses themselves can be dangerous if handled
improperly. The hoses derive power from the liquid or gas that

moves inside them; however, that power also creates a reactive

force. If the force is strong enough, it can cause the hose to whip,
possibly causing serious injury if it strikes a worker and even
additional hazards, like a chemical spill.
Inspect hoses for torn outer jackets, damaged inner reinforcing, or
soft spots before using them.
Reduce the pressure in the hose to a lower level if possible.
Avoid making sharp bends in the hose, which can damage the
Dont jerk on a hose that has become snagged as this can cause
ruptures. Find the object the hose is caught on, and release it there.
Restrain pressurized hoses that are unavoidably located near other
employees with guards that are strong enough to keep the hoses in
place if a leak or rupture occurs.
Use solid lines with tight fittings if possible instead of flexible hoses
when working near other employees. Solid lines do not whip or leak
as readily as flexible hoses, which can develop leaks from vibration,
pressure cycles and aging.
Pin the two sides of the hoses twist type fitting together using the
lugs provided. Be sure these fittings are fully secured.
Use the safety device at the air supply to reduce the pressure in the
event of a hose failure. Never connect or disconnect pressurized
hoses, always depressurize first.
Dont stop the airflow in a hose by bending or crimping with pliers
as this could cause major hose damage.

Stand clear of potential rupture points when conducting hose

pressure tests. During testing, the pressure should be increased
gradually with a brief pause between each increase. Instruments for
reading pressures should be arranged so they are clearly visible at
all times.
Key Codes
NFPA 70 National Electric Code - The NEC addresses the
installation of electrical conductors, equipment, and
raceways; signaling and communications conductors,
equipment, and raceways; and optical fiber cables and
raceways in commercial, residential, and industrial
NFPA 30 Flammable and Combustible Liquids - provides
safeguards to reduce the hazards associated with the
storage, handling, and use of flammable and combustible
NFPA 45 Safety in Labs Using Flammable Materials - Lab
staff should ensure that stock chemicals and other
hazardous materials are stored properly in order to prevent
spills, uncontrolled reactions and minimize worker
exposures. Labs are particularly challenged because of the
number and variety of chemicals that are handled.
NFPA 496 Electrical Enclosures in Hazardous Locations contains requirements for the design and operation of
purged and pressurized electrical equipment enclosures. This
protection technique is used in Class I and Class II
hazardous (classified) locations to reduce or prevent the
presence of flammable materials within electrical equipment
enclosures as specified by NFPA 70. It also includes chapters

covering protection of analyzers and rooms housing

analyzers as well as a chapter on pressurized control rooms.

Hazards in the Lab/Pilot Plant

Things to worry about (contd):
MSDS sheet must be readily available to laboratory employees for
each hazardous chemical used in the work area. The MSDS must
contain all the relevant information. The location and availability of
the MSDS collection must be shared with the laboratory employees.
The collection can either be maintained as an electronic or paper
Employees are responsible for understanding the hazards involved
with chemicals they use. They must be familiar with the location
and contents of the MSDS file in their work area.
Conduct regular laboratory audits for all processes, chemicals, and
equipment.Audits should attempt to identify process hazards,safety
measures, safety training, etc.
Do not store food and beverages in laboratories. Consume food and
beverages only in properly designated areas
Do not use laboratory glassware for food consumption
Peer inspection of hazardous experimental setup should be
provided. Also peer reviews of all internal safety audits, training
reviews, accident investigations, and other safety related actions
should be done.
Regular Team Meetings to include Safety Discussion
All members to actively participate in Safety Inspection Program

Some guidelines:
Use appropriate personal protective equipment at all times
Use laboratory equipment for its designed purpose
Confine long hair and loose clothing
Use a proper pipetting device, never directly by mouth
Avoid exposure to gases, vapors, aerosols and particulates
by using a properly functioning laboratory fume-hood.
Know the location and correct use of all available safety
Determine potential hazards and appropriate safety
precautions before beginning new operations and confirm
that existing safety equipment is sufficient for this new
Be certain all hazardous agents are stored correctly and
labeled correctly according to Workplace
Consult the material safety data sheet prior to using an
unfamiliar chemical and follow the proper procedures when
handling or manipulating all hazardous agents.
Follow proper waste disposal procedures.

UCLA Laboratory Death

December, 2008: Sheri Sangji, 23, working in a hood, withdrew tbutyl lithium and the syringe came apart in her hands. The syringe
contained a solution that combusts upon contact with air. The
solution spilled onto Sheri Sangjis hands and torso. A flash fire set
her clothing ablaze and spread second- and third-degree burns over
43% of her body. Her polyester sweater burst into flames. She
wasnt wearing a lab coat; no one had told her she had to.
At the direction of her boss, chemistry professor Patrick Harran,
Sangji had been trying to produce a chemical that held promise as
an appetite suppressant. She was unsupervised.
Eighteen excruciating days later, Sangji died in a hospital burn unit.
"Sheri wasn't out doing something stupid. She was working in a lab
at one of the largest universities in the world.
The accident brought into focus the dangers inside university
laboratories where students and employees, sometimes working
without proper training or supervision, routinely handle toxic,
flammable and explosive compounds.
Causes: Poor training, poor technique, lack of supervision and
improper method.

UCLA Laboratory Death (contd)

Two months earlier, UCLA safety inspectors found more than a
dozen deficiencies in the same lab, Molecular Sciences Room 4221,
according to internal investigative and inspection reports reviewed
by The LA Times. Among the findings: Employees were not wearing
requisite protective lab coats, and flammable liquids and volatile
chemicals were stored improperly.


But the required corrective action was not taken, records show, and
on Dec. 29 all that stood between Sangji's torso and the fire that
engulfed her was a highly flammable, synthetic sweater that fueled
the flames.

No matter where you work, Process Safety applies
Two key elements of PS in a lab are:
A. Flammability & toxicity
B. Cylinder and Equipment use/storage


Capital Projects and PSM

All of the PSM elements applyin varying degrees to all capital
projects including new plant/facility, maintenance, costsaving/revenue enhancement, capacity expansion and regulation
driven projects.
Today, we will focus on a new plant design the most involved
capital project.
Different safety strategies can be applied inherent, passive, active
and procedural to address and manage hazards during project
development and execution.

At the end of the day, you will be able to:

Know what a capital project is and why companies invest.
Understand the safety strategies that are applied to capital
Understand the phases of a capital project and how PSM
integrates into each project phase.

Todays Roadmap
What is a capital project and why do companies invest?
Chemical process safety strategies
Capital project phases and PSM

What is Capital? Why invest?

A company invests to continue to grow and thereby to maximize its
wealth. The company uses its assents for investment and these
assets are the companys capital. Capital investment is investment
in companys assets. For the purpose of investment, opportunities
need to be thoroughly investigated and select one that provides
maximum returns at minimum risks.
The business risk is projected by the rate of return on investment.
This is required to compensate the investors (shareholders,
owners) for the amount of risk they accept. For the company this
is translated as the cost of capital.
In addition to rate of return, companies look at the timing and risk
associated with the return.
Rate of return (RR) depends first on the amount of money expected
back from the investment and is expressed in percentages.
RR also depends on timing. The earlier the return (money) is earned
the better is the RR. So getting more money produces a higher rate
of return, and getting it sooner also produces a higher rate of
return. So an investment's rate of return also depends on when the
company expects to get the money back. Earnings are worth more
today than tomorrow because todays earning can be invested and
can generate further earnings. In addition, the uncertainty of
earning in the future and inflation make future earnings less
For most capital investments, the amount of money and/or the time
at which the company expects to get it back are uncertain. How the
investment's rate of return is calculated depends on the risk. So the
third important dimension of an investment's rate of return is the

risk connected with the amount of money a company expects to get

back from the investment.
When a company evaluates a capital investment, the amount of
money expected back from the investment is adjusted for its timing
and risk. This is known as the time value of money.
Strong and well-managed companies spend capital judiciously.

Capital Investments
The first and the foremost part of thecapital investment process is
generating new ideas. Such ideas can emerge anywhere in the
company. From bottoms up, top down and from R&D. A new
product idea can come from either a new technology(discovered by
the technical side of the enterprise) or a new customer need
(discovered by thebusiness side). In either case, the technologists
and business people work together to come upwith technical
solutions and goals.
The bottoms up process might start from plant managers or even
operators. Many times a plant manager can see the potential of a
new project or of operating on a different scale or by a more
efficient method. Even plant operators could suggest using better
types of equipment for more efficient operation. After screening out
undesirable ideas, managers send the ones that appear to be
attractive to the divisional level, with supporting documentation.
Division management reviews such proposals and adds ideas of its
own. For example, division management may propose the
introduction of a new product line. Alternatively, management may
want to combine two plants and eliminate the less efficient one.
Such ideas are less likely to come from the plant managers!

This bottom-up process results in ideas percolating upward through

the organization. At each level, ideas submitted by lower-level
managers are assessed, and those with a better potential and
workability are sent up to the next higher level. This process works
well because the higher the managers level, the broader is their
vision. They may refine the ideas or discard some part that they
may feel is not feasible.
The other process is a top-down process. Strategic planners will
generate ideas about new business opportunities the company could
grasp, or some profitable acquisition of other companies that could
benefit the business or modify existing business for better costeffectiveness. Strategic planning is a critical element in the capital
investment process. The processes complement one another; the
top-down process generates ideas of a broader, more strategic
nature, whereas the bottom-up process generates ideas of a more
project-specific nature.
Apart from these processes, many companies have an R & D unit,
either within a production division or as a separate department. This
unit is mostly involved in research and development of either new
products or more efficient processes. Research and development on
new chemical processes could be conducted by industry, academia,
and for-profit technology developers. The objective of the research
is to develop a technical solution to a known processing problem
These concepts then go to the marketing research department to
know their feasibility in the market. Once the new chemical process
can demonstrably accomplish the required processing, with known
capital and operating costs, then it becomes usable.
Capital investment categories: Capital investment decisions are
long-term corporate finance decisions relating to fixed
assets and capital structure. Decisions are based on several inter4

related criteria and comprise an investment decision, a financing

decision, and a dividend decision.
Maintenance projects of existing infrastructure or equipment will
certainly need investment. "Capital maintenance" means
maintenance intended to extend the useful life of a facility and
equipment, including upgrades and replacements systems. Such
investment can give benefits by increasing the productivity of the
Cost saving and revenue enhancement projects could be
replacement of obsolete equipment with modern equipment, or
investing in newer technology for safer/ more efficient processes.
The capacity expansion of existing products or target markets
means an expansion of the business. This may happen when there
is more demand for the product.
A new capital investment project is important for the growth and
expansion of a company. It is also important for the economy at
large as it means research and development. This type of project is
one that is either for expansion into a new product line or into a
new product market, often called the target market.
A new business or a new product or a new target market would
transform the business. It should be approved by higher-ups in the
business organization. Such projects need detailed financial
For any investment, supporting information is always required at
a level that is appropriate for the level of investment. The more the
investment, the more the information requited.
Authorization systems and rules are in place to control and monitor
capital expenditures.

Safety Strategies for Capital Projects

Inherent: Inherent strategies often involve incorporating safety
into the basic process chemistry and unit operations best
considered as early in process development as possible.
This is an alternative (new process, materials or technology) that
eliminates or significantly reduces a hazard. That is because safety
becomes integral to product, process or plant (e.g. using water
instead of a flammable solvent)
Passive: This is an alternative that reduces the frequency or
consequence of a hazard through the design of the process or
equipment without add-on safety devices (e.g. containment dikes
or blow-out walls; DuPont MIC tank elimination)
The design conditions should mean that process cannot move
outside of the safe envelope under any circumstances however
external factors such as external damage exist - future design /
process changes could invalidate protection
Such designs minimize hazards by process and equipment design
features that reduce the frequency or consequences of incidents
without the active functioning of any device.
Active strategy uses safety systems such as the control system,
safety interlocks, automatic shutdown systems, and relief systems
to detect and correct process deviations. These layers are intended
to prevent, control or mitigate a potentially hazardous scenario.
Procedural strategy utilizes operating procedures, administrative
checks, emergency management procedures, design standards,
codes, and training to prevent or minimize effect of an incident.

These are systems that are intended to manage risk by safety/

process management system through
Company policy
Site rules
Operating procedures
Training / refresher training
Maintenance and inspection regimes
Test procedures and schedules
Emergency response plans (on- and off- site)
Inherent and passive strategies tend to be the most robust and
However, all strategies will be needed to address all the hazards
associated with a chemical process.
These strategies are not totally distinct, with clear boundaries. They
just represent process safety approaches. People may disagree on
the labeling of some strategies. The separating lines are hazy!

Capital Project Phases

Pre-Authorization/Project Development
Authorization/Go-No Go
Post-Authorization/Project Execution
Operate & Maintain*
*PSM in these phases discussed in previous lectures

Budget Phase
Capital budgeting decisions relate to decisions on whether or not a
long-term project should be undertaken, capital facilities and/or
capital equipment/machinery. Capital budget decisions have a
major effect on a firm's operations for years to come. It is a
complex process and there are five broad phases. These are
planning, analysis, selection, implementation and overview.
Budget phase is when a project has been selected and the
schematic design is proposed.
The primary goal of this phase is to develop a clearly defined design
based upon the projects requirements, as defined by the facility
program developed during Predesign. Project quality, scope,
budget, and schedule will also be confirmed and refined.
Process and technology is reviewed and developed and screening
Preliminary process data is developed such as process flow
diagrams, heat and material balances, and simplified equipment
Preliminary cost estimates (+/- 30%) and simple project
economicsare developed and suitable financial arrangements made.
Preliminary Process Hazards Analysis (PHA) completed and
environmental impact (including permitting) assessed for screening
alternatives. Risks are best mitigated by recognizing them upfront
and managing them throughout the entire project life cycle.
This is the phase when an early design PHA is scheduled. Various
safetystrategies are explored. At this stage inherently safer design

and technology can be looked into and planned. Passive safety

strategies too are investigated.
PSM concerns/deliverables at this stage are preliminary (some
inherently safer) design conditions Preliminary construction and
action plans, materials of construction and preliminary PHA and
associated issues, Process safety information, including chemical
hazards, chemical reactivity, hazards of inadvertent mixing,
inventories, applicable codes and standards. The key PSM elements
at this stage also include baseline info for future PHA,baseline info
for future MI. this is the time to begininherently safer concept,
leadership and plans for employee participation.

Pre-Authorization/Project Development Phase

In this stage design basis is confirmed. Process design, equipment
design, construction design are all properly established.
As the process and technology is selected, large-scale drawings,
mock-ups and detailed plans are developed to present a
coordinated, clear view of the projects major elements with respect
to process, technology and utility infrastructure.
Towards this, first process data is developed.Design and operating
information include process flow diagrams (PFDs) with heat and
material balances, piping and instrumentation diagrams (P&IDs),
control narratives, interlock descriptions, pressure relief design
bases, facility siting study, dispersion modeling results, plot plans,
electrical area classification (EAC) drawings, equipment specification
sheets, and instrument specification sheets.
Simultaneously preliminary design (equipment layout, civil,
structural, piping, electrical) and construction estimates are
developed. The focus is on finalizing all drawings and specifications

for building systems, site utilities, and components that will form
the basis for the projects Construction Documents. A final set of
comprehensive documents provides specifications and drawings
sufficiently complete to support the Contractors GMP, obtain
necessary permits, and construct the project.
Vendor quotations for major equipment and machinery arereceived.
Vendors and contractors must be thoroughly scrutinized in order to
ensure that they will be compliant with the expectations of the
owner organization, especially as it pertains to safety, health and
the environment.
Preliminary Process Hazards Analysis (PHA) completed and
environmental impact (including permitting) assessed. The typical
and common PHA methodology at this stage in the project is a
Hazard and Operability (HAZOP) study. The HAZOP type depends on
what is being analyzed. A procedural methodology can be used
when applying HAZOP methodology to operating procedures as well
as modes of operation.
PSM elements include applicable codes and standards, process flow
diagram, thermal/kinetic chemistry information, material and
energy balances, and materials of construction. Here facility siting
basis is set. Emergency response plans and procedures are begun.
The project schedule developed. Project monitoring and
management of costs and schedules are extremely important. A
detailed project execution timeline is set up.
Further budgetary cost estimate (+/- 10%) and project economics
are developed. The cost estimate has a single total value and may
have identifiable component values. A problem with a cost overrun
can be avoided with a credible, reliable, and accurate budgetary
cost estimate.Budget has two sides: income and expenditure; it


shows how funds would be raised and used. On the other hand,
estimate shows only the expenditure side.
Safety strategies (typical): ISD, passive, active are developed.
Including initial list of actions to resolve as design project
progresses and initial list of inherently safer design considerations
available for incorporation into the design
PSM concerns/deliverables: design conditions and materials of
construction; preliminary PHA and associated issues; preliminary
spare parts; accurate estimates for safety devices and equipment;
sufficient time in project schedule for safety reviews, calculations
and follow-up; tie-in points; neighbors; country and local codes;
complete documentation.

Authorization Phase
Once the process and technical requirements are established and
the PHAs developed and strategies finalized, it is time to prepare
and submit project authorization documents. In a large organisation
there are probably written procedures for the analysis and approval
of capital projects. There are forms for a particular kind of projects,
state requirements.
At this stage it may be possible to secure initial/partial funding to
commence project execution

Post-Authorization/Project Execution Phase

Depending upon the project deliverables, the Execution phase can
take a long time. This is also when the bulk of money will be spent.
It needs to be ensured that the resources (or people, equipment
and materials) are available to do their work and know what work
needs to be completed. During this phase the design basis reconfirmed.

Process and technology is finalized by once again giving due

consideration to various factors such as cost analysis, safety
considerations including ISD options, etc.
Process data developed (process flow diagram and P&ID, heat and
material balances for all operating modes, detailed equipment,
machinery and instrumentation design and specifications)
Firm vendor quotations for major equipment and machinery
Detailed plant design (equipment layout, civil, structural, piping,
electrical) and construction packages developed
Construction contractor bidding and selection
Process Hazards Analysis (PHA) completed and issues addressed in
project. The goal is to ensure that the process safety integrity of the
project is preserved from the completion of the initial rigorous
design PHA to the revalidation of the rigorous design PHA. HAZOP is
usually carried out as a nal check when the detailed design has
been completed.
PHA action items are facility siting analysis, dispersion modeling,
and pressure relief analysis. It is also often necessary to perform a
layer of protection analysis (LOPA) to further define the risk of
specific hazard scenarios and identify their safety integrity levels
(SILs). This approach can give more focused guidance regarding
required independent protection layers (IPLs), interlocks, and
safety-instrumented systems (SISs).
Environmental impact assessment requirements met (including
obtaining permits). An environmental impact assessment (EIA) is an
assessment of the possible positive or negative impact that a
proposed project may have on the environment, together consisting


of the environmental, social and economic aspects. The purpose of

the assessment is to ensure that decision makers consider the
ensuing environmental impacts when deciding whether to proceed
with a project.
All equipment, machinery, instrumentation, piping, materials, etc.
Construction is undertaken. Fabrication and installation is
Operating procedures developed and MI procedures are completed.
Correct equipment is installed and installation procedure for each
piece of equipment is checked and confirmed.Written procedures for
controlling operations, controlling troubleshooting, controlling
emergencies, and maintaining equipment are formatted. Operations
must write procedures containing the right content (right instruction
for each step, in the right sequence) and format the instructions
(steps and pages) properly to lower the chances of someone
making errors when following the procedures.
Operating plant personnel are hired and proper training undertaken.
This includes training in operating critical equipment, process
training, and structured training on emergency responses, handling
of hazardous chemicals.
Commissioning and start-Up activity is meant to validate the
construction integrity and confirm that the facilities are delivered in
a safe, reliable and operational condition. This also provides
valuable baseline or benchmark information that can be used to
evaluate future maintenance decisions.
Project schedule finalized and projected.
Project costs tracked and projected.


PSM elements at this stage include applicable codes and standards,

P&IDs, revised materials of construction, safety interlocks and
controls, equipment design basis and some final equipment
details,multiple layers of protection. This is the time to compile
detailed info for future PHA and MI and to begin detailed
emergency planning and response.

Safety strategies (typical) are visited again with respect to ISD,

passive, active, procedural. If there are no significant design
changes from the initial rigorous design PHA, then it is a
revalidation of the initial PHA with additional analysis of modes of
operation and maintenance procedures.
To accomplish safe operations equal focus must be given to hazards
and operability/quality issues. A Hazard and Operability (HAZOP)
study is a structured and systematic examination of a planned or
existing process or operation in order to identify and evaluate
problems that may represent risks to personnel or equipment, or
prevent ecient operation. In this phase the HAZOP is reviewed in
details to ensure that plant emergency and operating procedures
are regularly reviewed and updated as required
Operator training and procedure modification in undertaken as
required. Various training formats are followed such as procedural
training, on the job training and environmental health and safety
In this phase PSM concerns and deliverables are design conditions
and materials of construction; PHA and associated issues; complete
spare parts list; safety devices and equipment; tie-in points;
neighbors; country and local codes; construction/contractor safety;
operating procedures; operator training; complete documentation.



Electrical Safety in Construction

Special Focus:
Electrical Safety
Confined Space
Hot Work
In electrical safety two key areas: Electrocution and Arc Flash
Exposure to electricity is a major cause of deaths among
construction workers. Among electricians, the concern is working
live or near live wires, instead of de-energizing and using
lockout/tagout procedures. Among non-electricians, failure to avoid
live overhead power lines and an apparent lack of basic electrical
safety knowledge are the major concerns.
What Can Be Done
Following these procedures would prevent most work-related
Contractors should:
Comply with OSHA regulations on electrical safety
Train employees on electrical safety
Contact utility companies in advance to de-energize or
insulate overhead power lines
If asked to work live, verify with owner/client that deenergizing live electrical circuits/parts is not practical or
would create a greater hazard.

Only allow work on live electrical circuits/parts in accordance

with a permit system with specific procedures.
Electrical workers should:
De-energize and lockout or tag out electrical circuits/parts
you will be working on or near
Work only on live electrical circuits/parts in accordance with
a permit system with specific procedures and only if you are
qualified to do so.
Wear appropriate PPE and use proper tools when deenergizing/testing live electrical circuits/parts.
All other construction workers should:
Make sure you are trained in electrical safety for the work
you will be doing
Ensure machinery and power tools are properly grounded or
double insulated
Check all extension and power cords for wear and tear
Disconnect the plug on any power tool or machinery before
inspecting or repairing
Keep at least 10 feet from live overhead power lines
Keep metal objects away from live electrical circuits/parts.
Arc Flash while racking a breaker
An arc flash electrical accident occurs when a worker makes
accidental contact with an energized electrical conductor. Here a
flashover of electric current leaves its intended path and travels
through the air from one conductor to another, or to ground. The


results are often violent and when a human is in close proximity to

the arc flash, serious injury and even death can occur.
The Arc Flash can be initiated through accidental contact,
equipment which is underrated for the available short circuit
current, contamination or tracking over insulated surfaces, as well
as other causes including dust, dropping tools, accidental touching,
condensation, material failure, deterioration and corrosion of
equipment, faulty installation.
Factors that determine the severity of an arc flash injury:
Proximity of the worker to the hazard
Time for circuit to break
Protection boundaries can act as safeguards.
Flash Protection Boundary (outer boundary)
Limited Approach
Restricted Approach
Prohibited Approach (inner boundary)
Arc flash can cause the following injuries:
Skin burns by direct heat exposure.
High-intensity flash can also cause damage to eyesight
Large shock waves that can blow personnel off their feet
Loss of memory or brain function from concussion
Hearing loss from ruptured eardrums. The sound associated
with the blast can greatly exceed the sound of a jet engine
Exposure risks from flying debris. For example, shrapnel
wounds from metal parts
Shock hazard due to touching energized conductors


Other physical injuries from being blown off ladders, into

walls, etc.
What Is A Confined Space?
Many workplaces contain spaces that are considered "confined"
because their configurations hinder the activities of employees who
must enter, work in, and exit them. For all employers and
employees a confined space exhibits these types of characteristics:
Is large enough and configured such that an employee can
bodily enter and perform work
Has limited openings for entry and exit
Is not designed for continuous employee occupancy
Has the potential for a hazardous atmosphere that may
include the lack of or too much oxygen, and/or the presence
of toxic or explosive vapors or gases such as hydrogen
sulfide and methane
Has physical safety hazards such as machinery, sources of
electrical shocks, liquids (drowning or fires), steam (burn
hazard), or loose, unstable materials that can cause
employees to be trapped, crushed, or buried.
Examples of confined spaces include but are not limited to: water
and sewer pipes, pumping stations, manholes, boilers, vats, kilns,
vaults, silos, storage bins, meter vaults, tunnels, tanks, wastewater
wetwells, grit chambers, utility tunnels, crawl spaces under floors,
water reservoirs, holding tanks, pits, and sumps.
In general, confined space regulations require all employers to
A written confined space plan


Procedures to test and monitor the air inside confined spaces

before and during all employee entries
Procedures to prevent unauthorized entries and to have an
attendant outside the space at all times
Effective controls of all existing atmospheric or safety
hazards inside the confined space
Employee and supervisor training on safe work procedures,
hazard controls, and rescue procedures
Effective rescue procedures immediately available on site
Hot work
Most hot work operations involve a number of parties, all of whom
have responsibilities for ensuring that the work is carried out safely.
Contractors and/or maintenance staff must consult and liaise with
the departmental staff in the area that the hot work is to be
performed. Hot work permit is mandatory for performing hot work.
Hot work means the use of open fires, flames and work involving
the application of heat by means of tools or equipment. This
includes the unintentional application of heat by the use of power
tools, hot rivets or hot particles generated from cutting or welding
The sources of heat most commonly involved include:
Gas/electric welding and cutting apparatus;
Blow torches/blowlamps;
Bitumen/tar boilers;
Grinding wheels and cutting disks.
Two specific workplace hazards associated with this:


Open flames or flying sparks that are able to ignite any

flammable gases and vapors
The hot work itself may produce toxic fumes and gases
Since hot work tools are highly portable ignition sources, improperly
conducted hot work is a major cause of fires and explosions.
When the Potential Hazard is getting burned by fires or explosions
during hot work, the possible solutions are:
Perform hot work in a safe location, or with fire hazards
removed or covered
Use guards to confine the heat, sparks, and slag, and to
protect the immovable fire hazards
Do not perform hot work near flammable vapors or
combustible materials. Work and equipment should be
relocated outside of the hazardous areas, when possible
Make suitable fire-extinguishing equipment immediately
Firewatchers are required whenever welding or cutting is
performed in locations where anything greater than a minor
fire might develop
When the Potential Hazard is gettingburned by a flash fire or
explosion that results from an accumulation of flammable gases,
such as Methane or Hydrogen Sulfide, around the wellhead area,
the possible solutions are:
Monitor the atmosphere with a gas detector. If a flammable
or combustible gas exceeds 10 percent of the lower
explosive level (LEL), the work must be stopped.
Identify the source of the gas and repair the leakage.


In this lesson we have defined capital investment.
We have looked at safety strategies for addressing hazards in
chemical processes.
We have learnt about Capital project phases and associated PSM
concerns and deliverables.
We have understood electrical hazards in construction activity.


Process Safety - Design and Engineering


What is Process Safety Engineering?
Process Safety Engineering implies applying a thorough knowledge
about process safety including PHA to your engineering techniques,
and mechanical and process design. It involves identifying hazards,
evaluating risks (qualitatively and quantitatively), and helping to
zero in on identifying and evaluating cost-effective engineering
solutions to avoid or reduce the risks. These jobs must be
performed with complete knowledge of engineering standards,
human involvement, and most important a thorough understanding
of process safety and all its concerned elements as per OSHA.
Process Engineering Design Management is critical to delivering a
final capital asset that will meet the business objectives, cost
targets (capital, fixed and variable costs), operability,
maintainability and, MOST IMPORTANTLY, Health, Safety and
Environmental performance standards. While the discussion in this
section will focus on major capital projects, the basic concepts apply
to projects of all sizes including location specific minor capital
Technology Selection:
Technology selection is a crucial step in determining the long-term
performance of an operating unit. Performance is measured by
several factors costs, yields, capacity and HS&E performance.
The focus of this section will be on overall PSM Performance.

Licensed Designs vs. In House Technology:

Mature technologies are frequently available for licensing from
another company using a certified contractor for detailed design and
construction. The licensed technology likely has a well-established
PSM performance record. This must be taken into account when
comparing to In House Technology. While the licensing fee adds
to the cost, it may also provide some underlying inherently safer
technology. These aspects need to be carefully weighed in making
a selection.
New Technology vs. previously practiced technology:
New technology development often brings opportunities for major
step changes in performance and economics. These new
technologies suffer from the lack of commercial implementation
experience to highlight potential PSM risks. A proper Process
Development program will investigate all aspects of the new
technology to fully understand metallurgy requirements, potential
side reaction and undesirable consequences. A thorough
technology and design review is mandatory to ensure the
fundamental technology and design does not contain significant PSM
risks. In comparing new technology with established technology,
the PSM performance of the established technology will have an
advantage in terms of understanding risks. The new technology
may bring simplifications for potentially lower PSM risks.
Demonstrated PSM Performance:
As noted above, previously practiced technologies and licensed
technologies will have established PSM performance. This past
performance can be used as an indicator of what to expect if one of
the existing technologies is implemented. Lessons can be learned
from incidents in those previous installations to allow design

changes to reduce risk. Care must be taken in modifying existing

designs not to compromise inherently safer practices that were
previously incorporated.
Design Premises / Basis of Design:
Development of the Design Premises / Basis of Design (BOD), is a
critical function that must take place early in project development.
The BOD will contain economic and business premises as well as
technology premises. With respect to PSM, the Process Engineer
must ensure that the detailed premises do not box in design
decisions that can negatively impact PSM performance. An example
of this is premises around installed spares and remote activation.
Capital cost can be avoided at some risk to stream factor but the
over-riding factor may be the PSM consequences if an immediate
online spare is not available. Process Engineers, Design Engineers
and Safety professional all need to participate in the development of
the BOD.
Inherently Safer Design:
Inherent safety is a concept particularly used in the chemical and
process industries. An inherently safe process has a low level of
danger even if things go wrong. It is used in contrast to safe
systems where a high degree of hazard is controlled by protective
systems. As perfect safety cannot be achieved, common practice is
to talk about inherently safer design. An inherently safer design is
one that avoids hazards instead of controlling them, particularly by
reducing the amount of hazardous material and the number of
hazardous operations in the plant.
(Ref: http://en.wikipedia.org/wiki/Inherent_safety)
It is critical that the Process Design and Engineering Design teams
are skilled in applying these principles. This design philosophy

begins early in the process development or specification and must

be carried through the detailed engineering.
(Reference Book):
Process Plants: A Handbook for Inherently Safer Design by Trevor
Project Work Process:
Major Project Teams require a significant number of contributors
with a broad range of skills. Even small projects must account for
the full range of skills required even though the team may be small
with many part time members. Focusing on major projects, a welldefined set of work processes is required to manage / lead /
coordinate this wide set of disciplines. Most owner companies and
contractors have internal systems they apply to ensure a smooth
progression of a project from Concept to Start-up / Operation. The
Process Design Engineer is typically the Guardian of the PSM
integrity of the design and implementation.
The following are typical roles/ key stakeholders that are filled in a
major project:
Business Representative Long-term profitability and
business strategy.
Technology Representative Primarily technology selection /
Project Management Manage resources / contractors.
Cost, Schedule, Quality.
Engineering Disciplines EE, ME, Pressure Systems, Control
Systems, etc.

Process Design Engineers Implementation of Technology.

Guardian of PSM.
Location Representative Provide location resources and
needs/ inputs.
Working with Contractors:
Early in the development of major project, work is done to select a
contractor. Many elements are considered when choosing a
contractor. A critical factor for construction contractors is their
OSHA Recordable Rate. All contractors must keep an OSHA log so
these statistics are readily available. This gives an indication of the
construction contractors commitment to safety on the job site.
Contractors can be changed at various phases of the project so very
complete documentation is required in case this happens. This is
particularly true around the design elements incorporated
specifically for inherently safer design and PSM considerations.
The Project Manager is typically the primary interface between the
owner company and the contractor.
Process Hazards Analysis (PHA) Freeze Design:
The detailed safety review of the final process design (PHA) is a
major milestone in a project. At this point, the design is frozen.
Any significant changes require review by the key stakeholders.
This PHA analysis can take several weeks of intense review by
representatives of all disciplines. Any significant changes can
trigger a partial review of the PHA with the attendant cost and
schedule impacts. This is a crucial step in the project work process
that gives the entire team the assurance that a thorough inspection
and study of the process has been completed. All PSM issues must
have been captured. Detailed HAZOP analysis is performed on
critical sections. Risk analysis with probability and consequences

has been performed as needed. High risk items in the risk matrix
have been addressed to the satisfaction of senior management.
A NO CHANGE mindset is critical from this point on.
Management of Change:
It is recognized that all circumstances could not have been foreseen
during the PHA. Given there will be some changes, a rigorous
Management of Change system MUST be instituted. The goal is for
changes to be small in size and few in number. Never the less, ALL
changes must go through MOC. Typically, all changes must be
signed off by:
Project Manager Cost & Schedule
Engineering Discipline Technical correctness
Process Design Engineer Operability & PSM Issues (can
trigger mini-PHA)
Location Representative Maintainability & Operability
This MOC is typically reviewed monthly by the key stakeholders to
ensure proper project controls are functioning. During the PSSR,
the PHA and the MOC Log are reviewed together to ensure key PSM
principles have been retained.
Integration with existing facilities:
Most projects will have interfaces with existing facilities. This may
be as simple as a tie-in to the biotreater or as complex as feed,
product and heat integration with existing units. These interfaces
must be specifically addressed during the PHA. This can include
taking into account the impact of a process upset in one unit on the
interconnected unit. Once the interface connections and interactions
are properly accounted for in the PHA, the MOC procedure must

include the interface units on ANY change items that could

potentially impact them.
The key on integration with existing facilities is great
communication with the location and their representative. Anything
the project does that could impact the existing facilities must be
subjected to proper analysis to ensure no PSM incidents are caused
at these interfaces.

Management of Change
To understand what management of change really means and how
it fits into the overall PSM requirements

At the end of today, you will be able to:

Know the expectations of the PSM regulation
Know when a change is a replacement in kind and when it
requires an MOC

The 14 PSM elements

Employee Participation
Process Safety Information
Process Hazard Analysis
Operating Procedures
Pre-Startup Safety Review
Mechanical Integrity
Hot Work Permit
Management of Change
Incident Investigation
Emergency Planning and Response
Compliance Audits
Trade Secrets

Management of Change MOC

A process by definition is something that is going on. It is never
static. Neither is a plant or system ever static. Changes occur or are
made to occur for definite reasons. Managing these changes is a








programs because of its critical role in managing safety.

The purpose of MOC is to ensure that no unexpected hazards are
introduced in the system/ process and risks are assessed and
For managing change:







procedures to manage changes (except for "replacements in

kind") to process chemicals, technology, equipment, and
procedures; and, changes to facilities that affect a covered
The procedures shall assure that the following considerations
are addressed prior to any change:
o Technical basis for the proposed change
o Impact of change on safety and health
o Modifications to operating procedures
o Necessary time period for the change
o Authorization requirements for the proposed change

Management of Change
Employees (operating, maintenance, and contract employees)
affected by a change in the process shall be informed of, and
trained in, the change prior to start-up of the process or
affected part of the process.

If a change results in a change in the process safety

information, such information shall be updated accordingly.
If a change results in a change in the operating procedures,
such procedures or practices shall be updated accordingly.

This is the initial paragraph from the Chemical Safety Boards

bulletin regarding an incident that lead to 6 fatalities in a Coking
unit in 1998.
Note that the MOC applies not just to hardware and maintenance
but also to procedures, operating limits and other soft items that
impact HS&E.
An incident occurred in a Delayed Coker unit at a west coast
refinery. A delayed coker converts heavy tar-like oil to lighter
petroleum products, such as gasoline and fuel oil. Petroleum coke
is a byproduct of the process. Here are some pictures to explain the
basic working of a Delayed Coker Unit.

Figure 1

Figure 2


Figure 3

Figure 4

Management of Change

This picture was taken after the fire that was a result of a failed
MOC situation. The incident occurred in a Delayed Coker unit at a
west coast refinery. A delayed Coker processes the 1000+ material
from a crude unit, it heats it up to around 600 deg. F, runs it
through a combination tower that flashes any residual light ends
and sends the remaining material to a furnace that heats the
material up to 950+ deg. F and then sends the material to a very
large vessel that provides residence time for the material to cook







dehydrogenation and long chain molecule cracking unit. Most

Cokers residence time for the hot heavy material in the large coke
drum is about 12 hours for the cooking process to take place. At
the end of that time the light ends that have been generated will be
cooked off, collected and processed (in the top of the combination
Steam is then introduced into the Coke drum which serves as a
cooling media and to induce fractures into the coke. After the out

going steam indicates that the material in the drum is cooled

sufficiently (think, how would you know that?),the pressure is
reduced to atmospheric, the top head is then removed, then the
bottom head is removed, then a drill is inserted into the top and a
bore hole is made. After that another drill bit is inserted and the
contents drilled out. The medium that drills out the coke is very
high pressure water, say 5,000 psi.
In this case an incident occurred in 1996 where the filling of the
drum was interrupted. When the operators attempted to introduce
steam to cool the drum the piping was plugged and they were
unable to steam the drum. They then introduced water to the drum
to effect the cooling, but no formal procedure was in place and they
did not cool the drumsufficiently. When the drum was opened a
torrent of water, heavy oil, and coke spewed out which created a
hazard and required a major cleanup.
An internal investigation team recommended that the procedures be
written for cooling/emptying partially filled drums. However, this
task was not completed. Fast forward to 1998 when a power failure
interrupted the process again. The drum was only partially filled
(about 7%). After power was restored the material in the drum had
partially cooled and the piping between the furnace and the drum
was plugged so the normal route for cooling steam was blocked.
The shift supervisor was aware of the seriousness of the situation,
but NO FORMAL PROCEDURE WAS IN PLACE. Instructions were left
with the night shift to not add any water, but to simply allow the
drum to sit and cool overnight. (What do you think of this? Do you
think that radiant cooling will be of any use?) The next morning the
process supervisor met with the day crew to determine how to
empty the partially filled drum. No engineers were in attendance at
the meeting. Surface temperatures on the bottom flange seemed
cool to the touch and surface measurements of the skin of the drum

indicated temperatures of ~230 deg. F. (Do you think that surface

temperatures might be indicative of the internal temperatures?).
A steam hose was hooked up to a fitting and steam was introduced.
An operator commented that the top of the piping warmed up when
the steam was introduced, but the bottom remained cool. Given this
the supervisor and process operator directed that the drum be
opened, but with minimum personnel present. Because of the
possibility of toxic gases being present the mechanics who loosened
the bolts were instructed to wear self contained breathing apparatus
(SCBA). The top head was loosened and removed from the drum.
The bottom head was unbolted and held in place by a hydraulic
dolly. The operator then activated a switch that lowered the bottom
head. When that occurred a whooshing sound was heard and a
white cloud of vapor came out of the bottom of the drum. The vapor
being released was above the auto-ignition temperature, it ignited,
and six people were engulfed in the flames and perished.
Six people died because an MOC on operating procedures
was not completed.

Management of Change







procedures to manage changes (except for "replacements in

kind") to process chemicals, technology, equipment, and
procedures; and, changes to facilities that affect a covered
Various procedures and practices exist that evaluate contemplated
changes to a process. This is an important element that leads to
compliance with regulations but more importantly, provides for

improved understanding of the facilities and the potential impacts

on the safety and health of employees. Determination of the effect
that modifications may have within a facility is essential to the








necessary, up-to-date, accurate safety information to all parties

responsible for process activities is an essential part of the program
as well.
So what is the simplest change? A replacement in kind. What is a
replacement in kind? To be an "in kind" replacement, the new item
must satisfy the design specifications of the item being replaced.
Additional changes in operation that do NOT apply include those
that are within approved design limits and those due to unforeseen,
short-term excursions outside of established limits. How about a
simple block valve change?Is that a replacement in kind? On the
surface it sure sounds like it is, but what if the valve is from a
different manufacturer than the one in place? How do you know that
the metallurgy is the same internally? Does the metallurgy make a
difference? These are some simple questions that always need to be
asked and answered. If yes, and the valve is the same (like gate or
quarter turn) then it is a replacement in kind.
Virtually all companies now have checklists to be followed that give
specific criteria for the level of oversight necessary to sign off on
MOCs. For simple replacements in kind the operator and supervisor
will review and sign off that a formal MOC is not required.


documentation is then sent to the unit superintendent (or manager)

for their notification. This ensures that if the unit manager does not
agree that an MOC is not required they can override and add
additional technical input. The procedures in place must recognize
the complexity of the proposed change and provide appropriate
technical support and management approval.


Management of Change Hazard Reviews

Potential impact on employee health and safety
Potential impact on the environment
HAZOP - Change and addition equipment and piping projects
Modifications that are within the scope of the Management of
Change procedure must receive a "Hazard Review" to address the
potential impact of the change on employee safety and health. The
effect on the environment should be considered as well. A HAZOP
review should be completed for all "change and addition" type
equipment and piping projects, which have an impact on the unit's
process safety. The MOC process should utilize the appropriate
hazard analysis techniques (such as Checklist, FMEA, Procedure
Reviews, etc. that have been previously presented) for reviewing
other types of change. The criteria for utilizing the various
techniques must be clearly spelled out and religiously used.

Management of Change PSSR

Any change requiring an MOC must undergo PSSR prior to
implementing the change

Management of Change PHA

ALL MOCs since the previous PHA (or PHA revalidation) MUST be
reviewed at the next PHA revalidation

Management of Change
Duration of time when valid
Necessary time period to implement the change
MOCs can be done for a permanent change or for a temporary
change; both need specific procedures and time frames. Temporary
MOCs are needed when say a portion of an auto shutdown system
needs to have on-line maintenance. During the time of maintenance,
alternative mechanisms must be in place to perform the shutdown
duties of the original system. To do so requires a plan in place,
operators being trained and informed, and appropriate levels of
supervision being informed of the temporary state of the system.
It should be clear that temporary MOCs are just that, temporary. A
specific time period must be a part of the process to ensure that the
temporary MOC does not become a permanent change. If that time
period is exceeded appropriate management must be informed and
approval given to continue operation.
Permanent MOCs need a time period for implementation. During
that time adequate controls must be in place to ensure that the
corrective actions that the MOC was intended to implement are
covered by an alternative means.

Management of Change Levels

Simple MOC local operations approval
MOC with technical impact area manager with technical input
MOC with complex engineering aspects plant manager with
technical and engineering input

MOCs are not created equal, nor should they be treated as equals.
Simple changes should require simple review and documentation.
For example on a unit, the night shift finds that one of their
procedures for putting on a pump will not work. The operators who
know the process well propose a modification to the existing
procedure. The shift supervisor reviews the proposal and agrees it
makes sense and approves the change. The change in procedure is
then followed and the pump is put on line. The next day the
superintendent of the unit is informed of the change and agrees it
makes sense, and then he ensures that the changed procedure is
documented, the other crews are informed of the change and each
of the unit operators signs off that they have been informed of the
More complex issues for a unit modification require that an
appropriate technical support review provide a solution, and
approve that solution. The local superintendent then should agree
that the change is valid and contains all technical support that is
needed for a complete solution. Once that is satisfied the approval
of the change should be reviewed and agreed upon by the area
Finally the most complex changes should include all of the previous
steps, but also be reviewed by the engineering department for
completeness. Approval for these types of MOC then should be
given by at least the plant manager if not someone higher in the
organization. This is to ensure that should the proposed change









organization be appraised of the issue and requested to seek a

proper solution.

Management of Change Areas

Hazard Review
Hazard Review Recommendations Resolved
Update Operating Procedures
Inform/Train Operating Personnel
Inform/Train Maintenance Personnel
Inform Contractor
Update Chemical Information (MSDS)
Update Block Flow Diagram
Update Material and Energy Balance (for units built after
Update P&IDs
Update Operating Limits
Update Relief System Data
Update Safety/Shutdown System Data

What we have learntso far

Reviewed the PSM regulation
Reviewed the types of equipment covered
Reviewed the types of non-equipment covered
Reviewed the levels of MOC
Reviewed the levels of approval authority

Class Exercise
Read Chemical Safety Board Report on the Coker incident.
List your observations of the report

What did they get right?

What may they have missed?
What is your logic for your comments?
Discuss CSB report on the Coker incident
Questions about the Coker Incident
What is the OSHA definition of a Covered Facility?
Analyze the CSB document.Do you agree with the conclusions and
why or why not. Do you read anything into it that the CSB did not
highlight or focus on?

Management of Change
Process Chemicals
Controls/Critical Alarms/Instrumentation
Operating Limits
Operating Procedures
Relief/Safety Systems
This list shows areas that will require a formal MOC when a change
is contemplated.

Management of Change Process Chemicals

MOC is required for the following:
New applications

A change in location of the addition/injection point

Discontinuation of an additive/chemical
Replacement with a product that is not "in-kind"
Concentrations outside of established limits
Change of dilution fluid
Vendor supplied equipment







additives/chemicals which are added to highly hazardous processes

or streams, and those which are explicitly mentioned in Appendix














demulsifiers, antifoaming agents, etc.

This seems to be an innocuous requirement, but in reality that is
not the case. Failure to adequately manage a change in chemical
addition directly led to a catastrophic explosion in a gulf coast
refinery. A simple change of adding an inhibitor and a filming amine
through the same additive nozzle led to unexpected corrosion and
loss of containment in the overhead of a cat fractionator. The vapor
cloud exploded and eight workers died. The two additives were not
compatible. If injected into separate nozzles with appropriate
distance between them, the explosion would never have happened.
MOCs are important.





MOC is required for:
Changing/modifying software or hardware, including control
program logic

Changing the sensing point (process variable) input to a control

loop/program or critical alarm
Changing the variable being controlled or the means by which a
process variable is controlled in a control loop/program
Changing a control valve failure action
Disabling, or bypassing control loops/programs or a critical








maintenance covered by a routine procedure)

Changing a critical alarm set point
Adding or deleting a critical alarm
Making critical alarm hardware modifications
Management of Change applies to process hardware, software, and
its associated instrumentation used in the controls and control
strategies for shutdown, interlock, and safety systems, as well as
for Critical Alarms, items associated with upper and lower design
and safe operating limits, and those involved with Critical Corrective

Management of Change Equipment & Piping

MOC is required for:
Adding any equipment
Removing any equipment
Replacing/modifying any equipment, other than an "in-kind"
replacement that meets the original equipment specifications
Operating equipment outside of established design limits
Changing an established design specification or rerating a
Modifying major structural support members in ways that affect
their design loading capacity or fire resistant characteristics

Management of Change applies in general to equipment and piping

which are parts of a covered process. Some examples are pressure
vessels, facilities for storage of 'OSHA listed' chemicals in excess of
the Threshold Quantity, rotating equipment, heat exchangers,
furnaces, filters, piping and valves, etc.Not everybody would include
any change but it is essential to treat even small changes with
respect and look into potential consequences.

Management of Change Operating limits

Changing an upper or lower safe limit
Establishing a new limit
Management of Change applies to established process safety limits
for raw materials (feedstocks, catalysts, etc.), product streams, and
operating conditions (flows temperatures, pressures, compositions).

Management of Change Operating procedures

Developing a new procedure
Modifying or revising an existing procedure
Remember the Coker incident!!
The Management of Change procedure applies to changes in
process safety related procedures that are used for startup,








temporary operations, or the operation of safety systems

We began this session reviewing an incident that was a result of
NOT modifying a procedure when the limits of the then current
procedures were known. So, this should be absolutely clear now.


Management of Change Relief & Safety systems









safety/shutdown system
Changes which could affect the capacity or design basis of a
safety system
Adding or removing a safety or shutdown system
Bypassing or disabling a relief, safety, or shutdown system
(except when addressed by a routine procedure for startup,
shutdown, or maintenance)
Replacing/changing system components (except for "in-kind"
Management of Change applies to safety systems which are
designed to protect equipment, facilities, and the process such as
those used for shutdown, safe-off, deluge, mitigation, chemical or
hydrocarbon detection, fixed fire protection/suppression systems,
emergency dump systems (deinventory), relief system equipment
or relief systems which are intended to contain/control/mitigate
releases of flammable or toxic material, and building pressurization

Management of Change - Technology

New or improved catalyst or additive (a process chemical
change). New process materials certification
Upgraded control hardware system (a control/instrumentation

Implementation of a new, innovative way of operating an

existing facility (an operating limits change)
Operating the process differently so as to produce a new
product (an operating procedure change)
Upgrading a toxic or hydrocarbon detection system (a safety
system change)
Changes in facility personnel job duties
Management of Change applies to changes in technology that can
potentially have an adverse affect on a covered process. Many
process technology changes may also be categorized as other type
of changes, but thinking in terms of technology changes may trigger
one to consider the Management of Change procedure for situations
that might not be otherwise considered. This can become very
serious on reaction systems that are exothermic. If the reaction
regime goes from the stable region to an unstable regime, then an
autogenous reaction condition could develop. This was mentioned in
a previous class, but bears repeating.

Management of Change summary

Reviewed the PSM regulation
Reviewed the types of equipment covered
Reviewed the types of non-equipment covered
Reviewed the levels of MOC
Reviewed the levels of approval authority

Hierarchy for Management of Change


1. Replace a valve in the unit that has failed with one from the
Who would you discuss this with?
What questions would you ask?
Who would be the person you would expect to
approve or decide to escalate further?
Explain your logic for these decisions.

2. An existing Gas Turbine is due for a Major overhaul. The
simple cost of replacing the turbine with an electric drive is
less than the overhaul.

How would the MOC process be

carried out and what parties would have to sign off on it?
Complex problem probably requires a HAZOP analysis.


More Bread Crumbs

Construct a What If list. Should be at least 5 to 10 top level
items. Show your work.
Analyze each item to determine who at what level on the
organization chart is required to put the decision in the Green
(may be more than one person). Show your logic.
Review the sketch of turbine vs. electric driver.
Consider the whole plant.
Metallurgy is not an issue.

Managing Risk The importance of

Operating Procedures

To understand what makes good operating

What should be included?
What should not be included?
Who is the audience?
Why are clear procedures necessary?
How to verify
How to audit

At the end of today, you will be able to:

Understand why procedures are important
Understand what should be included in procedures and why
Understand how to verify appropriate procedures in place
On one overseas audit of a condensate plant in the Middle East, all
operators were graduate engineers who knew the plant very well
and had been there a very long time. They were asked to show the
author their startup procedures. The Supervisor pulled out a piece
of paper and wrote about 5 lines like 1) Start up tower A, 2) Start
up tower B, 3) Heat up furnace 4) Commission treater 5) Send
product to storage.
This response was highly unsatisfactory. No matter how well trained
the operators are on a facility, they will each have their own best

way of doing things. That will lead to inconsistency at best, and a

process incident at worst. A best way to start up a unit must be the
same each time. To do that,procedures need to be clearly
documented, reviewed before each use, and audited for compliance
with expectations.
A standard operating procedure or SOP is a set of instructions that
address the who, what, where, and when of an activity. They should
also address the why when it adds to the clarity. They are meant to
be a guide to standardize the activity, to aid in producing reliable

Point 1: What elements do you think are important

to include?
First list your response and see how it matches with the following.
Preliminary preparations, utilities available, all blinds removed
(isolation devices), mechanical / electrical /supervision / all
crafts scheduled
Preparation--auxiliary equipment and services notify all
affected units
Elimination of air use steam / use nitrogen?
Tightness testing pretty obvious, but maybe not at this
stage of their career
Backing in natural gas or fuel gas the potential problem
discussed on the next slide
Elimination of water
Bringing the unit on stream
The procedures should comply with all company requirements
for process safety

The procedures should comply with environmental laws and


Point 2: What hazards do you think are important to

First list your response and see how it matches with the following.
Mixing air with hydrocarbons discuss the fire triangle
Contacting water with hot oil discuss the flashing of water
when contacted with hot oil (This was the primary cause of
the Whiting 1955 explosion only about 2 barrels of water
caused the destruction of about half of the refinery)
Freezing of residual water obvious blockage of flow paths
that would also prevent complete oxygen freeing
Exposure to toxic gases and liquids
Pyrophoric iron sulfide in the presence of air and
flammables could close the fire triangle and lead to an
explosion or fire
Excessive vacuum could collapse a vessel could be caused
by condensing steam
Thermal shock
Mechanical shock

Before the procedure begins!

PSSR complete and signed off
o Should include PHA action item completion
o Should include adequate staffing for the duration

o Should include any changed operations since last

time used
Musthave been certified as current
A best practice ishave undergone a complete dry run
Pre-startup Safety Review (PSSR) mandates a safety review for new
facilities and significantly modified work sites to confirm that the
construction and equipment of a process are in accordance with
design specifications; to assure that adequate safety, operating,
maintenance and emergency procedures are in place; and to assure
process operator training has been completed. Also, for new
facilities, the PHA must be performed and recommendations
resolved and fully implemented before start up. Modified facilities
must meet management of change requirement.
The PSSR process confirms that action items identified in the PHA
are complete, that adequate staffing will be in place, MOCs are
complete, and the startup procedure should undergo a dry run
before commencing operation.

Who is the procedure for?

Operator with one year of experience
Its 3:00 AM in the morning
No one to call
Manager of area clear expectations
Even if you have 30 year operators, everyone can havea bad day.
No startup should suffer because of anyones bad day, hence clear
procedures, signed off, and any anomalies noted should be written.
On this latter point, if, even after a dry run during the execution of
a procedure an anomaly is found it has to be resolved. If simple,

the unit supervision, with all involved agreeing, could change the
point. If significant, the operating manager must be involved; hence
during a startup a manager must be assigned 24/7 until stable
All industrial plants require an extensive set of operating procedures
which define the steps required - for example - to start the plant up,
to shut the plant down, to isolate pieces of equipment for
maintenance or to deal with emergency situations.
Thus written operating procedures are meant for all the operators
and workers.

Steps to include
Procedures should be detailed, written as check off points for
date, time, and initials of the operators at each step with an
area for comments
Include a brief unit status report at the beginning of each
major step to help tie multiple, parallel steps together.
Include acceptable limits before moving forward
Check the repair list to verify that all work is complete
Check that all safety related Pre-Startup Safety Review (PSSR)
items and Management of Change (MOC) items have been
resolved and are in place, including operator-training
The first point is probably a concern for the initials of the operator
The reason for this requirementis to have a specific individual to go
to if any corrections were needed or any problems occurred.

Steps - continued
Verify that all blinds have been removed or are in proper
startup locations. A master list needs to be maintained
Determine that the vessels are clean and free of debris
immediately prior to closing them. Operators should witness
and verify
Check operability of alarms, trips, MOVs (motor operated
valves), deluge systems, control valves, and the fail safe
position of the control valves
Give notice of startup to Utilities, Oil Movements, and other
units that may be affected. (Advance notice several or more
hours before startup, then notice at the actual time that
startup begins, or when startup will begin affecting other

Steps - continued
Put utility systems in service
Check all isolation block valves for relief valves to positively
verify an open path (Do this before the tightness test.)
Check to be sure all water coolers and condensers are drained
and vented before steaming to oxygen free the shell side.
Check for tube leaks when the shell side is pressured (List all
coolers and condensers.)
The relief valve (RV) is a type of valve used to control or limit the
pressure in a system or vessel, which can build up, by a process
upset, instrument or equipment failure, or fire.
The idea behind a pressure relief valve is that it provides an outlet
for dangerous buildups of pressure. Pressurized gases and liquids

can both be regulated with the assistance of a pressure relief valve.

In the event that pressure in the system becomes too high, instead
of blowing out the entire system, the pressurized liquid or gas will
vent from the pressure relief valve, bringing the pressure back
down and preventing a serious incident.
The pressure is relieved by allowing the pressurized fluid to flow
from an auxiliary passage out of the system. The relief valve is
designed or set to open at a predetermined set pressure to protect
pressure vessels and other equipment from being subjected to
pressures that exceed their design limits. When the set pressure is
exceeded, the relief valve becomes the "path of least resistance" as
the valve is forced open and a portion of the fluid is diverted
through the auxiliary route. The diverted fluid (liquid, gas or liquid
gas mixture) is usually routed through a piping system known as a
flare header or relief header to a central, elevated gas flare where it
is usually burned and the resulting combustion gases are released
to the atmosphere. As the fluid is diverted, the pressure inside the
vessel will drop. Once it reaches the valve's reseating pressure, the
valve will close.
As should now be quite clear relief valves are the last resort of
safety devices.

Steps - continued
Meg all electric motors as per existing guidelines in sufficient
time so as not to delay startup (List and check off each
Check all fire monitors, fire extinguishers, Self contained
breathing apparatus (or other respiratory equipment), safety
showers, eye bubblers and other safety equipment.

Review procedures prior to startup and note any special

Include arrow diagrams or other sequence specific aids to
illustrate the safe sequence of events. Include other diagrams
as needed.
Megging motors refers to verifying that they draw the correct
amount of amps and can run as designed. They may need to be
dried out or otherwise serviced prior to activation.
Regarding the third bullet point - this is a good time to indicate a
best practice is to dry run the entire startup procedure to catch any
problems and ensure that operators know what, where, and how
prior to the start of the real thing.
Arrow diagrams are wonderful aids to startup as they show the
sequence of events that can be done in parallel and those that must
be done sequentially.
What are arrow diagrams?
The arrow diagram is a network diagramming technique in which
activities are represented by arrows. The arrows indicate the
required order of tasks in a process, the best schedule for the entire
project, and potential scheduling and resource problems and their
solutions. The arrow diagram allows calculation of the critical path
of the project. This is the flow of critical steps where delays will
affect the timing of the entire project and where addition of
resources can speed up the project.

Arrow Diagram Example

This is an example of an arrow diagram that depicts various parallel

and sequential steps.For any procedure this is a clear mechanism to
show all just where the procedure stands, what has been completed
and what still must be done. This technique was used with various
color-coded markers to show various aspects of the procedure such
as when air free, when pressure tested, when oil was in the various
sections. Of course, when starting up a complex unit the arrow
diagrams could contain ten or more parallel paths and many, many
more sequential steps. This is why the arrow diagrams were and are
indispensible to an uneventful startup.

Steps - continued
When referring to temperatures, pressures, flows, and levels, give
the equipment number as well as name or function (e.g., TRC 4
depropanizer reboiler control). If it is important not to exceed a
certain temperature, pressure, etc., specify with a short explanation
the reason for the maximum value. Such as:
What is the Process Parameter?(It is the current status of a
process under control.Measurement of process parameters

isimportant in controlling a process. The process parameter is

a variable feature of the process, which may change rapidly.
Accurate measurement of process parameters is important for
the maintenance of accuracy in a process.)
What is the Process Limit?(The safety range lies between the
safe upper limit and the safe lower limit. If the value of the
parameter goes outside this range then the process is, by
definition, unsafe, and action must be taken.)
What are the Deviation Effects?An evaluation of the
consequences of deviations, including those affecting the
safety and health of employees need to be included in the
operating procedures.
What are the Recovery Measures? Recovery measures are
also required. Not all failures can be foreseen. Even foreseen
failures cannot always be prevented. So recovery measures
must be anticipated and documented. Sometimes near misses
can lead to formulating effective recovery measures.
These points help ensure that only the correct control point is
addressed during a startup. A very good point to make is that if the
step is directed by radio then both sides of the conversation repeat
the details to ensure clarity.
If there is a possibility of a reaction taking off, the recovery
measures should cover the correct steps to take to avoid an
autogenous reaction. Or similarly if another unit could be affected
by an exceedance of a process parameter, here is the place to note
and put corrective actions in black and white.

Steps - continued
Have specific oxygen freeing procedures for each system and
provide purge diagrams. Specify where the purge is to enter
the system and where to check for oxygen
Specify the maximum oxygen content (not more than 1
percent) allowable after purging to be considered
oxygen free
Record oxygen test results (point tested, time and date of
test, oxygen reading, and operator initials)
Use arrow diagrams, shown earlier, that you can color code with
markers to indicate when the procedure is complete.
Just to refresh your knowledge about the fire triangle and
flammability limits.
Fire triangle:There must be something to burna fuel; a source of
oxygen (an oxidizer); AND an ignition source. These three factors
are each at the corners of an equilateral triangle, the fire triangle,
whose overlap is a chain reaction that results in the rapid oxidation
of a fuelfire.
A fire will not always start when the three legs of the fire triangle
meet, unless all three elements are present in the required amounts.
For instance, vapors from a flammable liquid must be mixed with a
certain amount of air in order to ignite and propagate a flame.
Flammability limits are the proportion of combustible gases in a
mixture;within theseboundariesamixture is flammable. Gas
mixtures consisting of combustible, oxidizing, and inert gases are
only flammable under certain conditions. The lower flammable limit
(LFL) describes the mixture with the smallest fraction of

combustible gas, while the upper flammable limit (UFL) gives the
richest flammable mixture.
"Purging" for personnel entry involves removing contaminants
inside the confined space by displacement with first inerts and then
with air to achieve acceptable atmospheric levels. (Remember the
fire triangle). An acceptable oxygen concentration is required to
provide protection in case of accidental release of chemicals, to
remove contaminants generated by the work performed, or to cool
the enclosure.

Steps - continued
Ensure that all vents and drains are free of pluggage and
ready for use
When steaming, keep all condensate drained (List vents and
low point drains.)
Purge air to the atmosphere not to the flare. Install plugs or
caps in vents after purging is complete and before
hydrocarbons are introduced. All vents and other connections
to the flare system should remain blinded until the process
unit is oxygen free
Specify a vessel tightness test pressure and PRV (pressure
relief valves) settings to avoid popping relief valves
A relief valve is a mechanical device that contains an internal spring
that applies force to a metal seat or piston. This seat seals the
pressure vessel from the atmosphere. If the internal pressure of the
vessel increases to certain limits, the spring force in the valve is
overcome and the pressure is released. The set pressure of the
valve is determined by the vessel's maximum allowable working
pressure. This is based on vessel materials, wall thicknesses, design

temperatures and vessel construction. If the operating temperature

is very high, it could have an impact on the determination of the
relief valve set pressure.
Why do you think it is important to keep oxygen out of the flare
In a flare system the released gases and liquids are routed through
large piping systems called flare headers to a vertical elevated flare.
The released gases are burned as they exit the flare stacks.
Presence of oxygen in the flare system will be highly hazardous, as
the gases/liquids will begin to burn before safe release to the

Steps - continued
If using steam to oxygen free and then tightness testing the
unit, be sure to bring in nitrogen or gas (fuel gas or natural
gas) at a sufficient rate to displace condensing steam to avoid
pulling a vacuum (remember that fuel gas contains hydrogen
sulfide, and the IDLH of hydrogen sulfide is 100 ppm)
Specify in the procedures when to commission any on-stream
analyzers and other instruments
Specify when to install all running blinds. Have a check off
list of all running blinds (steamouts, water connections, etc.)
Specify how to back gas into each system
Immediately Dangerous to Life or Health IDLH
An atmosphere that poses an immediate threat to life, would cause
irreversible adverse health effects, or would impair an individual's
ability to escape from a dangerous atmosphere.

The National Institute of Occupational Safety and Health (NIOSH)

defines an immediately dangerous to life or health condition as a
situation "that poses a threat of exposure to airborne contaminants
when that exposure is likely to cause death or immediate or delayed
permanent adverse health effects or prevent escape from such an
environment." The IDLH limit represents the concentration of a
chemical in the air to which healthy adult workers could be exposed
(if their respirators fail) without suffering permanent or escapeimpairing health effects.
Also, if using nitrogen to purge or air free anyone going into the
area should be required to wear a personal minimum oxygen
monitor. There are units available for toxics such as H2S that should
be employed if the possibility exists for exposure.
Analytical monitoring is indispensable for optimizing the production
of chemicals. It is necessary to consider all the factors involved
(safety, cost, yield, on-stream vs. laboratory analysis, etc.) in
selecting the most appropriate method and apparatus. On-stream
analyzers are computerized devices that monitor processes. Usually
analyzers have many sampling points for different stages of the
process and the data is available with minimum delay.
Running blinds are blinds that the unit will run with, to allow for
certain procedures to be done while the unit is on line (running).
Finally a specific path and process should be in place to guard
against running into blinds or check valves that could prevent flow.
And also includes verification points.

Steps - continued
Check all low point drains for water; specify frequency (List.)

Bring in oil to establish levels and start cold oil circulation

Run all spare pumps during cold oil circulation to get rid of
any water. Document all pumps that have been run
Establish temporary flow through all bypass lines during
cold oil circulation to flush out any water
These points are all provided to ensure that all water is removed
from the system prior to heating up. This would be a good time to
mention that at one atmosphere, water at 211 degree F and steam
at 212 degree F expands 1600 times. If the expansion is
instantaneous, it is a problem.

Process in place to verify accuracy of procedures
Process in place to verify use of procedures
An audit process
Procedures are only good if they are appropriate and correct. To
ensure that each time they are to be used they should be verified
before use that they are right. The author liked to gundrill the
operators that means the whole procedure was run as if it where
the real thing. This gave operators a time to walk through each step
and ensure that they knew where and why.
Procedures are only good if they are used, so a process needs to be
in place to verify that they are actually used EVERY TIME!!! The
author knows of several instances where loss of lives took place
because the procedures were not used or not used correctly.
Finally an audit process needs to be in place to provide a separate
set of eyes that the previous two steps were done properly.


What about re-starts?

One procedure for all startups
If restart, then verify section by section
Sign off on verified sections
Look at everything to ensure no mistakes!
Today weve looked at elements of a complete startup of a unit. As
mentioned earlier most of the process safety incidents occur on
restarts after unscheduled outages when people are anxious to get
the unit going again to make money.It is advisable to have one
startup procedure for each unit. That same procedure was used for
restarts, but if sections were not needed as they were unaffected by
the outage, the essential points were verified, signed off, and the
procedure moved ahead. This is where the arrow diagrams were

In this lesson we have looked at:
Preliminary preparations, units
Elimination of air
Tightness testing
Backing in natural gas or fuel gas
Elimination of water
Bringing the unit on stream
Compliance with all company and regulatory requirements for
process safety
Compliance with environmental laws and restrictions


Conduct of Operations
Safe Ups and Downs
To get an overview of operational moves to shutdown and startup
units and how these mesh with PSM requirements.

At the end of today, you will be able to:

Know the expectations of the PSM regulation with respect to
operational moves
Begin to understand the practical aspects of shutting down and
starting up units in covered processes

The 14 PSM elements

Employee Participation
Process Safety Information
Process Hazard Analysis
Operating Procedures
Pre-Startup Safety Review
Mechanical Integrity
Hot Work Permit
Management of Change
Incident Investigation
Emergency Planning and Response
Compliance Audits

Trade Secrets

Operating Procedures - Must have and use written

operating procedures for the following phases:
Initial startup
Normal operations
Temporary operations
Emergency Shutdown
Conditions when emergency shutdown is required
Assignment of shutdown responsibility
Emergency Operations
Normal shutdown
Startup following a turnaround, or after an emergency

Operating Procedures should also cover:

Operating limits and consequences of deviation
Steps required to correct / avoid deviation
Health and Safety Considerations
Built-in Safety Systems
Hazard Control for non-routine tasks (i.e. First Line breaking,
Confined Space Entry, Control over entrance into a facility by
support personnel)
What could be the significance of breaking a line or a first line
break? The key element here is the non-routine issues. Particular
attention needs to be paid to the non-routine anything.

First line breaking means the initial opening of process and utility
lines, hoses, fittings and vessels to the atmosphere. It is subject to
all safety procedures.It is an important process that is needed to
clean, repair, and properly maintain the pipes and lines at a facility.
Designing and implementing a First line Breaking Policy is essential
to ensure health and safety, and reduce potential hazards.
A first line break needs to have absolute assurance that the line is
ready to be opened to the atmosphere, depressured, at the zero
energy state, neutered as it were, to ensure worker and facility
safety. Then, why should you be concerned about mechanical
personnel entering your facility? Do they know the hazards, do they
know escape routes, do they know what not to touch or open?
These are very salient points to remember. Do they know what a
confined space is and where they are on the unit? Again, these
points must be a part of the consciousness of all personnel on the
A confined space has limited or restricted means for entry or exit,
and it is not designed for continuous employee occupancy. Confined
spaces include, but are not limited to underground vaults, tanks,
storage bins, manholes, pits, silos, process vessels, and pipelines.

Shutting down units - elements

Cooling and depressuring
Pumping out
Removal of residual hydrocarbons
Removal of corrosive or hazardous materials
Disposal of water
Blinding and opening
Removal of pyrophoric iron sulfide, if still present

Blinding of sections of unit

Maintenance of a Blind List for each vessel that is entered
Verification of blinds on every vessel entered each shift
Testing and approval for entering
o Verify Oxygen concentration
o Confirm ALL other gas lines are blinded
A pyrophoric material is a liquid or solid that, even in small
quantities and without an external ignition source, can ignite within
five minutes after coming in contact with air. Most commonly,
pyrophoric iron fires occur during shutdowns when equipment and
piping are opened for inspection or maintenance. Instances of fires
in crude columns during turnarounds, explosions in sulfur, crude or
asphalt storage tanks, overpressures in vessels, etc., due to
pyrophoric iron ignition are not uncommon.
Where does pyrophoric iron sulfide come from? If you have a
system that contains sulfur and the metal is steel (iron) the
probability of generating pyrophoric iron sulfide is present.It is not a
problem when it is wet, but when the system dries and comes in
contact with air (assume that flammables are there) the fire triangle
will be closed. The dry iron sulfide is a source of ignition.

Shutting down
- Hazards frequently encountered
Mixing air with hydrocarbons
Contacting water with hot oil
Freezing of residual water
Exposure to toxic gases and liquids
Pyrophoric iron sulfide

Other flammables or explosives when exposed to air

Reactives when exposed to air that compromise metal integrity
Excessive vacuum
Thermal shock
Mechanical shock
Most of these are self evident, but the thermal and mechanical
shock not so much. Some materials are prone to damage if they are
exposed to a sudden change in temperature. Glass and certain
other materials are vulnerable to this process, in part because they
do not conduct thermal energy very well. This is readily observed
when a hot glass is exposed to ice waterthe result is a cracked,
broken, or even shattered glass.
Thermal shock is a reaction to a rapid and extreme temperature
fluctuation. The shock is the result of a thermal gradient, which
refers to the fact that temperature change occurs in an uneven
fashion. Temperature change causes expansion of the molecular
structure of an object, due to weakening of the bonds that hold the
molecules in formation. The existence of the thermal gradient
means this expansion occurs unevenly, and glass in particular is
very vulnerable to this process. Cooling too fast is as bad as heating
too fast. The differential thermal expansion of various parts of the
plant need to have time to grow and shrink (heat vs. cool) and get
into the ambient positions without jumping off of the supports.
A mechanical or physical shock is a sudden acceleration or
deceleration caused, for example, by impact, drop, kick, earthquake,
or explosion. Shock is a brief physical excitation. Mechanical shock
has the potential for damaging an item (e.g., an entire light bulb) or
an element of the item (e.g. a filament in an Incandescent light

A brittle or fragile item can fracture. A soft ductile material may

sometimes exhibit brittle failure during shock due to timetemperature superposition. A ductile item can be bent by a shock.
A shock may result in only minor damage that may not be critical
for use. However, cumulative minor damage from several shocks
will eventually result in the item being unusable.
A shock may not produce immediate apparent damage but might
cause the service life of the product to be shortened: the reliability
is reduced.
A shock may cause an item to become out of adjustment. For
example, when a precision scientific instrument is subjected to a
moderate shock, good metrology practice may be required to have
it recalibrated before further use.
Some materials such as primary high explosives may detonate with
mechanical shock or impact.
When glass bottles of liquid are dropped or subjected to shock, the
water hammer effect may cause hydrodynamic glass breakage.
Water hammer on pipes can be very destructive. One location had
over a mile of high pressure steam lines, three feet off the stations
due to water hammer. Imagine the amount of energy that must
have taken!

Shutting down Normal SD

Inform other affected units as early as practical
Inform Utilities, Oil movements, Scheduling, Management
Put up arrow diagrams in control room

Review shut down procedure with crew, if time dry run critical
Print out current shut down procedure should be logical order,
detailed as a check off with date/time, signed off by operators
and (only one copy please)
Rate reduction, cool-down rate, minimum flow rates, trip points
incorporated into the steps remember cooling metal contracts
watch expansion areas
Each shift should summarize unit status (use arrow diagram
and words)
Ensure all fire monitors, fire extinguishers, SCBA, etc. are in
working order
Block off access roads as necessary

These points should all be pretty clear. Arrow diagrams are very
useful communication tools to ensure all on the unit are on the
same page. Color highlighting the segments as completed, helps
avoid confusion. More details later.
SCBA means self contained breathing apparatus


Shutting down Normal SD continued

Specify purge medium steam, nitrogen, etc.
If steam caution collapsing steam can create a vacuum
If nitrogen caution nitrogen can not support life
Some metals transition between ductile and brittle and should
be noted in procedures to ensure no failures
Follow procedures step by step

Note when instruments need to be blocked in and isolated

Remember care and nurture of catalyst to prevent deactivation
If acids or bases, remember special precautions for people and
If toxic materials, remember special precautions for people
When finished purging and ready to open to atmosphere
verify hydrocarbon free prior to admitting air
The transition of metals from brittle to ductile phase is important
and needs to be clearly understood by all on the unit. At low
temperatures most metals are brittle and need to be slowly brought
up to the transition temperature before adding the pressure into the

Shutting down Normal SD continued

Use verified blind list to isolate vessels and equipment
Ensure that the confined space entry procedures are followed
to the letter
Ensure that proper lock out/ tag out procedures followed to
ensure zero energy
Ensure that hot work procedures are followed
Ready for the work to begin
Most are clear, but the hot work procedures followed by all are
critical. When many outsiders come to the unit to work, many are
not clear where the possibility of hydrocarbons can be found, thus
hot work procedures properly followed eliminate the possibility of
having a source of ignition and a source of fuel at the same time.
Lockout-tagout (LOTO) or lock and tag is a safety procedure which
is used in industry and research settings to ensure that dangerous

machines are properly shut off and not started up again prior to the
completion of maintenance or servicing work. LOTO includes the
practices and procedures necessary to disable machinery or
equipment, to prevent the release of hazardous energy sources
during servicing and maintenance activities. The procedure requires
that a tag be affixed to the locked device indicating that it should
not be turned on.
LOTO is a big deal! All energy sources MUST be isolated before
they can be worked on. This is usually thought of as electrical
energy but pressurized systems contain energy and must be
properly isolated or relieved before working on these systems.

Shutting down emergency SD

Automated emergency SD
o Automatically triggered
o Manually triggered
o Post SD checklist to ensure completion of actions
Manual emergency SD
o Each crew member has clearly defined set of sequential
o Post SD the emergency SD checklist to ensure
completion of actions
A key here is to make sure that after an emergency shutdown that
the expected sequence of events has taken place completely and
the unit is in the expected position. Verification of those positions
(i.e. levels, pressures, temperatures, etc.) is critical to a safe
startup after the reason for the shutdown have been corrected.

TAR - Turnaround
Ever vigilant to personnel on unit
Special procedures for first line break
Ensure that the confined space entry procedures are followed
to the letter
o https://www.osha.gov/SLTC/confinedspaces/
Ensure that proper lock out / tag out procedures are followed
isolate all hazardous energy potential
o https://www.osha.gov/SLTC/controlhazardousenergy/
Ensure hot work procedures are followed
o https://www.osha.gov/SLTC/etools/oilandgas/general_s
Turnarounds or TARs are planned, periodic shut down (total or
partial) of a process unit or plant to perform maintenance, overhaul
and repair operations and to inspect, test and replace process
materials and equipment.Turnarounds allow for necessary
maintenance and upkeep of operating units and are needed to
maintain safe and efficient operations.
Safety incidents are more likely to occur during these occasions, so
extreme vigilance and care is essential. All the required safety
precautions have to be followed with great care.

Starting up - Safely
In lecture 18-A we covered normal start up procedures and
what to include in that
Today we will learn only from abnormal Shutdown
Startups are when incidents are likely to occur do them by
the book no short cuts follow procedures

24/7 technical oversight a must

OK to pause most of the time
Any unexpected deviations indicate a pause is necessary
Verify each step before proceeding
Take your time to get the job done safely and quickly
The point about OK to pause may not be clear to all. During startup
the sequential aspect of some of the steps may have time
constraints that cannot be overridden, thus the caveat. What this
really means is that it is critical to all involved in a startup to be
clear on what the sequence is, what the alternatives are, and what
corrective actions should taken be if a step does not go as
anticipated.So here, do you stop, go back, take corrective actions,
or do you stop and reassess the position of the unit, meaning some
of the assumptions of where the unit is and where it really is, may
be incorrect. It is vital to know with absolute certainty the position
of the unit at all times to ensure a safe and uneventful startup.
It cannot be overstated that the leadership on the unit must be
continuous to allow for management to do just that - manage the
situation always. Anytime an unusual circumstance is encountered
the unit staff must be clear what the next step is. If that next step
is unclear STOP. Get the appropriate technical support to make
the correct decision.

Starting up - Safely
What should you check before SU?
Table top the SU
Notify affected units
Slow methodical heat up of equipment why?
If nitrogen is purge medium wear Oxygen monitors

How does nitrogen kill?

You must check:
Fail safe position of all control valves
Verify cause of Shutdown has been corrected
Verify Shutdown Systems (PLCs) are performing correctly
Anything abnormal if so, check why and correct before

How Nitrogen kills people:

Being overcome by nitrogen, which is an asphyxiant, will kill you,
but the treacherous aspect of being overcome by nitrogen is that
the brain does not measure how much oxygen is in your blood and
when low causes you to breathe faster. It measures the amount of
carbon dioxide in your blood and when it goes up it causes you to
breath faster to reduce the level of CO2. So if you are overcome by
nitrogen, your blood oxygen AND carbon dioxide are both displaced
by the nitrogen and your brain thinks all is well. BUT, clearly it is
not. The auto response of the body to breathe stops. No CO2 in the
blood, no breathing period. This means that if you try to recuse
someone who has been overcome by nitrogen and pull them to
safety, you must start cardio pulmonary resuscitation to get some
CO2 into their blood system and the diaphragms auto response will

Starting up - Safely
During SU most control valves will be in manual the board
operator must constantly adjust settings until reaching steady
state operations
During this time the unit is vulnerable!!
The operators should move to automatic as soon as possible
Once up and running re-verify all process variables to be within
normal operating range
This includes levels, pressures, temperatures, control valves; all in
automatic mode (no manual overrides permitted without a
temporary MOC in place). During this period when control valves
are in manual the unit conditions must be monitored and controlled
by unit personnel very carefully and according to strict, measured
constraints. As temperatures are increased it must be verified that
areas that could accumulate water are controlled. Any sudden
increase in temperature could cause an explosive increase in
volume (remember water increases 1600 times as it becomes
steam) and that uncontrolled increase could be a disaster.
Once up to the appropriate conditions the control valves must be
systematically put into the auto positions. A checklist should be
used to ensure no lapses.

Safe Ups and Downs summary

Reviewed the PSM regulation
Reviewed sequence of SD and SU
Reviewed the logic of SD and SU
Reviewed technical oversight required
Reviewed the practical aspects of unit SD and SU


You are a new process engineer at a small refiner one week
on the job, no other training available
Assigned to the light distillate Desulflurizer unit (1200 psig
An unscheduled SD has just occurred
The control panel shows that the reactor pressure is slowly
decreasing, then suddenly starts to rise
What do you check (in order)?
What does this mean? Give the logic
How do you correct and get ready for startup?

Chapter 19

Pre Startup Safety Review

Is everything ready to go?
The PSSR is your conscience LISTEN TO IT.

To show you:
Why do we do them?
What is a PSSR?
What is included?
Who is included?
What is the desired outcome?

At the end of today, you will be able to:

Participate in a PSSR
Contribute actively in a PSSR
Make suggestions about what should be covered in a
particular PSSR
Express concern about an issue with good judgment and
specific examples
What is PSSR?
PSSR means Pre Startup Safety Review.

The basic idea behind a pre-startup safety review is to confirm that

any changes made to a facility or equipment meet the original
design or operating intent. The PSSR aims to catch any changes
that may have crept into to the system during the detailed
engineering and construction phases of a project.
PSSR covers not only equipment, but also operating procedures and

Why do we do PSSRs?
It is an OSHA requirement!
It is also good business
o Safe startups save lives
o No unplanned events saves equipment
o Orderly startup makes product quicker

The 14 PSM elements

Employee Participation
Process Safety Information
Process Hazard Analysis
Operating Procedures
Pre-Startup Safety Review
Mechanical Integrity
Hot Work Permit
Management of Change
Incident Investigation
Emergency Planning and Response
Compliance Audits

Trade Secrets
We also do PSSR because it is a PSM requirement!

PSSR: OSHA Regulation

As per the OSHA regulation, a PSSR is needed whenever process
safety information is changed. Virtually all changes result in updates
to the facility documentation, particularly P&IDs. So in effect this
requirement means that virtually all changes will have to be
reviewed by a PSSR. There are very few changes that do not
require some information changes to do with topics such as safe
limits, engineering drawings and equipment lists.
The employer shall perform a pre-startup safety review for new
facilities and for modified facilities when the modification is
significant enough to require a change in the process safety
The pre-startup safety review shall confirm that prior to the
introduction of highly hazardous chemicals to a process:
o Construction and equipment is in accordance with
design specifications
o Safety, operating, maintenance, and emergency
procedures are in place and are adequate
o For new facilities, a process hazard analysis has been
performed and recommendations have been resolved or
implemented before startup; and modified facilities
meet the requirements contained in management of
change, paragraph (l) [of this regulation].

Construction and Equipment:PSSR team members can carry out

spot-checks of the installed piping and equipment, and compare it
with the piping lists and equipment data sheets.
Procedures:The PSSR should check that safety, operating and
emergency procedures for the new operation have been written
down, and that they accurately describe what has to done.
Though training is not mentioned, it is clear that operators and
maintenance workers must be trained in the use of the new
New / Modified Facilities
The PSSR team should check that the PHA was in fact carried out,
and that its recommendations were either resolved or implemented.

OSHA PSM Guidance

For new processes:
The employer will find a PHA helpful in improving the design
and construction of the process from a reliability and quality
point of view.
The safe operation of the new process will be enhanced by
using the PHA recommendations before final installations are
P&IDs are to be completed along with having the operating
procedures in place and the operating staff trained to run the
process before startup.
The initial startup procedures and normal operating procedures
need to be fully evaluated as part of the pre-startup review to

assure a safe transfer into the normal operating mode for

meeting the process parameters.

OSHA PSM Guidance

For existing processes:(that have been shutdown for turnaround,
or modification, etc.)
The employer must assure that any changes other than
replacement in kind made to the process during shutdown go
through the management of change procedures.
P&IDs will need to be updated as necessary, as well as
operating procedures and instructions.
If the changes made to the process during shutdown are
significant and impact the training program, then operating
personnel as well as employees engaged in routine and nonroutine work in the process area may need some refresher or
additional training in light of the changes.
Any incident investigation recommendations, compliance audits
or PHA recommendations need to be reviewed as well to see
what impacts they may have on the process before beginning
the startup.

How this all fits together:

For Change Items in a Turnaround:
The way in which PSSRs, MOCs and audits link to one another is
shown in the diagram


What is included in a new plant PSSR?

Detailed review of the Start-Up (S/U) plan
Operating Procedures
Operator Training (and their morale)
TOTAL Team Readiness (confidence, rest, understanding of the
S/U plan)
o Team Readiness requires you to interview most of the
S/U team
Logistics plan and contingencies
Here is an anecdote in authors own words:
I was the Technical Manager for a new plants startup. I led the
PSSR. At the end of the PSSR, I recommended they send everyone
home to rest for a day. The whole team was fired up and excited
but they had labored hard and long to get the unit right up to the
edge of being ready to startup. Everyone was tired operators,
engineers, managers everyone. So, at $100k per hour, I declared
a rest period. Minimum staff, small, final checks, everyone was to
get 24 hours free rest time scheduled. Who knows if I wasted that
money but we had an orderly, safe startup! One of my most
exciting, challenging and satisfying moments!
I took the unit as unit manager after the startup another great
assignment with its own stories.

Verify ALL MOCs since final design have been reviewed and
P&IDs As Built are in the Control Room
Arrow diagrams current & verified
Verify all equipment and utilities systems have been pressure
Verify all PHA and HAZOP recommendations been completed
An effective way of conducting a PSSR is to work through the
elements of the facilitys PSM program. Different companies,
professional bodies and regulators have different element lists.
The major components:
Ensure that all action items and recommendations from
Hazards Analyses and ALL other reviews such as Management
of Change have been completed as required.
Ensure that no changes that could affect safety or operability
have crept into the system during the construction phase.
The Piping and Instrumentation Diagrams - P&IDs that is the
schematic illustration of functional relationship of piping,
instrumentation and system equipment components
represent the actual schema as built and are in the Control
The PSSR Team should do a complete unit walk through to ensure
the facility is ready for Startup
Housekeeping should be excellent
Only essential scaffolding in place
No un-insulated burn hazard piping/vessels

Team should prepare a checklist of items to look for and

complete prior to SU
The review represents the final frontier to catch any problems.
Therefore it should be led by someone who will be required to run
the modified system. Generally, the following issues should be
covered by the review team:
Equipment and instrumentation items that have been changed
are installed and commissioned in accordance with design
Safety, operating, maintenance, and emergency procedures
are in place and are adequate.
All findings from hazards analyses, management of change
evaluations and other types of review have been closed out
All affected personnel have been trained in the new or modified
Insulation is the last skill group on the critical path.
Instrumentation and shut down systems are second last. Both
need to be examined for completion.
Review emergency SD procedures
Identify critical areas of SU gundrill
Identify 24/7 oversight team
An Emergency Shutdown System (ESD) represents the final layer of
protection that mitigates and prevents a hazardous situation from
occurring. It is the final defense against incident.

Is your ESD system reliable and function on demand? During an

emergency, is it capable of shutting down the process in a safe and
orderly fashion? Finally, verify if it is in place.
After identifying critical areas of start-up procedures, gundrills and
intensive training should be carried out.
An oversight team where every member will be accessible 24x7
should be identified.

Who is included in a PSSR?

Generally, a PSSR is conducted by a team.
The leader represents the operations group because it is they
who are usually the ultimate customer for the changes that
have been made.
Supporting the leader are technical specialists and
representatives from the process safety team.
The leaders should have sufficient authority to delay the
startup if theyidentify a significant deficiency, even at a cost to
the company.
Various Experienced Technical Staff
o Process Design Engineers
o Process Control, Instrumentation and Shut Down
System engineers
o Unit Process Engineers
o Technology Specialists
o Mechanical/ Hard Engineering Specialists
o Pressure Systems Engineers
o Location Infrastructure Specialist (Utilities, Biotreater)

A process design engineer designs, develops, and optimizes the

processes used in industrial operations. The engineer has expertise
in chemistry and knowledge about machinery, equipment, and
instrumentation.A process engineer oversees many types of
industrial processes such as mechanical, electrical, chemical, and
biological processes.
The typical Control and Instrumentation and Shut down Engineer
will be expected to be fluent in electronics, fluid dynamics,
materialselection, control engineering, and systems engineering
amongst all the usual competencies expected of today's professional
A unit process engineeroversees everything related to her/ hisunit,
including monitoring plant process parameters (temperature,
pressure, level, flow rate, etc.) and utilities (steam, water, gas,
electricity, etc.).
A technology specialist has expertise in information technology. A
technical specialist repairs, monitors, and helps implement new
computer networking systems for a business entity. Computers are
the heart of many businesses, tracking clientele and inventory
levels, for example. Technical specialists must keep the network
running, and improve it periodically, to maintain productivity and
constant computer access for employees.
Mechanical Engineers touch almost every aspect of technology.
They create machines, products and technological systems. Most
mechanical engineers focus on one of three broad areas of
technology: energy, manufacturing and design mechanics.


Operations Representatives:
o A veteran Operations and a veteran Maintenance
Supervisor on the team
o All operators that are part of process
o Operations Manager
Leadership of affected units
Leadership of Utilities needed
PSM Coordinator
Final step is for all appropriate leadership to sign off on final

The Desired Outcome

A safe startup with:
o No injuries or employee exposure
o No damaged equipment
An environmentally sound startup
A timely startup executed as quickly as possible while not
compromising on safety or environmental standards

Pre-startup and Restart Safety Reviews are an important part
of any process safety management program, yet are not
always given the attention that they deserve.
They provide a last chance for everyone associated with a
project to make sure that no unsafe acts or conditions have
slipped through before operations actually start.
Everyone involved in operating the modified facility must have
an opportunity to make sure that conditions are safe, that


effective procedures have been written and that the operators

and maintenance personnel have been properly trained.

You did an MOC on a change of function of a distillation column
from taking C-16= alpha olefins overhead to taking C-16/18= alpha
olefins overhead.
Who should participate in the PSSR before startup and why?
What are the key issues you MUST ensure are addressed and


Chapter 20

Operational Readiness; Operational

A Pre-startup safety review (PSSR) is an element in OSHAs Process
Safety Management (PSM) regulations. These regulations require
that the employer shall perform a PSSR for new facilities and for
modified facilities when the modification is significant enough to
require a change in the process safety information.
Operational Readiness and Operational Discipline are elements that
can be incorporated into the PSSR or can be carried out as a
separate activity. Operational Readiness/ Operational Disciple
(OR/OD) assessment should be done for ALL start-ups and not just
those that meet the OSHA requirements. This activity can be
modified for fit for purpose but each element should be reviewed
for applicability before deciding the level of detail required for a safe
The operational readiness element ensures that not only new startups, but also processes that were shut down for some reason, are
in a safe condition to start. Here shutdown duration, reason, type of
work performed on the process during shut down are considered.
The reason why this procedure is vitally important is that the
frequency of incidents is found to be higher during such transitions.
Maybe these were the result of variation in the physical process
conditions so those were not fit for safe operation.

Staff, Operations & Management Readiness:

Location management must be prepared for the start-up. They must
have considered the required staffing levels for the activity. This
includes ensuring adequate safety staffing is in place, appropriate
unit operations staffing levels are in place. Operations management
should ensure adequate experienced staff/ operators are available
for the activity and any training on new facilities is complete.
Specialty discipline staff such as pressure systems, rotating
equipment and instrumentation should be available. In preparation
for start-up, the feed supplier must be notified of the intended
start-up. Similarly, the business and logistics must be aware that
product is about to be available.
Unit Mechanically Prepared for Start-up:
The condition of the unit must be determined in order to make
appropriate start-up plans. If this is a new plant start-up or start-up
after a major turnaround, the unit may be empty and nitrogen
purged. If this is a restart after a brief shutdown or an emergency
shutdown, the exact condition and contents of the unit need to be
determined in order to have an appropriate restart plan. In All
cases, the comprehensive unit start-up procedure is used to
determine what off-normal items must be considered and dealt with
for this particular start-up. A clear understanding of how the unit
was shut down and verification of conditions (temperatures,
pressures, etc.) is required in order to develop a complete and
robust start-up plan.
Instrumentation functionality should be verified in affected areas.
Relief systems alignment should be confirmed. Typically, PLC and
shutdown systems are verified immediately before start-up for full
functionality. All flange connections that were opened must be leak

tested. In some cases, hydro testing is recommended by the

Pressure Systems Discipline. The unit should be clear of
construction materials and open for operators to perform their jobs
with proper egress access. This includes scaffolding, ladders,
welding machines and other equipment no longer needed. All
personnel protection insulation should be in place to protect those
working in the unit.
Location Facilities Prepared for Start-up:
Good communications within the location avoids unwanted surprises
in the other operating units at the location. Start-ups will typically
put demands (or excesses) on various location systems such as
steam, nitrogen, biotreater and hot oil systems. Feed providers and
product storage / customers must also be prepared for the unit
coming online. Neighboring units may not be directly impacted by
the start-up but they may want to avoid particularly hazardous
areas at the interface between the units. Sections of roads are
often barricaded to prevent traffic.
Final Procedures/ Checklist Reviews:
A final review of critical procedures and checklists gives the start-up
team one last chance to close any gaps. Operator readiness is
critical. The operators must have completed any training on
changes or new additions. They must feel confident they ready to
carry out the start-up. The unit emergency procedures should be
readily available. This is a good time to review key elements and
highlight any changes. A variety of checklists are used in preparing
for a start-up. These include, but are not limited to: Broken flange
leak tests, RV alignment verification, unit blind list, PLC
performance test. Typically, a unit manager or other management
representative verifies that ALL the checklists have been properly

initialed and signed. This includes action items from the PSSR and
any MOC documents.
The above discussion is one example of how Prestart-up Safety
Review and Operational Readiness/ Operational Discipline activities
can mesh together to put a start-up team in the best position for a
successful start-up. Different companies deal with these items
using their own systems / work processes. Never the less, the
fundamental concepts of PSSR and OR/OD must be addressed in a
thorough manner to ensure the best possible start-up outcome.

Reasons Theory
This lesson will introduce James Reasons Theory of how incidents
happen, what are the kinds of barriers that can be put into place to
prevent the incidents, and to review one well known failure

By the end of the lesson, you will be able to:

Use Reasons theory to analyze the broader, multiple failures
behind serious process safety incidents
Have a slightly broader interpretation of what to look at in
incident investigations

Todays Roadmap
Reasons Theory of the Cheese
Example & Discussion

Reasons Theory
To mitigate serious incidents, barriers must be in place
All it takes to stop a serious incident is one barrier
Usually, incidents are caused by multiple barrier
weaknesses, sometimes called precursors.
Identified hazards helps us put the right barriers into place
Safety described safety as a dynamic non-event. If there are no
incidents or near misses then safety tends to be taken for granted.
This happens especially because the production demands are ever
present. If people see nothing, they presume that nothing is wrong,

and thatnothing will be wrong to happen if they continue to act as

before. But this is misleadingbecause it takes a number of dynamic
inputs to create stable outcomes.
When such a state is prevalent, it pays to be proactive and carry
out checks and measures to prevent/ mitigate accidents. To ensure
that defensive barriers are in place.
All of these activities can be said to make up an informed culture
one where those who manage and operate the system have current
knowledge about the human, technical,organizational and
environmental factors that determine the safety of the system as
Defenses, barriers, and safeguards make up the defensive layers.
Some are engineered, some depend on people, and some rely on
procedures and administrative controls. These controls or barriers
are meant to protect potential victims and assets from local
hazards. Sometimes it may take just one barrier to prevent an
accident or stop an incident from happening.
However even such a system may have limitations. The barriers
may have weaknesses. Multiple barrier weaknesses, sometimes
called precursorscan lead to accidents under specific circumstances.
The immediate cause of the accident is a failure of people at the
"sharp end" who are directly involved in the regulation of the
process or in the interaction with the technology.
It is essential to identify the Hazards, to assess the associated risks
and proactively put correct hazard-prevention barriers in place.
Continuous efforts are required to control risks arising from various

Types of Barriers
Policies, Standards, Guidelines
People and their behaviors
Equipment and controls
Work instructions and procedures
Physical barriers
Space and distance
In any best practice organization, many layers of defensive barriers
and protective measures are put up against the likelihood of an
These are invariably a mixture of 'hard' and 'soft' defences. The
former include engineered safety features-such as automatic
controls, warning systems and shutdowns-together with various
physical barriers and containments, while the latter comprise a
combination of paper and people--rules and procedures, training,
drills, administrative controls and, most particularly, front-line
operators such as pilots and control room personnel. The result of
these many layers of defence is to make these systems largely
proof against single failures, either human or technical. For an
accident to occur in such a system, it requires the unlikely
combination of several different factors to penetrate the many
protective layers and to allow hazards to come into damaging
contact with plant, personnel and the environment.
The first one is policies, standards and guidelines. These would be
written safety policies, safety standards based on OSHA standards
and guidelines on following elements of the standard.
Administrative controls will promote safe practice through policies,
processes, training and signage.

People and their behavior also is a defence against unsafe incidents.

The human factor! For this the safety culture of the organization
plays an important part. Due to their diversity the elements of a
multilayered defensive system will be widely distributed throughout
the organization. It is only the organizational culture that extends to
every part of the organization.
Equipment and controls also is a major safety barrier. Regularly and
thoroughly maintain equipment and ensure that hazard correction
procedures are in place.
The Hierarchy of Control is an approach that involves working
through a prioritized sequence of possible control measures until an
appropriate solution is reached.
Elimination. Remove the hazard completely from the work
Substitution. Replace the material or process with something
less hazardous.
Isolation. Isolate the hazard by controlling or guarding it.
Engineering controls. Redesign equipment or work processes
to reduce or eliminate risk.
Work instruction and procedures in another safety net.
Procedures contain the basic process, or paper trail, for performing
a function. For example: what is the procedure for operating a
specific machine. Procedures describe a process and may include
details about the inputs, what conversion takes place (of inputs into
outputs), the outputs, and the feedback necessary to ensure
consistent results.

Work Instructions hold the exact process for performing the

function; they describe how to perform the procedure. Work
instruction is a tool provided to help someone do a job correctly.
The purpose of the work instruction is quality and that the target
user is the worker.
Physical barriers between hazardous process and workers are also a
protective measure. If a hazard cannot be removed or eliminated,
then enclosing the hazard to prevent exposure in normal operations
is advisable. Physical barriers thus placed will help. Where complete
enclosure is not feasible, barriers or local ventilation can be
established to reduce exposure to the hazard in normal operations.
When exposure to hazards cannot be engineered completely out of
normal operations or maintenance work, and when safe work
practices and other forms of administrative controls cannot provide
sufficient additional protection, a supplementary method of control
is the use of protective clothing or equipment known as personal
protective equipment, or PPE. PPE may also be appropriate for
controlling hazards while engineering and work practice controls are
being installed. While using PPE is important, and required in many
work environments, it is not always enough to keep workers safe.

Reasons Theory
Hazards are contained by multiple protective barriers
Barriers may have weaknesses or holes
When holes align hazard energy is released, resulting in the
potential for harm
Barriers may be physical engineered containment or
behavioral controls dependent on people
Holes can be latent/incipient, or actively opened by people

As we saw earlier hazards are contained by multilayered defensive

and protective barriers. Despite so many barriers to prevent
incidents, incidents do happen. Why?
James Reason, a British psychologist analyzed systemic failure in
terms of four levels of human error: unsafe supervision,
preconditions for unsafe acts, the unsafe acts themselves and
organizational influences. When the four levels of potential failure
align, accidents are inevitable.
Reason calls this theory Swiss Cheese Model. Here an
organizations successive layers of defenses, barriers and
safeguards are considered as cheese slices. Each slice represents
one layer of defence. In an ideal situation all these layers are intact.
However in reality each slice has holes that represent the weakness
with that defenses. These are of two types: human errors (active
failures at the human-system interface) and organizational errors
(latent conditions arising from the failure of designers, builders,
managers and maintainers).
The holes due to active failures would be short lived but the latter
may be latent for a long time. Also unlike the holes in cheese slices,
these gaps are not static. They move, open and shut depending on
various factors.
When these holes align hazard energy is released, resulting in the
potential for harm. They align to allow a brief trajectory of accident
opportunity, so that a hazard passes through holes in all of the
defenses, leading to an accident.
These barriers are a mixture of 'hard' and 'soft' defences. The
formerinclude engineered safety features-such as automatic
controls, warning systems andshutdowns together with various
physical barriers and containments, while the lattercomprise a

combination of paper and people rules and procedures, training,

drills,administrative controls and, most particularly, front-line
operators. The result of these many layers of defence is to make
these systemslargely proof against single failures, either human or
For an accident to occur insuch a system, it requires the unlikely
combination of several different factors to penetratethe many
protective layers and to allow hazards to come into damaging
contact with plant,personnel and the environment.

In order to reduce the potential for future major

incidents and losses, three layers of protection are
to be considered:
Plant engineering hardware, control systems, and layouts to
eliminate, control and mitigate potential hazards to people, and
improve productivity
Processes management systems to identify, control and mitigate
risks, and drive continuous operational improvement
People capability of our people in terms of leadership skills,
relevant knowledge and experience, and the organizational culture
they create

In layers of protection, hard barriers are more reliable than soft

barriers, though all rely on people.
These lines or layers serve to either prevent an initiating event from
developing into an incident or to mitigate the consequences of an
incident once it occurs.

What is a management system?

Any work process where steps of work can be outlined and
measured, for example:
Training and promotional systems
Distributed Control Systems
Information sharing and retrieval systems
Document and drawing control systems
Maintenance planning and execution
Contractor management
Capital project (design and engineering)
A management system is a proven framework for managing and
continually improving an organization's policies, procedures and
The best businesses work as complete units with a shared vision.
This may encompass information sharing, benchmarking, team
working and working to the highest quality and environmental
A management system helps an organization achieve these goals
through a number of strategies, including process optimization,
management focus and disciplined management thinking.

Management system thus is a work process where steps of work

can be outlined and measured, such as:
Training and promotion systems: Training is an essential element of
any management system. The amount and the kind of training can
be measured. A gap analysis between the skills required for
employees to perform their jobs and their existing skill sets can give
a good idea of the required training. Their performance can be
measured for promotion.
Control system failure may result in the loss of production and
equipment damage. That is why control system reliability is an
extremely important consideration when choosing a control system.
An appropriate Distributed Control System (DCS) can result in
reduced downtime, improved system availability, enhanced control
reliability, and uninterrupted system access.
DCS (Distributed Control System) is a computerized control system
used to control the production line in an industry. The entire system
of controllers is connected by networks for communication and
monitoring. DCS is a very broad term used in a variety of
industries, to monitor and control distributed equipment.
Information sharing and retrieval systems are an important
component of todays industries. There is vast amount of
information and knowledge through expertise and experience, which
can be stored in databanks. However it is important that correct
and relevant information needs to be available whenever required.
For this Information Retrieval systems become important.
Information retrieval (IR) refers to the systems for identifying and
presenting documents relevant to human information needs.
Document and drawing control systems: Engineering drawings and
supporting engineering data need to be documented because these

drawings describe how to consistently reproduce the design.

Consistent reproduction is essential, since it forms the basis for
product improvements and production efficiency.
Engineering data proves that the product conforms to the original
design goals. Proof of conformance is important to both internal
stakeholders (marketing, accounting, production) and external
groups (distributors, customers, service providers, regulatory
However simply creating engineering drawings and recording
engineering data is insufficient. To be useful engineering drawings
and data require engineering document control.
Maintenance planning and execution can prolong the useful life of a
system. Maintenance management's functions are to cost effectively
maintain the system to achieve mission objectives with minimal
downtime, and to introduce upgrade and modification programs that
improve operational capability as required. To accomplish this,
maintenance managers must plan for and execute preventive and
corrective maintenance that is based on an in-depth understanding
of how the system is performing when compared to design
limitations. When done correctly, the useful life of a system can be
extended safely and operational readiness and system effectiveness
are more affordable.
Contractor Management is a complex issue that has a number of
variables impacting on employer and contractor obligations.
Contract life cycle management is the process of systematically
and efficiently managing contract creation, execution and analysis
for maximizing operational and financial performance and
minimizing risk.


What types of Engineers WorkProcessSafety


Engineers with many different skills work the aspects of Process

Safety Management. It is very rare, if not impossible, for one
person to hold in their heads all the intelligence needed for a
particular problem. Note here the various roles of the process
engineer, the process safety engineer, and the instrument and
controls engineer. Civil engineers are involved in PSM when they
design blast walls, explosion proof buildings, and work on site
layouts and roads to carry emergency vehicles and their weight.
Mechanical engineers work on the design of pressure vessels, piping
systems, and other mechanical designs to reduce risks especially
with high-speed rotating machinery that handle hazardous gases
like compressors. Electrical power and instrument engineers, and
chemical engineers with controls backgrounds, work on the details
of process design to ensure that the feed forward and feedback


control loops are suited to task. Fire safety engineers are also
sometimes involved. Note that most of these subjects listed are not
taught at universities, but are learned on the job. Many of the skills
of the process engineer are introduced in the 4-year university
study of chemical engineering.

Hierarchy of Incident Investigation

The main objective of an incident investigation is prevention. A

good investigation aims to establish a series of events that should
have taken place and compares it to what actually happened to
identify areas that need changing. An incident investigation is the
account and analysis of an incident based on information gathered
by a thorough examination of all contributing factors and causes
Let us consider three levels of precursors. These are organizational
factors, local workplace factors and unsafe acts by individuals.


The organizational factors are a product of technological innovations

that have radically altered the relationship between systems and
their human elements.
Local workplace factors are characteristics of the task or workplace
that, in combination with human error and violation tendencies, lure
people into repeated patterns of unsafe acts or less-than-adequate
These levels of precursors have safety layers that are meant to
prevent incidents. To ensure barriers are real, one needs to
establish proactive diagnostics and metrics for each of these three
levels of precursors.

Texas City Explosion Hazard Management Diagram

Guess what? Its an imperfect world there are holes in the

How do we know? Either we find them or they find us!


The holes may change size, shape or location example maybe a

strong shift with a less experienced shift back to back.
Sooner or later these holes line up so that the hazard gets through
some of the barriers.
Here is a depiction of what occurred in the Texas City Incident in
March 2005
Work in your teams to divide the barriers (yellow) into people, plant
and work process
Also review the failures that occurred in each barrier (grey) and
categorize them into people, plant and process
Closing comment after class discussion: Remember that only one
barrier, any of these barriers, would have saved the lives of 15

Reasons Model
Improved understanding of barriers/weaknesses
Reviewed the Texas City Incident using Reasons method
Learned hierarchy of how incidents are reviewed
In this lesson we have seen James Reasons Swiss cheese model for
accident occurrence. We have seen that multiple barriers can
prevent incidents. However each defensive barrier can have
inherent weaknesses, which are gaps in the proper defence. Even
one barrier can prevent an accident. Nevertheless when a chance
aligns holes/ gaps in all the barriers, then a hazard can result and
harm ensue.


In general, these gaps create latent weaknesses in the safety

barriers. These weaknesses when coming together can give rise to
an incident.
We have seen the barriers necessary to ensure safety. These
barriers range from hard barriers such as engineering solutions to
soft barriers such as well-trained human operators.
We have reviewed the Texas City incident using Reasons theory.
How a weakness in each barrier led to another, and to another, till
there was no barrier left and safety suffered badly.
We have learned about Incident investigation and the proper
hierarchy of steps including organizational factors, local workplace
factors and acts of individuals.

Using Reason's Model, map out how you were able to safely get
from Forney back to your residence without incident.
Include hazard identification for each hazard incurred, barriers that
are in place, precursors, and failures that did not occur. If a failure
did occur, note it and list what barriers presented a serious travel
incident or your injury. List the hierarchy level for each barrier,
who owns it and who is responsible for improving it.
More detailed maps will get a higher grade/extra credit.


Asset Integrity and Reliability

To understand how mechanical Integrity fits into the overall PSM

At the end of today, you will be able to:

Know the extent of equipment covered
Know how procedures are utilized
Understand how inspection is an integral part of this element
Understand how equipment deficiencies are to be handled and
to what standards
Understand how people fit into the equation

The 14 PSM elements

Employee Participation
Process Safety Information
Process Hazard Analysis
Operating Procedures
Pre-Startup Safety Review
Mechanical Integrity
Hot Work Permit
Management of Change
Incident Investigation
Emergency Planning and Response

Compliance Audits
Trade Secrets

Mechanical Integrity
It is important to maintain the mechanical integrity of critical
process equipment to ensure it is designed and installed correctly
and that it operates properly. PSM mechanical integrity
requirements apply to the following process equipment:
Pressure vessels and storage tanks
Piping systems
Relief and vent systems
Emergency shutdown systems
Control systems, including monitoring devices and sensors,
alarms, interlocks

Mechanical Integrity
Written procedures:
o Establish and implement to maintain on- going integrity
of process equipment
Maintenance procedures:
o Train in an overview of the process, its hazards, and
safe work practices
The employer must establish and implement written procedures to
maintain the ongoing integrity of process equipment. Employees
involved in maintaining the ongoing integrity of process equipment

must be trained in an overview of that process and its hazards and

trained in the procedures applicable to the employees job tasks.

Mechanical Integrity
Inspection and testing:
Inspection and testing must be performed on process equipment,
using procedures that follow recognized and generally accepted
good engineering practices (RAGAGEP). The frequency of
inspections and tests of process equipment must conform to
manufacturers recommendations and good engineering practices,
or more frequently if determined to be necessary by prior operating
experience. Each inspection and test on process equipment must be
documented, identifying the date of the inspection or test, the name
of the person who performed the inspection or test, the serial
number or other identifier of the equipment on which the inspection
or test was performed, a description of the inspection or test
performed, and the results of the inspection or test.

Mechanical Integrity
Equipment deficiencies:
Equipment deficiencies outside the acceptable limits defined bythe
process safety information must be corrected before further use.In
some cases, it may not be necessary that deficiencies be
correctedbefore further use, as long as deficiencies are corrected in
a safe andtimely manner, when other necessary engineering
controlsare put in placeto ensure safeoperation.


Mechanical Integrity
Quality Assurance:
Assure the equipment fabricated is suitable for the process
Assure equipment properly installed and consistent with
design specifications and manufacturers instructions
Assure maintenance materials, spare parts, and equipment
are suitable for the process intended

Mechanical Integrity
o Routine maintenance
o Planned maintenance
o Predictive maintenance
o Reactive maintenance
To maintain the mechanical integrity of any plant it requires a
systematic process to perform maintenance. There are many
different kinds of maintenance, which may or may not be obvious.
The routine maintenance is performed on a prescribed periodicbasis,
such as withdrawing and adding a bit of oil to the rotating
equipment regularly. This ensures that the oil is not overused and
gives the operator a chance to see first hand what it looks like,
what it feels like, and what it smells like. This simple task has
avoided many failures.

In the authors area they had probably 200 pumps alone. Unless
there was a systematic process in place it was likely that just one of
those many pumps could have failed. If that one pump failure did
not have a reliable spare pump, the unit could have crashed down.
That was and is unacceptable and easily avoidable.
Another example of routine maintenance is keeping the unit
spotless. So what does cleanliness have to do with reliability?
Simple, if the unit is spotless and the machinery is spotless then
any deviation, say a leaking seal, can be spotted immediately;
hence the fix can be done immediately. This would be reactive
maintenance and means that your other types of maintenance have
Planned maintenance is what is termed as turnarounds (TAR).
Prior to a TAR, a list of required maintenance is compiled as the
need becomes evident. Many items are done every TAR and are on
the permanent list. Other items come up based on their history
(predictive) and are also put on the list. Examples of predictive
maintenance would be when relief valves are pulled and
reconditioned. For many years relief valves would not have had
block valves to isolate on-stream, so they had to be maintained
when the whole unit was down. The reason the isolation valves
were not there is the difficulty in knowing if the flow path to and
from the RV was open. Better procedures have been developed such
as x-raying the block valves to verify open path, have led to many
RVs now having isolation block valves which also means that
maintenance can be done on line.
Can you think of a reason to NOT maintain an RV while the unit is
on-line? The answer is that if the RV is needed due to an
overpressure event and the RV is unavailable, the vessel could fail.

The solution would be to have spare RVs. Students may not know
this, but should be able to think it out.
Another example of predictive maintenance would be when the
vibration monitoring on rotating equipment exceeds the normal
range. The vibrations may still be in the acceptable range but they
tell you to put on the spare machine and fix the problem prior to
failure. Failure to do so could cause significantly more damage
including total destruction of the piece of equipment.
Weve talked about a few of the processes in place for maintenance,
but the most important part of the equation is the people. The
author learned the importance of people (all people on the unit)
quite a few years ago when he inherited an individual from another
part of the company. He had been with the company for quite some
time and was transferred into the authors area. On this complex
there were eight operator jobs per shift. The pay was the same for
the operators, but if you learned more jobs that meant you were
eligible for more overtime pay. (The overtime lists were by
definition always balanced among qualified operators. So, it was in
the financial interest of the operators to learn as many jobs as
possible to increase their annual take home pay.)
Well, this gentleman only learned and was qualified on one job. He
didnt seem to be very interested in that job, and certainly didnt
want to learn more. Maybe he wasnt very bright and didnt offer
much to the unit, but with so much time in the company there was
no other alternative. He showed up to work and did the minimum to
keep the job.
Then one day he came in to see the author and complained that the
block valves of the gas to the furnace burners were very stiff and he
thought he might get a strain if he had to work them suddenly.
(Each gas burner had two valves, one to the pilot was natural gas,

and the fuel gas to the burner was the other). In this complex
there were about 20 individual furnaces with on average 12 burners
or more. So, the number of gas valves approached 500. So, the
worker was asked to describe the gas valves, to see if he really
understood what they were. He did so, and also mentioned there
appeared to be a zerkgrease fitting on each valve. Yes, he was told,
and asked what he thought would happen if those fitting were
greased on a monthly basis. He understood exactly what was
meant. So he was given one additional responsibility: to make a
complete list of gas valves that needed to be greased on a monthly
basis. Within a short time he made the list. He became the gas
valve king and kept them absolutely workable. He began to shine.
Finally he had something he could do and excel in.
There is avaluable lesson in this. Everyone wants to do well and
excel in something. A good leader finds their niche and lets them
excel. When that happens, the sound processes and motivated
people yield superior performance.
The key element is the people!

Mechanical Integrity
Pressure vessels and storage tanks
There are industry standards for pressure vessel and storage tank
design and inspection (API 650 and API 653).
However these are the minimum standards. Generally any standard
you see is the minimum expectation. This calls for tank inspections
at least once every ten years.For very benign service that may be

just fine, but the conscientious company will use its own inspection
data to determine the optimal frequency.
An example of what could cause accelerated deterioration of a tank
would be if water separated out in the tank and caused a corrosion
cell to form at the water interface. Of course, to prevent damage an
epoxy liner could be installed at the expected interface area. But,
then you would need to inspect that epoxy liner on a periodic basis
to ensure its integrity. So, you can see although there are standards
to follow, the key element is the human factor to observe the data
and adjust as necessary.

Brittle Failure Characteristics on Surface of Failure

ASME Section VIII Division 1 recommends that to minimize the

chance of brittle fracture, themetal temperature during hydrostatic
testing must be maintained at least 30 deg. F (17 deg. C) abovethe
minimum design metal temperature, but must not exceed 120 deg.
F (49 deg. C).

By comparison, the National Board of Boiler and Pressure Vessel

Inspection Code requiresa metal temperature not less than 60 deg.F
(16 deg. C), unless toughness characteristic informationindicates
acceptability of lower test temperature. The maximum metal
temperature shouldagain not exceed 120 deg. F (49 deg. C).
Hydrostatic testing conditions must be considered and clearly
resolved at the design andmaterial selection stage of vessels. This
minimum temperature also applies during the start up of vessels in
winter conditions. You must stay below the maximum pressure at
any given temperature.
The author had a unit that required a significant warm-up in the
winter before increasing pressure. The pressure/ temperature curve
was posted on the control board all the time to make sure none
would ever forget.
The vessel shown here was hydrostatically pressure tested below
the appropriate temperature and you can see the brittle fracture
characteristics in this photo. Not too bad, thats why we test
hydrostatically since water is essentially incompressible.

Pressure Vessel That Failed on Hydrostatic Test with Water

However when you see the rest of the vessel you can see that this
vessel is history. It was probably never put in service judging by the
condition of the paint, and attachments.
The lesson here is that any steel is subject to brittle fracture if
stressed when it is below the transition temperature.


This incident happened in a facility in Brazil during a pneumatic test

of the tank associated piping. A blind was NOT installed to isolate
the piping only block valves were closed. Remember that air is
actually compressible and thus stores a significant amount of
energy. This is the foundation where the tank was before the test.
The key lesson here is that testing with an incompressible liquid like
water does not store the energy that testing with a compressible
gas like air or nitrogen. If a failure occurs, the energy released by a
compressed gas is HUGE compared to an incompressible liquid like


This is where the tank ended up after the test. A point to

remember Blinds serve a purpose and pneumatic testing is usually
not a good idea.

Another view of the end result.


Yet another perspective! You really dont want something like this to
happen on your watch. It can be really avoided by using sound PSM

Mechanical Integrity
Piping systems
Piping systems are really crucial. To understand piping systems you
have to clearly understand the fluid mechanics of the system.
Some things you should worry about are the obvious ones, such as
erosion, corrosion, and so on. Howeverthere was a failure of a
piping elbow in a reflux drum circuit of a depropanizer. So, first you
know that this was highly flammable and if you had a loss of
containment you would have a big problem. At the time of the
incident a rigorous inspection program was in place, in fact this
reflux circuit was inspected just one month before the incident. The
inspection team carefully measured the wall thickness of the lines at
the outside radius (where you might expect erosion to be the


highest). Seems appropriate? Well, in this case atwo phase flow was
occurring, the phases separated and the liquid migrated to one side
by centrifugal force and stayed at that same part of the line, the
vapor was on the other portion. Down the line the outside radius
had become the inside of the line at the next bend in the line and
the liquid was flung off the line much like cavitation occurs in a
pump with insufficient net positive suction head. The result was that
the line was chewed away just like an impellor looks in a cavitating
pump. The line failed, the propylene/propane found an ignition
source and a huge fire ensued destroying significant parts of two
units. Both units were down for many months, but the good news
was that there were no fatalities.
The learning here is that if you are responsible for piping system
integrity, know the fluid dynamics that takes place and validate with
the inspection data that your assumptions of fluid flow are correct.
Any anomalies? Find out why and adjust your process accordingly.

Mechanical Integrity
Relief and vent systems
Relief and vent systems can easily go under the radar, but are the
escape of last resort and they MUST work when needed. There is no
second chance here. So, what do you routinely do? Look at the lines
leading to the relief header, are they what you expect them to be.
Are they warm? If so, what do you think that might mean?
(Probably a leaking RV!) How do you trace back to the source?
Then what do you do if you find it? If the RV is spared (not very
likely) you isolate the leaking RV, have it pulled, have it serviced,
then verify open path. This last step is VITAL and never to be
ignored. What are other issues? Check on the drains to the systems,

do they contain liquid? If so, drain and find the source. If winter, is
the system protected by heat tracing, like steam or electric? Is it
working? And so on. Many mundane issues but need to be
verifiedto maintain functionality.
What is the periodic maintenance? The RV must be periodically
pulled and serviced. Usually this happens during a TAR. The existing
condition of the RV must be documented to determine if the
frequency of checking the RV is appropriate or does it need to be
changed? The as is condition tells you that.

Mechanical Integrity
Emergency shutdown systems
Emergency shutdown systems are the last resort before the RVs.
These systems are designed to make an orderly, but rapid
shutdown of the system. To ensure operability it must be
maintained and tested at a frequency that assures 100% reliability.
To do so online is tricky, but doable, if the system is designed
properly. Thats where we Chemical Engineers come into play with
the computer and instrument folks. We think of the various
scenarios where we want the shut down to occur. We think of how
torecognize the systems (remember false shut downs are really
frowned upon). Then we think of the sequence that makes the most
sense and causes least harm to the system. Then,after we build it,
we test it. The frequency is really dependent on the reliability of the
individual components. To test each component, it must be isolated
from the blow down system, then the false signals must be
generated, then the actions of the ESD must be observed and
documented to ensure they are correct. Modern Programable Logic
Systems (PLC) offer options to logically test shut down systems.

This is no substitute for a full functional test that should be

performed during a shutdown.

Mechanical Integrity
Control systems, including monitoring devices and sensors,
alarms, interlocks
Similar to the Emergency Shutdown Systems (ESDs), the control
systems, monitoring devices, sensors, alarms, etc. all must be
tested and monitored in a similar fashion to the ESDs. Isolation of
the device, false signal, observe actions. Itmay be tedious but a
very necessary process to ensure the health and well being of your
facility. These tests are typically undertaken when there is an
identified problem or when a unit is being operated for an extended
time between shutdowns.

Mechanical Integrity
Written procedures - Establish and implement to maintain
on- going integrity of process equipment
Written procedures are required for operating start-ups, shut downs,
as we have already seen in a previous lecture, but the mechanical
integrity aspect of maintenance also requires detailed procedures to
ensure that the OEM (original equipment manufacturers) guidelines
are met. This makes the presumption (and a good one at that) that
the folks who make the equipment know best how to maintain it.
Much like the car you drive has some minimum guidelines for oil
changes, air cleaner replacements, etc., so does the maker of all
the equipment used in industry. A good idea is to have an audit


process in place to ensure that the OEMs recommendations are

being met.

Mechanical Integrity
Maintenance procedures - Train in an overview of the
process, its hazards, and safe work practices
In addition to the requirement to have maintenance procedures, the
maintenance staff also is required to be given knowledge of the
processes they work on.
Where that might be a crucial requirement? Maintenance workers
on an HF alkylation unit, where contact with HF could be fatal, must
know the process and the dangers. Other examples would include
any process that deals with toxics, pyrophoric materials, or strong
acids or bases.
The overview should talk about the hazards, how to recognize that
you have been exposed to the hazard, and what are the best
practices for safely working with the hazards to prevent injury.

Mechanical Integrity
Inspection and testing
Perform on process equipment
o Follow recognized and generally acceptable good
engineering practices (RAGAGEP)
Frequency per manufacturers recommendations, good
engineering practices, and prior experience
Document for each inspection and test performed:

o Date
o Person who performed
o Equipment identification
o Description of inspection or test
o Results of inspection or test
Inspection is the window to what is happening on the unit. The units
are built with a lot of assumptions on where wear will occur and
how rapidly it will take place. However every unit is truly unique,
that is where a bulletproof inspection program comes in. Have
qualified inspectors, give them the right tools, and then listen to
them and follow-up accordingly.
Here is authors experience in his own words:
I still remember taking over a unit complex and having two TARs
within the first five months. On one of the units the inspection
program was less than robust, or the unit leadership did not listen
to the inspectors, or some other lame excuse, but the bottom line is
that once the unit was down we found that a very large line (main
transfer line ~ 36 diameter) was well below discard thickness. The
line was at elevation, was a normally long lead-time piece of
equipment and was a secret! I was not a real happy camper to find
this out. First, the thought of an on-line line failure that would have
led to a loss of containment and subsequent fire jumped out at me.
Happily that did not happen. But second, I would not start that unit
up until that line was completely replaced. If the TAR went beyond
its expected duration I would look like a chump and it would cost a
lot of money for the unscheduled downtime. Neither of which was
appealing to me. To make a long story short, we got the pipe, got it
installed, got it hydrostatically tested, and commissioned all within
the original time frame of the TAR, but not without a lot of blood,
sweat, and tears. AND, the worst part is - it was avoidable.

The point is: Just keep a competent inspection program in place.

Of course, the inspection program includes all pieces of process
equipment, vessels, lines, rotating equipment, everything. To do
that well you must have a systematic process that catches all
possible failure modes and does so in a timely fashion. To be sure it
is a daunting task, but taken systematically and consistently it is
The last thing shown on the slide is the documentation process.
Though it is obvious, pay attention to the last bullet. Interpreting
the results of the test and comparing that result to the last test or
tests is the key to knowing where you are. If the rate of corrosion is
steady, then thats fine, but if you see a jump up or down in the
rate, you must understand why and adjust accordingly. This may
mean that you increase corrosion inhibitor, reduce rates, or
accelerate your maintenance program to compensate. In any event,
know why things change. That is a key to success.

Mechanical Integrity
Equipment deficiencies:
Correct deficiencies outside acceptable limits before further use, or
in a safe and timely manner when necessary means are taken to
assure safe operation
This seems to be an obvious requirement, but you would be
surprised how many companies (or people within those companies),
if left to their own devices would cut corners to save money and
look good for the immediate timeframe. But, companies should not
be in business for the short run. If that is their plan, it will be selffulfilling. Go for the long-term solution and you will be in business

for the long haul. If anything is to be done, do it right, do it once

and the results will be what you can live with, as will the others you
work with.
Most equipment talks to you every day. Rotating equipment in
particular. Listen to it, measure the vibrations and temperatures
and take it out of service before failure and you will be richly
The author: I take great pride in my time in operations that I never
had to send in a piece of equipment on an E order. That means an
emergency order would have been necessary because I didnt listen
to the equipment. We routinely ran the spare equipment and
repaired the equipment before failure. Its a must better way to run
the business.

Mechanical Integrity
Quality Assurance:
Assure the equipment fabricated is suitable for the process
Assure equipment properly installed and consistent with
design specifications and manufacturers instructions
Assure maintenance materials, spare parts, and equipment
are suitable for the process intended
Again, the bottom line here is: do it right the first time! Make sure
you have the appropriate process equipment in the appropriate
place. Do it according to the manufacturers specifications. No
compromises, ever! You can live with that. That however doesnt
mean you dont use your chemical engineering fundamentals to
evaluate the proposals to verify that what you are being told is

correct and makes good engineering sense. The author was once
told that an acoustic vibration damper was needed on the suction
side of a reciprocating pump to avoid damage to the pump. When
asked how the damper worked, he was told it utilized the
compressibility of water and the internals of the device to damp out
suction side vibrations. The ad showed spring water meaning that
water was a little springy!! Huh? What?? Water is and always has
been incompressible. This was simply a ploy to sell unneeded
equipment that would serve no purpose. Use your good engineering
judgment every day on the job.
Slide 27

So far we have assumed that all jobs would always be done. But in
the real world sometimes you have to evaluate what can wait until
the next opportunity for repair. So if all cannot be done, to do this
systematically you must risk rank the jobs. This shows a risk matrix
that the author used extensively. There is nothing magical about

this or any other risk matrix. What is magic is that it is used to

prioritize work.
The easy way to do this is to make a list of the jobs with two
columns. The first lists the consequence of NOT doing the work now
and the second lists the probability that the consequence will occur.
Obviously this is a judgment call so you get in the affected people
and those who know the work. Get their input and go through the
list. The final result will be asystematic list of the highest risk to the
lowest risk. Then you can make an informed judgment on what you
will defer. The very act of reviewing the list in this manner will let
you know where the work list can reasonably be ended. If after you
go through this process you may decide that you must take an
extra few days to complete more work than had been originally
allocated, but you will have a sound basis for that extension. That
extension will cost your company money, but by not doing some of
the work, this exercise may tell you (and your company) what the
cost of not doing the work may ultimately be.

Weve seen the extent of equipment covered
We now know how procedures are utilized both mechanically
as well as operationally
Weve seen how inspection is an integral part of managing
our business
We know how equipment deficiencies are handled and to
what standards
We understand how people fit into the equation
We understand how we can use risk ranking to better
manage our work.

Contractor Management
The objective of this lesson is to understand how contractor
management fits into the overall PSM requirements

At the end of the day, you will be able to:

Know the expectations of the PSM regulation
Know the responsibilities of the employer
Know the responsibilities of the contractor employer
See an example of a management system designed to
proactively address contractor safety

The 14 PSM elements

The PSM program has 14 elements, as we have seen. Each of these
elements has standardized procedures. Such standards need to be
strictly adhered not just to follow legal tenets, but also for the
safety of people, equipment and environment.
Today we will learn about Contractor Management.

Contractor Safety applicability

Who is a contractor? Why are they required?
A contractor is someone who is not a direct employee of the
organization, is brought in to work at its premises. Contractor is a

person or company that is hired by another employer to perform

temporary work. The nature of the work is should be specific and
well defined. The contract worker works for a definite time period.
Contractors are required because companies may not have skills or
experience to work in a certain area of their operations. This work
may not be an ongoing process; maybe it is of temporary nature.
So it makes more practical sense to hire workers trained in that
specific type of work.
Contractor selection is an important responsibility of the hiring party.
Typically, the contractor OSHA log is reviewed to understand the
safety record of the contractor. The OSHA recordable record of a
contractor has a big impact in contractor selection.
At any worksite many different types of contract workers may be
present. The main reason for their presence at a specific jobsite is
because they possess a particular skill for the job at hand or some
specialized knowledge. They may run the facility, or do some skillspecific job. Some may work for a long time; some may be
contracted for a short time. Sometimes there is need for increased
staff at a short notice.
The safety of all contract workers is the responsibility of the
employer. PSM includes specialprovisions for contractors and their
employees to emphasize theimportance of everyone taking care
that they do nothing to endangerthose working nearby who may
work for another employer.
PSM, therefore, applies to contractors performing maintenance
orrepair, turnaround, major renovation, or specialty work on or
adjacentto a covered process. (A "covered process" is a process
that contains a regulated substance in excess of a threshold

It does not apply, however, to contractorsproviding incidental

services that do not influence process safety,such as janitorial, food
and drink, laundry, delivery, or other supplyservices.

Contractor Safety Employer Responsibility

Employers are responsible for the safety of all onsite workers,
permanent and contract. Work can be hired out, not the health and
safety obligations of the employers. Employers need to:
Obtain and evaluate information regarding the contract
employer's safety performance and programs
Inform contract employers of the known potential fire,
explosion, or toxic release hazards related to the contractor's
work and the process
Explain to contract employers the applicable provisions of the
emergency action plan
Develop and implement safe work practices to control
entrance, presence, and exit of contract employers and
contract employees
Periodically evaluate the performance of contract employers
in fulfilling their obligations
Maintain a contract employee injury and illness log related to
the contractor's work in process areas.
Before hiring a contractor an employer should create a policy
detailing contractor relevant work and safety standards, contract
procedures, required information from the contractor, as well as the
date when the policy needs to be reviewed.

If there is a separate contractor safety program, that should be

communicated to the contractor. List of potential hazards and
hazardous substances should be given in writing.
Apart from the responsibilities of the employer, there are expected
health and safety obligations of the contractor too. These should be
informed in writing.
The employer must explain to contract employers the applicable
provisions of the emergency action plan; develop and implement
safe work practices to control the presence, entrance, and exit of
contract employers and contract employees in covered process
areas; evaluate periodically the performance of contract employers
in fulfilling their obligations; and maintain a contract employee
injury and illness log related to the contractors work in the process

Contractor Safety Contractor Employer

The direct employer of the contract workers too has responsibility
towards ensuring that his workers safety is protected. The
contractor employer must assure the client that each contract
Is trained in work practices to safely perform his/her job
Is instructed in the known potential fire, explosion, or toxic
release hazards related to his/her job and the process, and
the applicable provisions of the emergency action plan
Has received and understood the training required by this
paragraph. The contract employer must record the identity of

the contract employee, the date of training, and the means

used to verify that the employee understood the training
Follows the safety rules of the facility
Advises the employer of any unique hazards presented by
the contract employer's work, or of any hazards found by the
contract employer's work
The employer has to instruct workers in safe work practices and
safety rules of the facility, as well as in the known potential fire,
explosion, or toxic release hazards related to his/her job and the
process. Also ensure that the worker is fully aware of the relevant
emergency action plan.
Once the direct employer has confirmed these particulars, s/he
needs to assure the prospective contract employer that the worker
has the necessary expertise and training, and fully comprehends job
specific hazards and emergency action plan.
The employer has to confirm that the worker has received and fully
understood training required bythe regulations.The contract
employer must record the identity of the contract employee, the
date of training, and the means used to verify that the employee
understood the training.
The employer needs to assure that the safety rules of the facility
are followed.
The contract employer has the responsibility to advise the potential
employer of any unique hazards presented by the contract
employer's work, or of any hazards found by the contract
employer's work. This will safeguard both the contract workers and
other workers close to the potential hazard.

Contractor Safety program elements

The following are the essential elements of a sound contractor
safety program. We will study more about each element
Contractor Selection
Contractor Safety Committee
Pre-Job Safety planning
Case Management
Reward and Recognition
Drug Screening policy

Contractor Safety
Contractor Selection
The first step in a meaningful contractor safety program is the
selection of the contractor.
Having defined selection and evaluation criteria is essential in hiring
and maintaining contractors with excellent safety performance.
Selection criteria need to be based on OSHA incident rates and
insurance experience modifier rate (EMR) that are consistent with
the safety objectives of your company. In addition, several absolute
criteria must be met to assure that a contractor has a safe work
history. A thorough review of a contractors safety program and

procedures is another indicator of the culture and commitment of

that company to safety.
The selection process must also contain a de-selection process and
a re-admittance program. Contractors by definition have many
employees so the evaluation should start with the overall safety
history of the contractor in general, then focus on the people
proposed for your individual company. The selection process must
include both steps to be successful.
A requisite for selecting a contractor must include a review of their
performance with other similar companies. The contractor should
provide their metrics of OSHA incident rates and their EMR. EMR is
the Experience Modifier Rate. The "experience mod," as it is called
in the insurance industry, is a numerical expression of a company's
accident and injury record compared with the average for the firm's
industry. An experience mod of 1.0 means a company has an
average safety record, while an experience mod of 0.80, for
example, means a company has a good safety record that merits a
20 percent discount. An experience mod of 1.20 means the firm's
accident rate is above the industry norm and raises a company's
costs by 20 percent.
This EMR is calculated using payroll and loss data for the oldest
three of the last four years. Loss data includes paid claims as well
as a "reserve" for all outstanding claims. These reserves are usually
well established by the time they are used in these calculations, at
least one year after the policy has expired. However, if claims are
eventually settled for a different amount than reserved, the EMR will
be adjusted accordingly.
The OSHA rate is the total number of OSHA reportable accidents
and injuries divided by 200,000 times the total number of covered
employees. So, there are two metrics to consider, one relative to

the others in their industry, the EMR, and the second, an absolute
number. Again, these are minimum requirements for consideration
of the contractor. A subtle aspect of getting these numbers from the
contractor is an independent verification that the numbers given are
accurate. Should that not prove to be the case the contractor under
consideration should be a very hot potato and dropped accordingly!
Even after this selection process an incident can occur. If so, an
immediate review of the specifics must be undertaken to determine
the root causes of the incident. If the root cause was a failure of the
safety systems in place, they must be identified and corrected. If
the incident was a result of lack of oversight by the contractor, then,
if warranted, the contractor must be discharged. A follow up
process may be undertaken to determine if the lack of oversight is
addressed and corrected by the contractor, then a re-admission
process may be undertaken, if subsequent metrics show that the
steps taken correct the problem.

Contractor Safety
Financial terms and conditions
Job specifications including quality requirements
o Adherence to appropriate safety regulations
o Accident and near miss reporting requirements
o Employee safety training and certifications
o Safety representatives, safety meeting requirements,
and job safety assessments
o Plans and procedures requirements

Timing, schedule and milestones

An obvious aspect of hiring any contractor is the contract, the legal
document that cements the deal. Minimum elements of any such
contract should include the financial terms and conditions, the job
specifications that clearly spell out the quality requirements. The
minimum safety elements beyond the EMR and OSHA rates should
be spelled out. These include assurance that the contractor will
adhere to all appropriate safety regulations, local, state, federal,
and company. Accidents and near misses may occur and should be
reported promptly. A formal mechanism should be in place for this.
Further both should be analyzed for appropriate corrective actions.
The employees documentation of safety training and certifications
should be included in the contract. Regular safety meetings should
be a normal part of the business and the responsibilities of the
contractor and employer safety representatives should be clear.
Finally the plans for work and the procedures to be followed should
be spelled out both for the work itself, but the safety practices as
well. Finally, the timing of completion of work and the milestones
should be agreed upon up front.

Contractor Safety
Top performing contractors have extensive training programs.
Employees are trained on the safety policies and procedures. Job
skill training such as welding or pipe fitting is also provided by many
contractors. In addition, many companies are starting to train their
supervisors in root cause analysis and accident investigation
techniques. Process unit and facility specific training is a key area.

Most pacesetter organizations are actively involved when training

contract employees on site specific issues.
Active owner involvement in this area is the key to success.
Owners must know that the training program is in place, know that
it is fit for purpose and is continually updated if conditions dictate
the need. This will have been a part of the initial contract, but the
owner needs to have a process in place to ensure it is adhered to.

Contractor Safety
Contractor Safety Committee:
A joint contractor/owner safety committee is an essential element of
the safety program. Joint committees foster open communication
between parties resulting in enhanced safety performance.
Contractors are typically brought in to perform the most dangerous
work. Examples of such work are catalyst change outs in reactors,
hot taps, and a variety of work that is not regularly performed.
Changing catalyst in a desulfurization process means that the
catalyst (typically a nickel/molybdenum or cobalt/molybdenum)
becomes pyrophoric while on stream and must be removed under
IDLH (immediately dangerous to life and health) conditions. This in
other words is an oxygen deficient atmosphere that is essentially
100% nitrogen.
Since each location is unique the communication between the safety
committees and the workers is necessary to ensure all potential
problems are communicated. Once inside a reactor that is probably
quite warm and while wearing 100% breathing equipment (implies
limited visual abilities) any slip could mean problems. So, the
proper procedures must be in place and fully practiced. Both the

contractor and the company need to be fully aware and following

the procedures.
Company and contractors should ensure that risks are minimized,
when contractors are engaged, by diligent application of proven
standards of risk management policies, work processes, systems,
and procedures which fully integrate health and safety evaluation,
planning and design. All contractors must to commit to and abide by
these standards to maintain superior levels of health, and safety

Contractor Safety
Pre-Job Safety Planning:
Careful planning of work assures that the work is performed
efciently and safely and safety planning is a critical part of work
planning. Work planning ensures the scope of work is understood,
appropriate materials are available, all hazards have been identied
and mitigating efforts established, and all affected employees
understand what is expected of them.
Pre-planning a job is necessary to performing the job safely. Big
construction projects and turnarounds need to be pre-planned with
safety in mind. Master safety plans that identify potential hazards
related to specific job tasks are essential. For smaller jobs or dayto-day tasks, job safety analysis or similar techniques are employed
to identify the hazards that can be encountered. Owner job
representative participation in safety pre-planning is required.
All personnel working for contractors must complete an appropriate
safety induction prior to starting work. The organization should
provide a site-specific induction and the contractor is responsible for

providing all other training that may be required. A task-specific Job

Hazard Analysis will help identify unique hazards associated with a
particular task. These should be prepared by the employer.
For long-term or ongoing contracts, refresher training for all
contracted employees must be provided by the contractor at least
Pre-job safety planning also means developing written safety
guidelines for all employees to follow. The safety plan should be
clear, and explained to all the employees AND understood by them.

Contractor Safety
Case Management:
The top performing contractors and pacesetter organizations
aggressively manage all injury cases. Once an injury occurs a
trained person typically follows the case to the end. Top performers
work closely with their medical providers to train them on all
aspects of working in the specific industries. Informed medical
providers will be sensitive to the needs of their client and will have a
better understanding of requirements when it comes to the OSHA
record keeping.
It is required to establish rehabilitation goals and the steps to
achieve these objectives and return to work.
A practical injury management plan specific to the workplace could
be developed.


As mentioned earlier, all injuries should be analyzed to determine

the root cause of failure and corrective steps taken to prevent reoccurrence.

Contractor Safety
Reward and Recognition:
Most contractors believe that some type of reward and recognition
program is essential to maintain the focus on safety and achieve
good performance. These are considered powerful tools used by
contractors to motivate their employees.
Some of the contractors believe that monetary incentives are critical
to success as long as the program is structured to be separate from
regular pay and provides incentives to maintain good performance
over the long term of a given project. Other contractors prefer a
reward system that is based on non-monetary gifts such as gift
certificates, jackets or other small items. These rewards are
typically given for defined milestones such as safe days worked.
Reward and Recognition is a way in which contractors can influence
the major drivers of employee engagement of safety. On the job
you will see many different types of R&R programs, but the goal of
all is to heighten awareness and performance in on the job safety.
When reinforced by the owner the program gets enhanced results.

Contractor Safety


Top performing organizations hold their supervision and

management accountable for safety performance. The best
performing companies consider safety as a condition of employment.
Promotions, bonuses, and pay raises are often tied significantly (2530%) to the safety performance of the manager's or supervisor's
team. The safety pacesetters do not tolerate repeat or continuous
poor safety performance from supervisors or managers.


Contractor Safety
Drug Screening Policy:
The misuse of alcohol and other drugs is extremely dangerous
especially in chemical industry. The risks are plenty and safety
could be compromised by such employees. As contract employees
undertake jobs with high risks it is even more pertinent that they
are regularly tested for drugs and alcohol.
The top performing contractors rigorously screen their employees.
This includes 100% drug testing for pre-employment qualification.
Random drug testing programs that have severe penalties for
violations ensure that the intent to keep drugs out of the work place
is followed. A good drug testing policy will encourage early detection
of a substance abuse problem, facilitate early intervention, and,
when appropriate, provide support for the employee to deal with
the problem. It will also ensure safer work place.

Contractor Safety
A consistent audit program is a critical element to excellent safety
performance.Top performing contractors typically follow an audit
program regardless of the requirements of the owner. Audits that
address both site conditions and safe behaviors are most beneficial.
Periodic workplace safety audits prevent injuries and accidents.
Audits are important to effective safety management as a
continuous process of workplace safety planning, analysis, and
correction when needed.


Most injuries in the workplace occur due to unsafe behaviors rather

than unsafe conditions. Audits focus on safety programs and
behaviors while safety inspections focus on the facility, equipment,
and tools. Audits help analyze contract workers understanding and
compliance with safety procedures and programs.
Audits should include observations of employee working habits
doing a variety of job tasks. Are the contract workers following
safety procedures? Are they wearing required personal protective
equipment? Are they lifting properly and following good ergonomics?
Immediate feedback is an important aspect that many companies
are starting to implement. Feedback on observations and audits
should include both positive items as well as opportunities for
The observations and recommended corrective items in a safety
audit should be documented.The results of audit should be shared
with employees. This includes the positive observed behaviors,
observations that required improvement, and information on what
corrective actions were taken.
Owner participation is beneficial to a successful contractor audit
Formal and informal audits that are documented and followed-up
are key areas to improving contractor safety performance. Audits
are used to monitor behaviors, compliance with safety policies, and
physical conditions of the work site.
It is important to monitor contractor performance and compliance
with safety and health requirements on an ongoing basis. The
frequency of monitoring should depend on the level of risk
associated with the work that is being performed. Regardless of the
type of audit, safety audit is advisable to be carried out by qualified


assessor properly trained to carry out the task objectively,

impartially, and effectively.

In this lesson we have:
Reviewed the PSM regulation
Reviewed the responsibilities of the employer
Reviewed the responsibilities of the contractor employer
Reviewed an example of a management system that
proactively addresses contractor safety


Emergency Management and Response

To understand the fundamental elements of an emergency
management and response plan, which must be in place to
meet the requirements of PSM
To begin to build understanding for additional learning and
sensitivities important in the execution of the plan

At the end of the day, you will be able to:

Know the key elements of an emergency management and
response plan.
Understand the philosophy behind and some details of a
typical plan, so that you will be able to actively participate as
a responder.

Todays Roadmap
What could be an Emergency?
Who is involved in an Emergency?
Framework/Philosophy in Support of Emergency Response
Key priorities that must be addressed
Typical Scope of ER Team
What is in the plan?
Role of ER Support Center
Summary and Homework

What could be an Emergency?

Death and multiple injuries
Anemergency is a situation that poses an immediate risk to
health, life, property or environment. Most emergencies require
urgent, immediateintervention to prevent a worsening of the
situation, although in some situations, mitigation may not be
possible and agencies may only be able to offer palliative care for
the aftermath.
In order to be defined as an emergency, the incident should be one
of the following:
Immediately threatening to life, health, property or
Have already caused loss of life, health detriments, property
damage or environmental damage
Have a high probability of escalating to cause immediate
danger to life, health, property or environment
Anemergency may also be defined as a condition where life,
health or property is in jeopardy and the prompt summoning of aid
is essential.
Emergencies include:
Multiple injuries,
Fire Hazards
Hazardous Materials Incidents

Adverse Weather (tornado, floods/rains, winter weather)

Suspicious Packages and Biological Threats
Bomb Threats
Workplace Violence
Power Failure
Working alone
Transportation Incidents
Emergency management is the process to prepare for, mitigate,
respond to, and recover from an emergency.
Who is involved?

Group Exercise 1:
Generate a list of potential events that could activate a
corporate emergency response plan
List, from your perspective, who would be the key people
involved in managing each emergency, from both inside and
outside the company.
Keep your lists for further reference.
Any of the emergencies listed above can activate a corporate
emergency response plan.
The key people involved in managing emergencies will depend on
the nature and severity of the emergency. The EAP should also
state the degree of involvement of facility employees for various
types of emergencies. Local emergency response personnel may
handle some emergencies such as firesand explosions. This should
be clear in the written EAP. At such times the corporate the
emergency action plan will focus on evacuation and notification.

Framework for EM&R

Based on corporate values and reputation management
Provides tangible management system to manage risk
Is principle based and encourages right behaviors
Provides basis against which actions can be subjected to
monitoring and performance review
Reduces risk or increases opportunity for value
Emergency management is the generic name of an interdisciplinary
field dealing with the strategic organizational management
processes used to protect critical assets of an organization from
hazard risks that can cause events like disasters or catastrophes
and to ensure the resiliency of the organization within their planned
Emergency Management and Response framework is based on
corporate values and reputation management. Organizations can
have effective emergency risk assessment, mitigation,
preparedness, response and communications, as a well planned
strategy. Excellence even in this crucial area will enhance the
organizations image and credibility with employees, customers,
suppliers, and the community as a whole.
The purpose of an EM&R is to facilitate and organize employer and
employee actions during workplace emergencies. Well-developed
emergency plans and proper employee training (such that
employees understand their roles and responsibilities within the
plan) will result in fewer and less severe employee injuries and less
structural damage to the facility during emergencies.
This plan should be principle based and encourage right behaviors.

Emergency management programs tend to be decentralized in

execution. In order to guide the multifarious activities that are
needed to support the program, it is necessary to have a clear
process for coordination. The desired end result of the emergency
management program should be safety of people first, then other
The EM&R provides a benchmark for actions during an emergency.
Such actions can be monitored and subjected to performance
The risks too are reduced. Emergency management and response is
proactive and ensures that the strategic organizational management
processes are used to protect critical assets of an organization from
hazard risks that can cause events like disasters or catastrophes
and to ensure the resiliency of the organization within their planned

Level of Effort in Emergency Response







* Percentage may change based on incident; however, most

effort should be in prevention.
Planning is prevention and preparedness! The first objective of the
plan should be to do everything possible to prevent emergencies. In
fact any Emergency Response Plan the majority of effort must be
directed at prevention of emergencies. Prevention means such
actions that are cost effective and substantially reduce the risk of

future damage, hardship, loss, or suffering in any area affected by a

major disaster. Prevention saves lives, reduces property damage,
and helps to preserve the economy in the disaster area, thus
reducing disaster assistance costs.
The EAP should next address preparedness in case of emergencies.
Preparedness is planning how to respond should an emergency
occur, and working to increase resources and the ability to
effectively respond. Preparedness involves actions that will improve
the speed and coordination of the response to an emergency.
Emergency pre-planning and training will make employees aware
of, and prepared to implement proper actions. Emergency
preparedness, or the employer's tertiary (third) lines of defense, are
those that will be relied on along with the secondary lines of
defense when the primary lines of defense that are used to prevent
an unwanted release fail to stop the incident.
Response is the period of time shortly before, during and after a
disaster, during which activities are conducted to save lives and
minimize damage. Activation of the local Emergency Operations
Center (EOC), search and rescue, and reception and care of disaster
victims are some of the response actions.
Responders may be working under very hazardous conditions and
therefore they should be led by a person properly equipped to do
their assigned work safely, and fully trained to carry out their duties
safely before they respond to an emergency.
Recovery is the period of time when the immediate threat to life
and property has passed, and cleanup, repair, and restoration
activities become a priority. This stage will continue until the
organization/ community is returned to normal or near-normal

operations. Debris cleanup, damage assessment, and reconstruction

are some recovery measures.

Philosophy for EM&R

Over react
Stand Down


Group Exercise 2:
Discuss the 4 philosophical concepts of EM&R and jot down what
your team thinks is a good definition of each of the 4 philosophies.
Be prepared to share. You have 5 minutes.

Priorities during an Emergency

1. People employees, contractors, suppliers, customers,
communities, responders
2. Environment Air, water, land, spillages, other areas of
3. Property company, partners, community
4. Business supply to others, production, reputation
The priorities during an emergency are obviously first people, then
environment, property and business in that order.

Levels of Crises

Typical Response Team

A way to think about how this structure works is to relate it to how

your local fire department functions. It has a pre-plan in place for
say a structure fire on several levels. If that structure fire is a
detached garage a single fire apparatus will most likely respond and
put out the fire. It that structure is an attached garage to a single
family dwelling, several fire apparatus will respond and put other

responders on alert. If that fire is in close proximity to other homes,

the other responders will roll to help contain and limit the spread of
the fire. If that garage is a part of a large multi-family dwelling
such as an apartment complex, all of the previously mentioned
responders will roll and the mutual aid departments will also be
alerted and they will respond.
This is the logic for Emergency Response Plan. The local incident
management team responds to local confined issues. If the incident
is broader than simply a local issue the business support team is
also activated and will provide assistance. Finally, if the incident is
broad and could affect the whole corporation, the corporate team
will be activated to provide assistance and support. The key to this
type of structure is to have pre-plans in place that envision what
could go wrong and anticipate the appropriate response and have
the required resources in place. These plans must be periodically
reviewed and verified by independent reviewers to provide
assurance to the local facility, the company, and the corporation
that they are sufficient and will met the anticipated needs.
So, the activation of each Emergency Response Team depends on
the size of the incident or the level of crisis. Those responsible at
higher levels provide support to the local team. If required, teams
of experts may be formed at the higher levels and dispatched to
provide local support (Incident Management Teams).
Not all of the teams need to be active in each incident. The
Emergency Response structure is meant to expand and contract as
the scope of the incident requires. For small-scale incidents, only
the incident commander may be assigned. Command of an incident
would likely transfer to the senior on-scene officer of the responding
public agency when emergency services arrive on the scene.

Command transfers back to the business when the public agency

A multilayered emergency communication system should be in place
to keep everyone informed in urgent situations.

What is in the Plan?

An emergency action plan must be in writing, kept in the workplace,
and easily accessible to employees for review. However, an
employer with 10 or fewer employees may communicate the plan
orally to employees.
The Plan must have the following elements with details:
Preplan and practice frequently(outside observers for
practice to evaluate and find gaps)
Who should respond internally? Externally?
What are qualification requirements for all responders?
What triggers escalation to the next level?
What specific equipment will be needed for each scenario?
What is backup plan for necessary resources, i.e. Water,
foam, manpower, communications, food.
Hazard analysis helps to plan/build scenarios
Include natural disaster potential, environmental impact
Include governmental notification process
Need relationships BEFORE the emergency!
Training for each should be a part of the EM&R plan
Drills for training should include: Tabletop Exercises, Drills,
and Limited or Full-Scale Exercises
Full scale exercises should include outside observers to note
effectiveness and identify gaps


Drills should include outside responders when possible, both

governmental as well as contractors
Plan should include recovery from event and internal
Planning and regular practice on emergency response actions
should be part of the safety policy of an organization. Detailed
protocols and procedures should be practiced.
The names of internal as well as external responders should be
decided and informed. The names of those individuals must be
listed in the plan. These responders should have proper training in
the task and should have complete knowledge of the response
expected from them.
The different levels of emergency needs to be determined based on
the anticipated severity of an incident. The notification recipients
are decided beforehand for all emergencies. For example a small
fire hazard may be classified as Level 3 emergency when it is
limited to a detached office or workshop. This turns into Level 2
when the fire is likely to spread to nearby structures and may be
dangerous to life and environment. When the hazard of fire can
spread to the entire facility, it may escalate to Level 1.
In level 3 the system is at or near the emergency limits, but the
operators and supervisors believe that they are able to return the
plant to normal conditions using normal operating procedures and
techniques. It is critical that they understand the exact nature of
the problem if they are to be successful in this.
The higher level of an emergency occurs when the safety
instrumented system and other high reliability, automated devices
(including relief valves) take over. At this point in time the role of
the operator is simply to secure the unit as it shuts down.


In the mostsevere stage of an emergency, the situation is out of

control. There may be a large fire or chemical release to contend
with. The full emergency response system is needed to minimize
injuries, environmental damage and loss of equipment.
The equipment to contain or to fight each type of emergency needs
to be predetermined and their location identified in the preplan for
every participants knowledge. The equipment may include fire
protection and suppression equipment, communications equipment,
first aid supplies, emergency supplies, warning systems, emergency
power equipment, and decontamination equipment. Also
contingency planning and back up measures for water, foam,
manpower, communications, food should be established.
It is important to identify any chemicals that require special
treatment during the course of an emergency. For example, the use
of water on some chemicals may cause them to ignite. In these
cases, they must be controlled with other chemical agents.
Hazard analysis is required to determine potential emergencies.
Hazard initiating events can be identified, listed and analyzed when
conducting hazards analyses and preparing a risk management
plan. Factors to be considered when identifying potential accident
scenarios include for example in case of toxic chemical release, the
location of a release, its magnitude, wind direction and the number
of people who may be in the area at the time of the release.
It can be useful to model some of the scenarios, particularly the
release of hazardous chemicals so that, if the accident actually does
occur, the emergency responders will have some idea as to the size
of the incident with which they may be expected to cope.
Although the emergency action plan will address all emergencies in
some way, the plan will focus on the most likely events.


A description of the alarm system to be used to notify all people

(including disabled employees) to evacuate and/or take other
required actions. The alarms used for different actions should be
distinctive horn blasts, sirens, or even public address systems.
In case of an evacuation proper procedures and escape routes
should be already known to employees. They should know who is
authorized to order an evacuation, under what conditions an
evacuation would be necessary, how to evacuate, and what routes
to take. Evacuation procedures often describe actions employees
should take before and while evacuating such as shutting windows,
turning off equipment, and closing doors behind them. Exit
diagrams are typically used to identify the escape routes to be
followed by employees from each specific facility location.
There should be procedures for employees who remain on site after
the evacuation alarm sounds, if required, before evacuating.
Response to natural disasters should be included in the plan, along
with how to prepare for, respond to and recover from such
disasters. Environmental events, such as earthquakes, are prone to
creating multiple, simultaneous emergency situations.
There are established reporting requirements for hazardous
substance releases and oil spills to identify when the federal
government should be notified. States also may have separate
reporting requirements. These should be strictly followed. Identify
applicable federal, state and local regulations, fire codes,
transportation regulations, zoning regulations, and corporate
policies to enable correct procedures for reporting.
Meet periodically with local government agencies and community
organizations. Inform appropriate government agencies that you
are creating an emergency management plan.While their official


approval may not be required, they willlikely have valuable insights

and information to offer.
Some disaster responses could benefit from media sources to notify
concerned people, relatives, etc. Such relationships need to be
cemented before any disaster. The press and the public must be
informed of what is going on at the site, particularly if anyone is in
any danger. Facility management should take the initiative when
communicating with the public, and they should be open and as
forthright as possible. Telephone lines and other links for public
communication must be available, and they must have sufficient
capacity so that they do not become jammed whenunnecessary
calls occur (and they will).
Training for each activity as emergency response should be a part
of the EM&R plan. Drills for training should include: Tabletop
Exercises, Drills, and Limited or Full-Scale Exercises. For example in
a conference room setting, describe an emergency scenario and
have participants discuss their responsibilities and how they would
react to the situation. Based on this discussion, identify areas of
confusion and overlap, and modify the plan accordingly.
The drills should include outside responders when possible, both
governmental as well as contractors. Everyone who works at or
visits the facility requires some form of training. This could include
periodic employee discussion sessions to review procedures,
technical training in equipment use for emergency responders,
evacuation drills and full-scale exercises.



Role of the ER Support Center

During an Emergency: provides facilities, infrastructure,
input to management
During Normal Operation: provides training, runs exercises
and drills, improves infrastructure and keeps it working and
After an incident: may coordinate the corporate work around
investigation and systems fixes
The ER support center serves as a centralized management center
for emergency operations. Here, decisions are made based upon
information provided by the incident commander and other
Regardless of size or process, every facility should have a
designated area where decision makers can gather during an
emergency. The center should be located in an area of the facility
that is safely away from potential hazard areas. An alternate
location should be designated in the event that the primary location
is not usable.
Each facility must determine its requirements for such a center
based upon the functions to be performed and the number of people
involved. Ideally, the emergency response center is a dedicated
area equipped with communications equipment, reference
materials, activity logs and all the tools necessary to respond
quickly and appropriately to an emergency.
During normal operations the ER support center should provide
general training for all employees and could include:
Individual roles and responsibilities
Information about threats, hazards and protective actions


Notification, warning and communications procedures

Emergency response procedures
Evacuation, shelter and accountability procedures
Location and use of common emergency equipment
Emergency shutdown procedures
Apart from these activities it is a good idea to carry out orientation
sessions, tabletop drills, walkthrough drills, functional drills, and
even full-scale exercises.
Conduct sessions at least annually:
For new employees during their orientation period
For existing employees when there is a change in their duties
When new equipment or materials or processes are
When emergency procedures are updated or revised
When exercises show that employee performance needs
Communications are needed to report emergencies, to warn
personnel of danger, to keep families and off-duty employees
informed of events at the facility to coordinate response actions and
to keep in contact with customers and suppliers.
After the incident, as soon as the site is secure, and the danger is
over, recovery procedures can start. At this time, the plant may
contain many unexpected hazards, such as the danger of being
struck by falling equipment that has had its foundations weakened
by fire. Or there may be pockets of spilled chemicals in unexpected
places. Some equipment may be contaminated with hazardous
chemicals, and may need to be specially treated before it can be
returned to service, or before the operators or maintenance
personnel can use it.



ER is a system within PSM
An ER system has its own theories and thinking
It must be practiced to be done well
ER extends beyond the walls of the company
May you NEVER have a real one


Incident Investigations
Todays roadmap Advanced Investigations
Corporate Policy
Theories of Incidents/Accidents
Typical Training
What gets investigated
Incident/Accident Causation
Investigating Process Safety Incidents
Action In Case of Incident/Accident
Reporting & Investigation
Recommend corrective actions (if warranted)
A good investigation is likely to reveal several contributing factors,
and it probably will recommend several preventive actions.
What is an accident?
Accident is an undesired event that results in a personal injury or
illness, or damage to or loss of property, process or environment.
What is an incident?
An incident is an event that disrupts the work process and has the
potential to cause injury, harm, or damage to persons, property or

Near misses describe incidents where no property was damaged
and no personal injury sustained, but where, given a slight shift in
time or position, damage and/or injury could have occurred.

Corporate Policy
Despite PSM, there are accidents and near-misses in all industries.
At such times corporate policy and written guidelines to promptly
address the issue and resolve the incident are essential. In addition
to immediate measures to contain the impact and support the
affected employees, it is essential that the policy includes detailed
instructions to report the findings and give recommendations for
identifying and remedying flaws in the system that can produce
catastrophic results.
It is the responsibility of the management that the strategic system
for incident investigation works as intended. Management is
responsible to establish a consistent means of recording accident/
incident investigation information and disseminating corrective
actions throughout the organization, which will be used to prevent a
recurrence of the same or similar accidents.
Management is also accountable for ensuring the organization takes
action and LEARNS! Management systems need to be developed
which will recognize operational weaknesses and implement
preventive measures. The incident investigation plan should be
developed before any such occurrence to be of any use. Who should
investigate, when, where, what and how; all issues should be
decided right in the beginning.

It is a good idea to impart basic accident investigation training

beforehand. Investigation tools, policies, procedures need to be
planned in advance.
Who should inform, who will form the investigative team, who will
be the spokesperson for outside agencies; all should be decided in
the plan.

Theories of Incidents/Accidents
Reasons Theory (covered earlier)
ABC Antecedent, Behavior, Consequence
ABC Antecedent, Behavior, Consequence
ABC is a simple formulato know why a behavior occurs. It helps us
to understand the relationship between theAntecedent-BehaviorConsequence. The antecedent is something that comes before a
behavior (in this case the incident). The incident needs to be
described in a specific operational sense. The consequencethat
follows the incident (behavior) is the reinforcing outcome of the
This is a tool that requires observing the event immediately prior to
the behavior (incident) to determine what triggers the incident. This
knowledge can be used to reduce or eliminate problem behaviors by
intervening beforeor after they occur.

Typical Training
An incident investigation process is crucial to prevent similar
incidents in the future. It is a learning tool. That is why proper

training is essential for the people doing the investigation. All the
people involved in investigation process should have clear
understanding of their part is in the process and how to perform
their assigned responsibilities during an investigation process. They
should know how to carry out the investigation and the tools used
to do this. They should be aware of the process and know how to
complete incident reports and provide analysis of information
For this purpose, all members who have the potential to become
involved in an investigation MUST be trained. It is only proper that
more training is required for more serious incidents. Also
considering the time element the training needs to be as per the
level of investigation. (E.g. If you have the potential to be in the
lowest level investigation, you only need to be trained to that level)
The training and technique must be consistent and should escalate
as sophistication increases.
E.G. ABS Consulting http://www.absconsulting.com/

What gets investigated?

The employer shall investigate each incident, which resulted in, or
could reasonably have resulted in a catastrophic release of highly
hazardous chemical in the workplace.
Accidents of course should be thoroughly investigated. But incidents
that could have caused catastrophe also should be investigated. If
any incident would have resulted in significant damage to life,
property or environment should be investigated with the same
thoroughness as an accident investigation.

Of course a company should investigate any suspicious incident or

circumstance that could have had hazardous impact.

Incident/Accident Causation
The immediate cause of a workplace accident is mostly easy to
determine. However, zeroing in on the system failure that led to the
cause of the accident is tougher. That is the root cause of the
Causal factorsare usually multiple. These can be divided into
immediate and system causes.
Immediate causes: actions, conditions man, machine, material.
System causes: human factors, job factors management systems,
methods or environment.
Root cause analysis (RCA) is a technique that aims to find out the
real cause of a problem and dealing with that, rather than just
dealing with its symptoms. Such a finding is important as it can be
corrected to prevent recurrence of this and similar occurrences.
Normally precursors or antecedents of an accident/ incident need to
be determined. During accident/incident investigation the state of
barriers should be assessed. Consider using WHY questions as
simple RCA.

Investigating an Incident
It is a good practice to establish your system and train people prior
to any incident.

When an incident occurs, activate team within 24 hours; enter into

database without all data. First deal with the emergencies and
ensure medical aid to those who need it. Immediately after that
begin the investigation. Early action when the incident is fresh in
peoples mind will provide a clearer picture of the incident.
Gather evidence. Identify potential sources of information such as
witnesses and any physical evidence, gather the facts about the
incident, document and preserve evidence.
Once evidence is gathered, carry out correct analysis. This means
thorough and systematic evaluation of the findings to identify root
causes. Such an analysis should include technical aspects as well as
human and organizational factors.
Identify antecedents; determine causes and root causes via proper
technique. Develop findings and recommendations for action and
complete incident reports with potential solutions to prevent
recurrence. Share lessons learned.
Ensure standards are improved/updated

Finding Root Cause

Asking why is onemethod used to explore the cause/effect
relationships underlying a particular problem. When done
thoughtfully this leads you to the root cause of a defect or problem.
Simply ask Why for each cause that has been noted. Keepon
asking Why did this happen? until you cannot go any further, until
you get to the root causes. Generally this goal is achieved in five
why questions. However if needed, nothing stops you from going
on. This method is simple, yet effective. Works by itself and can

also be combined with other methods. If there is an existing Failure

mode and effect analysis (FMEA) or Fault Tree Analysis (FTA), you
can use that to guide the WHY discussion.
You could develop a Fault Tree with AND and OR gates.

Investigating Process Safety Incidents

All process safety related incidents are not equal in their impact on
people, property or environment. Depending on their severity, the
incidents are classified into three broad categories(low,
intermediate, high). The levels relate both to the actual severity and
the potential severity. They also should factor in both dollars and
These levels should be consistent with the risk matrix your company
A typical 5X5 risk matrix is attached at the end of this chapter and
is the one discussed in the chapter on risk. (Worksheet attach
with resources)

Lowest Priority Incident

Such an incident affects only a small portion of the facility, or does
not go outside of the facility. There are no casualties. As there is
limited impact, the team that investigates will most likely be from
that facility, and include participants from within that facility, but
outside of the affected unit. That ensures that the team members
know the details that should be included are there, but also that an
outside perspective is present.

The report should be disseminated to all similar facilities so that

learning is shared.

Intermediate Level Incident

Such an incident affects the whole facility or minimally extends
outside the facility. In this case too there are no fatalities.
However as the reach of the incident includes the entire facility and
maybe outside of the facility to some extent, the makeup of the
team should include higher level personnel. Also an objective
approach will be certain if individuals from outside the facility too
are included.

Serious Incident
The seriousness of such an incident is due to inclusion of a fatality
or major offsite impact. It may impact the reputation of the
company, or could have such an impact.
The investigating team should include senior people from the
location as well as corporate. Team reflects the expertise (vis--vis
technical qualifications as well as human resource qualifications) to
understand the incident.
Team knows the right questions to ask andhas the wherewithal to
understand the true root cause. If conditions warrant, outside
experts should be brought in to ensure impartiality.

Actions in case of an incident

In case of an incident, put your companys accident investigation
plan into action. Receive notification and determine severity. Notify
individuals according to your plan (perhaps notify legal and/or
The first formal communication should go out within 24 hours.
Depending on the severity of the incident establish level of
investigation and corporate response. Corporate response should be
legally correct, well thought out, responsible and reassuring.
The level of the incident will indicate the composition of the
investigation team.
Extend all required support to the team.
Manage any external investigators/regulators.

Assigning the Investigation Team

An incident investigation team shall be established.
At least oneemployee knowledgeable in the process involved should
be included in the team. This will give additional expertise and
insight, but will ensure credibility to the results. Employees also
accrue benefit as they learn of potential hazards, and the
experience usually makes them believers in the importance of
safety, thus strengthening the safety culture of the organization.
The incident owner and the supervisor also should be part of the
team. The incident owner may be the line manager in the facility
where the incident occurs. This can provide direction to the

investigation. A contract employee must be included if the incident

involved work of a contractor.
Training may be given where necessary.
Other persons with appropriate knowledge and experience to
thoroughly investigate and analyze the incident could be included.
These are employees in
Maintenance and/or Control Experts
Electrical equipment experts
Transportation experts
Briefings to the management should be done as per the plan and
also as per the level of the incident.

Writing the Report

A report shall be prepared at the conclusion of the investigation to
include at a minimum:
Date of incident
Date investigation began
A description of the incident, which gives the accurate sequence of
what exactly, happened. Also the unsafe act or condition that could
have led to the incident needs to be described in details. Any other
factors that were considered to have contributed to the incident also
are to be included.

Any recommendations resulting from the investigation including

immediate corrective action, long-term corrective action, follow-up
to check if the corrective actions are in place and whether they are
effective; should be a part of the report. Of course the follow-up is
not strictly the teams responsibility, but managements.
Such report should be sent to other sites or facilities that share the
same technology.

Reporting and Investigation

According to the OHSA, injuries and incidents have to be reported if
Result in a death
Cause a worker to be admitted to hospital for more than two
Involve an unplanned or uncontrolled explosion, fire or flood
that causes or has the potential to cause a serious injury
Involve the collapse or upset of a crane, derrick or hoist; or
Involve the collapse or failure of any component of a building
or structure necessary for the structural integrity of the
building or structure.
You should suggest a brief, pithy written report. This may need to
be done under privilege. You also need to assess the level of detail
that could be shared with the public.
When communicating the outcomes be sensitive to the personnel
involved (tell them first and privately). Involve them in the
communication if they agree.

Final Thoughts
The investigation is to learn from mistakes and not to assign blame
for what happened. That is the most important aspect of any such
investigation. Such investigations also provide crucial information
that will help develop methods to prevent future incidents. So it is
important that you DO something concrete with what you have
Your procedures and standards should be written in the blood of
those hurt. Also keep your antennae up for any incidents in other
companies that could have happened in yours, and learn from their
mistakes; do not repeat an error! API has a committee called Best
Practices that reviews incidents to provide a forum to share
learnings. That, in and of itself is a best practice.

More Final Thoughts

An effective investigator understands how people think and behave.
Consequently he or she must be able to communicate with a wide
range of people. She/he should be able to encourage people to tell
stories about what has gone wrong and learnings from what went
wrong. It is how we learn!
These learningsare important and should be retained through
procedural and standards alterations and changes.
Management must be diligent, own the process, and take time
asking questions.

Slide 20

Class Discussion Investigations

In your small groups for 20 minutes discuss an incident
investigation that you are aware of or participated in
Identify what went well
What could have been improved?
Was root cause achieved? How do you know?

Chapter 26:

Audits and Self Assessments

To understand the process of auditing and how it complies with PSM

Understanding Audits
Any system in an organization cannot be declared successful unless
proved to be so. So PSM system too needs feedback to continuously
improve the process and achieve excellence. One of the most
important feedback methods is Audits!
Audits and assessments are standard pieces of work done in many
facilities to help assure that work is being done correctly and
completely. Audits do use metrics to evaluate but also use special
protocols that are normally different from standards.
There are different types of audits:
Occupational Health
Project Management
Safety Culture

In this chapter we will study audits exhaustively on following points:
What is an audit?
Why audit?
Types of audits Four levels of audit
What is the Principle involved?
Purpose and objectives of an audit
Audit guidance
Link to TQM, Total Quality Management
Link to Business Excellence
Slide 5

What is an Audit?
Webster: A methodology to examine with intent to verify
Chemical company: Systemic approach to determine position
relative to goal
CCPS: Systematic, independent review to verify conformance
to established guidelines/standards Audit employs a welldefined process to ensure consistency. Auditors must be able
to reach defensible conclusions.
An audit is a technique used to gather sufficient facts and
information, including statistical information, to verify compliance
with standards.Field observations yield data for determining
performance against established standards.
A compliance audit is a comprehensive review of an organization's
adherence to regulatory guidelines.Audits provide a crucial
management control for Process Safety Management (PSM). Audits

employ protocols and checklists to validate compliance with

regulatory requirements and industry standards. They help ensure
that programs are properly designed and implemented. Audits
identify program deficiencies so that recommendations can be
developed for corrective action.
The audit is to include an evaluation of the design and effectiveness
of the process safety management system and a field inspection of
the safety and health conditions and practices to verify that the
employer's systems are effectively implemented.

Why Audit?
Why should audits be carried out?
Audits are critical to the implementation of any system. PSM too
profits from audits. Basically audits ensure that the metrics set by
an organization and the industry regulatory standards are being
met. That means people are kept safe.
Audits also are a learning tool, for the organization being audited
and the auditors too. Such detailed examinations help in continuous
improvements in the safety processes.
Audits are essential to satisfy regulatory requirements.
Audits assure that the organization is on the right path of progress.
They verify claims made about safety and systems and there is
guarantee that the claims are right!
Audits help improve processes and profitability.

Types of Audits
1st party you assess yourself and your team every day.
Findings are captured. Items to be corrected are placed on a
local list.
2nd party another site assesses your operation
3rd party some from external to your company assesses your
4th party a management systems audit on a group of
managers to assess progress and effectiveness; real outcomes
vs. stated outcomes
Large organizations can perform both 1stparty audits as well as 3rd
party audits. That is because large organizations typically have
groups that are dedicated to the audit process for the corporation.
For informal first party audits leaders can ask a series of questions
to employees with respect to PSM, their knowledge, and their
degree of compliance.
The First Party Auditing can be conducted anytime and many times.
Also as the managers are directly involved they become fully
conversant with the PSM standards and can ensure high standards
in their area. Also as these audits are informal, they can check
deficiencies easily and correct them quickly.
A weakness of First Party Auditing is that it might not be rigorous. It
is human tendency to promote the positive and play down the faults
to the detriment of safety. Auditors need to be impartial and must
display strong leadership and commitment in order to conduct
meaningful First Party Audits.
It is sometimes better to conduct PSM audits across areas or units.
For this knowledgeable subject matter experts can perform audits

outside their own area. This will bring a new perspective to the
Second Party audits are external audits. Theyre usually done by
customers or by others on their behalf. However, they can also be
done by regulators or any other external party that has a formal
interest in an organization.
Third Party audits are when a company invites outside organizations
such as registrars (certification bodies) or regulators to conduct
audit. These audits offer an outsiders view and are considered to be
less biased and more objective. Of course there may be
aggressiveness from outside auditors as they may be desperate to
find something to justify their presence, or make them or their
company look professional.
Fourth party audits are a management systems audit on a group of
managers to assess progress and effectiveness; real outcomes vs.
stated outcomes

Audits are Key

Audits are done in every business area. Financial audits are the
most well known. They are done in every business area: finance,
supply chain, sales, engineering, operations and safety/process
Continuous improvement is possible only when the management
and the employees are fully involved. Dr. J. Edward Deming, the
famous quality guru, provided a simple yet highly effective
technique that serves as a practical tool to carry out continuous
improvement in the workplace. This technique is called PDCA Cycle
or simply Deming Cycle. PDCA is acronym of Plan, Do, Check and

Action. Deming Cycle provides conceptual as well as practical

framework while carrying out Kaizen activities by the employees. So
the CHECK part of Deming Cycle is auditing!
Sophisticated audits require an audit protocol, which is a written set
of requirements and an agreement on how to score what is

PSM Auditing Principles

The only way one can know and understand how one is really
doing is by observing (in the field) and comparing performance
versus established standards.
Proper auditing includes positive feedback on significant
strengths as well as corrective feedback on areas needing

Slide 10

Management System


Purpose of a PSM Audit

The purpose of a PSM audit is to communicate standards. These
standards are already set by the organization and regulatory
bodies, and the audit is a method to know if these standards are
followed. Set standards by themselves cannot achieve excellence,
but they provide a benchmark forthe management to see the
effectiveness of their systems.
Audits identify deficiencies in the processes and can zero in on the
root causes. Strengths in the system are recognized and can be
further reinforced. Audits provide feedbackon implementation/
effectiveness of programs.
They measure performance against metrics and make


Audit Objectives
Audits and assessments are standard pieces of work done in many
facilities to help assure that work is being correctly and completely.
Audits should be viewed as an opportunity for the organization to
learn and to improve. Improvement should be carried out where
there is scope to improve; otherwise only audits would have no
How an organization responds to an audit is usually dependent on
how the leader thinks and talks about the audit and its results.

Regulatory Guidance (USA-OSHA)

Employers shall certify that they have evaluated compliance
with the provisions of this section at least every three years to
verify that the procedures and practices developed under the
standard are adequate and are being followed.
The compliance audit shall be conducted by at least one person
knowledgeable in the process.
A report of the findings of the audit shall be developed.
The employer shall promptly determine and document an
appropriate response to each of the findings of the compliance
audit, and document that deficiencies have been corrected.
Employers shall retain the two (2) most recent compliance
audit reports.

Regulatory Guidance(SEVESO DIRECTIVE)

The competent authorities shall organize inspections or other
measures of control proper to the type of activity concerned, in
accordance with National Regulations.

Trade Association Guidance (USA-CMA)

Measurement of performance, audits of compliance, and
implementation of corrective actions.
Accountability for ones commitment (or lack thereof) to the Guiding
Principles of Responsible Care that address process safety cannot be
achieved unless management measures and reacts to the process
safety performance of the affected individuals. Once measurement
systems are in place, management can perform periodic audits,
prescribe corrective actions for areas that need improvement and
support awards for people who have achieved their performance

Trade Association Guidance (USA-CMA) (contd)

3.1 Establish a program to verify operating facilities compliance
with process safety objectives.
Define the physical and organizational scope of the program.

Commit adequate personnel for performing audits.

Coordinate Process Safety Code audits with other regular
audits(e.g., loss prevention, boiler, environmental) to avoid
Establish a system to measure the effectiveness of the audit
Develop lists of corrective measures.
3.2 Verify that corrective actions have been implemented in a
timely fashion.
Assign specific responsibilities for correcting identified
Establish target completion dates along with a resource plan.
Require documentation of actions that resolve audit

Professional Association Guidance (USA-CCPS)

Guidelines for Technical Management of Chemical Process
Safety (Chapter 13 - Audits & Corrective Action) (1989)
Plant Guidelines for Technical Management of Chemical Process
Safety (Chapter 13 - Audits & Corrective Actions) (1991)
Guidelines for Auditing Process Safety Management Systems

Relationship of Process Safety & Risk Management

to Total Quality Management
Organizations choosing to become ISO certified go through a 9-step
process to registration as follows:

1. Management Decision and Commitment

2. Establish and Train Internal Resources
3. Internal Audits Begin
4. Begin Documentation Efforts
5. Choose Registrar
6. Practices Documented and Implemented 70-80%
7. Pre-assessment
8. Registration Assessment
9. Registration

Relationship of Process Safety & Risk Management

to Total Quality Management (contd)
As you can see, the auditing process begins in step three and
proceedsthrough steps seven and eight. Once registered, the
process is just the beginning. Organizations will continue their
internal audits, management reviews and corrective actions. Also,
the registrar will be conducting surveillance audits on an ongoing
basis. In this way ISO 9000 provides a foundation for continuous
improvement and for other quality or business initiatives.
The attached Quality - Process Safety Matrix provides a summary
cross reference between CMAs Process Safety code, OSHAs PSM
Standard and three quality initiatives. There is a strong relationship
between Process Safety Management and Total Quality
Management with each initiative using auditing to drive
organizations towards continuous improvement. Fortunately, the
efforts expended on PSM contribute to TQM and vice versa.



ISO 9002


1. Management Leadership

Management Responsibility

2. Accountability

Responsibility & Authority

3. Performance Measurement

Internal Quality Audits


Human Resources
Strategic Quality Planning
Problem Solving

4. Incident Investigation

Corrective Action



Management for Quality

Employee Well Being
Quality Assessment


Compliance Audits


Problem Solving


Process Management


Incident Investigation


5. Information Sharing

Non-conforming Material


Trade Secrets


6. Community Input

Process Control, Improvement


Public Responsibility
Customer Relationship Mgt.



Operating Procedures


8. Hazards Documentation
9. Risk Assessment
10. Management of Change

Quality System
Document Control
Document Control
Quality Records
Statistical Techniques


Key Control Characteristics

Process Control, Improvement
Process Control, Improvement

Process Control
Document Control: Changes


Key Control Characteristics

Process Control, Improvement





Senior Executive Leadership

7. Design Documentation



11. Siting Impacts

Public Responsibility

12. Codes & Standards

Inspection, Measuring, Test Stat. (4.10)

Process Safety Information


Process Hazard Analysis


Management of Change



Design & Introduction of Products (5.1)

13. Safety Reviews

14. Maintenance & Inspection

Performance Data & Information (2.1)

Process Safety Information


New/Modified Product Approval (4b)

Pre-startup Safety Review


Process Control, Improvement

Mechanical Integrity


Process Hazard Analysis



15. Multiple Safeguards

16. Control in Emergency

Process Control


Process Control, Improvement


17. Skills Identification

Quality System


Human Resources


18. Work Practices

Quality System




19. Training



Human Resources


20. Proficiency Demonstration



Contract Review


Emergency Planning & Response (n)

Employee Education & Training

Employee Education & Training





Hot Work Permit

Operating Procedures






21. Fitness for Duty

22. Contractor Programs

Supplier Quality



Relationship of Process Safety & Risk Management

to overall Business Excellence
Organizations, which are successful in achieving overall Business
Excellence typically, achieve excellence in every facet and every
activity of the business. For example, one will find that these
organizations have worked to achieve excellence in safety, quality,
customer supply and service, cost, etc. Upon closer examination of
those organizations, which have really achieved business
excellence, one will find a common thread running through each
of the different activities that constitute the business. And that
common thread includes:
Sound & up-to-date technology
Trained and qualified personnel
Equipment that is maintained and reliable

Effective management of change

Audits - with feedback and control
A focus on doing each task the right way
Thus, the efforts to establish these threads, which are an integral
part of a process safety management program, are strongly aligned
with efforts to achieve overall Business Excellence.

Common thread
Sound & up-to-date technology
Trained personnel
Equipment - Maintained & reliable
Effective Management of Change
Audits - Control & feedback
Do the job the right way


An example of a management system audit


A Typical Management System Approach

In this approach to audit, the management addressesall the three
layers of an Audit Program and Process. This is done by:
Reviewing all Corporate Competencies annually
Reviewing Business or Regional Programs Annually on a Two
Year cycle
Observing Four to Six Facility 2nd party Audits Annually
The goal is of completing Company-Wide Review every 2 years.
Such kind of commitment poses substantial challenges in scheduling
these activities, as these clash with immediate revenue generating
An important aspect of PSM audits is to confirm that a management
system is in place to ensure the expectations of the company are
being met. This means that correct functioning of any process is
NOT dependent on any one individual, rather the system ensures
correct functioning. That means that a proper audit checks that the
systems in place actually do what is needed to confirm compliance
with company policies, as well as governmental rules and
regulations. When gaps are found the gaps must be risk ranked to
ensure that the areas of highest concern are fixed in priority order.
The Competencies that should be reviewed every two years are:

Employee Safety
Fire Protection
Occupational Health
Process safety
Product Stewardship

Second-Party Audit Observations

Site A


What observed


Site B


Site C


Do you notice any patterns or common deficiencies?

Most likely

due to issues in how the firm is managed.


Exemplary Practices
Make sure that the good news is mentioned first as one reports out
on a management system audit.

Program Deficiencies
List deficiencies AFTER listing what is going well however, when
reporting deficiencies, be sure to be specific and use clear language
to assure understanding.
Straight talk helps to assure action!

Management System Audit Areas

Management Support
Team Staffing and Resourcing (funding, time)
Written Procedures
Audit Processes
Finding Documentation Processes
Corrective Action
Quality Assurance

Internal Audit Team

The Internal Audit Team is a group of individuals designated to
perform an internal audit. The audit team is responsible for auditing
selected departments in its own company. The team should have a
combination of skill sets that include technical and industry-specific
The Team:
Composition must be cross functional
Not constrained by appearances

Not constrained by previous audits

The first item on the agenda for an audit team is to have the
membership of the team to be cross functional to understand all
aspects of the systems they will audit. This may seem obvious, but
eludes many team leaders.
The second item to consider is that appearances can be deceiving.
The team needs to get facts, not opinions, or conjecture and verify
the facts to be certain.
Finally, the team should not review previous audits prior to their
investigation and let them find their own findings. That said, after
the review and before their report is written, that is the time to look
at previous audits to look for repeats. Repeats are a sign that the
facility is not doing it job properly and should be a finding in and of

Internal Audit Management Systems

An important aspect of PSM audits is to confirm that a management
system is in place to ensure the expectations of the company are
being met. This means that correct functioning of any process is
NOT dependent on any one individual, rather the system ensures
correct functioning. That means that a proper audit checks that the
systems in place actually do what is needed to confirm compliance
with company policies, as well as governmental rules and
regulations. When gaps are found the gaps must be risk ranked to
ensure that the areas of highest concern are fixed in priority order.
Audits exist to add credibility to the implied assertion by an
organization that they are safety compliant. When the audit is
planned, a checklist and procedure is developed, depending on the

PSM elements and the organizations policy requirements. With this

list, it can be verified if the requirements of the PSM standard and
also those of the relevant corporate policy are fulfilled. Review of
written documents, interviews with employees, can help the team
determine training effectiveness, knowledge and awareness of the
safety procedures, duties, rules, emergency response assignments,
etc. actual procedures followed and practices too can be observed
and their compliance with the required actions can be ascertained.
The audits help to:
Confirm the internal (policies) and external (regulations)
control requirements for the facility or organization are in place
Evaluate facility's compliance with corporate policies and
government rules and regulations (Company, Country, State,
Determine the applicability of each elements criteria to the
facilitys management system
Identify the areas of highest concerns or risks (prioritize)

Internal Audit Assessment Outcomes

Determine that the facility knows what it needs to know in
order to manage their responsibilities effectively.
Determine if the facility has addressed specific issues through
the development of and the implementation of a policy,
program or procedure, as appropriate.
Determine that if they are doing what they say they are doing,
they will be able to sustain operational excellence over time.
These are the most important aspects of the audit. Do the people
at the facility know what they need to know to manage? What

process is in place to ensure again that the structure is not

people dependent, but rather is built on a system that is effective
and evergreen.
Audit assessments provide important information to the
management to help ensure that collected data are regulation
compliant. Audits and assessments can uncover deficiencies in
physical facilities, equipment, project planning, training, operating
procedures, PSM elements as well as quality system aspects
applying to more than one project.
Assessments give directions for corrective actions. That is in fact
the whole raison d'etre of the audit. The identified flaws should of
course be addressed as soon as possible. However this is also an
opportunity to plan, follow-up and document the corrective
procedures taken. This becomes a benchmark for future
assessments and audits.
Post audit management review is advisable to decide upon
appropriate actions, to prioritize the actions, develop a timeframe
for following up, allocate resources, and responsibilities.
The corrective actions may mean minor maintenance, or small
changes in procedures. Here MOC procedures should be used as
appropriate. Some deficiencies may need a complete overhaul of
the procedures. Sometimes there may not be any response
required. Whichever is the case, documenting what actions were
taken and why is mandatory.
Assessments also help the organization to:
Determine that the facility knows what it needs to know in
order to manage their responsibilities effectively.

Determine if the facility has addressed specific issues through

the development of and the implementation of a policy,
program or procedure, as appropriate.
Determine that if they are doing what they say they are doing,
they will be able to sustain operational excellence over time.
Confirm that what they say they are doing in the office is what
they ARE DOING in the field.
Confirm that the structure of the management systems will
withstand the test of time and remain effective.
A process for continuous improvement is in place and followed.

Internal Audit feedback Continuous Improvement

Audits are meant to effect improvements wherever required. The
auditor should not be judgmental or confrontational. The auditing
company must understand that:
An audit is NOT a GOTCHA game
An audit IS a mechanism to improve operational and financial
So the auditing organization shall:
Ensure that appropriate guidance documents and background
information is provided prior to the audit.
Ensure that the assessment process is clearly explained to a
Ensure an atmosphere of team work and cooperation
Ensure gaps identified are either anomalies or systemic
Provide solutions
Ensure feedback from facility to leadership

PSM audits need:

Block Diagram of Process or Operations
List of Raw Materials Used and Products Made
Block Diagram of Waste Water Flows and Discharge Points
Plant Safety and Health Rules (e.g. Safety Manual)
List of Process and Non Process Stacks
Copies of Air and Water Permits
Copy of overall operating Permits








Environmental, Health, and Safety concerns

Local Regulations and Ordinances
PSM audits of different plants and operating systems certainly have
some requirements that are common. However each plant will have
its own unique requirements and features. Hence each PSM audit
needs to be especially designed to suit a particular organization or
facility. The proper application of PSM procedures requires
knowledgeable and experienced personnel.

PSM audit
PSM audit should have an evaluation of the design and effectiveness
of the process safety management system and a field inspection of
the safety and health conditions and practices to verify that the
employer's systems are effectively implemented and well
The essential elements of an audit program include review of PSM
program details, review of support documentation, conducting the
audit, interviews, evaluation and corrective action, follow-up and
documentation tracking recommendations to closure.

It incorporates a review of the relevant documentation and process

safety information, inspection of the physical facilities, and
interviews with all levels of plant personnel.Then the auditor should
examine compliance with the provisions of the standard and any
other relevant corporate policies.
Each element of PSM reviewed to determine if the management
system is in place
Establish Managements expectation for confirmation
Field verify that expectation by walking around (MBWA)
Talk to people in field to verify their understanding of element
and compliance
MBWA is management by walking around.
The last point shown is very important. The people in the field need
to know what compliance means and that it is not OK to not
comply. Compliance is simply good business.

Second Party Audit Team Qualifications

Competency & expertise for audit area
Training for Lead Auditor and Team
Prior audit experience
Team has gravitas
Team is informed on local requirements as well as corporate

Common Audit Program Concerns

Audit Quality

o Focus on higher risk vs. administrative items

o Actionable findings
o Audit report quality
o Auditor quality
o Assuring auditors approach as independent but
o Assuring sites view auditors in this manner
o To score or not to score
Action Item Closure
o Effective (foremost)
o On-time and timely

Link to TQM
TQM is Total Quality Management, which is a kind of quality
The Deming Cycle is the basis of continuous improvement in
any system
Deming Cycle consists of 4 key steps: Plan, Do, Check, Act
The Audit is the CHECK part of the Deming Cycle
Follow Up on Audit recommendations can be part of Act, Plan
and Do steps

Link to Business Excellence

One rarely achieves business excellence with out doing many
audits in many different functional areas
Two key business axioms:
o What gets measured, gets done.

o Trust, but verify, always.


Examples of audits
Foreign facility that did not run pollution abatement equipment
Domestic facility that the workers felt did not care about safety
Foreign facility with sulfur emissions
Were these bad situations?
On a very large and profitable foreign facility, the basis for fast
tracking expanding the facility hinged on installing the latest
pollution abatement equipment. The author was the lead auditor on
the project and indeed the pollution abatement equipment WAS
installed however it was not running. The management at the
facility tried to hide this fact from the team, but when confronted by
the fact that it was not running, the team was told Good Catch.
Within 6 months that facility manager was retired, as it should be.
Good companies need to run good facilities. Period.
Point two is subtle, but important! The workers at a facility must
know that their safety and wellbeing is the top priority of the units
leadership. At one time an entire complex was shut down to safely
work on a small portion of the flare header the mechanical folks
being macho men said the could work on the section safety if no
discharges occurred to the flare line. However the author did not
agree with that and the entire complex was safety shutdown and
put in the safe off position. The mechanical and operational folks
understood clearly that the cost of daily operation was high and
worked non-stop to repair the line. The unit was then safety
restarted and no workers lives were ever in jeopardy. In the long
term, the unit operators knew that safe operation was expected.

The third example is from a very remote facility that emitted sulfur
emissions that the author thought were excessive. These were
within the operating permit, but quite high by the authors
experience. Conversations with the local staff said that the sulfur
emissions were actually good for the environment since the
surrounding soil was very basic; and the sulfur emissions actually
did two positive things. It added trace nutrients to the soil as well
as moved the pH of the soil more to neutral that would help the
crops being cultivated. The learning here is that pre-conceived
notions need to be either substantiated or the truth discovered.

On a related noteRegulatory Inspections

An external audit
External regulator historically came in after an incident too
late for the facility. Efforts are underway to change this.
2007: OSHA

Petroleum Refinery

PSM National Emphasis

Program (NEP)
Nov 2011: OSHA Chemical Facility NEP
o No expiration date
o PSM-covered facilities will be inspected.
NEP compliance should be the minimum PSM objective!
OSHA Enforcement website
The chemical facility National Emphasis Program (NEP) issued by
OSHA is a focused inspection program that includes policies and
procedures to verify compliance with OSHA's Process Safety
Management (PSM) standard (29 CFR 1910.119) at covered

This new NEP is meant to protect workers from the catastrophic

release of highly hazardous chemicals at chemical facilities.
OSHA will attempt to identify the most hazardous process of units
selected for inspection under the NEP based on several factors. The
factors include quantity of chemicals in the process, age of the
process unit, number of workers and/or contractors present,
incident and near-miss reports and other history, input from the
union or operators, ongoing maintenance activities, and compliance
audit findings.
The chemical NEP has no expiration date. NEP compliance needs to
be the minimum objective of organizations.
Practical issues to consider








importantly implemented!
The management and employees should be aware of NEP
PSM documents, files should be easily accessible
Ensure the earlier audit recommendations are fulfilled
Maintain proper schedule for closing such action issues
Arrange for an external audit/regulator and institute required
remedial measures

Role of regulator
Employers should necessarily select a PSM trained individual or
assemble a PSM trained team of people to audit the process safety
management system and program. An outsider may not be entirely
aware of the process.
The outside regulator/auditor does not know the process well

The outside regulator usually comes in after an incident too

late for the facility
OSHA can fine
CSB cannot fine
Fines do not fix problems process can
Non-compliance with PSM standard, or even with one of its
elements could invite penalty. However penalties or fines cannot fix
a problem.

Learning from Incidents and Audits


Learning Objectives
Incident findings and audits are great learning and improvement
tools. Incident findings can help change processes to avoid incidents
from recurring, and turning potential hazard into safe practice.
Audits help in checking actual practices vis--vis ideal or standard
practices. This helps in identifying process, equipment and training
problem areas and these can then be addressed straightaway.
Audit and investigation findings are essential to business
improvement. In todays world, if you cannot learn in an organized
way from your experience, your business will pass from the scene

Three Main Points to Cover

Firms must be ready to accept the learnings from their audits and
incident investigations by:
Creating the learning organization
Continuous improvement Work Processes

Creating the Learning Organization

A learning organization is one that facilitates the learning of its
members and continuously transforms itself. Learning organizations

develop as a result of the pressures facing modern organizations to

enable them to remain competitive in the business environment.
According to Peter Senge (1990: 3) learning organizations are:
organizations where people continually expand their capacity to
create the results they truly desire, where new and expansive
patterns of thinking are nurtured, where collective aspiration is set
free, and where people are continually learning to see the whole
For a learning organization it is necessary to set up
The system to manage and categorize the knowledge your
business needs to be competitive, also known as knowledge
A culture that values learning and rewards people for
collaboration, analysis and insight AND sharing
Enough resource in time and money to support employee
growth; a management ethos that values learning as much
as getting the work done

Knowledge Management Systems

Knowledge management (KM) comprises a range of strategies and
practices used in an organisation to identify, create, represent,
distribute, and enable adoption of insights and experiences.[1] Such
insights and experiences comprise knowledge, either embodied in
individuals or embedded in organizations as processes or practices.

Knowledge management systems refer to any kind of IT system

that stores and retrieves knowledge, improves collaboration, locates
knowledge sources, mines repositories for hidden knowledge,
captures and uses knowledge, or in some other way enhances the
KM process.
Usually includes the following:
Designated standards systems: engineering, safety, process
safety, pipe codes, equipment
Communities of experts that maintain and own each
standard, and update their standard based on what happens
inside and outside
Management that directs the organization to use the internal
A Search engine to retrieve learnings

What is Sustainability?
Todays corporates define sustainability as a business strategy that
directs long-term corporate growth and profitability, by including
environmental and social factors in the business model. Thereby
sustainability strives to change the way a company does business,
for the better.
The aim is to enhance company and employee value by managing
environmental and social risks and seize opportunities that emerge.
Corporations, universities and the government are all starting to
embrace and implement the concept of sustainability.
Sustainability is a path of continuous improvement, wherein
the products and services required by society are delivered
with progressively less negative impact upon the Earth.

On the following chart sustainability has seven key elements,

and safety, including process safety, is one of them.
(Defined by AIChE Institute for Sustainability, November 04-July 05 Grassroots


AIChE Sustainability IndexTM

What is the AIChE Sustainability Index?
As the concept of sustainability has grown more important, many
companies have discovered a need to measure, track and compare
their efforts in this area. The AIChE Sustainability Index will enable
you to assess your company's sustainability performance with 7 key
metrics that will help you understand how your companys
sustainability efforts are perceived in the community, by your
shareholders, by your customers and versus your peers.
What makes the Index unique?
The AIChE Sustainability Index was developed by engineering and
scientific experts for both engineering and scientific experts and
enterprise managers. Unlike other indices, the AIChE Sustainability
Index benchmarks well-defined performance metrics and indicators,
including EH&S performance, innovation, and societal measures.
The metrics factor technology and innovation into performance data
and enable your company to:
Benchmark your performance among peers
Assess your performance against well-defined metrics on an
on-going basis
Measure progress toward best practices at regular intervals

Access unbiased, expert interpretation of publicly available

technical data
Better understand public perception of your companys
sustainability efforts
These sustainability criteria are:
Drills down, but remains broad enough
Based on public data
Targeted for managers and corporate executives, not
Focused on
o Environmental performance metrics
o Safety performance metrics (workplace, process)
o Product stewardship mgmt system, history
o Value chain management mgmt system
o Sustainability innovation initiatives, tools, results
Social performance and strategic management also covered
o Less than other indexes
Benchmarked to peers and best practices

Slide 8

Elements of Sustainability Index

Strategic Commitment

Environmental Performance

Safety Performance


Product Stewardship

Sustainability Innovation

Social Responsibility

Value Chain Management

Net Revenue > $10 Billion USD
Net Revenue < $10 Billion USD

The AICHe SI is composed of seven critical elements:

1. Strategic Commitment to Sustainability
2. Sustainability Innovation
3. Environmental Performance
4. Safety Performance
5. Product Stewardship
6. Social Responsibility
7. Value-Chain Management
These elements are scored based on either quantitative or
qualitative data. Each metric and indicator area is weighted based
on the relevance to the industry sector concerned. The scoring is
designed to take into consideration subjectivity in a transparent

manner. They are meant for the management to manage company

business lines.

Sustainable Firms
Sustainable firms manage their profits, people and the planet. Such
businesses have healthy financial, social and environmental systems
making them change compliant. They create value for themselves
and for their customers today without compromising the tomorrow
of generations to come.
As stated by Financial Times, for industrial development to be
sustainable, it must address important issues at the macro level,
such as: economic efficiency (innovation, prosperity, productivity),
social equity (poverty, community, health and wellness, human
rights) and environmental accountability (climate change, land use,
Safety is essential to sustainability
People have to go home well and unhurt to have a
sustainable firm
Process safety and mechanical integrity must be supported
and operational for a firm to remain viable, as all firms
operate with the permission of those around them

Continuous Improvement
Continuous Improvement as the name suggests is a quality theory
that believes that more improvements are possible all the time by
reevaluating and improving processes and systems. That is the
Kaizen continuous improvement philosophy! It is an ongoing

effort to achieve excellence in either small increments or big

Continuous Improvement
IS part of PSM
Provides a framework to support learnings from audits and
Select the right kind of CI system for the work you are doing
Totally compatible with the PSM system

Types of CI Work Processes

Major types of CI processes:
Statistical Process Control (SPC)
Six Sigma
These will be covered in detail in a later chapter (32)
The AUDIT or CHECK step is essential to any CI work process

Link to TQM
Total Quality Management (TQM) is also continuous improvement.
It applies to every facet of an organization right from management,
systems, and processes to the culture of the company! Such an
organization ensures that processes are done right with maximum

yield and minimum wastage. Defects are sought to be totally

eliminated from the overall operations.
TQM is Total Quality Management, which is a kind of quality
The Deming Cycle is the basis of continuous improvement in
any system
Deming Cycle consists of 4 key steps: Plan, Do, Check, Act
The Audit is the CHECK part of the Deming Cycle
Follow Up on Audit recommendations can be part of Act, Plan
and Do steps
Slide 13

No matter what the audit learnings need to learn
Your Knowledge management system must work with your
PSM audit process in order to help maintain the lessons

Chapter 28

The Role of the Regulator


Learning Objectives
Learn the various roles that a regulator plays in the safe execution
of Process Safety Management
What is a Regulator?
A Regulator is a member of a Regulatory body mandated under the
terms of a legislative act (statute) to ensure compliance with the
provisions of the act, and in carrying out its purpose.
Their task is to codify and enforce rules and regulations and impose
supervision or oversight for the benefit of the public at large.
Chemical industry is a regulated industry and a Regulator will
secure compliance and enforcement of statuary requirements. The
regulatory agency promulgates benchmarks created to enforce the
provisions of a legislation.

Todays Roadmap
Regulator as part of a system
Different types of regulators
How regulators work
How to manage a visit
Common pitfalls
Making regulator visits work for you


Regulator as Part of the System

Some independent regulatory agencies perform investigations or
audits, and some are authorized to fine the relevant parties and
order certain measures. In many countries the Regulator is the
ultimate auditor.
Regulators determine through records, interviews, and direct
observations that certain personnel, facilities, processes, and
operations are in compliance with regulatory standards, and that
the documented procedures are being correctly followed. The
findings are usually tied to penalties, fines, and publicity. The
regulator has many tools at his disposal for enforcement and
compliance, including notices, enforceable undertakings and
Regulators can assist businesses and communities with compliance
training and guidance. There is growing movement to use
regulators as a cooperative ally than an enforcer after a fact.
Regulators use public investigative protocols to check for
compliance of standards. Sometimes a regulator may have limited
expertise/experience in certain areas.
They cannot audit every facility and may need assistance to enforce
standards at some of them. They could be subject to political
pressure and may make mistakes. They are human after all and are
open to human frailties!
Regulators can recommend law changes

Different Roles of Regulators

The Regulatorshave a very important role in establishing,
controlling, inspecting and enforcingsafety regulations. A Regulator
is often the first to be contacted when there is an abnormal
situation or a hazard potential.
Regulators inspect and fine
o US Environmental Protection Agency (EPA)
Regulators educate
o National Energy Board of Canada
o UK HSE (Health and Safety Executive)
Regulators focus on improvement
o US Chemical Safety Board recommendations
o VPPPA Supported by US OSHA

How Regulators Work

Generally Regulators are more reactive than proactive. However
there is a positive effort for regulators to educate and train
company personnel in the requirements of the standards and
compliance. There is a growing need for regulators to adopt a
positive and proactive approach towards ensuring compliance by
helping and encouraging organizations understand and meet
regulatory requirements more easily.
They usually respond to regulatory breaches, complaints and calls
and always respond to a workplace fatality.
Regulators give presentations and speeches that outline focus and
targeted areas for improvement or action. They are authorized to
issue fines and play politics.

They engage companies directly for support if need is indicated

like when developing a new regulation. They can advise and make
recommendations to the authorities and monitor and enforce
compliance with the regulatory standards.

How to manage a visit

A regulator has to be treated as an ally and not as an adversary.
You should treat regulator with respect and extend full cooperation.
The corporate office should be notified of the visit and legal
direction should be obtained.
Identify one person to be the visit contact or the liaison. This person
should not be the top manager. However s/he should have authority
to deal with the regulator and give necessary assistance. Regulators
can ask for information from any person they have reason to
believe is able to give it, show documents, etc. So it is advisable to
give them the information they want, as they can subpoena it with
a court order anyway.
Your information needs to be true and factual. Speculations and
conjectures are not to be resorted to.
Note anything the regulator says or asks. The most important thing
is to ensure any promises made are kept and completed on time

Common Pitfalls
The most injurious action is to treat the regulator as an adversary
and deny due respect.

Denying, hiding or stonewalling information that is demanded is

also a bad idea.
The regulators cannot be denied access to facility or information or
documents in anyway. It is better not to even try to do this.
Failing to listen carefully or verify statements is another common
pitfall that can be done at your own peril.
If there are any previous outstanding audit items, these should be
taken care of preferably before the Regulators visit. Failing to do so
will ensure penalty.
Another mistake that is quite common is to have poor or incomplete
documentation on site.

Making Regulator visits work for you

A regulators visit should be treated as a platform for improving
your safety standard compilation. Their expertise and experience
can be put to good use if you are open to their advise.
Designate one person a lead contact and assign two people as
support/note taker/arranger, etc.
The regulator will ask questions but may not volunteer information.
Why not ask questions to the regulator? Most will be willing to
answer, as both your goal is to comply with the regulatory
standards. There may be a better and easier way; you have to just
ask! Be interested in their experience.
Be positive about the work at the facility; but speak frankly and
factually. There is no need to either embellish or to fudge anything

Introduce them to people; be professional.

It is also important to make sure they meet any organized labor
leadership very early in the visit.

Making regulator visits work for you

The regulator visits should not be a one-upmanship game. Let the
regulators carry out their work without hindrance or obstacles.
Make it easy for the regulator; if you make it harder, they will be
back with a team of people and it will be much more intrusive.
Do not buy the regulator a meal or a gift of any sort; it is not right
and they cannot accept it.
Meet your regions regulator BEFORE he/she shows up for an
inspection a relationship of a professional sort always helps.

Regulators play many different roles
Respect your regulator
Get to know your regulator BEFORE they show up at your site!
Regulators are real people!

Tone at the Top

This lesson examines the difference between proactive and reactive
management actions. We will also look into the importance of tone
at the top and how it influences corporate PSM messages
With this lesson you will have a better sense of the role of the
managers to create a tone of how we do things around here and
what you had better pay attention to.

Todays Roadmap
What is a vision?
Impact of corporate commitment on PSM
Discussions on tone at the top
Proactive and reactive
Looking at key communications

Tone at the top!

Let us first look at some definitions to understand tone at the top.
Leadership is the ability to get others to want to do what we want
them to do, willingly and to the best of their ability, without
Ethics refers to behavior that conforms to an accepted set of
principles or values (such as accountability, compassion, honesty,
integrity, responsibility). Ethics means choosing to do the right
thing, the right way for the right reasons.

Tone (at whatever level) is another word for the informal culture of
the organization the shared understanding of how things really
work around here irrespective of formal rules and policies
Such tone, ethical or otherwise is set at the top by the top
management and trickles down to all the levels to the last employee.
Tone at the Top is about creating a culture where everyone has
ownership and responsibility for doing the right thing, because it is
the right thing to do. Even if there are important rules and
regulations for safety, if the management is firm and walks the talk,
then the correct tone is set. That is what builds the integrity of the
Rules, written procedures do not build integrity. It comes from the
top, when the top people are seen to follow the rules and are seen
to be concerned about safety. Embedding systems and processes to
support the Tone from the Top will help shape the organizational
culture and measure the effectiveness of leadership actions and
behaviors over a period of time.
When the top managers uphold ethics and integrity so will
employees. However if they appear blas about ethics and more
concerned about production and profits, then the employees will
take their cue! So it is advisable to set the right tone at the top.
The Treadway Commission used that phrase for the first time vis-vis financial reporting. Its study concluded that a companys
culture is causally linked to a companys misbehavior and
emphasized that a companys leaders must create a culture that
promotes appropriate business conduct.
Consistent and frequent communications and oversight and
monitoring of decision making are the key drivers to the
implementation of safety culture.


What is Vision?
Vision is a picture of the future the company wishes to create. It is
what the company wants to become, where it wants to be. These
are the long-term goals of a company.
Creating a corporate culture that aligns with the values of all
stakeholders, employees, customers, shareholders and society is
the critical issue for business in the 21st century. Cultural capital is
rapidly becoming the new frontier of competitive advantage.
What is the culture of a company? It is best defined as, the way we
do things around here. It has a deeper connotation how we
behave when no one is looking

Culture usually trumps procedures every time.

What it means is that there may be excellent processes in place to
deal with situations, but if these do not reflect the culture then
these processes will be ineffectual. This happens because people are
vastly different and totally unpredictable. They react to a situation
based on their values and company culture and these reactions may
not be the same. So processes may be in place, but culture trumps
procedures every time.
The primary requirements are, of course, consistent communication
and education, ongoing awareness efforts and lots of feedback to
employees about process safety. Realigning process and procedure
can be quick and even easy, but it takes a long time to change
culture, and the effort has to be top-to-bottom.

Why do people take short cuts?

Even if there are set and defined safety procedures, people will try
to take short cuts. Especially when a decision is too complex, people
adopt simplifying strategies to make the decision easier. Safety is
the first consideration but when there are no incidents for a long
time, complacency too makes people go in for short cuts. That is
why accidents happen. Things that people do when they should not
be doing those.
Short cuts may reflect in unsafe decisions. People do not follow
rules and procedures; sometimes production pressure is the culprit.
Sometimes short cuts save time and are easier to do. Sometimes
people are not thinking, tired, or even when unsafe behavior is
noticed nobody points it out for various reasons.

What does a good culture look like?

In a good corporate culture there a clear, sound action plan to
achieve their defined objectives. There is a proper management
team committed to people and in whom the people trust! The
people too are hard working, committed and feel free to voice their
opinions. Safety Cultures develop joint responsibility between
individuals from management to employee towards safety.
The managements decisions are unwavering and the
communication is open. Whenever problems occur, a consensus
decision is taken quickly.
From the employee point of view they are more involved, feel safe
to be open and have clear responsibilities and boundaries. They feel
responsible for their work.
The best indication is that people look forward to coming to work.

What does a bad culture look like?

A "bad" work culture is one where people cannot fulfill their desires.
It is the opposite of the above. There is no open communication.
There are no clear guidelines, no quick decisions. No defined
responsibilities or boundaries.
Employees come for work as they get paid!

How is management involved in culture?


Impact on PSM Work Session 1

Discuss in your small groups for 10 minutes:
Who decides what engineers work on?
Who sets the overall direction for the company?
What happens that causes senior execs to make less than
optimal decisions around PSM?
Appoint a spokesperson and be prepared to share the answer to the
last question

Culture Ladder
Achieving World-Class safety performance requires a culture shift
and the involvement and ownership of all employees.

What is a safety culture?

Safety Culture is when individual and public safety at a workplace
governs all the procedures and processes. It is where every

employee at every level in the organization feels responsible for


Safety Culture Ladder

The best way to understand our culture is in terms of an
evolutionary ladder. Each level has distinct characteristics and is a
progression on the one before.

Looking at it like this provides a route map, where every team, or

company has a certain level of cultural maturity and can see which
rung of the ladder they are on, where they have been and what the
next step looks like.
The range runs from the Pathological, through the Reactive to the
Calculative and then on to Proactive and the final stage, that we call
the Generative.
Pathological, is where people dont really care about Safety let alone
Health and the Environment, and are only driven by regulatory

compliance and or not getting caught. We probably all recognize

this from the past but is something we have hopefully to move
Reactive, is where safety is taken seriously, but it gets sufficient
attention only after things have already gone wrong. People say
things like its a dangerous business, or you have to understand
it is different here, you have to look out for yourself, or those
who have the accidents are those who cause them.
At the reactive level managers take safety seriously, but feel
frustrated about how the workforce wont do what they are told. If
only they would do what they are supposed to, we need to force
The next level, Calculative, is where an organization is comfortable
with systems and numbers. The HSE-MS has been implemented
successfully and because HSE is taken very seriously, there is a
major concentration upon the statistics bonuses are tied to them,
contractors are rated in terms of their safety record, not just
because they are the cheapest. Lots of data is collected and
analyzed, we are comfortable making process and system changes.
There is a plethora of audits and people begin to feel they have
cracked it. Nevertheless businesses at this level still have fatalities
and are surprised when these occur.
Proactive is where Shell EP is aiming for. It is moving away from
managing HSE based on what has happened in the past to really
looking forward. Not just working to prevent last weeks accident, it
is starting to consider what might go wrong in the future and take
steps before they are forced to.
Proactiveorganizations are those where the workforce start to be
involved in practice, as well as in management statements of intent.

Unlike the Calculative, where the HSE department still shoulders a

lot of the responsibility, in Proactive organizations the Line begins to
take over the HSE function, while HSE personnel reduce in numbers
and provide advice rather than execution. Indicators become
increasingly process-oriented are we doing the right things? rather than just focused on incidents have we had any accidents?
It is quite simply about creating an environment that encourages
the behaviors and beliefs that will deliver lasting improvements in
our performance both HSE and beyond.
As an organisation climbs up the ladder there the level of
informedness and trust increases with people offering to accept
accountabilities (you can count on me) rather than just being told
they will be held accountable for some outcome. Informedness is
about mangers knowing what is happening in their organisation and
where all the problems are, and the workforce knowing exactly what
managers expect no mixed messages. As managers and workers
are aligned, this builds two-way trust. People know what is
expected and are trusted to do it, there is less need for bureaucracy,
audits and supervision, so workload decreases from after the
Calculative stage
Generative organizations set very high standards and attempt to
exceed them rather than be satisfied with minimum compliance.
They are brutally honest about failure, but use it to improve, not to
blame. They dont expect to get it right, they just expect to get
better. Management knows what is really going on, because the
workforce is willing to tell them and trusts them not to over-react
on hearing bad news. People live in a state of chronic unease, trying
to be as informed as possible, because it prepares them for
whatever will be thrown at them next.

DuPont Bradley Curve

Achieving World-Class safety performance requires a culture shift
and the involvement and ownership of all employees. The DuPont
Bradley curve is a roadmap of how to change the organizational
culture from average-safe to highest-safe level, from low maturity
level to highly mature safety culture. It shows the shifts in mind-set
and actions required to bring about this change in culture.

Reactive Stage is the lowest maturity level. Here people do not take
responsibility. They believe that safety is more a matter of luck than
management, and that accidents happen. And over time, they do.
The second stage is the Dependent Stage where safety is just a
matter of following rules that someone else makes. Accident rates
decrease and management believes that safety could be managed
if only people would follow the rules.
The next is the Independent Stage. Individuals take responsibility
for themselves. People believe that safety is personal, and that they
can make a difference with their own actions. This reduces
accidents further.

The most evolved stage is the Interdependent Stage where

teams of employees feel ownership for safety, and take
responsibility for themselves and others. People do not accept low
standards and risk-taking. They actively converse with others to
understand their point of view. They believe true improvement can
only be achieved as a group, and that zero injuries is an attainable

Impact on PSM Work Session 2

Discuss in your small groups for 10 minutes:
How does the role of the manager change when the organization
works at the different levels of culture as shown in the Keil Centre
and DuPont Models?
Appoint a spokesperson and be prepared to share your answer

How managers behave

1. What do Managers do when they see an unsafe behavior?
2. Is Safety communication open and honest?
3. Is the workforce involved in solving safety issues?
4. How are individual and team competencies assured?
5. How do managers balance safety and production?
6. Are contractors integrated into the working environment?
7. Are safety programs (e.g. driving safety, behavioral safety)
adapted to the local culture?
8. Does your Manager listen to your ideas for improvement?


9. How is maintenance actually carried out?


Does my team leader trust me

and respect me?

Remember these questions: how many of these questions are
affected by what the manager does?

How can you turn reactive

actions and thinking into proactive? What will it take in your work
experience to do so?

Take an example from this class.


Proactive and Reactive

Proactive is an adjective serving to prepare for, intervene in, or
control an expected occurrence or situation, especially a negative or
difficult one; proactive is anticipatory: e.g. proactive measures
against crime.
Reactive on the other hand is in response to a stimulus
Proactive management means thinking of future, anticipating and
planning for change or crisis. Reactive management means reacting
to change or crisis after it happens.

In process safety, is there an advantage to being

proactive vs. reactive?
In process safety there is indeed an advantage in being proactive
rather than reactive. That is because here the whole idea is to
prevent accidents and incidents from happening and doing whatever
is necessary to achieve this. Such as identifying hazards before they
blow up into incidents or accidents and taking the necessary actions
to reduce the safety risks.


The reactive (or traditional) safety management approach is useful

when dealing with technological failures, or unusual events.
A workplace can go from hoping another incident doesn't occur to
actively eliminating hazards and preventing incidents. This is
possible with proactive management.
When we are reactive, we're one step behind. We've not seen that
issue or need, and we're not even aware that there's a problem.
Conversely, proactive is one step ahead. It's actively looking for
issues or needs and correcting them before an incident occurs.
In safety and health, a reactive response occurs after an incident
and aims to rectify the problem or minimize the costs. Eventually
the cost of reactive management is more than proactive
management. When the management takes steps after an incident
a negative message is sent to employees as reactive programs kick
in only after an accident has occurred.
On the other hand, a proactive attitude to an incident or accident is
most rewarding. Looking for problem areas and fixing them to
prevent accidents sends a message that the management is keen
on employee safety. This approach is always less expensive in the
long-term as a result of fewer accidents and injuries.

Proactive and Reactive

Some things to think about:
How come we never have the time and money to fix it beforehand,
but we always have the time and the money to fix it or investigate
it after the fact?
Many roads in life have been paved with good intentions.

Why do you think these things happen?


Corporate Commitment Statements

DuPonts SHE Commitment:
Johnson & Johnsons Credo:
Are these statements reactive or proactive? Why?

Key communications
There should be communication of support to safety culture by top
management to their organizations supervisors and employees.
This support can be reiterated through including safety issues and
policy in the ongoing communications
Messages by the Head of the agency to all employees expressing
commitment to safety first in their organization
Incorporate safety first message in all agency publications such as
brochures, newsletters, posters, etc. Also talk about safety internal
presentations and trainings.
Expression of support by Program Directors at their supervisory and
staff meetings and messages to their employees. Ensure continuous

communication to managers, supervisors and employees. This can

be done by orientation programs, training sessions, staff meetings,
written materials.
Encourage managers and supervisors expression of support and
commitment to safety through messages to all workers.
Important communication elements for safety culture:
Message from the Board of Director
Periodic safety policy statements on its importance
Brochure detailing safety measures to all employees from the
The company Intranet should have a separate section for
safety related information
Messages from Managers to their employees either verbally or
in written materials

Culture Eats Strategy for Breakfast
Culture always wins
Managers MAKE the culture by how they behave and what
behaviors they TOLERATE!
Proactive behavior is more successful than reactive behavior
in solving problems, always!

Question: In the practice of process safety management,
does it ever pay to be reactive over being proactive, and why?


Write a maximum two-page paper, and include examples to

support your argument.


Chapter 30

Safety Culture

Understand the role that culture plays in the ability to safety
execute a PSM system
Understand the components and the various ways of
measuring culture
Culture eats strategy for breakfast

At the end of the day, you will be able to:

Know a few key things to look for when visiting/assessing
safety culture
Know the various levels of safety culture and the general
behaviors that each level represents

Todays Roadmap
What is culture?
Impact of culture on PSM
Examining two models of safety culture
Some key behaviors to look for
Maintaining a good Safety culture


What is Culture?
Culture is described in various ways. For an organization, it is how
they do things. It is an intrinsic quality that can be observed. It also
signifies the shared beliefs, symbols, behaviors of the people of the
organization, and written and unwritten rules that have been
developed over time and are considered valid. It can have a potent
effect on a companys wellbeing and success. It includes an
organization's expectations, experiences, philosophy, and values
that hold it together.
Culture is How we behave when no one is looking.
Culture usually trumps procedures. You may have immaculate
procedures and processes in place, but if the culture is laid back and
slack, finesse in processes wont make a difference. The leader may
be a visionary, the strategy may be brilliant but will it work if not
supported by a good culture? Why do people take short cuts?
A good culture is motivated, inspired and self-driven. The workforce
is creative and innovative. The culture is positive and sustainable.
The employees are engaged; that means they are emotionally
committed to the organization and its goals.
In bad cultures creativity is stifled, workforce is not motivated.
People are stuck in daily grind and demands of productivity stress
them out!
How does this happen? The management is ultimately responsible.
New ideas, change is not welcome. The people therefore are not
engaged and productivity suffers!

DuPont PSM Model

Management leadership and commitment, which defines the core

value of safety necessary for implementing and maintaining strong
PSM programs, is shown at the center of the PSM Wheel. The main
features of the PSM program are arranged by Technology, Personnel,
and Facilities, separated into the essential 14 elements around the
spokes of the wheel.
Operational excellence is achieved through operational discipline,
which is shown as the rim of the PSM Wheel. This implies that such
discipline connects all of the 14 elements and translates the
required managing systems into real results for preventing injuries
and incidents.
DuPont PSM Model Works mainly because:
The center of the wheel is Management leadership and
commitment. Thus process safety is the Core Value
A robust Managing System that identifies, evaluates and
mitigates process risks at all stages of a facility's life cycle

Operational Discipline encircles all the technical elements

A single governance process
Integrated into all business processes
Flexible and adaptable to many industries

Impact of Culture on PSM

Which elements of PSM have something to do with
culture...if culture is how we do things when no one is
What is the role of values in this discussion?
What is the role of leaders and managers?
Is culture applicable in other industries?
The 14 elements encompass three key features of any
manufacturing process people, technology and facilities.
It is the elements that have to do with people that have something
to do with culture. These elements are training and performance,
managing contractor safety, incident learning and prevention
emergency planning and response, and conducting operation
integrity audits.
Values are important for any discussion on culture. Organizational
values guide organizations thinking and actions; they explain what
is important in the peoples minds. Values are where a cultural
change begins!
Leadership is by its very nature imbued with power over others.
Leaders can influence others. Ethical leadership can make everyone
in the organization do the right thing for the right reasons. For this

to happen leadership is required. Only ethical leaders can promote

an ethical organization.

Slide 8

Culture Ladder

Safety Culture Ladder

The best way to understand corporate culture is in terms of an
evolutionary ladder. Each level has distinct characteristics and is a
progression on the one before.
Looking at it like this provides a route map, where every team, or
company has a certain level of cultural maturity and can see which

rung of the ladder they are on, where they have been and what the
next step looks like.
The range runs from the Pathological, through the Reactive to the
Calculative and then on to Proactive and the final stage, that we call
the Generative.
Pathological, is where people dont really care about Safety let
alone Health and the Environment, and are only driven by
regulatory compliance and or not getting caught. We probably all
recognize this from the past but is something we have hopefully
moved beyond.
Reactive, is where safety is taken seriously, but only when gets
sufficient attention after things have already gone wrong. People
say things like its a dangerous business, or you have to
understand it is different here, you have to look out for yourself,
or those who have the accidents are those who cause them.
At the reactive level managers take safety seriously, but feel
frustrated about how the workforce wont do what they are told. If
only they would do what they are supposed to, we need to force
The next level, Calculative, is where an organization is comfortable
with systems and numbers. The HSE-MS has been implemented
successfully and because HSE is taken very seriously, there is a
major concentration upon the statistics bonuses are tied to them,
contractors are rated in terms of their safety record, not just
because they are the cheapest. Lots of data is collected and
analyzed, we are comfortable making process and system changes.
There is a plethora of audits and people begin to feel they have
cracked it. Nevertheless businesses at this level still have fatalities
and are surprised when these occur.

Proactive is where you should ideally be. It is moving away from

managing HSE based on what has happened in the past to really
looking forward. Not just working to prevent last weeks accident, it
is starting to consider what might go wrong in the future and take
steps before being are forced to. Proactive organisations are those
where the workforce start to be involved in practice, as well as in
management statements of intent. Unlike the Calculative, where the
HSE department still shoulders a lot of the responsibility, in
Proactive organizations the Line begins to take over the HSE
function, while HSE personnel reduce in numbers and provide
advice rather than execution. Indicators become increasingly
process-oriented are we doing the right things? Rather than just
focused on incidents have we had any accidents? It is quite simply
about creating an environment that encourages the behaviors and
beliefs that will deliver lasting improvements in our performance
both HSE and beyond.
As an organisation climbs up the ladder there the level of
informedness and trust increases with people offering to accept
accountabilities (you can count on me) rather than just being told
they will be held accountable for some outcome. Informedness is
about mangers knowing what is happening in their organisation and
where all the problems are, and the workforce knowing exactly what
managers expect no mixed messages. Because managers and
workers are aligned, this builds two-way trust. Because people
know what is expected and are trusted to do it, there is less need
for bureaucracy, audits and supervision, so workload decreases
from after the Calculative stage
Generative organizations set very high standards and attempt to
exceed them rather than be satisfied with minimum compliance.
They are brutally honest about failure, but use it to improve, not to
blame. They dont expect to get it right, they just expect to get

better. Management knows what is really going on, because the

workforce is willing to tell them and trusts them not to over-react
on hearing bad news. People live in a state of chronic unease, trying
to be as informed as possible, because it prepares them for
whatever will be thrown at them next.

Slide 9

Measuring culture



Team Leaders
Trust and

Attitudes to Risk

Local Culture


towards Rules


Production vs

Learning Culture


Two Way




Elements of Safety Culture

The assessment provides an opportunity for the organization to look

at factors more deeply and to better understand priorities for
Once we have identified which are the key components of a robust
safety culture we refined an assessment and improvement tool,
which provides very useful information as established in a series of

The assessment tools have been developed with support from

specialized researchers in the area, statistical analysis and input
from best in class. Two assessment and improvement tools have
been developed: safety culture and leadership, both are aligned in
terms of content, format, administration of it, etc.

The tools have been adapted to be culturally neutral and are

considered not only safety aspects but also organizational factors
like country cultures, relationship between leaders and employees
Slide 10

DuPont-Bradley Curve
Achieving World-Class safety performance requires a culture shift
and the involvement and ownership of all employees.

This curve basically maps how the culture of the organization

impacts the safety of people, processes and productivity. The safety
culture depends on the maturity of the people towards safety. The
DuPont Bradley curve describes four stages of culture maturity:
Reactive, Dependent, Independent and Interdependent.
In the Reactive stage, people do not take responsibility for safety.
Safety is attributed to luck and not management. Accidents are
bound to happen is the attitude. Safety Manager looks after safety,
and compliance with rules and regulations. Top management is not
actively involved and safety is relegated to a lesser issu.
Unfortunately such lax attitude affects the productivity and the
profitability too, which is not at its best.
The management commitment begins at the Dependent stage.
Safety now becomes a responsibility of the supervisors. However

the emphasis is on discipline, and following rules and procedures.

There is no active involvement though necessary safety training is
provided. Safety compliance is due to fear of reprisal and because it
is an employment condition. However at this stage because of
safety awareness, productivity and profitability improve to an extent.
Accident rates decrease and management believes that safety could
be managed if only people would follow the rules.
The next stage is the Independent stage where individuals
become personally involved in safety. The management ensures
that employees have a thorough knowledge of safety issues and
methods. Individuals become committed to safety and follow safety
standards because they believe that they can make a difference to
safety with their own actions. The accident rates go down further
and profitability and productivity climbs higher.
Now the organizations and people are ripe for the Interdependent
stage. Here safety is no longer an individual issue but each person
feels responsible to their own as well as others safety. They
encourage others to conform to safety initiatives. They have an
active safety network and feel proud about their safety endeavors.
This is when the accident rate approaches zero and the productivity
and profits are at their best!
An organization can follow the DuPont-Bradley curve to achieve the
highest rates of safety. Understanding the psyche behind the
increasing safety culture stages, they can incorporate the safety
culture and sustained improvement in safety and productivity!

Slide 11

Key behaviors to look for

Culture of an organization can be observed in various facets. The
answers to these questions below will give key insights into the
culture of an organization.
What do Managers do when they see an unsafe behavior?
Is Safety communication open and honest?
Is the workforce involved in solving safety issues?
How are individual and team competencies assured?
How do managers balance safety and production?
Are contractors integrated into the working environment?
Are safety programs (e.g.driving safety, behavioral safety)
adapted to the local culture?
Does your Manager listen to your ideas for improvement?
How is maintenance actually carried out?
Does my team leader trust me and respect me?
Slide 12

A Good Safety Culture

These are the important aspects of a good safety culture:
Stand on firm ground - clear values
Be paranoid never be satisfied with current performance
Do not tolerate late or overdue PSM critical items
Look for trends and patterns in all incidents
Follow procedures

Leadership listens and is willing to pitch in

Everyone works together
Supervision genuinely cares about their reports
Slide 13

Culture Eats Strategy for Breakfast
Culture always wins
Current business climate makes having a good safety culture
more difficult, not impossible
Culture requires solid work processes, effective rewards
A healthy organizational culture is made of various factors such as
tradition, mission, committed workforce, due recognition of merit,
and continuous improvement. It is said that a great strategy keeps
people in the game, but a great culture helps an organization win.
Slide 14

Review the following incidents, and document the key
elements of safety culture that were weak:
o Occidental - Piper Alpha, UK North Sea
o Nypro - Flixborough, UK
o NASA - Columbia Shuttle
o BP - Texas City
o BP Deepwater Horizon

Note areas of common safety culture issues and analyze the

Place each of these five organizations on the Keil Centre
ladder. Explain your placement with short paragraph

Chapter 31

The Role of Management in PSM

Management - [man-ij-muhnt]
1. The act or manner of managing; handling, direction, or control.
2. Skill in managing; executive ability: great management and tact.
3. The person or persons controlling and directing the affairs of a
business, institution, etc.: The store is under new management.
directorship, control, governorship, stewardship, hegemony.
2. Ability to lead: As early as sixth grade she displayed remarkable
leadership potential. Synonyms: authoritativeness, influence,
command, effectiveness; sway, clout.
Origin: 181525; leader + -ship 4. Executives collectively,
considered as a class (distinguished from labor).
Origin: 15901600; manage -ment
Leadership - [lee-der-ship]
The position or function of a leader, a person who guides or directs
a group: He managed to maintain his leadership of the party
despite heavy opposition. Synonyms: administration, management,
These two definitions as you can see are mostly interchangeable.
In the minds of some there is a difference, but when you look
closely they should be interchangeable.
What is safety leadership?
A leader is meant to influence others to achieve objectives and
goals. In fact they want to accomplish them.

A safety leader believes in the value of safety, promotes safety and

can influence others to also believe in safety first!
The purpose of Management Leadership with respect to
Process Safety Management
The purpose of Management Leadership with respect to Process
Safety Management is to do essentially just three things:
1. Proactively put management systems in place that prevent
process safety related incidents.
2. Take an ACTIVE role in ensuring that the systems are properly
being utilized and followed.
3. In the event of a breakdown in the system and a process
safety related incident does occur, to immediately determine
the root cause of the failure and modify the system in place to
prevent a reoccurrence.
These steps sound simple, but as is evident from recent lapses, are
actually quite complex and demand managements/leaderships
constant and full attention.
The following points should be a part of every system in place for
each of the 14 elements of PSM.
Proactive Management
In Proactive Management the following questions are asked for
every PSM element and answers demanded and given:
1. Responsibility is assigned for developing the elements
management program?
2. Responsibility is assigned for training of personnel in that
elements program?

3. Responsibility is assigned for maintenance of that elements

4. Minimum qualifications for assignment of program
maintenance personnel have been defined?
5. A process is in place to ensure new personnel understand the
status and priorities of the elements program?
6. Specific tasks, associated with the elements program,
identified as performance objectives for individuals are
assigned to support this elements program as part of those
individuals roles and accountabilities?
7. Background instruction in your companys Program
Management is provided to the assigned central staff
8. The elements program is consistent with the overall
companys plan in terms of:
Guiding Principles
These are not all-inclusive points, but suggest the tone that should
be set in establishing a proactive program. Clearly the intent is to
make the program consistent in all locations based on one common
set of guidelines established by senior leadership in the company.
Another point is that local management reviews and has input to
the program on a continuing basis to ensure their knowledge of the
workings of the program.
Active Management
1. The Board of Directors has a process in place that periodically
reviews the status of PSM performance, both leading and
lagging indicators.

2. A member of the Board of Directors should have oversight

and accountability responsibilities.
3. A process is in place to keep business Unit/Group
management aware of the local units activities?
4. The local management team periodically reviews and provides
input into the elements program?
5. Management By Walking Around (MBWA) is actively practiced.
6. Training is a continuous practice and improved as the need
Examples of Management / Leadership to support Process
Safety from a company that excelled in Process Safety
Example 1
An audit / assessment group that reported to the Board of Directors
of the Corporation was required to make assessments and report
findings to the Board. The selection process ensured that all
facilities were assessed in a risk ranked manner that was designed
to look at all facilities in a very systematic manner. The process was
an Environmental, Health, and Safety review of the facilities. It
covered Process Safety Managements 14 elements and
management systems that provided the process of governance.
A typical assessment was led by a project manager and supported
by five to 15 subject matter experts (SME). The assessment
process usually took one to two weeks depending on the size and
complexity of the facility. The process followed a prescribed audit
protocol that guaranteed a consistent starting point.
During the assessment process as items were uncovered or
discovered the individual SME would pursue as deemed fit. Finally a
closing meeting was held with the entire facility team and the

assessment them to review what was found to be a best practice,

and items not meeting minimum standards. Typical expectations
were that the deficient items would be resolved within two years.
After the process was complete the Project Manager would review
the previous assessment finding to look for Repeats. Repeats
were, and should be, absolute NO-NOs. The meaning of a repeat is
that a finding of non-compliance during an assessment would be
listed in the final report for that assessment. The individual facility
would then be charged with correcting the issue, including the root
cause. If, during the next assessment that same issue were found
it would reflect badly on the management of the facility and could
impact promotability or even security in the position. Reports of
repeats were sent up through the facilitys organization and were
even periodically reviewed by the Board.
Three repeats, if found, were an almost certain path to dismissal
from the company. It should be noted that this process pre-dated
the review process inherent in the PSM regulation and was even
more stringent than that required by the regulation.
Example 2
One very large non-US facility was assessed by one of the
Compliance teams and a major piece of pollution abatement
equipment was found to be intentionally not running on two of the
five trains. The pollution abatement equipment was specifically
required to run in the operating permit for the facility. The penalty
of not running was specified to be $1,000,000.00 (US)/day right in
the operating permit, but the facility chose NOT to run it, but told
the assessment team they WERE running it.
When the team verified that the abatement equipment was NOT
running, the president of the subsidiary was notified and requested

to be at the closing meeting. Shortly thereafter the head of that

facility elected to take early (very early) retirement.
Good companies have management systems like this in place and
fully use them to demonstrate principled operation.
Example 3
On a positive note, the Board of Directors made a visit to a facility
and one of the agenda items was to go to a unit and listen to a
presentation of workers from that unit describe to the Board how
they had improved the mean time between failure of their rotating

This unit had at one time held the worst standing in

the facility, but with very simple, but specific steps turned that
worst standing into the best. After the presentation to the Board it
was hard to see any of the workers feet touch the ground. The
pride they took in their unit was palpable. The point here is that
management can encourage desired behavior, when done well it
drives PSM performance.
PSM Management Reviews in Safety oriented companies:
In each of the segments of the company, PSM management drives
expected behavior by regularly reviewing open action items found
during audits, assessments, PHAs, etc. When action items are open
longer than reasonable, management can intervene to see if
additional resources are needed and if so, get them where needed.
If the open action items are not complete due to inaction rather that
lack of necessary resources, then other steps may need to be taken
to show managements expectation of completion.
PSM management regularly reviews exceptions taken from
existing guidelines to see if trends exist in the various locations. If
more locations make exceptions it might mean that the guidelines

need to be modified. When done on a proactive basis this is a sign

to the workforce that management is in touch with reality.
PSM management has a process in place to review requests for
deviations from existing standards. Local facilities should not be
able to deviate without a review by higher management to ensure
that the broader perspective is taken. This means that short term
wants must not take precedence over longer term perspective.
Each facility in a company has a PSM system in place that looks at
the PSM results and actions in place at the facility. The system has
a management structure geared to each level in the facility.
Each unit in a facility has a PSM coordinator assigned to it to
ensure that all of the PSM requirements for that unit are followed
rigorously, such as PSSRs, JSAs, etc. Of course, the PSM
coordinator has other duties and the PSM aspect is probably no
more that 15 20% of their duties. That being said, it is their
ownership of PSM that is evident to the entire workforce that PSM is
integral to a smooth and safe operation.
The PSM coordinators meet regularly (say once a month) with the
PSM leadership to ensure clear communication to the entire facility
of what is occurring and how and why. This continuity ensures a
consistent and multifaceted approach to PSM.
It should be clear from the above descriptions that the role of
management is not simply to issue edicts and expect the company
to follow, rather it is a comprehensive approach that starts from the
top and extends throughout the organization with setting
expectations, following up to see if those expectations have been
met, If not, why not and correcting the why-nots.

Giving support to the whole organization to ensure superior PSM

results is the ultimate role of management.

Chapter 32

Quality Tools in Process Safety Management

Learning Objective
To introduce the various continuous improvement techniques
that are commonly used to support Process Safety
This chapter will NOT make an expert out of you; this
chapter will just give you enough information to determine if
you want to learn more detail

Todays Roadmap
Improvement Processes
Basic Continuous Improvement (CI)
Statistical Process Control
Six Sigma
Lean Techniques

Improvement Processes
What is Continuous Improvement (CI)?
Continuous improvement means an ongoing effort to improve
products, services or processes. This is done by examining your
processes to discover and eliminate any shortcomings and faults.
This is generally done through small incremental changes or
sometimes through a breakthrough change. By focusing on making

things better, project teams take actions to reduce defects, remove

activities that provide no value and thereby provide customer
delight. There are no revolutionary transformations but there are
evolutionary changes! By getting to the root cause of a problem and
questioning why, project teams can design a plan to offset the
problem. Plans usually include a description of the problem and
details about what should be done to remedy the situation.
Continuous improvement is characterized by having all employees
involved, producing daily improvement, focusing on product
characteristics and customer delight!
This concept of Continuous Improvement (CI) is the fundamental
underpinning of ISO Standards.
There are different types of CI processes:
Deming Cycle
Statistical Process Control (SPC)
Six Sigma
What CI processes do you currently use? Any?

Continuous Improvement
Organizations are making concerted and effective efforts to
implement PSM programs and procedures to comply with applicable
rules. Most have got the processes stabilized and the core
regulatory elements in place. Efforts now are primarily for
continuous quality improvement. So CI has become a part of PSM.

Out of the various different CI tools available, you should select the
right kind for the work you are doing and the process you are using.
Basically each tool can be used separately; but can also be used in
conjunction. It is like synergy; when used together the tools are
very powerful. All are totally compatible with the PSM system

Basic Continuous Improvement: Deming Cycle

Among the most widely used tools for continuous improvement is a
four-step quality modelthe plan-do-check-act (PDCA) cycle, also
known as Deming Cycle. It was developed by W. Edwards Deming
in the 1950s; and provides overarching thinking to support CI
efforts; serves as the basis of all ISO standards
This model analyzes business processes and uses measurements to
identify sources of variations that cause products to deviate from
customer requirements. Such processes are placed in a continuous
feedback loop so that managers can identify and improve where
necessary. Deming created a (rather oversimplified) diagram to
illustrate this continuous process, commonly known as the PDCA
cycle for Plan, Do, Check, Act:

PLAN: Plan ahead for change, design or revise business process

components to improve results
DO: Implement the plan,taking small steps in controlled
circumstances and measure its performance
CHECK: Check, study the results, assess the measurements and
report the results to decision makers
ACT: Decide on changes needed to improve the process, take action
to standardize or improve the process Continuous improvement

A standard is a document that provides requirements,
specifications, guidelines or characteristics that can be used
consistently to ensure that materials, products, processes and
services are fit for their purpose.
ISO International Standards ensure that products and services are
safe, reliable and of good quality. For business, they are strategic
tools that reduce costs by minimizing waste and errors, and
increasing productivity. They help companies to access new
markets, level the playing field for developing countries and
facilitate free and fair global trade.
The Basic Standards of the International Standards Organization
(ISO) are below. ISO will certify your firm on these standards,
which means that you are following an ISO certified process, and
have achieved some minimum level of result.
Quality - 9000 Series
Environmental - 14000 Series

Safety - OHSAS 18000 Series

Risk Management 31000 Series
A typical ISO process is:
Determine your process
Follow your process
Check your process deviations
Improve your process and repeat

Statistical Rules of Thumb

In general, when it comes to statistics, some old rules apply
What gets measured gets done.
Trust in God; everyone else must bring data!
You can make statistics tell you anything.
Slide 9

SPC or Statistical Process Control

Statistical process control (SPC) procedures can help you monitor
process behavior. When a process is monitored, it becomes easier
to control it. So to apply this method to a process, it is essential
that the "conforming product" (product meeting specifications)
output can be measured. Key tools used in SPC include control
charts; a focus on continuous improvement, and the design of
experiments. An example of a process where SPC is applied is
manufacturing lines.

The most successful SPC tool is the control chart, originally

developed by Walter Shewhart in the early 1920s. A control chart
helps you record data and lets you see when an unusual event,
e.g., a very high or low observation compared with typical process
performance, occurs. Such variation is analyzed by establishing
control limits.
Control charts are made with data measured over time and attempt
to distinguish between two types of process variation:
Common cause variation, which is intrinsic to the process
and will always be present
Special cause variation, which stems from external sources
and indicates that the process is out of statistical control
Various tests can help determine when an out-of-control event has
occurred. However, as more tests are employed, the probability of a
false alarm also increases.
Control charts have three basic components:
1. A centerline, which is the mathematical average of all the
samples plotted.
2. Upper and lower statistical control limits that define the
constraints of common cause variations.
3. Performance data plotted over time.
In process improvement efforts, the process capability index or
process capability ratio is a statistical measure of process capability:
the ability of a process to produce output within specification limits.
The concept of process capability only holds meaning for processes
that are in a state of statistical control. Process capability indices
measure how much "natural variation" a process experiences
relative to its specification limits and allows different processes to

be compared with respect to how well an organization controls

them.Defines process capability or Cpk
Defines stable process
When the control chart indicates that the process is currently under
control (i.e., is stable, with variation only coming from sources
common to the process), then no corrections or changes to process
control parameters are needed or desired. (Wikipedia)
Processes that are more stable usually have LESS process safety
incidents; higher reliability
What is Six Sigma?
Six Sigmafundamentally signifies quality that strives for near
perfection. It is a disciplined, data-driven approach and
methodology for eliminating defects (driving toward six standard
deviations between the mean and the nearest specification limit) in
any process from manufacturing to transactional and from
product to service.
To achieve Six Sigma, a process must not produce more than 3.4
defects per million opportunities. A Six Sigma defect is defined as
anything outside of customer specifications.

Six Sigma (DMAIC)

DMAIC is not exclusive to Six Sigma and can be used as the
framework for other improvement applications.
DMAIC is an abbreviation of the five improvement steps: Define,
Measure, Analyze, Improve and Control.

All of the DMAIC process steps are required and always proceed in
this order:
D Define a problem or improvement opportunity
M Measure process performance
A Analyze the process to determine the root causes of poor
performance; determine whether the process can be
improved or should be redesigned
I Improve the process by attacking root causes
C Control the improved process to hold the gains.
The most used Six Sigma (SS) Process expands the Deming cycle.
This method is used where the work process is known. SS requires
specific data to be obtained at each step of the process. DMAIC
methodology can be thought of as a roadmap for problem solving
and product/process improvement.
Good for overall PSM process analysis.
In the Improve phase you will develop a proposed solution, and you
will test, or pilot, that solution in a real business environment. This
piloted solution allows you to collect real-time process data to verify
statistically that you have fixed the sources of variation and your
solution will work on a larger scale.

Slide 11

Six Sigma (DMADV)

Lesser used Six Sigma (SS) Process is targeted where the work
process needs to be put into place and none currently exists;
sometimes called SS for Design.
Thus DMADV methodology is used when:
A product or process is not in existence at your company and
one needs to be developed
The existing product or process exists and has been
optimized (using either DMAIC or not) and still does not
meet the level of customer specification or Six Sigma level
The first three steps are similar to the earlier model. Then comes
the difference!

D Define the goals of the project and that of the customers (both
internal and external).
M Measure and quantify the customer needs as well as the goals of
the management
A Analyze the options, existing process to determine the cause of
problem and evaluate corrective measures
D Design a new process or a corrective step to the existing one to
eliminate the error
V Verify, by simulation or otherwise, the performance of thus
developed design and its ability to meet the target needs

Kaizen is Japanese word for good change. The Kaizen method is
based on the philosophy of continually seeking ways to improve
operations. The basis of the continuous improvement philosophy is
the belief that no operation is perfect and there is always room for
Kaizen is gradual, uses small steps, conventional know-how and a
lot of common sense. The focus can be on for example reducing the
length of time required for a process, or the waste generated in a
process or even wasted movement. Setting up tool stations so that
everything is within arm's reach is an easy way of cutting out
wasted steps, and iterated over the course of a day, or a month, for
two hundred workers, this means greatly increased productivity.
The concept is to review and look at physical workflow. Then focus
on removal of hard work or muri. The people most closely
associated with an operation are in the best position to identify the

changes that should be made. Consequently, employee involvement

plays a big role in continuous improvement programs. Kaizen
method thus engages full workforce.
Based on the Deming cycle
Can be used to plan/execute maintenance in support of PSM

Lean Techniques
"Lean", is a production practice that considers the expenditure of
resources for any goal other than the creation of value for the end
customer to be wasteful, and thus a target for
elimination.Essentially, lean is centered on preserving value with
less work.
The Lean technique is a workplace organization method that uses a
list of five Japanese words which when translated into English are as
Sort the necessary and the unnecessary, the essential and
non-essential items. Eliminate clutter.
Set The workplace in order. Decide the best location for
each item, and keep essential items in assigned locations.
Remove all non-essential items from the work area. Devise
effective storage for easy access and ensure proper labeling
for quick siting.
Shine The work area. Systematically clean the place and
tidy it up. Daily regular housekeeping activities and cleaning
is required as a follow up.

Standardize Activities. Consult with the process

employees to identify the best procedures and standardize
Sustain The 5S system. Maintain the proper procedures
and ensure that all activities and changes that have been
implemented stay implemented.
There are three more Ss which are sometimes included:
This is however not a traditional set of "phases". Safety for example
is inherent in the 5S methodology and is not a step in itself.
Therefore the additions of the phases are simply to clarify the
benefits of 5S and not a different or more inclusive methodology.

Lean Techniques
What is visual lean technique? These are visual manifestations of
the Lean process such as scoreboards, production control charts,
team communication boards, or other types of visual media. When
such visual media are right in front of you, you know where you
stand and what you need to do!
Lean technique is used a lot in process industries to discover waste.
Drives operational discipline, which underpins strong PSM.

Continuous improvement is an essential business process in
support of PSM
Good PSM processes include CI techniques
Use of more than one technique is better
Not using any CI technique signifies a very weak PSM

Chapter 33

Pulling it all Together PSM in Your First Job

To know the relevance of PSM in your first job

Todays Roadmap
Let us see where you will choose to be in your first job.
Capital project execution
Sales and Marketing support
Finance/MBA/Insurance/Risk Management
We will look at PSM vis--vis these jobs.

Understanding the relevance of PSM

You have studied PSM. Now are you wondering if it has any
relevance in your first job? Does PSM knowledge help you? Isnt it
supposed to be the senior managements responsibility?
It is true that companies monitor safety performance and ensure
that process safety is not taken lightly. Safety performance is often
measured by the last process safety incident. However, the 'Baker
Report' (The report of the BP U.S. refineries independent safety
review panel) published following the 2005 incident at the Texas
City Refinery concluded that: "The passing of time without a

process safety accident is not necessarily an indication that all is

Each employee therefore needs to be safety oriented. That's why
you need to know PSM right from your first day in your first job!
The challenge for you is to ensure that your company doesnt hurt
people or planet! And the company is you! A complete appreciation
of PSM can help you proactively identify, evaluate and help prevent
PSM is an advantageous skill to bring to your first position be it in
academe or industry or government.
Depending on the position and job role, PSM is applied differently.
However fundamentals of PSM do not change
If your employer has a specific PSM method, be sure to ask early to
learn that.

Manufacturing First Job

If your first job is in Manufacturing, PSM is required for both, actual
manufacturing as well as technical support.
Your first task is to determine if your unit is a PSM covered process.
By now it is certain that you know what a covered process is!If yes,
the full range of all PSM elements will be required.
If you are assigned to technical support, be especially conscious of
Management of Change, because your job is to improve unit ops, or

Capital Project Execution

Assigned to help a team build something?
You will use different parts of PSM depending on the project stage:
Concept Stage: Hazard identification, PHA
Design Stage:Detailed PHAs, Start Operating procedures in
support of engineering and equipment specifications, collect
PS Information
Construction: System testing, PSSR
Operation: Full PSM implementation
Decommissioning: Hazard Identification, PHA, Procedures
and Process Safety information

Sales and Marketing

Even in Sales and Marketing PSM is required.
If you are selling a hazardous material, you may find yourself
evaluating the clients ability to effectively manage storage and use
How good is your clients PSM effort?
If you are marketing software to support PSM, you must understand
how your software supports the PSM.

Finance/Risk Management
So, you are not on the manufacturing side, you are getting an MBA.
Even here your knowledge of PSM is of a great advantage. Reducing

a firms overall risk levels is extremely important; an understanding

of PSM supports your analysis.
If you are doing a cost evaluation of similar projects, the relative
safe operation of various technologies may be required
Evaluating effective use of capital? How safe and how well the plant
can run is a critical component of its financial viability

PSM touches every job in many industries
Thank you for your attention over the past chapters
We hope this eBook has been of value
We are interested in your comments, and please send your
feedback to us: [email protected]
Follow us on Twitter @thePSMeBook

The future of PSM

Great change in the workforce
The future of PSM is beginning now, with you and your peers. You,
reading this text and learning from this and other resources, will
determine if process and workplace safety improves or degrades in
the future. You are inheriting the mantle of responsibility. You will
not start out in management, but you will start out being able to
control outcomes.
Vast numbers of leadership positions in industry today are held by
Baby Boomers who are retiring literally thousands per day. The
lessons they learned the hard way will be history. Of course many
will be going (or have already gone) into consulting positions. They
will still be available to provide resources, but the day-to-day
oversight and stability of the workforce will now fall to you and your
peers. So, the path forward for you is one of learning every day
and from every situation.
To become proficient in Process Safety Management, personal
experience may not be the best teacher. Encountering an incident
or hazard and learning from it is painful and could have been
avoidable if you had knowledge of process safety. Instead of this, a
better way to learn is to learn from what has happened to others
and put a system in place to understand root cause(s) and avoid a
future occurrence.
The PSM regulation, 29 CFR 1910.119 lists requirements for
compliance, but not how to fulfill the requirements. The PSM
regulation has been in place since May of 1993, but again only
listing the expectations for compliance. The mechanism to meet
that compliance is left up to individual company. Some companies
have developed very good training programs while others rely on

the experience of their employees. The point being made is that

the solution(s) are different depending on the company. That is
positive, in that the solutions are fit for purpose for the company
PSM course?
Academia has not provided relevant courses in the past and, at
present; very few Universities have them in place. There is a valid
reason for this is that the academia has little, if any, experience in
industry. Thus when you see courses offered by experienced
individuals take advantage of the opportunity and ask as many
relevant questions as you can. Get contact information and utilize
the contact in your future to leverage the information and person.
The previous thoughts have been made concerning the relationship
between you, the reader, and the future of PSM. Why do you think
they were/are made? The point is that great changes are occurring
in the workplace and the key to successful outcomes lies with you.
Collapsing companies
Along with the dramatic change in the leadership positions of
companies, the companies themselves are undergoing significant
change. As the world economics are changing the companies must
also change to remain viable. In terms of profitability, liquidity,
solvency, efficiency, leverage and market confidence, one element
that has rarely been mentioned is PSM effectiveness of companies.
It is perhaps the most important element of a companies continued
existence. Those companies who do PSM well will also do well in
their continued operation. The attention to detail in PSM is
essentially the same as economic excellence. However, it may only
take one lapse in PSM proper management to forever destroy a
companyeven the ones with otherwise strong economic basis. The

point being made here is that if the company you begin to work for
has a strong PSM program, make it stronger, and if it is a weak
one, you may want to consider another company.
Safety culture
Safety culture is a fairly nebulous term, but hopefully you get the
drift that the way the culture of a place really is, compared to how
it is portrayed makes a huge difference. Do they do what they say
they do, or is it simply for a good face to the world. Effective,
economically viable companies have a clear vision of operating that
must be a normal part of their everyday existence. Others have
listed the following as indicators of how a culture of safety can be
measured.A not all-inclusive list follows:
o Management Support for Safety
o Peer Support for Safety
o Personal Responsibility for Safety
o Incident Reporting and Analysis
o Safety Rules, Regulations, and Procedures
o Training, Safety Suggestions and Concerns
o Rewards and Recognition
o Safety Audits and Inspections
o Communication
o Employee Engagement
o Safety Meetings & Committees
o Discipline
As should be obvious from the list these are not subtle things. They
tell you how the leadership of a company views and supports
safety. Management support is essential for the commercial success
of any endeavor both from an economic point of view as well as the
PSM aspects.

So, how does this relate to the future of PSM? Only the companies
that embody a safety culture within their value system will survive
and thrive. Make sure you strengthen that safety culture by doing
the right things for the right reasons. Sounds simple and it is, just
do to.
Behavioral Safety
Continuing the theme of safety culture, the types of behavior that
are practiced by you and your co-workers are crucial to a safe
environment. You get what you give. Do you only behave in a safe
manner when you know you are under scrutiny or all the time? If
the former you are the problem, if the latter then you are part of
the solution. If you see unsafe behavior, you can help the offender
by offering your insight into a better, safer way. As hard as it is to
believe, some behave in an unsafe manner because they dont think
it is unsafe, merely a quicker way to get things done. The safer
way may be a slightly longer way to get to the end point, but
arriving safely is the preferred route.
Here is a simple, but clear example: When you bring eggs home
from the store and transfer them into your refrigerator,do you hold
the carton under the transfer point or not. To do so, takes just a bit
more time, but if you dont and you slip! Well, you get the idea.
This is clearly not an earth-shaking event either way, but is an
example of doing things in a manner you have thought out to be
the best and safest manner (and in this case the cleanest) is
obvious. How you approach your work and home similarly will make
your life safer. The key element is to anticipate what could go
wrong and take measures to minimize that possibility of unwanted
That, in a nutshell, is behavioral safety!

Human Factors
Human factors and ergonomics are focused on the "fit" between the
user, their equipment, and their environments. It takes into account
the user's capabilities and limitations in seeking to ensure that task,
function, information, and environment suit the user.
To assess the fit between a person and the used technology, human
factors specialists or ergonomists consider the job (activity) being
done and the demands on the user; the equipment used (its size,
shape, and how appropriate it is for the task), and the information
used (how it is presented, accessed, and changed). Ergonomics
draws on many disciplines in its study of humans and their
environments, including anthropometry, biomechanics, mechanical
engineering, industrial engineering, industrial design, information
design, kinesiology, physiology, and psychology.
A very simple example of this is the keyboard you use on a laptop.
It is not at all suited to the function of typing, but rather to fit the
laptops design. A better alternative to use regularly is a split
keyboard that fits the general orientation of the hands when typing.
The author had been using a regular keyboard for many years when
challenged by an industrial hygienist to try a split keyboard. The IH
person insisted and took away the regular keyboard and promised
to bring it back in a week. Soreluctantly the author agreed. One
week later the author would not give up the unwanted split
Again, the above is a simple, yet clear example of human factors at
work and the possible impact on the potential for carpel tunnel
syndrome developing. A clear example from the workplace is the
location of valves. Are they located where the worker has easy
access or does a scaffold need to be put in place? If the latter, can

a walk way be permanently installed to more readily accommodate

the worker? Can the valve be relocated to grade?
These are simple, yet important aspects of human factors that need
to be addressed by you when you get into the workplace that will
have an impact on future workplace safety.
Fatigue and its Role in Process Safety Incidents
API (American Petroleum Institute) is the industry supported entity
that addresses industry issues to develop best practices.
In the case of fatigue in the workplace API RP (recommended
practice) 755 addresses this issue).
It has been well documented that excess workplace fatigue can be a
risk to safe operations. In the past, it was thought that simply
placing limits on the hours of service would adequately address the
risk of fatigue. However, over the last several years, a broad
international consensus has emerged that the better way to manage
fatigue risk is through a comprehensive fatigue risk management
system (FRMS) that is integrated with other safety management
systems as necessary.
ANSI/API RP755 is based on the FRMS approach and contains the
following elements:

Positions in a facility covered by the FRMS

Roles and responsibilities of those covered by the FRMS

Staff workload balance assessments
Safety Promotion: training, education and communication
Work environment
Individual risk assessment and mitigation
Incident/near miss investigations
Hours of service guidelines

Exception process
Periodic review of the FRMS to achieve continuous
It should be clear from the above that a comprehensive approach to
the issue has been incorporated in RP 755 to help provide a
An individual must also take personal responsibility to ensure that
their own fatigue will not lead to a process safety incident.
Whatever rules or guidelines are in place it cannot be emphasized
too much how the individuals own sense of responsibly should
govern their actions.
Effects of Health on Safety
Not much has been written on the effects of ones health on their
safety in the workplace, but the converse is not true. So, we shall
think a bit about the topic of ones health.
An individual (YOU) should take the best care of your own health for
the obvious reasons. youll feel better. youll live better. those
around you will be better. you will work better and safer. When
you read this you will say, of course I understand this and it is
obvious. However, how many times did you go to work after a few
too many drinks the night before and never thought about it? More
than once, I would expect. And, how many times did you go to
work with a touch of the flu and did not think about how many of
your co-workers could become infected? Or, did you ever think
about the distraction of feeling ill and how it could cause your
judgment to possibly be affected that might lead to a process safety

If you are the manager, be sensitive that some of your employees

may have chronic illnesses like diabetes, and will need to eat
wholesome food at regular times. This means to be sensitive to
their human needs and to NOT work them through a meal time
without a break. You cannot and should not expect them to tell you
their personal situation; you are in the position to know better and
to manage the work appropriately.
You are the one who has control of your own health. You control
your destiny and the impact you can have on process safety in your
own workplace. Be aware and act accordingly.
Historical Incident Database
Data Driven Safety Management
Data, not emotion, should drive safety management. It is very
easy to get caught up in emotion when an incident occurs, but hard,
cold data is your best resource. A root cause analysis of incidents
will give you the data you need to improve in the future. Much has
been written about this and many systems are available. You and
your company need to be aware and get on board. If your
company does not currently utilize a data driven approach.you
should push to get it into your work place. You can and do control
your own destiny.
Record Keeping & Statistics
The records (data) that are kept must be accurate and unbiased.
That means that all data that is relevant should be meticulously
gathered, maintained, and USED to PREVENT incidents from
occurring. It does no good to gather and store data and statistics if
not used in a pro-active manner to head off process safety
incidents. Much is available about the types of data to collect and
maintain, but the author would submit that the MOST IMPORTANT

data is the near miss data. Near miss data is that small voice
whispering in your ear that says.Pay Attention To This You, the
reader should be alert to these warning signs and ensure that your
company does so as well. If you do not have a management
system in place. you can and should make it happen.
Rememberyou control your destiny.
Use of BIG DATA Could Make PSM more Predictive
With the advent of more powerful computing platforms, we are
learning to harness the computer as a tool to provide almost
continuous analyses around micro-trends in how the data changes.
These trends can be used to predict serious process incidents with
enough lead time to be able to mitigate or avoid things like
emergency shutdowns, plant outages, reactor upsets and serious
mechanical failures
Much has been said about how you control your destiny.it cannot
be overemphasized. You do control your destiny and to some
extent that of your co-workers. Make sure you understand what
you are seeing in the workplace (and your home) and think thru
your actions to anticipate what could go wrong and adjust your
actions accordingly. If a procedure doesnt make senseask why
Do not proceed until you know that the path is correct. Just do it is
not the correct answer. You do control your destiny.

Now go forward and make PSM better than you found it!

You might also like