Risk analysis and assessment on construction operation based on human factors and empirical Bayesian theory

This study was focused on the roles of human factors in construction accident and also dealt with the probabilities of fours levels of injury. We used an empirical Bayesian method and the human factors analysis and classification system framework to analyze the probability distributions of the severity of accidents of high risk operations in hydropower construction. Accident severity in four levels of injury was modeled: severe injury, one death, two deaths, and three deaths. The results show the behavior characteristics of workers and factors influencing their operation violations. The calculation of posterior distributions of the levels of injury enables us to rank the factors with respect to their risk of injury. The study revealed that lack of the ability to determine hazards is the direct reason in many accidents; resource management, inadequate supervision and supervisory violations also play important roles in the occurrence of accidents.


INTRODUCTION
In the southwest China, there are numerous large-scale hydropower projects that have been built and put into operation.The construction projects are large-scale, with long construction period, and are extremely complex with high safety risks.At present, there are no systematic and thorough study on the hazards identification, risk evaluation and control management on the hydropower construction project at home and aboard.Though our country has built laws, technical specifications, and procedures, there are no related concrete contents of the construction safety hazards.On the other hand, strengthening the construction accident statistics work is an important part of the safety production.Construction accident statistics is the basic management work in the construction safety production.Statistical data are used to analyze the key factors of the accidents to obtain the cause and regulation of the accidents.
Hazard risk assessment involves many factors, which is a dynamic system with interaction factors.Therefore, the establishment of the evaluation indicator set must be scientific, rational and comprehensive.This study begins from the view of common hazard identification and evaluation, takes into account the characteristics of construction safety hazard, and deeply analyzes the main factors influencing hazard risk assessment.On the basis of the human factors analysis and classification system framework (HFACS), we use classical Bayesian theory to calculate and sort the importance degree of the factors.

METHODOLOGY Human factors analysis and classification system framework (HFACS)
The safety assessment structure model of high-risk operation and risk factors causing accidents are shown in Figure 1, which is also a HFACS framework (Wiegmann andShappell, 1997, 2001).This study uses the HFACS framework to identify the kind of factors that are of most importance in the accidents.
The HFACS framework classified the human fault factors involved into four levels.L1 level factors were the focus of the past accident investigation, which are the personnel behavior mistakes causing the accident (i.e construction workers unsafe behaviors).This level can be further divided into four categories: decisionmaking mistake, skill-based errors, perception errors and violation operations.L2 level factors (precondition for unsafe behaviorspotential/obvious mistakes) refer to the obvious psychological or environmental factors' influence on L1 factors, such as operation environment that does not meet the required conditions and are more likely to cause accidents.L3 level factors (unsafe supervisionthe potential mistakes) are the onsite supervision factors that lead to unsafe behaviors.L4 level factors (organizational influences-the potential mistakes) are described as the wrong decisions made by the high-level managers' influence on the low-level workers' behavior.
In HFACS framework, the higher level can influence the adjacent lower level.Adjusting and replacing some factors according to the characteristics of construction can strengthen the framework's independence and generality; and makes it possible to quickly identify the key factors in the whole system.The amendment HFACS framework is more in line with the actual hydropower construction project.

Statistical framework
There are two basic concepts in Bayesian statistics: the prior distribution and posterior distribution.Prior distribution is a probability distribution of the overall distribution parameter θ.The fundamental idea of Bayesian School is that any statistical inference on the overall distribution of the parameter θ, in addition to samples provided by the information must also provide for a prior distribution.It is an indispensable element for making statistical inferences.They believe that the prior distribution does not need to have an objective basis, and can be partially or fully based on subjective belief (Shappell and Wiegmann, 2001); posterior distribution, according to the samples distribution and the prior distribution of the unknown parameters, with the method of conditional probability distribution to find out the conditional distribution of unknown parameters under the known samples.Because of this, distribution obtained after the sampling is called posterior distribution.The key of Bayesian inference method is that any inference must only rely on the posterior distribution, and no longer involve in sample distribution (de Lapparent, 2006).
The severity of the injury accident cases is divided into four levels: severe injury, one death, two deaths and three deaths.K indicates the mutually exclusive type of injury accident (that is, event space).Y i,j is the result of the accident type i leading to the injury type j, which is represented by an array of discrete random variable,   ' , 1 , , . For all j=1, ..., K, Y i,j takes the value 1 if the severity of accident is the same with level j and the value 0 otherwise.A probability measure Pi on BY, the σ-algebra of the K elementary events, is associated with each other.
is the probability that the accident i has an injury of type j.Therefore, we state that , where id stands for 'independently distributed': Y i is distribution conditionally on Pi with a Multinomial probability distribution with parameter 1 and P i (Leonard, 1977).The corresponding conditional probability density functions is then Because there are different accident configurations, understood as different circumstances and consequences, there are variations of the probabilities of the types of injury according to many factors.
Due to the unpredictable nature of accident, for each accident i, P i is random, and uses known accident model in line with the Bayesian analysis principle.
In a Bayesian context, the beliefs one can have about the family of distributions of probabilities of the types of injury are summarized through prior distributions.They represent the state of knowledge about the individual distributions of the types of injury before observing the sample where  is the gamma function: (3) Using the conditional distribution (1) and the prior distribution (2), the joint distribution of (Yi, Pi) is defined: The marginal distribution of Yi is derived using (4) by integrating over Pi: The posterior distribution of i i y p | is obtained using the transition formula of Bayes: Where is a Dirichlet distribution: Empirical Bayesian analysis is performed in two steps: first, estimate the unknown hyper parameters; secondly, compute the posterior distributions using the estimated parameters and analyze the results.

Estimation of hyper parameters
As stated above, before accident occurs, construction organizations attempt to improve the workers' safety consciousness, use safety equipment, strengthen safety supervision and management, and control the probability distribution of Pi of the severity of accident.Also, the construction environment and accident attributes will play different roles when the corresponding accident happens.From the statistical standpoint, it means that one can explain the values of the shape parameters of the distributions according to some exogenous factors.

Let
are assumed to be compact subsets of p R .In order to maintain consistency with the strict positivity of the shape parameters, it is assumed for the rest of the paper that: The only available observation we have for an individual i is that severe injury, one death, two deaths and three deaths.In order to infer on the values of the unknown hyper parameters of the model, the marginal probability distributions of Yi, i = 1 . . .n, are used to build the sample log-likelihood function of observed yi, i = 1 . . .n: Looking at (9), we see there is no objective relationship between the value of the log-likelihood function and the values of the parameters as long as we do not set further identification restrictions.In fact, we must choose a benchmark outcome , and all the remaining estimable parameters are expressed as differences with respect to * k  .The maximization of (9) with respect to the unknown parameters gives us asymptotically unbiased and efficient estimates we use to compute the posterior probability distribution of accident severity.

Analysis of result
Hyper parameters are interpreted in the sense of their causal influences on the shapes and moments of probability distributions of the types of accident.The posterior distribution represents the state of knowledge concerning pi after the observations have been combined with the prior information.Posterior distributions represent our updated beliefs about probability distribution of the types of accident after accidents happened.
The Bayesian estimator of the expected rate of an accident with injury of level j for individual i is given by the posterior mean of j i p , (Gwet, 2002b): And the posterior variance of the distribution is

Factors analysis
Select 59 accident cases occurring in the construction peak period of the Xiluodu project and the Xiangjiaba project.The accidents are not described as name but serial number from 1 to 59.The statistical process is to determine in turn whether the HFACS factors are the reasons causing these accidents.In the 59 accidents, x is the full rank matrix (n, p) of the explanatory variables of the observed samples and j i,  measures the influence of probability distribution of the explanatory variable i x to the accident severity.

The value of
x is: x =the sum of edge frequency of the factors in the accidents/1000.Where, the number of 1000 means that there are ten factors and the edge frequency is expressed as a percentage.

The calculation of prior probability and posterior probability
Based on the results of the super parameter calculation, the formulas 2 and 6, we calculated prior probability and posterior probability of the 59 accidents.
Calculation results show the prior and posterior distribution of the accident samples.The difference among the prior distributions is not obvious, which indicates the researchers could not easily find out the occurring law of serious accidents based on the empirical prior distribution.But the posterior distribution makes up this shortage; it dramatically reflects the difference between the serious accidents and other accidents.For example, the number 32, 34, 39 and 46, their posterior distribution value is small, and the difference value between the posterior distribution and the prior distribution is negative, this feature shows that using Bayesian method to analyze the accident cause is feasible.According to the descending order of standard deviation, we get Table 2.
In accordance with the value of standard deviation, we sort the factors in descending order, and get the order of importance degree of the various factors in Table 2.It is the order of factors of the accident severity degree based on Bayesian theory under the new HFACS framework.To sum up the above 59 accidents, we may know the sample size is small, the accident is different from the experiment and is unrepeated, and the accident analysis is strong subjective.Such feature is suitable for the Bayesian statistical method.

The calculation and analysis of expectation
The expectation of each factor is shown in Table 3.We use the data in Table 3 to draw the broken line chart shown as Figure 2 for further analysis.We can draw the following conclusions from Figure 2: the expectations of severe injury accidents are almost over 0.3, "(5) personal quality", "(3) supervisory violations", "(6) crew  management", and "(8) perceptual and decision error", these four factors are prone to cause serious injured; the expectations of one death accidents evenly distribute on both sides of 0.3, the value of "(10) operation violations" is significantly lower than other factors, but the values of "(9) skill-based errors", "(5) personal quality" and "(2) organizational process" are relatively higher; in the cases of two deaths accidents, "(3) supervisory violations", "(4) failed to correct a known problem" and "(5) personal quality" most easily induce the accidents; and in the cases of three deaths accidents, "(5) personal quality", "(3) supervisory violations" and "(8) perceptual and decision error" most easily result in the accidents.

The calculation and analysis of variance
The posterior variance of each factor is shown in Table 4.We use the data in Table 4 to draw the posterior variance broke line chart of the accident severity of the factors, shown as Figure 3.We can draw the following conclusions from Figure 3: the posteriori variance of each factor to the four types of severity smoothly distributes at a mean value.The severe injury accidents and the one

Figure 1 .
Figure 1.Four levels of the HFACS framework.

.
For convenience, it is assumed that they all belong to the same family of probability distributions.A Dirichlet distribution is used because it can be mixed conveniently with the Multinomial distribution and it results in a parameters.The probability density function for the prior distribution is

.
p, 1) array of (unknown) weights measuring the causal effects of explanatory variables on the shape of the probability distribution through Measure the effects of the explanatory variables i x on the importance of the outcome j in the probability distribution of accident severity. k


=The sum of edge frequency of the factors corresponding to the injury type j in the accidents/1000.For example, j , 1  represents the sum of edge frequency of the factors corresponding to the injury type j in the accident 1.The values of j , 1  in accident 1 are shown as follows:

Table 2 .
The priori probability, posteriori probability and standard deviation of each factor by the descending order of standard deviation.

Table 3 .
The expectations of the influencing factors.