Tunnels stability analysis using binary and multinomial logistic regression (LR)

One of the most serious problems in tunneling projects are falling rock blocks. By considering this fact, the importance of stability predicting using some input parameters can be obviously understood. Among the existing rock mass classification systems for underground structures, rock mass rating (RMR) and Q are probably the most widely used ones this is rather unlikely to change, at least in the near future, frequently used and more available in tunneling projects, therefore establishing a proper and valid stability method based on such items would be useful. Since none of them (RMR and Q) can reflect the tunnel stability condition entirely and each has some lacks in rock mass properties defining, therefore both of them were used in this analysis which can provide the whole perspective of rock mass condition and stability. For this aim, data (RMR, Q, and hydraulic radius) from 104 cases of eight tunnel projects were gathered. By observing the stability condition in each tunnel, the data were classified in three categories: stable, potentially unstable and unstable. Two models next were defined and the related formulas were found using binary and multinomial logistic regression, at last the best predictor model would be selected by using the percent of correctly predicted cases in each model. The results of this paper show that the logistic regression (LR) is a robust tool to establish and develop predicting model for tunneling projects and can assist engineers to predict the stability condition of tunnels. 
 
   
 
 Key words: Logistic regression, tunneling, rock mass classification, stability.


INTRODUCTION
Rock falls and collapses in excavation projects result in considerable fatalities and damages in man sources and machinery.So for long time, many stability analyzes have been considered, and a lot of studies were conducted to predict and control these events in tunnels, stopes, pillars and, as a result different stability graphs were established.For example, The Mathews stability graph method for open-stope design was first proposed for mining at depths below 1000 m (Mathews, 1980), and during years it has been modified by Potvin (1988), Stewart and Forsyth (1995), Nickson (1992), andHadjigeorgiou et al. (1995) for example the modified Mathews stability graph by Stewart and Forsyth was shown in Figure 1.Other different stability graphs have been developed during years to estimate the stability condition and the probability of failure for stopes, pillars, tunnels such as; Laubscher's caving stability graph (Laubscher, 1990).Barton et al. (1974) stability graph method based on the NGI tunneling index Q, pillar stabilitygraph developed by Lunder (1994) and modified by Mah (1995).
The purpose of this paper is to develop a stability *Corresponding author.E-mail: raminlamezi@gmail.com.Stewart and Forsyth (1995).
graph and method especially for tunnels based on both RMR and Q rock mass classification systems, because the two widely-used and available systems in tunnels are the Norwegian Geotechnical Institute's Q system, Barton et al. (1974) and the various versions of the rock mass rating (RMR) system, originally proposed by Bieniawski (1973).Interestingly, both systems trace their origin in tunneling.After all each of these rock mass classification systems has some lacks which cause not being able to reflect the tunnel stability condition entirely.For example the Q system does not take the rock material strength into account explicitly, although it is implicitly included in arriving at the stress reduction factor (SRF).The SRF in Q system may be significant depending on the depth.The orientation is also not taken into account.The RMR does not take account of the confining stress in the rock mass, nor explicitly the number of joint sets.Since both rock quality designation (RQD) and joint spacing are classification parameters considerable weight is given to block size and also the weighting of parameters in these two rock mass classification systems are different (Barton et al., 1974;Barton, 1991;Bieniawski, 1973Bieniawski, , 1976Bieniawski, , 1989)).
Anyway, these two rock mass rating systems, that is, RMR and Q, were considered as candidates for assessing geotechnical conditions of these eight tunnels.But as mentioned just above, none of these systems adequately can describe all the observed geotechnical conditions and failure mechanisms.The final conclusion was that a hybrid system would provide the best results.
For this reason, both of these systems (RMR and Q) were used in the models.The two models including the two rock mass classifications were made using logistic regression, in which the probability of stability can be predicted by entering RMR, Q and hydraulic radius ( h R ) of each case as input variables given in Table 1.The hydraulic radius of tunnel is defined as below: Where S is the tunnel cross-section area and P is the tunnel cross-section perimeter.At last the best model was selected to estimate and classify the tunnel conditions using statistical result of each model.

LOGISTIC REGRESSION (LR) ANALYSIS
Logistic regression is useful for situations in which the purpose is to predict the presence or absence of a characteristic or outcome based on values of a set of predictor variables (independent).It is similar to a linear regression model but is suited to models where the dependent variable is discrete and discontinuous.Logistic regression is applicable to a broader range of research situations than discriminant analysis.For example, in mining engineering fields, the logistical regression analysis was used by Trueman et al. (2000) to   (Mawdesley et al., 2001) and also Molinda et al. (2000), performed simple regression analysis with some significant geotechnical variables like overburden, bolt strength, bolt capacity, grout length, density, entry width, coal mine roof rating (CMRR) and intersection span for predicting the roof fall rate.This type of analysis is appropriate to data sets with a binary or multinomial dependant variable and a number of numerical independent variables.Generally, logistic regression analysis has two basic forms; a) binary, b) multinomial that a sort summary of them have been brought as following:

Binary logistic regression
Binary logistic regression is most useful when the aim is to model the event probability for a categorical response variable (dependent) with two outcomes (dichotomous).For example: An engineer wants to know if a particular excavation with input parameters such as geotechnical properties of rock mass would fall in stable category or unstable.Since the probability of an event must lie between 0 and 1, it is impractical to model probabilities with linear regression techniques, because the linear regression model allows the dependent variable to take values greater than 1 or less than 0. The logistic regression model is a type of generalized linear model that extends the linear regression model by linking the range of real numbers to the 0 to 1 range.The general form of the logistical regression described by Charles (2002), is shown in Equation 2.
Where pr is the probability for the category with logit value 1,   (3)

Where
is the probability of falling each case in category , (k= Number of independent variables, X i : independent variables (predictors), b ji : regression coefficients for category i ).As it stands, if you add a constant to each Z, then the outcome probability is unchanged.This is the problem of non-identifiably.To solve this problem, k Z is (arbitrarily) set to 0. The th k category is called the reference category, because all parameters in the model are interpreted in reference to it.

DATA ANALYSIS
In this paper the input parameters are RMR, Q and hydraulic radius ( h R ) which are the independent variables (predictors).The dependent variable (response) is the stability condition which is a categorical variable divided into three categories; stable, potentially-unstable and unstable.These variables for each tunneling case were given in Table 1.Since the dependent variable (stability condition) has categorical nature, using logistic models is so practical therefore to well-defined models were considered as following and their applicability would be evaluated.

Model 1
In this model all cases have been classified into three categories namely; stable, potentially-unstable, unstable.The three logit values 1, 0.5 and 0 are assigned to stable, potentially-unstable and unstable categories, respectively.The nature of this model show that it should be analyzed by multinomial logistic regression, therefore for this aim, the SPSS statistical software was used and the parameter estimations and statistical summary results have been summarized in Table 2 in which ji b are the regression coefficients in 3).In this analysis the reference category is stable (logit value 1.00).There are several ways to estimate the logistic model accuracy and reliability but among them using the classification results or goodness of fit is so usual.The goodness-of-fit results are given in Table 3.The reliability and goodness-of-fit table shows the practical results of using the multinomial logistic regression model in classifying the 104 cases.Cells on the diagonal are correct predictions and cells off the diagonal are incorrect predictions.For example, this table shows that there are 47 stable cases, of which 42 cases have been classified correctly into stable category and the 5 remained cases were incorrectly classified into the unstable category by the model.After all, the total percent of correctly classified cases in this model is 71.2%.However, the overall percentage is 71.2% but it can be seen that about potentially-unstable category the correct classification percent is not satisfactory because out of 17 potentially-unstable cases only one case was classified correctly and 11 cases were fell into the unstable category and the 5 remained cases have been classified into the stable category.Two important results could be extracted from Table 3, are as follows: (1) This model doesn't classify the potentially-unstable cases satisfactorily and it is not a proper and reliable model for this category.(2) In order to integrate the potentially-unstable category with any of stable and unstable categories, it should be to the unstable category because the logistic model tends to classify its cases into the unstable category.By using the presented Formula (3), the formulas of this model are as following: In these formulas, Z 1 = 0 because the reference category is stable with logit value 1.0.

Model 2
Next, the potentially-unstable and unstable categories were taken into one category called unstable, so there are two categories stable and unstable, this is a kind of binary problem.In this model, the two logit values 1, and 0 are assigned to stable and unstable categories, respectively.After analyzing by SPSS software the Tables 4 and 5 results were achieved.In   that out of 57 unstable cases, 49 cases were classified correctly, totally 86.0% of all unstable cases are classified correctly and just 8 cases have been classified wrongly.About stable category, we can see that out of all 47 stable cases, 38 cases were classified correctly and the total correct percentage is 80.9 which are reliable and satisfactory.Overall percentage of correctly classified cases in this model is 83.07.In this model any of categories have been classified satisfactorily and yet the total percent is reliable.By using the Formula (2), the related formulas are as following:

An example
Consider a tunnel with these rock mass properties as: RMR = 45, Q= 6.16 and h R = 4.5 m.The model 2 is used to predict the stability in this tunnel, the results are as following: Probability of stability=74% and probability of instability=26%.Since the stability probability is greater than the instability probability this case falls in stable category, field observation shows that this case is stable as obtained from analysis.
In order to establish curves between RMR and hydraulic radius ( h R ) or Q and hydraulic radius ( h R ), the relationship between these two rock mass classifications should be found.To find the relationship between RMR and Q rock mass classifications in these tunnels, three forms of regression were done between them which their results are given in Table 6.The results of this table show that the logarithmic regression is the best equation between them.The curves of three regression forms are shown in Figure 3.  substitute by each other, in the probability formulas.Then the equation between RMR (and Q), h R and probability of stability (or instability) could be obtained.And then the curves RMR versus h R or Q vs. h R can be drawn.By experience, engineering judgment and field observations the stable and potentially-unstable boundary is a line in which probability of stability is 60% and instability is 40% and also the boundary line between unstable and potentially-unstable is where the instability probability is 60%, on the other word in the stability probability is 40%.fication systems.For this aim the data of 104 cases form eight developing tunnels including RMR, Q, hydraulic radius and stability condition, were gathered and then two well-defined logistic models were considered.After analyzing with these models, it was understood that considering all the cases into two separate stable and unstable categories using binary form is so satisfactory.This model classifies cases 83.7% correctly.From engineering judgment point of view, this precision is adequate and reliable.
To plot the graphs of RMR vs. hydraulic radius or Q vs. hydraulic radius and the isoprobability contours, some regression models between data of RMR and Q were done and the best equation between them was obtained.By using this relationship the related curves have been drawn.The results of this paper show that the LR is a reliable and proper tool to establish and develop the predicting model for tunneling projects and can assist

Figure 3 .
Figure 3. Correlation between RMR and Q rock mass classifications.
using the Equation (14), the RMR or Q can be Hydraulic radius (m) Rock mass classification
These three different zones are shown in Figures4 and 7.The isoprobability contour figures for two stable and unstable categories have been given in Figures5, 6, 8 and 9. CONCLUSION Due to the increasing number of tunnels which are being developed, this paper focused on the evaluation of eight tunnels stability and establishing a stability graph based on logistic regression models and rock mass classi-Hydraulic radius (m) Rock mass classification

Figure 6 .
Figure 6.Isoprobability contours for unstable tunnel based on logistic regression.

Figure 7 .
Figure 7. Stable and potentially unstable and unstable regions.

Table 1 .
Data of eight tunneling projects.

Table 2 .
Parameter estimations and statistical summery.the purpose is to classify subjects based on values of a set of predictor variables.This type of regression is similar to binary logistic regression, but is more general because the dependent variable is not restricted to two categories.For example; an engineer wants to know if a particular excavation with input parameters such as geotechnical properties of rock mass would fall in stable category, potentially-unstable or unstable categories. which

Table 3 .
Reliability and goodness-of-fit for model 1.

Table 5 .
Reliability and goodness-of-fit for model 2.

Table 6 .
Regression analysis summary and parameter estimates.