Modeling effluent heavy metal concentrations in a bioleaching process using an artificial neural network technique

Artifical neural networks practices were used to predict the recovery of heavy metals (Zn, Cu, Ni, Pb, Cd and Cr) from dewatered metal plating sludge (with no sulfide or sulfate compounds) using bioleaching process involving Acidithiobacillus ferrooxidans. The bioleaching process was operated as a completely mixed batch (CMB) reactor. The leaching performance data of the CMB reactor in terms of heavy metals was applied to a multi-layer perceptron (MLP) neural network technique for simulation. The performance of the reactor was evaluated with this robust model using the experimental data obtained under varying heavy metal concentrations in the sludge. Agitation time, pulp density of the sludge, and pH were used as inputs for the model, whereas the heavy metals (Cd, Cr, Cu, Ni, Pb, and Zn) concentrations were the output variables. The results of the models were compared using statistical criteria such as mean square error (MSE), mean absolute error (MAE), mean absolute relative error (MARE), and determination coefficient (R 2 ). The results show that the MLP neural network produced highly accurate estimation of the aforementioned metals with R 2 over 97.9%.


Many of exterior physical (adsorption-desorption, pulp
Abbreviations: CMB, Completely mixed batch; MLP, multilayer perceptron; MSE, mean square error; MAE, mean absolute error; MARE, mean absolute relative error; R 2 , determination coefficient.density, particle size, agitation and temperature), biological (growth rate and cell concentration), chemical (pH and medium composition), and electrochemical (redoks potential) parameters have been determined to have an influence on the performance of the bioleaching process.The common mathematical models that are built based on these parameters and used in bioleaching applications have some advantages and disadvantages such as models of Kumar and Gandhi, Lacey and Lawson, Blancarte-Zurita and Branion, Konishi and Katoh, Sanmugasunderam and Branion, and Hanson, etc (Haddadin et al., 1995).In order to manage an optimum bioleaching process and effluent heavy metals concentrations, appropriate models that completely defines the system are required.In an effort to control the performance of the system, it is imperative to use thorough models that are dependent on the determination of specific parameters and hence predict the process performance based on these parameters.
Biological systems are non-linear, ever progressing, and highly complex systems.Artificial neural network (ANN) have found vast application for biological systems and therefore its importance and favourability have increased greatly.Mathematical modeling of non-linear and complex biological systems offer a difficult task for the interested.Therefore, the regression models have been used to model most biological systems due to their wide range of application.However, regression models have proven to err in the determination of some data that are not required to be used for especially the regression equation.It has been shown by Mohanty et al. (2002) that the ANN models have better simulated the biological systems as compared to the regression models.The use of neural networks to predict the solubilization of heavy metals originating from municipal wastewater treatment sludges has already been presented by Du et al. (1994) who used Th.thiooxidans and Thiobacillus thioparus in batch systems.The authors have demonstrated that a neural network with input variables of type of sludge, initial metal concentrations and pH could satisfactorily predict heavy metal solubilization.Since the 1990s, the studies on biological systems and bioleaching has shown that ANN-based models demonstrated better prediction performance for complex biological systems with numerous non-linearly correlated parameters as compared to conventional mathematical and statistical models (Acharya et al., 2006;Ozkaya et al., 2008;Liu et al., 2008;Jorjani et al, 2007;Nurmi et al., 2010;Laberge et al., 2000).In this study, a new ANN model was proposed for the estimation of heavy metal concentrations in a completely mixed batch reactor as an alternative to the conventional methods.The predictive ability of the proposed model was assessed using some statistical criteria -mean square error (MSE), mean absolute error (MAE), mean absolute relative error (MARE) and and determination coefficient (R 2 ).

MATERIALS AND METHODS
The methodology followed in this study has been previously Sari 16197 described in details (Bayat and Sari, 2010a).Therefore, the materials and methods were concisely described herein.The dewatered metal plating sludge samples studied were collected in polyethylene bags from the Karme Metal Plating Industries in Fatsa-Ordu, Turkey.The physicochemical characteristics of the dewatered sludge samples are presented in Table 1.All analyses were carried out in duplicate (Bayat and Sari, 2010a, b).The biological leaching experiments were carried out in a completely mixed batch (CMB) reactor with a volume of 3 L and dimensions of 15 cm inner diameter (ID) and 17 cm height equipped with pH and temperature controllers, a stirrer, and an aerator system (Figure 1) (Bayat and Sari, 2010a).

Artificial neural network approach
ANN has the ability to learn from examples, recognize a pattern in a group of data, adapt solutions over time, and process information rapidly.The application of ANN to issues related to wastewater treatment and water resources conservation is rapidly gaining popularity due to their immense power and potential in the mapping of nonlinear system data.In the context of hydrological forecasting, recent studies have reported that ANN technique may offer a promising alternative for bioleaching (Acharya et al., 2006;Ozkaya et al., 2008;Liu et al., 2008;Jorjani et al, 2007;Nurmi et al., 2010;Laberge et al., 2000), rainfall-runoff modeling (Lin and Chen, 2004), stream-flow prediction (Raman and Sunilkumar, 1995;Kisi, 2004a), suspension of sediments (Kisi, 2004b), water resources (Cobaner et al., 2008), reservoir inflow forecasting (Coulibaly et al., 2005) and treatment of wastewater (Elmolla et al., 2010;Pai et al., 2009;Chen and Lo, 2010).The variation in the characteristics of a bioleaching system may be non-linear and multivariate, and the variables involved may have complex inter-relationships.For most cases, ANNs provide more reliable estimates for dependent variables of concern.The processes that involve several parameters are easily amenable to neuro-computing.Among the many ANN structures that have been studied, the most widely used network structure is the multilayer perceptron (MLP) network.An ANN consists of a number of data processing elements called neurons or nodes, which are grouped in layers.The input layer of neurons receives the input vector and transmits the information to the next layer with the help of cross connections.In the current study, a MLP modeling technique was applied.

Multi-layer perceptron (MLP) neural network
A MLP distinguishes itself by the presence of one or more hidden layers, whose computation nodes are correspondingly called "hidden neurons of hidden units".The function of hidden neurons is to intervene between the external input and the network output in some useful manner.By adding one or more hidden layers, the network is enabled to extract higher order statistics.In a rather loose sense, the network acquires a global perspective despite its local connectivity due to the extra set of synaptic connections and the extra dimension of NN interconnections.The detailed theoretical information about MLP can be found in Haykin (1998).
The MLP network used in the current study is shown in Figure 2.
Index is referred to as the individual output layer neurons, the indices and refer to as the input neurons and the hidden layer neurons, respectively, while and represent the connection weights between the hidden-input layer and hidden-output layer, respectively.A hidden-layer neuron produces the following as output:  While an output-layer neuron produces the following as output; Where is the output of the neuron in the hidden layer; is the input of the neuron in the input layer; is the output of the neuron in the output layer; and are the threshold values, also called the bias, associated with the hidden and output nodes, respectively; and denotes the activation function.Each neuron multiplies every input by its interconnection weight, sums the product, and then passes the sum through a transfer function to produce its result.This transfer function is usually a steadily increasing S-shaped curve, called a sigmoid function.
The MLP can have more than one hidden layer.However, theoretical works have shown that a single hidden layer is sufficient for MLP to approximate any complex nonlinear function (Maier and Dandy, 1996;Onkal at al., 2005).Therefore, in this study, onehidden-layer MLP was used.Throughout all MLP simulations, the adaptive learning rates are used to speed up training.The numbers of hidden layer neurons are found using simple trial-and-error method in all applications.The sigmoid and linear functions are used for the activation functions of the hidden and output nodes, respectively.Some of the recent studies have reported that the performance of MLP was superior to conventional statistical and stochastic methods (Kisi, 2004a, b).Multilayer perceptions can get trapped in a local minimum when they try to find the global minimum of the error surface.Maier and Dandy (2000) summarized the methods used in the published literature to overcome the local minima problem, such as training a number of networks starting with different initial weights, an on-line training mode to help the network escape local minima, inclusion of the addition of a random noise, and employment of second order schemes, such as Newton-Raphson and Levenberg-Marquardt algorithms, or global methods such as stochastic gradient algorithms and simulated annealing.Other ANN methods, such as conjugate gradient algorithms, the radial basis function, the cascade correlation algorithm, and recurrent neural networks, were briefly explained in the report by the ASCE Task Committee on Application of Artificial Neural Networks in Hydrology (2000aHydrology ( , 2000b)).

Levenberg-Marquardt algorithm
In the present study, the Levenberg-Marquardt algorithm was employed because this algorithm is more powerful than the conventional gradient descent techniques (Hagan and Menhaj, 1994).The Levenberg-Marquardt algorithm is an approximation of Newton's method and is very efficient for training networks with up to a few hundred weights.Although the computational load of the Levenberg-Marquardt algorithm is greater than that of other techniques, this is compensated by the increased efficiency and much better precision in results.In many cases, the Levenberg-Marquardt algorithm was found to converge when other backpropagation techniques diverged (Hagan and Menhaj, 1994).

Determination of an appropriate ANN model
Determining an appropriate architecture of a neural network for a particular problem is an important issue as the network topology directly affects its computational complexity and its generalization capability.MLP model with one hidden layer can approximate any complex non-linear function provided sufficient amount of hiddenlayer neurons are available (Hornik et al., 1989).Indeed, many experimental results seem to confirm that one hidden layer may be enough for most forecasting problems (Coulibaly et al., 1999).Therefore, in this study, one hidden-layer MLP model was used.Generally, the number of hidden layer neurons is determined by a trial-and-error method.A common strategy for finding the optimum number of hidden-layer neurons is to start with a small number of neurons and increase their number while monitoring the performance criteria until no significant improvement is observed (Goh, 1995).

RESULTS AND DISCUSSION
In this study, the concentration of each heavy metal (Cd, Cr, Cu, Ni, Pb, and Zn) was the dependent variable, while the independent variables were the agitation time, pulp density, and pH.The minimum and maximum values for the model variables are provided in Table 2.There are no acceptable rules to determine the optimum size of the training data set.The networks are not very sensitive to the number of training data, but very sensitive to the number of testing data.Attempts at reducing the training data size resulted in poor generalization capabilities in the testing phase.Therefore, the available data set was partitioned into a training set and a testing set with 75 and 25% of the available experimental measurements selected for training and testing phases, respectively.Before the training phase of the network, both input and output variables were normalized within the range of 0.1 to 0.9 as follows: Where is the normalized value of a certain parameter, is the measured value for this parameter, and are the minimum and maximum values in the database for this parameter, respectively.
For all created neural networks, the general structure of input, one hidden, and one output layer was used.In order to determine the optimal architecture, several neural networks were trained with different iteration number (epoch) and number of nodes in the hidden layer.For all cases, a "log sigmoid transfer function (log sig)" was used in the hidden and output layers.When the log sig was applied, the inputs and the outputs were normalized within the range of 0 to 1.The most accurate estimations of the ANNs were obtained with logarithmic sigmoid transfer function.The best MLP results were obtained from the ANN (4, 6, 1) model using the logarithmic sigmoid activation functions for both hidden and output layer neurons, respectively.
The MAE, MSE, MARE and R 2 values of ANNs for both training and testing phases are given in Table 3  In   and testing phases, respectively.As shown in these figures, the MLP produced highly accurate results in the estimation of heavy metal concentrations for both training and testing phases.

Figure 3 .
Figure 3.Comparison between observed and predicted heavy metal concentrations in training phase.

Figure 4 .
Figure 4. Comparison between observed and predicted heavy metal concentrations in testing phase.

Table 2 .
Minimum and maximum values of input and output parameters.

Table 3 .
Statistical assessment of MLP-LM models for both training and testing phases.