Identifying qualified audit opinions by artificial neural networks

Data mining methods can be used in order to facilitate auditors to issue their opinion. This paper for the first time in Iran, applies four data mining classification techniques to develop models capable of identifying auditor’s opinion. Four type of techniques were utilized in this study including: Multi-layer perceptron neural network (MLP), probabilistic neural network (PNN), radial basic functions network (RBF), and logistic regression (LR). Input vector included a qualitative variable as well as several quantitative variables. Our results proved the high capability of MLP neural network in identifying different types of auditor's opinion. PNN was the most balanced model in identifying type of auditor's opinion, and had closet amount of error in identifying unqualified (clean) and qualified type of reports, as compared to other models. RBF neural network in comparison with other models is of the highest performance in identifying qualified type of opinion and LR has the poorest performance in identifying qualified opinion. The results of this study can be useful to internal and external auditors and companies decision-makers.


INTRODUCTION
The purpose of audit in a test-based process, emphasizing related theories, is to refine information and improving its reliability in order to provide a favorable context, which is appropriate for utilization of information in making economic decisions.The final product of audit process is a report, in which the auditor expresses its professional opinion about truth and fairness of financial statements.In doing so, the auditor by means of audit report, conveys information (namely a message) about quality of represented information for decision-making purposes, as well as, fairness of financial statements from accountability of management and also governance *Corresponding author.E-mail: nezam@mail.uk.ac.ir.Tel: +98 341 3235900.Fax: +98 341 3235900.perspective, to his/her client.The purpose of this message, as the final product of audit, is to optimize financial reporting by increasing credibility of represented information; such a credit is drawn based on audit evidence and, therefore, "can be justified (American Accounting Association, 1973)".On the other hand, development of new technologies and their applications in various sciences has drawn the audit profession's attention toward these techniques and their usage in this profession.Technological changes and their usage in other sciences have enticed auditors to employ new technologies by the aim of improving efficacy of their procedures.
One of the most important ways in increasing efficiency of audit is to exercise new data-mining techniques for the purpose of prognosticating type of auditor's opinion.There have been noticeable international researches efforts (Dopuch et al., 1987;Ireland, 2003;Pasiouras et al., 2007;Doumpos et al., 2005;Gaganis et al., 2007) in order to provide a model, which can identify and predict type of auditor's opinion.According to this fact, there has been an increasing interest in theoretical development of dynamic intelligent systems which are free from a specific model, and are based on empirical data.
Artificial neural network (ANN) is one of those dynamic systems, which by performing an in-depth analy-sis of empirical data, translate and transfer knowledge or rules hidden in data, to the core of the network.Neural networks can be constructed by simply incorporating an example of a specific problem solving in real world.
These real examples are related to a specific applicable ground, and play a heuristic role to the neural network.In the audit area, for example, there are lots of audit reports regarding various kinds of companies which can be used as available experiences from the past.Considering these audit reports, we can use characteristics and accounting information of companies as feedback signal for purpose of training neural network, and using outputs of such models we can reach to the most probable type of auditor's opinion.
In this study, considering deficiencies of popular classifying and predicting models, we used the multi-layer percpetron (MLP) neural network for the purpose of predicting type of auditor's opinion.In order to validate the abovementioned model, its results were then compared to those of probabilistic neural network (PNN), radial basic functions (RBF) neural network, and logistic regression (LR).The knowledge of the decision-making mechanism is very important to auditors.That is, the auditor can be assured that the logic of the model is reasonable and that it complies or even does not contradict with recognized auditing principles and practices.
Multi-layer percpetron (MLP) neural networks are effective data mining classification methods.They provide several advantages over logistic regression, as they are very effective in cases where non linear relation-ships exist between the dependent and the indepen-dents.MLPs do not impose arbitrary assumptions.They are tolerant to noisy data and are capable of classifying patterns on which they have not been seen yet.
Radial basic functions (RBF) neural networks consist of two layers: a hidden radial basis layer of S 1 neurons, and an output linear layer of S 2 neurons.RBF networks may require more neurons than standard feed-forward networks, but often they can be designed in a fraction of the time it takes to train standard feed-forward networks.They work best when many training vectors are available.The main advantage of these networks is their zero-error on learning data.
Probabilistic neural network (PNN), in addition to its simplicity, speed, and transparency of traditional statistical classification models, carries a major computational power and flexibility of feed-forward networks, too.
The purpose of this study is to identify qualified auditor's opinion, by conducting a comprehensive indepth analysis of data about companies using neural networks approach for the first time in Iran.In our study, the four models are compared in terms of their overall predictive accuracy.This study has implications on internal and external auditors, company decision-makers, investors, financial analysts, and researchers.Consequently, researchers have developed classification models to help auditors in forming their opinion.By using such models, auditors can simultaneously screen a large number of firms and direct their attention to those having a higher probability of receiving a qualified audit opinion, thus saving time and money.Furthermore, auditors can use these models to predict what opinion other auditors would issue in similar circumstances, when evaluating potential clients, in peer reviews, to control quality within firms (Laitinen and Laitinen, 1998).The auditors may use such a model to plan specific auditing procedures to achieve an acceptable level of audit risk.Such a model can also be used as a quality control tool for the auditing process and also for reviewing and finalizing the auditing process.
Despite its contribution, our study is not without its limitations.One of the drawbacks of the neural networks is that it does not reveal which variables contribute in the decision to classify a financial statement as qualified or unqualified (clean).Hence, the method operates as a "black box".Nevertheless, the main purpose of this kind of research is not to provide evidence of the association between published audit reports and company characteristics.Instead, attention is on whether financial statements can be accurately classified as qualified or unqualified.Therefore, the coefficient estimates, their significance level, and even their signs are less important (Dietrich, 1984).
Classifying and predictive models used in this study from three aspects are different from those used by other researchers (Kirkos et al., 2007): First, in this study for the first time we used simultaneously three different neural networks which were previously used separately in other studies.Second, in this study we tested the previous mentioned models in an emerging market (Iran), where economic, social, and cultural environment is quite different from other countries.Third, our main model considers a more comprehensive area of accounting and financial information.
The remaining of the paper is structured as follows.Subsequently, the study reviews relevant prior research.Thereafter, it provides an insight into the research methodology used.Afterward, it describes the developed models and analyzes the results.Finally, it presents the concluding remarks.
Lots of researchers such as Spathis et al. (2003), Doumpos et al. (2005), Kirkos et al. (2007), and Gaganis et al. (2007) have directed their attempts towards development of models for identification of auditor's qualified opinion.Spathis et al., (2003) examined the financial statements, auditors" opinions, and financial statements notes for companies in Greece that received a qualified audit report and for those that received an unqualified (clean) audit report.They modeled the auditor"s qualification using a multicriteria decision aid classification method and compared it with other multivariate statistical techniques such as discriminant and logit analysis.Their finding indicated that the qualification decision is explained by financial ratios and by nonfinancial information such as the client litigation.Their developed models are accurate in classifying the total sample correctly with rates of almost 80%.Spathis et al. (2004) in their study used a multi-group hierarchic discrimination (MHDIS) to identify the type of auditor's opinion, and also compared performance of the resulting model using two discriminant analysis techniques, and logit analysis.Their results showed that the predicting accuracy of MHDIS, as similar to linear discriminant analysis, was about 72%, but the MHDIS method had a more balanced predicting rate in identifying the type of auditor's opinion.Gaganis et al. (2007) examined the ability of Probabilistic neural networks in identifying type of auditor's opinion, and compared the results regarding its predicting accuracy with those of MLP networks and logistic regression.They concluded that probabilistic neural network has a better performance, as compared to the other two previous mentioned models.Kirkos et al. (2007) utilized three data-mining models including MLP neural network, BBN, and the decision trees (DTs), to identify qualified auditor's opinion and its determinants.The results of models' estimations on a set of educational data showed that decision tree has a better performance, while the results of this paper, regarding test set, reveals that the Bayesian Belief network has a better performance as compared to those of the DTs and MLP, and provides a more precise prediction of the type of auditor's opinion.Gaganis et al. (2007), in a research compared the Pourheydari et al. 11079 ability of three approaches including the nearest neighbours, discriminate and logit models in identifying the type of auditor's opinion.Their results revealed that the K-NN model with a rate of 76.29%, on a base of average accuracy classification, in more efficient than discriminate analysis and logistic models.Entering credit risk rating into the model substantially increased the goodness of fit and accuracy of classification.

Data sources and sample
Population of this study includes all of the firms listed in Tehran Stock Exchange (TSE) which have been active during the years 2001 to 2007.Our population also includes the firms which have been listed on stock exchange and have been active after the year 2007.But for the purpose of calculating some of ratios such as receivables turnover, total asset turnover, fixed asset turnover, total asset yield, total equity yield, and the company's growth we expanded our sample period to include the year 2000.
In order to choose a sample which is a good proxy of intended population, we followed an eliminating approach.For this reason, we considered four criteria which should be met by any of population companies to be included in our final sample; these criteria are as follows:  1.
In this study, in order to identify the type of auditor's opinion we made our models using 29 variables in the first step, and in the next step using principal components analysis entering variables were reduced to 15 variables, and then we examined the performance of models to identify the type of auditor's opinion for either of the cases.
In this study for the purpose of measuring profitability, we used earnings before interest and taxes to sales ratio, earnings before taxes to sales ratio, return on assets, return on capital employed and return on shareholders" funds.Also, in order to assess liquidity, the current ratio and the quick (acid test) ratio were utilized.We measured firm's efficiency and firm's operation using account receivable turnover, account receivable collection period, inventory turnover, and fixed assets turnover.
Firm's ability to pay its obligations was assessed using two indicators including solvency ratio, and the ratio of equity to sum of long term debts and current share of long term debts.We also used logarithm of asset book value, logarithm of net sales, and logarithm of the number of employees as proxies for measuring firm size.The firm's growth was measured using the percentage of change in total assets.Employee productivity was measured by four different indicators including working capital per employee, asset per employee, net sales per employee, and net income per employee.
We also utilized the Z-score measure which was developed by Pourheydari and koopaee (2010), to assess firm's solvency.We measured effect of legal suits using dummy variable taking numbers zero and one.In order to assess cash flows we utilized the ratio of cash flows from operating to sales, and the cash flow from investment activities to sales ratio.In order to take other influential factors into considerations used post retirement benefits allowance per capital, and also tax allowance to sale ratio, and retained earnings to capital ratio.The chosen title for each of these variables and their abbreviations are provided in Table 1.

Descriptive statistics
Descriptive statistics comprising mean, and standard deviations of variables, and also their kurtosis and skewness for each group of qualified and unqualified type of opinion are separately presented in Table 2.These results show that firms with unqualified financial statements, as compared to those with qualified financial statements, has a relatively better position with respect to their profitability, operational activities, efficiency, liquidity, and ability to Comparison between mean working capital of two groups reveals that firms with qualified financial statements often have a negative working capital, which indicates weakness in their liquidity position.These firms also have a lower per capita net income, EBIT to sales, and earnings before taxes to sales, than firms with unqualified financial statements; providing evidence consistent with those of prior studies which show that firms with qualified type of auditor's opinion have a relatively lower profitability (Loebbecke et al., 1989;Summers and Sweeney, 1998;Laitinen and Laitinen, 1998;Beasley et al., 1999;Pasiouras et al., 2007).A comparison between days' sales in receivables of the two groups shows that firms with qualified financial statements have a longer days' sales in receivables, which proves results of studies by Spathis et al. (2004) and Doumpos et al. (2005), arguing that firms with a higher ratio of receivables to sales are more likely to receive a qualified type of auditor's opinion.These firms also have a lower Z-score ratio as compared to firms with unqualified financial statements, showing that financial distress of these firms.Firms with qualified type of auditor's opinion have a lower growth than those with unqualified financial statements, and their return on equity, return on used capital, return of assets, and cash flow from operation to sales ratio is also lower.Inventory turnover in these firms is also higher, which can be due to stockpiling damaged or non-consumer inventory which has led to an increase in inventory, or even might be a result of inventory over valuation.

RESULTS ANALYSIS
First type of network we used in this study to identify type of auditor opinion was MLP network.Learning process in this network is monitored, and its learning algorithm, error-back propagation (EBP) comprises two steps.In the first step, input data are entered into the network and effect of applying input(s) propagates forward into the succeeding layers; in this step weights are constant and at the end the network output(s) are calculated.In the second step, network's weights are adjusted based on EBP, and error signal is propagated back to preceding layers and then weights are corrected accordingly.In designing a MLP network, the network's structure parameters, type of learning algorithm, learning rate parameter, number of network hidden layers and also number of neurons in each hidden layer, and number of repetition for model during learning should be carefully considered.
In classification type problems, number of neurons in input layer equals to number of predictor (independent) variables.In this study, therefore, considering number of variables, there were 29 neurons in input layer.Determination of neurons in intermediary (hidden) layers is not such an easy task, and is mostly done on a trial and error basis, in a way that network's overall performance is improved.
Generally, as number of hidden layers increases, the networks ability to identify complicacy in training set increases, but this may reduce networks generality.We should make a fair balance, thus, between these two types of costs, so that improve network's overall performance.During the learning process, we should constantly assess the network ability to learn using target functions, and finally the network with least error should be chosen.The network's ability to identify qualified opinion from unqualified opinion, and its overall error in identifying type of auditor's opinion were computed separately.In order to attain the best MLP network structure for the purpose of identifying type of auditor's opinion, in the first step we constructed a network with one hidden layer which had 1 to 30 neurons in its hidden layer.
After repeating learning set for 6000 epochs, the least error was occurred in a network with only 11 neurons in its hidden layer.Total error in network (false classification percentage) with 11 neurons in its hidden layer was about 14.21%, and training set error was approximately 9.21%.Then a network with two hidden layers was constructed.In this case, after many times of testing network with various numbers of neurons in each layer, the least observed network error was for the one with 21 neurons in the first hidden layer, and 10 neurons for the second hidden layer.The network's inclusive error in this case was 12.25% for the test set, and error relating training set was about 8.72%.The network's error also increased with an increase in number of hidden layers beyond two layers.
Different networks with one, two and three layers, and tangent hyperbolic (Tan-Sigmoid) transfer function or logistic (Log-Sigmoid) transfer function were constructed, and after many runs, it revealed that a network with following characteristics can lead to the best results for our particular modeling problem: 1.A three-layer network, with two hidden layers; 2. Tan-Sigmoid transfer function as hidden layers' moving function; 3. Log-Sigmoid transfer function as output layer's function; 4. 21 and 10 neurons for the first and second hidden layers, respectively; and 5. Traingda function as network's learning function.
Figure 1 represents the MLP network with the optimal structure, as provided in the foregoing.We finally constructed a network with the previous-mentioned

Comparative analysis
Since an absolute opinion cannot be expressed about ability of a model to predict, we employed other models in order to make comparison between results.In this study, we used MLP neural network as our main model, and then its performance was compared to those of radial basic functions (RBF) neural network, probabilistic neural network (PNN), and logistic regression.

Radial basis functions networks
We may use either of two approaches to build a RBF network.In the first approach, network builds a number of basic radial neurons equal to network inputs.Mean square of errors after learning process using this approach will always equal zero.In the second approach, neurons of repeatedly added to the radial basic neuron network.Neurons are added to the network until the sumsquared error falls beneath an error goal or a maximum number of neurons have been reached.In this stage the network stops and its error in identifying type of auditor's opinion will be calculated.
In this study, we used both approaches, but results show that the latter approach had a better performance in classification.Parameters goal, therefore, is considered 0.005 in this study.An important point in designing RBF networks is determination of an appropriate spread parameter for development.It is important that the spread parameter be large enough that the radbas neurons respond to overlapping regions of the input space, but not so large that all the neurons respond in essentially the same manner.For this reason, in order to determine appropriate expanding coefficient parameter, we raised expanding coefficient parameter between 0 to 2 with a step size of 0.005; after 400 times of learning, network testing, and calculation of network error, the best amount for spread parameter was chosen.Figure 3 shows network's overall error with respect to spread parameter.
As can be seen in Figure 3, least network error is attained with a spread parameter equal to 0.915.Table 4 presents results of classification for several spread parameters, upon which spread parameter (SP) considering type of opinion to be identified can be chosen.
Since we are more interested in identifying qualified Error (%)    opinion, and identification of this kind of reports is of a higher importance than unqualified opinions, therefore, our optimal network was build with a target error of 0.005 and spread parameter of 1/27.In this case, the abovementioned kind of network, as compared to other networks, has a relatively better performance, in identifying qualified type of opinion.In this case, of course, our network also has an acceptable performance in identifying unqualified type of opinion, too.

Probabilistic neural network
We also employed another type of network in identifying type of auditor's opinion, namely PNN.Determination of an appropriate amount for smoothing parameter is an important point in designing a probabilistic neural network.Optimal calculated amount for this parameter was 0.0588.The classification results showed a satisfactory performance for the designed model, because network's training error was zero, and testing set's overall error was 15.19%, and also the designed model classified 82.36% of qualified reports, 86.03% of unqualified auditor's opinion, and 84.81% of total auditor's reports correctly (Table 5).
After constructing models, and learning of networks, and testing networks using the holdout sample, accuracy of predictions was used as a measure of evaluating models' performance.By accuracy of predictions, we mean number of auditor reports which were classified correctly in each model.Comparative performance of the four models in identifying type of auditor's opinion is presented is Table 7.
Results of this study show that the MLP neural network with a predicting precision of 89.71%, as compared to other models, has the best performance in identifying unqualified opinion.This network also, with an accuracy rate of 87.75% in classification, is of the best total performance in comparison with other models.The RBF neural network, with a predicting precision of 85.30%, has the best performance in identifying qualified opinion; this is while other models have a better performance in identifying unqualified type of auditor's opinion.The LR model, although classified 89.3% of unqualified opinions correctly, showed the weakest results in identifying qualified type of opinion, and was the most unbalanced model in identifying different types of auditor's opinion, amongst all mentioned models.The PNN was the most balanced model in identifying different types of auditor's opinion, and its error in identifying unqualified and qualified types of auditor's opinion were the closest ones, as compared to other model.
Since many dimensions of data makes their classification a complicated and time-consuming task, for the purpose of evaluating performance of models with less dimensions, in this section we reduced data dimension using principal components analysis (PCA).Using chosen main factors, we converted the primary financial variables set, into a new set containing 14 variables.These factors altogether describe 93.81% of general Pourheydari et al. 11085 dispersion of primary data.In addition, litigation firm"s artificial variable, was entered into the analysis, as a nonfinancial variable.This way, our input vector comprises 15 variables.Then using these factors, we designed research models.Performance of derived models in identification of type of auditor's opinion was evaluated for the two cases.
The best performance for MLP neural network was accomplished in a network with two hidden layers, Tan-Sigmoid Transfer Function as moving function of hidden layers, and Log-Sigmoid as Transfer Function of output layer; this network also contained 18 and 9 neurons respectively in its first and second hidden layers, and had a traingda learning function (Table 8).
Results of comparing dimension reduced models with primary model show that: 1. MLP neural network, in either of the cases, has the best rate of accuracy in total classification and identifying type of auditor's report, and PNN is the next in ranking.PNN is also the most balanced network in identifying various types of auditor's opinion.2. As data dimensions decrease, performance of RBF network and LR models in identifying unqualified opinions improved 0.74 and 4.4%, respectively; this is while a reduction in data dimensions is accompanied by significant fall in accuracy rate of all models in identifying qualified type of auditor's opinion.Predicting error in identifying qualified for MLP network increased 10.30%, and similar increase in this type of error for PNN network, RBF network, and LR were 5.88, 8.82, and 11.5%, respectively.3. The model derived from LR in both cases, although had a good performance in identifying unqualified opinions, but was of weak performance in identifying qualified type of auditor reports, and from this respect is an unbalanced model.

CONCLUSIONS AND RECOMMENDATIONS
New data mining techniques can assist auditors in providing type of their opinion.In this study in order to develop models that can identify and predict type of auditor's opinion, we examined relative performance of neural network in comparison with classic models.Four type of techniques were utilized in this study including: Multi-layer perceptron neural network (MLP), probabilistic neural network (PNN), radial basic functions network (RBF), and logistic regression (LR).Input vector included a qualitative variable as well as several quantitative variables.
Our results proved the high capability of MLP neural network in identifying and predicting different types of auditor's opinion.This network, with an accuracy rate of 87.75%, had the best performance in identifying type of auditor's opinion.PNN was the most balanced model in identifying type of auditor's opinion, and had closet Results of these models can be used for purposes of predicting type of auditor's opinion by internal and independent auditors within planning as well as evidence gathering stages of audit.These results may also be useful for investors and creditors in predicting type of auditor's opinion on un-audited information.Securities exchange commissions, as authoritative supervisors of capital markets, also can make use of findings of this research in evaluating quality of financial reports by business enterprises to securities exchange.
In this study, we utilized only four different techniques of classification in predicting type of auditor's opinion, but other substitute classification techniques such as fuzzy neural networks, multi-dimensional scaling, Bayesian belief networks, decision trees, support vector machines, and discriminant analysis can also be employed in identifying type of auditor's opinion.As an alternative, one can distinguish between different paragraphs of auditor's report, and try to identify type of auditor's opinion.A same study about predicting type of auditor's opinion with respect to firm's going concern would be one of interest to other researchers.Other non-financial variables such as audit firm's size, auditor's auditing and non-auditing contracts, market value of firm, type of ownership (public, private, listed in stock exchange, subsidiary), and change of auditor can also be used to predict type of auditor's opinion in future studies.

Figure 3 .
Figure 3. RBF network's overall error with respect to spread parameter.

Table 1 .
List of variables.
auditor's opinion, based on previous studies, and considering specific situation of Iran, we examined indicators relating to profitability, liquidity, leverage, growth, firm size, employee productivity, and the firms' efficiency along with other factors.Each of these indicators was measured using one or more proxies.A complete list of research variables is provided in Table

Table 3 .
Rate of accuracy in classification in MLP network.

Table 4 .
RBF network performance in identifying type of auditor's opinion.

Table 5 .
PNN performance in identifying type of auditor's opinion.

Table 6 .
LR performance in identifying type of auditor's opinion.

Table 7 .
Comparative performance of different techniques in identifying type of auditor's opinion

Table 8 .
Comparative performance of models with and without using PCA. in identifying unqualified and qualified type of reports, as compared to other models.RBF neural network, with a predicting precision of 85.30%, in comparison with other models, is of the highest performance in identifying qualified type of opinion, but other models had performance in identifying unqualified opinions; and LR has the weakest performance in identifying qualified opinion, and is an unbalanced model in identifying different types of auditor's opinion.Also, after reducing data dimensions using PCA techniques, there was a significant fall in rate of accuracy in classification of qualified type of opinion in all models.