A novel FOREX prediction methodology based on fundamental data

Markets play a critical role in economics of the world and the distribution of wealth. Predicting them can help with preventing crashes and avoiding severe losses, or making significant profits. But such prediction is not easy due to the very complex nature of markets and the wide variety of the influence factors involved. Technical analysts or chartists rely on historical chart data to predict patterns based on previous behaviours of graphs. This approach is fairly straightforward and has also been automated to a great extent. There are computer programs or predictor robots that use the technical approach and facilitate buy or sell decisions. However, market behaviour obviously is more than repetition of old patterns and many of the events in the outside world have constant impacts on it. These external pieces of information can vary from political events to economic statistics. Fundamental analysts are those with a knowledge and understanding of the world events on market behaviour. This requires knowledge of politics, micro and macroeconomics to say the least, and hence, there are far fewer of such analysts. However, the very successful analysts like Warren Buffet have repeatedly emphasized on consideration of fundamental data in prediction calculations. Nevertheless, proper fundamental analysis remains to be a challenge and even a bigger challenge when it comes to its automation. There are very few research efforts and approaches which look into possibilities of automation of fundamental analysis. Hence, this work initiated a novel approach on fundamental data manipulation for identification of relationships between market behaviour and external information. This work made an effort to apply the afore in the foreign exchange market by observing the USD/GBP currency pair. In this research, an approach was devised and proposed for integration of fundamental data into automatic prediction. In this approach 3 main sources for fundamental data were identified. From these sources, data was extracted, organized and then fed into a proposed neural network during 6 experiments. The experiments put the possible relationships between the identified fundamental data and the price movements of the chosen currency pair (USD/GBP) to test. The test results indentified the datasets with plausible relationships with the market behaviour. The observed positive output of 3 different sets of data-input proved the proposed methodology to be of considerable value for market prediction.


INTRODUCTION
In today's economy, stock markets play an essential role in the circumstances of nations. The total world derivatives market has been estimated at about $ 791 trillion face or nominal value, (BIS, 2008)  Quite similar to the stock market operates the foreign exchange market (FOREX) which is a worldwide decentralized over-the-counter financial market for the trading of currencies. The foreign exchange market determines the relative values of different currencies (Levinson, 2006).
The primary purpose of the foreign exchange is to assist international trade and investment, by allowing businesses to convert one currency to another currency. For example, it permits a US business to import British goods and pay pound sterling, even though the business's income is in US dollars. It also supports speculation, and facilitates the carry trade, in which investors borrow low-yielding currencies and lend (invest in) highyielding currencies (Flassbeck and La Marca, 2009).
In both markets mentioned, the equilibrium between the supply and demand sets the price. That is, basically, the decisions made by buyers and sellers, determine the equilibrium. But what do buyers and sellers base their buying and selling decisions on? All decisions are naturally based on the information that they have at hand or their perception of the market and factors of influence on it. This is the basis for EMH (efficient-market hypothesis) which states that prices already reflect all known information and instantly change to reflect new information. There are three types for this hypothesis, weak, semistrong and strong, depending on the strength of the impact of the new or hidden information on the market prices, with strong EMH being the theory whereby it is believed that all information regardless of how new or hidden to the public they are, are instantly reflected in the market. The other famous theory at the other end of the spectrum would be the random walk theory whereby all market movements are considered as completely random events, and that stock market dealings are complete acts of gambling (Fama, 1965).
However, it is not a secret that many stock market movements are reactive to events in the world as there are numerous examples for it; the impact of a release of a robust and popular product by a company on its stock price is evident. The impact of political events like international sanctions or wars is obvious on the stock market. Hence, in this research, there will be no doubt in existence of strong relations between the stock market prices and event information in the outside world.
The rather intriguing topic is the methods for identification of such relations between the outside world and the market. Having identified such relations, their impact on the market price can definitely help greatly with foreseeing market movements. In the light of the recent stock market crash in 2008, and its impact on the with-itintertwined lives of world's people, demonstrated that we Nassirtoussi et al. 8323 are far behind from being able to predict markets as we really should today. Avoiding horrible losses in stock market equates to significant profitability. Therefore, this research is dedicated to increasing profitability of the decisions made in the stock market by basing them on relations with outside factors that have been identified as influential or merely related. There are two sources of data that are available from which relations can be derived with the stock market, namely, the technical and the fundamental data based on which the two main schools of thought in financial markets analysis evolve: the technical analysis and the fundamental analysis.
Technical analysts or chartists, look at the movement characteristics that are observed on the charts only. In the technical analysis graphs are analysed and patterns of movement over short and long periods of time are determined and classified, these patterns are used for predicting reoccurring natures in the movement. Technical analysis due to its straightforwardness and easiness of use with computer programs is very popular (Chaigusin et al., 2008;Lee, 2004). Despite the fact that it works on many occasions, on its own, it lacks the ability to bring in much meaning in market movements.
On the other hand, fundamental analysis looks at world events and the data outside the charts and their impact on the market movements. It usually looks at the health factors of companies for example their cash flow, income and balance sheet (Eng et al., 2008).
Fundamental analysis can be extended to other factors like geopolitical factors of influence, other economic data that is released by the governments and news. All the above sources or similar external sources of information can have an impact on the decisions that are made by buyers and sellers or can reflect a relationship between external phenomena and market movements. These factors can be especially relevant when considering the movements in FOREX. Taking into consideration a currency pair and its price moves and finding a relation between external information and the currency pair moves is the specific aspect that is explored in this work. This work benefits the body of research by establishing a novel methodology for integration of fundamental analysis into the market prediction analysis that is predominantly occupied by technical analysis. Such integration can complement the methods based on technical analysis and can provide grounds for much higher prediction accuracy rates.
This paper hypothesizes about possible logical relevance of some external economic information and the moves of a currency pair. It then introduces some of the sources that are publicly available for such data which have been identified. For this work, a number of experiments are conducted to determine the existence of plausible relations between the external sources of info and the price moves of the currency pair. The experiments are conducted with the use of neural networks.
At the end, this work concludes by specifying the relations that it has identified and recommends further on how the results can be used in future research.

Possible relationship between a currency pair and fundamental national economic data
This work at first hypothesizes about existence of possible relationships between movements of a currency pair in FOREX as an example for a fluctuating price point in the market and the national economic data for the relevant countries as possible fundamental data that are external to the FOREX market charts. The objective is to devise a mechanism that can identify plausible relationships between specific economic data and the price moves of the currency pair with a precision. Hence, the objective is twofold: firstly to propose a mechanism to put existence of such relation to test and secondly, to put a number of different types of economic data to test and identify the ones with a relationship that can be observed.
The strongest currencies in the market are USD, euro and pound sterling. Hence, possible data sources for national economic data for the U.S., Europe and the U.K. are identified. The most convenient and useful sources for the purpose of this research are determined based on the comprehensiveness and reliability of their data, their presented data format and their historic data availability as well as their frequency of data release. The successful sources based on the aforementioned criteria are: 1-Bureau of Economic Analysis -U.S. Department of Commerce (http://www.bea.gov), 2-Bank of England (http://www.bankofengland.co.uk) BEA is an agency of the Department of Commerce. Along with the Census Bureau and STAT-USA, BEA is part of the Department's Economics and Statistics Administration. BEA produces economic accounts statistics that enable government and business decision-makers, researchers, and the American public to follow and understand the performance of the Nation's economy. A more comprehensive introduction to BAE can be found at its mission statement website page at http://www.bea.gov/ about/ mission.htm.
On the other hand, the Bank of England is the central bank of the United Kingdom which is the centre of the UK's financial system; the Bank contributes to promoting and maintaining monetary and financial stability. Both of these institutions provide the exact kind of data that is required as input for the experiments in this research.
Since these two sources present financial data about the U.S. and England respectively, intuitively, the currency pair of USD/GPD is chosen as the currency pair whose price moves are to be monitored in the experiments. The fundamental data that can be derived about the U.S. economics from the former data source and about the UK economics from the latter one is to be investigated for relationships with the pair's value. This is the relationship that is hypothesized about its existence and is to be identified and put to test. If the test succeeds, the currency pair moves can be predicted with a known precision based on the economic data.
Next is to identify some economic data among the many sets of the available data in the above sources. The primary criteria taken into consideration for choice of the data sets are as follows: Firstly, there needs to be an intuitive relationship between the fundamental data set and the currency pair USD/GBP. All data sets are related to the U.S. or the UK economics and are supposed to have an impact on the currency pair or be impacted by it but some seem to be better choices at least intuitively and those are given priority in this experiment. Secondly, the data set is to be available on a monthly basis as opposed to many fundamental data sets that are available only on a yearly basis, so that relatively shorter terms can be explored which are more attractive for prediction purposes. Thirdly, the monthly data should be available for the same period for all data (which in this experiment is from February 1996 onwards).
Based on the afore criteria, 3 main sources for data sets are identified, which are: 1-UK International Reserves (in US dollar millions) from Bank of England, 2-U.S. National Retail and Food Services Sales from Bureau of Economic Analysis, 3-U.S. International Trade in Goods and Services (Total Import and Export) from Bureau of Economic Analysis. These data sets were available in the needed frequency for the period set for this experiment. Moreover, they were intuitively very relevant to USD/GPD moves. This is expanded a bit more in the following.
UK International Reserves is any kind of reserve funds that can be passed between the central banks of different countries. International reserves are an acceptable form of payment between these banks. The reserves themselves can either be gold or else, a specific currency, such as the dollar or euro (International Reserves Definition, 2010). Central banks throughout the world have sometimes cooperated in buying and selling official international reserves to attempt to influence exchange rates (Foreign Exchange Reserves, 2010). The quantity of foreign exchange reserves can change as a central bank implements monetary policy (Aristovnik and Cec, 2009). Hence, there should be a solid relationship between the released data on international reserves and the fluctuations of the USD/GBP. There are multiple indices with regards to the internal reserves, each pertaining to a specific aspect of impact, the following were chosen for the purpose of this research based on data availability and plausibility of the relationship based on these: 1-Monthly amounts outstanding of Central Government foreign currency total reserves (in US dollar millions) not seasonally adjusted, 2-Monthly amounts outstanding of Central Government IMF reserve tranche position total in special drawing rights (in US dollar millions) not seasonally adjusted, 3-Monthly amounts outstanding of Central Government Gold swapped or on loan total (in US dollar millions) not seasonally adjusted, 4-Monthly amounts outstanding of Central Government all foreign currency forwards and swaps (incl sterling leg) total (in US dollar millions) not seasonally adjusted, 5-Monthly amounts outstanding of Bank of England Banking Department all foreign currency total bills issued (in US dollar millions) not seasonally adjusted, 6-Quarterly amounts outstanding of Bank of England Banking Department total US dollar assets (in US dollar millions) not seasonally adjusted. The above were used as part of the input for the experiments in this work as indicators on international reserves.
Next identified data set is U.S. National Retail and Food Services Sales from Bureau of Economic Analysis. Retail sales occur when businesses sell goods or services to households. How much is spent on retail and food services by consumers is tied closely with purchasing power and economic growth. It is plausible to assume that the strength of the U.S. currency can have a relationship with the National Retail and Food Services Sales.
Next proposed data set for experiment is U.S. International Trade in Goods and Services (Total Import and Export) from Bureau of Economic Analysis. In this category, two indicators are used, firstly, the balance of import and export in goods and services. The total export of goods and services minus the total import of the goods and services on a monthly basis, composes the balance of import and export. Second is the total of monthly export of goods and services in the U.S.
The aforementioned are the 3 categories of fundamental data found in statistical resources. Intuitively relationships are plausible between any of them and the currency pair moves or between a combination of them and the pair's moves. This is put to test to identify if such relationship exists, which potentially can be used for forecasting. Nevertheless, the restrictions which were imposed in selection of these datasets should not be dismissed. Prediction of market behaviour on a monthly basis requires availability of the kind of fundamental data that is required in this approach, that is, numeric data that is released in periodic reports of official financial organizations. Such data at the required frequency is not easily available as many of the official reports are of longer intervals. Furthermore, the data had to be available for the same period of time for all the sources, that is, from February 1996 onwards. Hence, such use of numeric fundamental data extracted from periodic reports is limited but it is novel and this study demonstrates its effectiveness.

Appropriateness of neural networks for relationship determination and stock/exchange market prediction
An artificial neural network (ANN), usually called neural network (NN), is a mathematical model or computational model that is inspired by the structure and/or functional Nassirtoussi et al. 8325 aspects of biological neural networks. A neural network consists of an interconnected group of artificial neurons, and it processes information using a connectionist approach to computation. In most cases an ANN is an adaptive system that changes its structure based on external or internal information that flows through the network during the learning phase. Modern neural networks are non-linear statistical data modelling tools. They are usually used to model complex relationships between inputs and outputs or to find patterns in data (Artificial neural network, 2010). Neural networks are perfectly suited to model price movements (Emam, 2008;Eng et al., 2008). They can model price behaviour mathematically themselves and they have the ability to extract information from large amounts of data which is necessary for complex signals such as financial price movements (Samarasinghe, 2007;Technical Analysis with Neural Networks, 2010).
A neural network is made up of a number of interconnected neurons which behave like an artificial brain. The network is stimulated by appropriate input signals, in the case of technical analysis the input signals can be historical price data and outputs from other technical indicators (Samarasinghe, 2007;Technical Analysis with Neural Networks, 2010;Wang, 2009).
The neural network is trained to find connections between these input signals and future price levels. If there exists such connections, neural networks have an amazing capability to find them. This can then be used to generate appropriate trading signals (Technical Analysis with Neural Networks, 2010).
There has been a lot of research on the use of neural networks in technical analysis of the stock market, but the idea is still fairly new (Dagli et al., 2003;Zhang et al., 1998;Chenoweth et al., 1996).
However, there is very little research about the possible use of neural networks in fundamental analysis and that is exactly what this work addresses.

Specifications of the neural networks
In order to use neural networks for experimentation in this research, a program by the name of GoldenGem is utilized. A number of considerations were made in the choice of this program. Firstly, the program had to be freely available for this academic research. Secondly, it had to use standard and established neural networks algorithms. Next, it had to be conveniently used with the stock and FOREX market, and the available market data on the Internet. Moreover, the training of the neural networks needed to be convenient and practical. Lastly, documentation sources for understanding its mechanism of operation needed to be accessible. With the aforementioned criteria an array of available tools were tested and compared with each other which included NuClass7, Sciengy RPF, Sharky Neural Network, NeuroShell Engine, EasyNN-plus and GoldenGem. GoldenGem met all the afore criteria best and was determined most suitable for the purpose of this study. Excerpts from GoldenGem's website on the technical specifications of this tool are used (GoldenGem, 2010). These details are vital to be considered to understand the nature of the experiments that are conducted in this work. Experiment design and methodology are described in the following sections after this one.
GoldenGem is a neural network computer program. The default configuration is the standard one, a three level perceptron, which can be a nonlinear function approxi-mator. Figure 1 illustrates a perceptron network with three layers.
The tool can receive input for the neural networks as text files. Training is accomplished by the use of a logarithmic sensitivity adjustment. Validation is by a pair of indicator lights. The first indicator light which becomes yellow if both the correlation coefficient and adjusted correlation coefficient of predicted versus actual change is larger than 0.2 and green if it is larger than 0.5 while the second indicator light goes from red to yellow to green as the training input is removed by the user's control of the sensitivity adjustment.
The adjusted correlation coefficient is needed because it is possible to obtain a falsely favourable correlation coefficient during back testing by a strategy of returning to the known mean value of past data.
One will need to try different combinations of input variables before being able to make both lights remain green at the same time. If the lights cannot be made to remain green, the prediction is meaningless. If the lights do remain green, then that means a relationship has been found which has been able to make successful predictions during the backtesting interval. When sensitivity is set to zero, there is no training input, and the green graph is calculated only using data values of all variables from the time of the earlier red graph and any prediction you see therefore shows a real mathematical relationship during backtesting.
The configuration of the program is limited to analyzing the values of a set of variables that change over time, with the aim of predicting the future value of one of those variables based only on the current value of all the variables.
The algorithm is the most widely used and simplest algorithm (Eng et al., 2008). Improved algorithms such as conjugate gradient may possibly be superior (GoldenGem, 2010;Güreşen and Kayakutlu, 2008).
In Table 1, a brief summary of the technical specifications of GoldenGem is presented. Since the inputs are normalized to mean zero, a bias neuron is needed to break the symmetry in layer one. The subsequent layers do not need a bias neuron (GoldenGem, 2010).

EXPERIMENT DESIGN AND METHODOLOGY
Six experiments are designed based on the mentioned specified 3 sources of data. All input is monthly values starting from February 1996 to March 2010. That is, a total of 170 values for each of the criteria that are to train the neural networks. So, firstly, datasets are chosen that start at least from February 1996. Secondly, they are available on a monthly basis. The extra data trail before and after these dates are omitted. Each of the criteria has 3 entries for each month: 1-the date, 2-the name of the factor 3-the value. The name of the factor or one of the criteria that is used for training the neural networks in GoldenGem is to be called a ticker from now on, because the program in its default mode uses other stock market tickers to train the neural networks for the prediction of a particular ticker. And in these experiments we are using fundamental data posed in the above particular structure and the name of the data factor that is used for training would be the so called ticker in this context.
The input values for a particular ticker is shown in Figure 2. For each experiment, an input text file is created that has the date, ticker name and the value as above for all the months in the mentioned time period and for all the tickers, meaning, if we have a combination of six tickers to train the neural networks, all of them are put in the same file one after the other and also the available values for USD/GBP for the same period are placed in the same file . Import export balance as a monthly ticker value fed into the neural networks in a text file.
in the same format. Later on, in GoldenGem under 'related group of tickers' field which is in the bottom right corner of the program console as can be seen in Figure 3, all tickers are to be mentioned, separated by comma. Then, when the text file is loaded using the file menu, the tickers can be seen in the drop down menu on the right hand side, too. There, one can choose which of the tickers is to be predicted and the rest are used as training data and prediction input.
On the slide at the bottom left, number of days to be predicted are set, in our context, because the data is presented in a monthly format, it would be the number of months for which the particular ticker is to be predicted. In this setting, the prediction is for the next 14 months.
The slide on the top left is for adjusting the sensitivity level to the real ticker value during the learning process. At the end of the learning process the two indicator lights need to be green while the sensitivity is set to zero. This means that the green graph which is the prediction is created by the learned neural networks and is blind to the actual values of the ticker but it matches the actual value (blue/red graph) in an acceptable proximity.

The method of conduct
After having set the ticker names and having loaded the input file, the sensitivity is adjusted to the highest. The ticker that is to be predicted is chosen and the iterations start. There are two indicator lights as long as the left one is green the level of sensitivity can be reduced little by little and eventually it can be set to zero. Then, if the left light is still green, after a few iterations, the right light may go green too. As soon as the two lights are green, it is accepted that the data sets can predict the particular ticker (USD/GBP). Otherwise, the group of tickers is not able to predict the particular ticker.

RESULTS AND DISCUSSION
A total of 6 experiments were conducted by providing the different sets of available fundamental data as input to the neural networks. This accumulated history data that is  Monthly U.S. retail and food services sales Yes 3 The monthly balance of import and export in goods and services in the U.S. No 4 Total of monthly export of goods and services in the U.S. No 5 The combination of the input of experiment s 1 and 3 Yes 6 The combination of the input of experiments 3 and 4 Yes fed to the networks is used for training it. If relationships exist between the input and the currency pair moves, after limited number of iterations the networks reaches a "learned" state. This indicates that based on the input alone, the neural network is capable of predicting the currency pair's price. As shown in Table 2, in half of the experiments, the neural networks reached a "learned" state. The neural networks did not manage to identify any relationship between different aspects of monthly UK international reserves as combined input, nor was any relationship found for the monthly balance of import and export in goods and services and the total monthly export of goods and services in the U.S. However, the monthly U.S. retail and food sales proved to have a relationship with the currency pair's moves which was detected by the neural networks. This indicates the sensitivity of domestic US markets to international currency markets. The money spent on retail and food services by consumers is tied closely with purchasing power and economic growth. The identified relationship by the neural networks proves that there is a clear relationship between the strength of the US currency and the national retail and food services sales, most probably because when US economy and people's purchasing power is on the rise more retail and food service purchases are made and the currency value behaves accordingly. Furthermore, interestingly experiments 5 and 6 did manage to bring the neural networks to a "learned" state. These two experiments are special because the inputs for both of them are combined input elements which have been used in other experiments and have not led to a learned state. This work also finds that the combination of those input sets and re-feeding them into the neural networks proves to be able to train the neural networks.
This proves that relationship between fundamental data and currency pair moves is of course very complicated, however, if different facets are put together and fundamental data is combined from different sources, neural networks can detect predictability. As in experiment 5, in which international reserves monthly data for the UK, combined with the monthly balance of import and export in goods and services in the U.S. surprisingly manages to bring the neural networks to a "learned" state. Furthermore, in experiment 6, the input for experiments 3 and 4 are combined, that is, the monthly balance of import and export in goods and services in the U.S. and the total of monthly export of goods and services in the U.S. and again, a positive result is gained which indicates predicttability after combination of data.

Conclusion
In this work, an effort is made to explore the possibilities of using fundamental data to predict currency price moves in the foreign exchange market. Such prediction is very much in demand; however, technical analysis is the approach that is widely looked at in research in this area. This work introduces an approach that can be undertaken for integration of fundamental analysis in automated prediction. The proposed approach in this work that resides on utilization of neural networks proves to be successful through the conducted experiments. The experiment results indicate solid plausibility in determining currency moves through the proposed methodology and with the use of the identified input.
In addition to identification of some fundamental data that can be used for such prediction and proposing a methodology, this work also manages to demonstrate through the conducted experiments that while a set of fundamental data might not be indicative of price moves on its own, it might very well contribute to determination of such indication when combined with other sets of such data. This clearly demonstrates the multitude of aspects of information that are involved, and points to the direction of combining different possible fundamental data inputs in order to get the best results. The feasibility of the act of taking such multiple aspects as input and producing an indicative output having taken all of that into consideration becomes only possible with the help of neural networks. A successful example of such use of neural networks is demonstrated in this work. Nassirtoussi et al. 8329 The experiments conducted on the sets of fundamental data showed that there is a plausible chance for predicttion of the currency pair USD/GDP based on such data. Although, at times (experiments 1, 3 and 4), individual fundamental factors as input prove to be ineffective predictors independently, but in those cases, a combination could be formed of such data sets which has strong prediction capability. This prediction capability can be observed and learned by neural networks. Therefore, this work produces an initial outlook on a new methodology for prediction of market moves based on fundamental data and also identifies a few data sources which prove to be effective for prediction through the proposed methodology.

FURTHER RESEARCH
This research aims to tackle many further aspects and questions posed during this work in its continuation. It also provides specific grounds for future research in the field of market prediction for others. Some of the possible avenues and topics which can be explored next are discussed further.
Firstly, the choice of fundamental data in terms of its source, nature and type can be further refined. The economic comprehension and reasoning can be advanced by steering this work further based on macroeconomic principles. The monitoring of market can be refined by making separate input dataset selections from prerecession, recession, and post-recession, that is, recovery periods; with that, the proposed methodology will contribute to the study of recession prediction. Secondly, different possibilities for the format of fundamental data needs to be explored, in this work numeric data from official reports is used as input. However, there is vital information available for fundamental analysis in the form of textual information; this research in its continuation is targeting a major extension to its methodology which deals with taking advantage of textual information available in news media through an extraction and preparation method. The focus is on composing a representation methodology that could prepare textual data in a way that could be used as input to the neural networks. Thirdly, combination of the proposed methodology that is based on neural networks with other approaches needs to be explored to see if the results accuracy can be increased. Fourthly, the development of a trading robot based on the proposed methodology is in sight.