Better results of artificial neural networks in predicting ČEZ share prices

The specific objective of the article is to propose a methodology for predicting future price development of the ČEZ, a.s., share prices on Prague Stock Exchange using artificial neural networks and time series exponential smoothing to validate the results on a part of the time series, and to compare the success rate of these two methods. The data used in our analysis is the data on the share prices for the period of 2014-2019. Multilayer perceptron (MLP) and radial basis function (RBF) networks are generated, with the time series time lag of 1, 5, and 10 days. In the case of exponential smoothing of time series, multiplicative models (triple smoothing of time series) are used. Based on residuals and absolute residuals, the best model for share prices’ prediction is chosen. In the case of time series smoothing, the method of exponential smoothing appears to be more successful; however, predictions of the best neural network are significantly more accurate. The resulting neural network can be used in practice to predict future development of ČEZ share prices. The neural network is able to self-train for a certain period of time to provide current and more accurate predictions.


INTRODUCTION
At present, predicting as such is considered to be one of the most important parts in the planning and decision-making process in a wide variety of areas, such as natural sciences and humanities, informatics and, above all, finance. Effective prediction plays a key role in dealing with uncertainties and ambiguities that may arise in the future (Dash et al., 2019). Koutroumanidis et al. (2011) state that being able to predict stock prices is of crucial importance for all persons dealing with the stock market. Given the unpredictability of the global crisis, Qun et al. (2017) believe that predicting stock prices is an extremely complex and chaotic process, which makes the presence of a dynamic environment a great challenge. On the other hand, it must be said that a successful prediction of future stock price development can be very useful and beneficial. For example, Hašková (2017) states that stock price prediction has a huge potential, both for the entire market economy and for investors themselves, as it helps them to improve their return on shares. According to Koutroumanidis et al. (2011), stock price prediction plays a key role for investors, especially in terms of profit maximization. Li et al. (2019) state that the issue of stock price prediction is important in terms of the investment strategy, efforts to stabilize the financial system or market risk control. According to Ticknor (2013), accurate stock price predictions form the basis for financial investment decisions and represent probably the biggest challenges in capital investment. According to Dash et al. (2019), prediction of future stock price movements has always been a fascinating research area, not only for investors who want to make a profit from stock trading, but also for the researchers trying to discover hidden information from complex time series data at stock markets. Groda & Vrbka (2017) also add that such issues attract investors and researchers from all over the world, who focus on subjective investment judgments based on objective technical indicators. Klieštik & Majerová (2015) argue that predicting share prices is a topical issue for shareholders, dealers, and stockbrokers. Therefore, it is essential to make share prices predictions be of high accuracy. Etemadi et al. (2015) state that investors, managers, and financial analysts consider earnings per share to be one of the key financial indicators. This indicator is often used for assessing profitability, making investment decisions or predicting share prices. According to Mettle et al. (2014), fluctuations in prices make investing in shares risky, leaving investors uncertain. For improving investors´ confidence, several methods have been proposed, such as models of neural networks (ANNs), exponential smoothing, hybrid models using a combination of several models, etc. According to Sheelapriya & Murugesan (2016), it is now very important to have a reliable method that would overcome the difficulties related to complex prediction and enables capturing the development of share prices at financial markets.
The prices of many companies on the capital markets of Central European countries that have undergone political transformation change in response to specific factors, specific only to these economies (Redo, 2015, Pinteric, 2017. The image of the economy is not without significance here (Chugajev, 2015), which in this case means that investors often treat these countries on an equal footing with unstable, developing markets. This is despite high ratings assigned to them by international organizations (Ciak and Płókarz, 2016) or leveling the level of so-called creative economy (Krawiec and Noga, 2017).
It follows from the aforementioned facts that the issue of predicting share prices is a very current and interesting topic, and each new study dealing with this topic is of great theoretical and practical importance.
The objective of this contribution in particular is to propose a method for predicting the future development of ČEZ, a.s. share prices at Prague Stock Exchange using artificial neural networks and time series exponential smoothing, validation of the results on a part of the time series, and comparison of these methods' success rates.
Stemming from this objective, the following research questions have been formulated: V1: Are artificial neural networks more successful in time series smoothing than the methods of exponential smoothing? V2: Are artificial neural networks more successful in predicting than the methods of time series exponential smoothing?

LITERATURE REVIEW
In addition to literary research, the article describes the data and methods used for the preparation of the application part. The application part shows the results of calculations using ANNs as well as the results of calculations using exponential smoothing of time series. Subsequently, the results of both methods are compared. In the conclusion part, the relevant context is presented and the results are summarized. León-Álvarez et al. (2016) state that time series analysis involves studying groups or individuals observed at consecutive moments, that is, series of data points in time order. By means of analysing these time series, important statistics and other necessary data characteristics can be obtained. By means of time series predicting, it is possible to predict future values on the basis of previously observed values. Accurate and clear predicting of time series is important for various areas, such as energy, transport, economics, finance, etc. (Rodrigues et al., 2019). In this article, time series analysis will be used for predicting share prices of a specific company. According to Shi et al. (2012), the development of share prices is a dynamic, complex, and nonlinear process.
Although time series prediction is a very challenging task, there are various methods for dealing with this issue (Pointer & Khoi, 2019;Amassoma & Ogbuagu, 2018). It is important to note, however, that most of the existing techniques work with the historical range of values only; therefore, predictive models may not be fully effective in some cases. Often, a historical set of values is not enough. Pulungan et al. (2018) analyzed the possibility of using an autoregressive integrated moving average on the Indonesia Stock Exchange, especially for socially responsible investment. Plastun et al. (2018) investigated the frequency of overreaction to stock price fluctuations in the Ukrainian stock market. To do this, statistical tests (parametric and nonparametric) were carried out, including correlation analysis, augmented Dickey-Fuller test, Granger causality test, and regression analysis with additional variables. Tan et al. (2018) examined the uncertainty level on the UK stock market and indicated that the level affected daily earning, but the uncertainty itself did not cause a change in profit. The methods that can be used for these purposes include, for example, discriminant analysis, cluster analysis, the ARIMA model or decision trees. Majerčíková & Bartošová (2012), Valaskova et al. (2018), Gavurova et al. (2017) state that the method of multiple discriminant analysis is one of the traditional statistical models, representing a diagnostic method which consists in the interpretation and measurement of future economic risks. According to Halgaonkar et al. (2010), the aim of cluster analysis is to classify a specific number of objects into several relevant homogeneous clusters, taking into account the requirement of the objects je within one cluster being as similar to each other as possible and the objects belonging to different clusters be as similar as possible. Pai & Lin (2005) state that the autoregressive integrated moving average (ARIMA) model is one of the most widespread linear models in time series predicting. Chollet (2019) characterizes the decision tree method as data structures that can predict the resulting values of specific inputs or classify input data points. Rostan et al. (2018) add that methods based on the prediction and analysis of time series while taking into account seasonal fluctuations include, for example, seasonal differentiation and HP filter, spectral analysis, time series decomposition or Box-Jenkinson methodology. According to , the method of ANNs has recently been mentioned in connection with the analysis and predicting time series.
For example, Caridad et al. (2019) prove that artificial neural networks produce more disaggregated results when constructing ratings than some multivariate statistical methods and are used to assess ratings of credit rating agencies. Gamaliy et al. (2019) consider the possibility of using artificial neural networks to analyze the impact of political and economic factors on the situation in the global foreign-exchange market. Horák (2019) claims that ANNs are considered a highly effective method for data collection, analysis, and prediction. Therefore, their application is possible in many complex situations or for solving complex problems, e. g. in predicting the development of share prices. According to Wu & Duan (2017), ANNs are ableto reveal a complex relationship between investors and price fluctuations. For this reason, the networks start to be used for predicting share prices of individual companies. ANNs can also be used for solving many problems, but currently, their most significant application is related too-time series (Machová & Rowland, 2018). ANNs can also be used for regression, classification, etc. (Wang & Nguyen, 2015). Their advantage consists in their ability to work with big data, the accuracy of results or the possibility to use obtained neural networks simply. A significant disadvantage of these networks is the complexity of individual ANN models creation .
The research in the area of predicting share prices using ANNs has been addressed by many authors. Yamashita et al. (2005) confirmed that multifunctional ANNs have higher generalization and representation ability than common ANNs. ANNs are capable of learning to predict the price for the following day using share prices in time series. Zheng (2015), for example, proposed Elman neural network for predicting share prices. Laboissiere et al. (2015) proposed a methodology that predicts minimal and maximal share prices of three Brazilian energy distribution companies on the basis of ANNs. Sen & Das (2014a) applied the method of multilayer perceptron neural networks for predicting the share prices of Indian companies operating in IT. Lertyingyod & Benjamas (2016) introduced a prediction model capable of predicting share prices using data mining. Machová & Vochozka (2019) used ANNs for predicting the development of Unipetrol, a.s. stock exchange share prices on Prague Stock Exchange for a period of 62 trading days. By means of the statistic interpretation of the results obtained, it was found that all retained networks are applicable in practice.
As mentioned above, besides ANNs, another suitable prediction model for forecasting share prices is e. g. exponential smoothing. Exponential smoothing has been successfully used in many empirical studies over the past twenty years and is currently well established as a method of accurate prediction (Marek & Vrabec, 2015). The application of exponential smoothing for forecasting time series usually depends on three basic methods: simple exponential smoothing, trend-adjusted exponential smoothing, and its seasonal variation. A common approach to selecting a method suitable for a specific time series is based on validating the prediction on a retained part of the sample using the criteria such as mean absolute percentage error (Billah et al., 2006). The aim of Horák & Krulický (2019) is to compare the method of exponential time series smoothing and using artificial neural networks as a tool for predicting the future development of Unipetrol share prices. On the basis of simple argumentation, the authors are inclined to believe that a more realistic picture of further development is obtained by a prediction based on time series exponential smoothing.

METHODOLOGY
The most important shareholder of the parent company ČEZ, a.s. is the Czech Republic, with a share in the registered capital of almost 70% (as of 14 June 2017). ČEZ shares are traded on Prague and Warsaw Stock Exchanges, where they are a part of the PX and WIG-CEE stock indices.
The mission of ČEZ group is to provide safe, reliable, and positive energy for its customers and the whole society with the aim tobring innovations to address energy needs and contribute to a higher quality of life. The strategy reflects the fundamental transition of the energy market in Europe. ČEZ Group wants to operate its energy assets in the most efficient way possible and to adapt to growing share of decentralized and emission-free production. Another priority is to offer the customers a wide range of products and services in synergy with electricity and gas sales. The third priority is the active investment in promising energy assets with a focus on Central Europe and to the promotion of modern technologies at an early stage of their development.
In the Czech Republic, ČEZ Groups are engaged in coal mining and selling, production and distribution of electricity and heat, trading in electricity and other commodities, selling electricity, heat, and natural gas to end customers and provide them with other services. The product portfolio consists of nuclear, coal, gas, water, photovoltaic, wind, and biogas energy sources.
For the purposes of the article, there are available data on share prices in the period between 2 January 2014 and 30 December 2019, which is a total of 1,447. The data were obtained from the ČEZ database. These are closing prices of each day on which the shares were traded in the given period of time.
For the creation and possible verification of the model, the data from the period of 2 January 2014-21 October 2019 will be used.
The basic statistics of the dataset are shown in Figure 1.

Artificial neural networks
For data processing, TIBCO´s Statistica software, version 13 will be used. There will be used neural network data mining tools, specifically, time series (regressions).
Multilayer Perceptron networks (hereinafter referred to as MLP) and Radial Basis Function networks (RBF) will be generated. The independent variable will be time; the dependent variable will be the company share price. The independent variables will be: • Date: it is a continuous variable represented by an integer in the number of days from 1 January 1900. The variable will determine the moment of share price measurement. • Day of week: a continuous variable. Each day in a week will be represented by an integer ranging from 1 (Monday) to 7 (Sunday). By means of this variable, it will be possible to examine the weekly fluctuations of the time series or the way the day in a week in which the shares are traded influences the share price. on-year development of the time series. The time series will be divided into three sets -training, testing, and validation. The first set will contain 70% of the input data. Based on the training dataset, neural structures will be generated. The remaining two data sets will contain 15% of the input data each. Both sets will be used to verify the reliability of the found neural structure or the model created. The time series lag will be 1, 5, and 10. Time series lag indicates how much previous time data will enter in the calculation of the target variable. A very short lag can indicate that the fluctuation at the end of the time series will significantly affect all future predictions and the model will not be applicable. Too long lag can thus significantly reduce the seasonal fluctuations and simplify the development of the time series. Time series lag can thus be included only heuristically. For each time series lag, 10,000 neural networks will be generated, out of which 5 with the best characteristics will be retained. We will use the method of least squares Generation of networks will be finished if there is no improvement, that is, if there is no reduction of the value of the sum of the squares. We will thus retain the neural structures who sum of residuals squares to the actual development will be as low as possible (zero ideally). The hidden layer will contain at least two neurons but no more than 20. In the case of RBF, the hidden layer will contain between 21 and 31 neurons. For MLP and RBF networks, the activation functions given in Table 1 will be considered.

Petr Šuleř, Veronika Machová
Better results of artificial neural networks in predicting ČEZ share prices As an error function, the method of least squares will be used: Where Nis the number of trained cases, yi is theprediction of target variable ti, tiis thetarget variable of the i-th case.
Other settings will remain default (according to the ANN tool -automated neural networks). If the outputs are not adequate, the results can be corrected by the modification of individual neurons´ weight in the structure. Subsequently, the time series will be evaluated using the residuals (absolute residuals). We will examine the sum of residuals, average residuals, and minimal and maximal residuals. Moreover, expert evaluation of the price development and smoothed time series will be carried out. We will choose the most successful artificial neural networks (regardless of the lag) and compare them with the most successful results of time series exponential smoothing.

Exponential smoothing of time series
There will also be used TIBCO´s Statistica software, version 13, for the data processing. There will be used an advanced model tool -time series/predictions. We will use triple exponential smoothing of the time series. Multiplicative models will be used for the calculation. We will include seasonal components, linear, exponential and muted trend components, which are standardly defined by the following parameters: • Alpha: is a constant independent of the season and time series trend. It is contained in all models. • Delta: represents seasonal fluctuations of time series.
• Gama: sets the time series trend.
• Phi: it also sets the time series trend; it is applied only if it is possible to identify the muted (second) trend of the time series. The calculation will be carried out by means of the modification of the following model: The setting of the individual models will be as follows:

Comparison of methods
Two types of methods were used for time series smoothing: artificial intelligence in the form of multilayer perceptron networks and neural networks of radial basic function, and methods of exponential smoothing of time series. According to the authors, both methods show excellent results. Nevertheless, the research shows that the evaluator cannot rely fully on the results of one or the other set of results. It is thus necessary to compare the results of both methods with each other and select the one that best describes one of the present facts and best predict the future development of the monitored share prices. After time series smoothing, the best results will be chosen (neural networks and exponentially smoothed time series) at the time interval of 2 January 2014-31 October 2019. Thus, we compare the ability to predict future developments using the data on which the models will be created, trained.The best results will be used for predicting time series at the interval of 1 November 2019-30 December 2019. Although the data are known it was not used for the time series smoothing. The ability to apply individual models in practice will be verified. We will monitor the accuracy of the prediction (i.e. the final price that the model will offer) and the error (i.e. the absolute residuals-the difference between the predicted value and the actual value). It is also possible to use other result verification methods (e.g. Hodrick-Prescott decomposition or technical analysis). In this case, however, it is easier to use the value of absolute residuals, because the development of the share price is known in the monitored (validation) period. The result of the paper will be one model able to meet the objective of the article and predict the future development of ČEZ's share price. A total of 10,004 modes will be created (10,000 neural networks, 4 models of exponential alignment of time series). Of these, we will mark 10,003 as unnecessary.

Neural networks
The results of the first part of the experiment conducted by means of neural networks are given in Table 2. It follows from the table that only the MLP neural networks were retained. Their performance is characterized by the correlation coefficient, which is above 0.98 for all subsets of the data set for all retained neural networks. This indicates almost direct dependence. Neural networks thus found a model of the time series and present its high accuracy as well. All retained neural networks are in the structure 5-11-1, which means 5 input variables, 11 neurons in the hidden layer of the neural structure, and one output in the form of predicted ČEZ share price. For the activation of the neurons in the hidden layer, the hyperbolic tangent function. Neurons in the output layer were activated using the identity, logistic, exponential, and hyperbolic tangent functions. The course of the smoothed time series with the actual development of share price is shown in Figure 3.  Vol.13, No.2, 2020 The figure shows that the neural structures were able to capture the global trend of the time series and were able to show most local extremes in the development of the share prices. Table 3 shows the results of retained neural networks in the case of a 5-day lag of the time series. In the case of a 5-day lag, the retained neural networks are also the MLP networks. The performance of all structures is also very high in this case, with the correlation coefficient being above 0.97, or nearly 0.98 and higher in all its subsets. The time lag results in 25 neurons in the input layer (5 inputs at 5-day time lag of the time series), 7-11 neurons in the hidden layer, and 1 neuron in the output layer (share price). For the activation of the hidden layer, logistic and hyperbolic tangent functions were used, while the exponential function was used for the activation of the neurons in the output layer. The course of the smoothed time series and the actual ČEZ share prices is shown in the graph in Figure 4. Even in the case of a 5-day lag, it is evident that the neural networks are able to follow the global trend of the time series (including seasonal fluctuations) and record most of the local extremes. Even in this case, the retained neural networks can be considered successful. Table 4 shows the overview of retained neural networks with a10-day lag. The retained networks are all the MLP networks, with 50 neurons in the input layer (5 inputs at 10-day time series lag) and 8-9 neurons in the hidden layer. The networks show high performance in all data sets. The correlation coefficient is almost 0.97 and higher. This could also be considered nearly directly proportional. The neural networks use the logistic function for the activation of the neurons in the hidden layer, and the identity, exponential, and sine functions for the activation of the neurons in the output layer. The graphicalrepresentation of the course of smoothed time series and the actual share prices are shown in Figure 5.  Vol.13, No.2, 2020 The retained neural networks are capable of copying the global trend of the time series and most of the local extremes. At first sight it can be stated that the most successful time series are those with a 1 and 5-day lag. The specific residuals and absolute residuals valuesare given in Table 5.

Exponential smoothing
Triple exponential smoothing of the time series was carried out. The smoothed time series and the actual development of ČEZ share price can be seen in the graph in Figure 6. The exponentially smoothed time series are also able to capture the overall development of ČEZ share price, as well as the majority of the local extremes.
The comparison of the performance of the smoothed time series in the form of residuals and absolute residuals is shown in Table 6.   The graph shows that there are only minimal differences between the most successful time series residuals. Therefore, we will go back to the sums and averages of residuals and absolute residuals (see Table 7). On the basis of absolute residuals, the most successful smoothed time series appears to be Smoothed 4 followed by 3. MLP 5-11-1.

Comparison and discussion
At this moment, it is necessary to move on to the second part of the methodology, that is, it is necessary to apply the resulting models on the time series from 1 November 2019 to 30 December 2019. The comparison of the models and the actual course can be seen in the graph in Figure 8.   In the table, the error is calculated as a share of absolute residual to share price on a given day. It is expressed in percentage. The most interesting value appears to be the error in the period of 1 November 2019-30 December 2019 (that is, in the period of prediction). In this case, the best values are achieved by 3. MLP 5-11-1 (although Smoothed 4 showed the best values during the training period).
The course of the most successful time series, including the residuals, the smoothed time series, and prediction are shown in Figure 9. 2. V2: Yes, artificial neural networks are more successful in predicting than the exponential smoothing methods (in this contribution, it was 3. MLP 5-11-1).
• The time series based on which the models were not created (i.e. the period from 1 November 2019 to 30 December 2019) was best predicted by the neural network 3. MLP 5-11-1. The result is very interesting since the model that showed the best parameters on the detaches from which it was derived did not show the best predictive power. In general, the authors of texts focused on time series make every effort to prove the excellence of their results only on the parameters of time series adjustment. They are based on the fact that if the model can describe the past course of data excellently, it will be able to equally describe the future development of the time series with the same excellence. If the authors of this article had done the same, the Smoothed 4 model would have been used. However, the authors decided to divide the data so that they could verify the predictive power using the data based on which the models were not created. It can thus be concluded that the model with the best ability to smooth the time series may not show the best parameters in predicting its future development. The analogy between the development of the data in the past and their development in the future is not absolute.
The results of this research show that the application of the artificial neural network method for predicting the future development of ČEZ share prices was proved to be a better choice compared to the method of exponential time series adjustment. In all three cases, time series lag, i.e. 1-day, 5-day, and 10-day lag, only MLP neural networks were retained. These networks always found a time series model with high accuracy. These neural networks were also able to always capture the global trend of the time series and at the same time record most of the local extremes in the development of the share price of the monitored company. Although exponentially smoothed time series affect the overall development of ČEZ's share price and are also able to record the vast majority of local extremes, the neural network method shows better values in the overall assessment.
In this context, it can be stated that our results confirmed the findings of Sen & Das (2014b), who, in their study, dealt with a similar topic. They specifically applied the methods of MLP networks to predict the share price of Indian companies operating in information technology. Also, according to Oliveira (2011), the method of artificial neural networks is one of the best prediction methods with a high level of precision in the area of share price predicting. Based on the survey conducted by Chang (2011), it can be stated that in the case of comparison with other methods, neural networks appear to be the most stable technique for predicting the development of share prices.

CONCLUSION
Time series smoothing, that is, searching for a model of behavior of specific variables over time, is a very topical issue. Many plans and outlooks need such a model. Nevertheless, it is very important to find a model that is able to describe the reality as precisely as possible. (However, it is necessary to realize that each model is always a certain simplification of reality. This is due to the fact that a model does not consider all factors that can influence the value of the target variable). Tasks where such a model is very needed and useful include determining the value of share prices on the capital markets. Using such a model, a company determines its business as well as the financial strategies (a possibility to attract new investors, etc.). On the other hand, investors hope to appreciate their invested funds and examine whether their expectations will be met. There is a number of methods for predicting share price. Experts use fundamental or psychological methods. In any case, they always use statistical methods for verification and checking. Artificial neural networks have been recently coming to the fore in terms of the frequency of their use. Creating a successful neural structure is a complex problem. However, its use shall be more precise. This refers not only to a single-use of the network, but also to possible upgrading of such structures over time. It is possible to retrain the network and use its modified form. It is even possible to set the network to retrain itself within a certain time interval to provide up-to-date predictions. However, the assumption is that the neural networks will generate much more accurate prediction than the classical statistical methods. The objective of the contribution was thus to compare the success rate of the neural networks (or selected neural networks) with conventional methods (namely the method of time series exponential smoothing). Although the exponential smoothing method was more successful in terms of time series smoothing, the predictions of the most successful neural network were significantly more accurate.
There has been created a neural network that is able to predict the development of ČEZ share prices and is thus applicable in practice. The neural structure was programmed in C++. Over time (or at a predetermined time interval), the network can be retrained and re-used for predicting the share prices of a given company. The interval can be set to a day (in such a case, it is possible to retrain the network with high accuracy).
It can be assumed that the length of the time series may have some influence on the creation of the model. If the time series is too short, the model does not capture the trend, the seasonality, or both. If the time series is too long, there can also be some inaccuracy. The model considers a part of the time series that factually does not influence anything, but the time series model smooths the time series in its course, which may result in greater inaccuracy. This can be seen as a limitation to this research that has to be dealt with.