Programme

All times are at GMT+3 (Finland)

4th May

11:00 – 12:30

Opening ceremony + Keynote

13:00 – 15:00

Session 1

16:00 – 18:00

Session 2

5th May

11:00 – 13:00

Session 3

13:30 – 14:30

Session 4

15:00 – 17:30

Session 5

17:00

Closing

Accepted papers

Masanori Hirano (The University of Tokyo), Hiroki Sakaji (The University of Tokyo) and Kiyoshi Izumi (The University of Tokyo). 

Title: Concept and Practice of Artificial Market Data Mining Platform.

Abstract. We proposed a concept called the artificial market data mining platform and presented a practical example for it.

This concept is designed to evaluate data mining methods through artificial market simulation.

We believe that the proposed platform can help us conduct a fair evaluation of data mining methods for the financial market without actual data dependence, and investigate the impact of financial market factors on predictions of future market movements.

In this study, as a practical example, we built a tick-time level artificial market simulation and data mining models to predict short-term price changes and investigated the effect of four financial market factors on the performance of data mining models.

Through experimental analysis, we demonstrated the validity and benefits of the proposed concept and practice model.

We also discussed the potential and future applications of our proposal.

Isla Almeida Oliveira (CERTI Foundation), Pâmela Rugoni Belin (CERTI Foundation), Carlos José Alves Santos (CERTI Foundation), Mathias Arno Ludwig (AES Brasil), Júlia da Rosa Howat Rodrigues (AES Brasil) and Cesare Quinteiro Pica (CERTI Foundation). 

TitleLong-Term Energy Consumption Forecast for a Commercial Virtual Power Plant Using a Hybrid K-means and Linear Regression Algorithm.

Abstract. With regard to the development of a commercial Virtual Power Plant (VPP) – whose objective is to aggregate consumer and generator units that receive contractual benefits through a joint operation –, arises the necessity to implement a long-term energy consumption forecast algorithm, with the competence to provide inputs for the decision on the purchase or sale of long-term energy contracts. To perform this forecast, a hybrid algorithm with k-means clustering is used to cluster seasonal patterns of daily energy consumption through unsupervised machine learning, also applying regression concepts to identify trends and compose forecasted consumption. The model traces daily consumption profiles throughout the year utilizing measurement data to forecast the monthly energy consumption, which is segmented in peak and off-peak periods, in virtue of additional taxes that are charged for distributors of electricity in high demand hours. The proposed forecast model resulted in elevated accuracy in the aggregated loads context – which is the main objective of the VPP application –, increasing the usefulness of the VPP application as a decision-making tool for retailers, power distribution companies and other purposes involving grouping of electricity consumption.

Jiacheng Yang (Department of Computer Science, University College London), Denis De Montigny (Department of Computer Science, University College London) and Philip Treleaven (Department of Computer Science, University College London). 

Title: ANN, LSTM, and SVR for Gold Price Forecasting.

Abstract. This paper investigates a series of machine learning models (e.g. ANN, LSTM, SVR) to predict gold prices according to traditional indices, emerging indicators, commodities, and historical price time series of gold. In our approach, three machine learning algorithms, Artificial Neural Network (ANN), Long Short-Term Memory (LSTM), and Support Vector Regression (SVR), are applied to build the models that forecast the gold price. The dataset for this research is a time-series from 1st January 2017 to 31st December 2020, containing two major indices in the US (S&P 500 and DJI), two popular cryptocurrencies (BTC and ETH), two commodities (silver and crude oil), USD index (United States Dollar against Euro), and the gold prices (historical price and volatility) [24]. The evaluation benchmarks are Mean Absolute Error (MAE), Root Mean Square Error (RMSE), and Mean Absolute Percentage Error (MAPE). In the first stage, a comparative analysis is applied to three models. In the second stage, the assessment of the impact of cryptocurrency on the models is demonstrated. It was observed that the SVR model outperforms the other two models, and our result indicates that the additional data of cryptocurrencies has a positive impact on all three models.

Sulalitha Bowala (University of Manitoba), Japjeet Singh (University of Manitoba), Aerambamoorthy Thavaneswaran (University of Manitoba), Ruppa Thulasiram (University of Manitoba) and Saumen Mandal (University of Manitoba). 

Title: Comparison of Fuzzy Risk Forecast Intervals for Cryptocurrencies.

Abstract. Data-driven  volatility  models  and  neuro-volatility models have the potential to revolutionize the area of Computational  Finance.  Volatility measures  the  variation  of  a  time  series data, and thus it is also a driving factor for the risk forecasting of returns from investment in cryptocurrencies. A cryptocurrency is a decentralized medium of exchange that relies on cryptographic primitives  to  facilitate  the  trustless  transfer  of  value  between different parties. Instead of being physical money, cryptocurrency payments exist purely as digital entries on an online ledger called blockchain  that  describe  specific  transactions.

Many  commonly  used  risk  forecasting  models  do  not  take into  account  the  uncertainty  associated  with  the  volatility  of an  underlying  asset  to  obtain  the  risk  forecasts.  Some  tools from the fuzzy set theory can be incorporated in the forecasting models  to  account  for  this  uncertainty.  Interest  in  the  use  of hybrid models for fuzzy volatility forecasts is growing. However,a  major  drawback  is  that  the  fuzzy  coefficient  hybrid  models used in fuzzy volatility forecasts are not data-driven. This paper uses fuzzy set theory with data-driven volatility and data-driven neuro-volatility forecasts to study the fuzzy risk forecasts. Simple yet effective models incorporating fuzziness to obtain fuzzy risk  volatility  forecasts  and  fuzzy  VaR  forecasts  are  presented.  The key  underlying  idea,  unlike  the  existing  risk  forecasting,  is  the use  of  hybrid  nonlinear  adaptive  fuzzy  model  for  volatility.

Aerambamoorthy Thavaneswaran (University of Manitoba), You Liang (Ryerson University), Sanjiv Das (Santa Clara University), Ruppa K. Thulasiram (University of Manitoba) and Janakumar Bhanushali (Ryerson University). 

Title: Intelligent Probabilistic Forecasts of VIX and its Volatility using Machine Learning Methods.

Abstract. The market focuses on the Cboe Volatility Index (VIX) or Fear Index, an option-implied forecast of 30 calendar day realized volatility of S&P 500 returns derived from a cross-section of vanilla options. The VIX is determined using a formula that derives the market’s expectation of realized one-month standard deviation of returns backed out from the near-term call and put options on the S&P 500 index. Market participants such as traders, asset managers, and risk managers, keenly watch the VIX index, and are interested in achieving accurate intelligent probabilistic forecasts of the VIX, and also of the realized volatility of individual stocks. These volatility forecasts are useful to options traders placing bets on the future volatility of individual stocks. This paper examine models that only utilize past values of the VIX and document improvements in forecasting the VIX (and its volatility) over different horizons. The approaches include LSTM models, simple moving average methods, data-driven neuro volatility techniques, and industry models like Prophet. Uniquely, we propose a novel VIX price interval forecasting model. The driving idea, unlike the existing VIX price forecasting models is that the proposed novel LSTM interval forecasting method trains two LSTMs to obtain price forecasts and the forecast error volatility forecasts. All the proposed forecasting methods also avoid model identification and estimation issues, especially for a series like the VIX which is non-stationary. We compare models and document which ones perform best for varied horizons.

Yoshiyuki Suimon (The University of Tokyo, Nomura Securities) and Hiroto Tanabe (Nomura Securities). 

Title: Construction of real-time manufacturing industry production activity estimation models using high-frequency electricity demand data.

Abstract. In this paper we describe how we estimated production activity in the manufacturing industry in Japan by analyzing the characteristics of fluctuations in the high-frequency electricity demand data published by major Japanese electric power companies, on the basis that the manufacturing industry consumes electricity when carrying out production activity. We constructed mathematical models to estimate production activity in each area of Japan on the basis of electricity data provided by multiple electric power companies, and then combined the estimates generated by these models to estimate production activity in Japan as a whole. The industrial production index published by Japan’s Ministry of Economy, Trade and Industry (METI) is an example of government data that reflects production activity in the manufacturing industry. However, the industrial production index for a particular month is not published until the end of the following month, so there is something of a time lag between the production activity itself and the publication of this government data. The method we set out in this paper makes it possible to estimate manufacturing industry production activity around one month before METI’s industrial production index is published through the use of highly timely electricity demand data. Furthermore, the industrial production index is normally calculated on a monthly basis, but in this paper, by taking advantage of the high degree of time granularity of the electricity demand data we use, we are able to present a mathematical model that generates highly timely estimates on a weekly basis.

Christopher Felder (University Tübingen) and Stefan Mayer (University Tübingen). 

Title: Customized Stock Return Prediction with Deep Learning.

Abstract. In finance, researchers so far use standard loss functions such as mean squared error when training artificial neural networks for return prediction. However, from an investor’s perspective, prediction errors are ambiguous: in practice, the investor prefers to see underprediction of portfolio returns rather than overprediction, as the former implies higher realized returns and thus financial benefits. We present a loss function customized to this behavior and test it based on Long Short-Term Memory (LSTM) models, the state-of-the-art tools in time series analysis. Our model learns unique signals, predicts returns more cautiously, and improves profit chances over the standard LSTM and reversal signals. Daily and weekly revised portfolios achieve on average five percentage points higher annualized returns. We show that our loss function is robust to market sentiment and beneficial in nonlinear optimization.

Charl Maree (University of Agder) and Christian Omlin (University of Agder). 

Title: Balancing Profit, Risk, and Sustainability for Portfolio Management.

Abstract. Stock portfolio optimization is the process of continuous reallocation of funds to a selection of stocks. This is a particularly well-suited problem for reinforcement learning, as daily rewards are compounding and objective functions may include more than just profit, e.g., risk and sustainability. We developed a novel utility function with the Sharpe ratio representing risk and the environmental, social, and governance score (ESG) representing sustainability. We show that a state-of-the-art policy gradient method – multi-agent deep deterministic policy gradients (MADDPG) – fails to find the optimum policy due to flat policy gradients and we therefore replaced gradient descent with a genetic algorithm for parameter optimization. We show that our system outperforms MADDPG while improving on deep Q-learning approaches by allowing for continuous action spaces. Crucially, by incorporating risk and sustainability criteria in the utility function, we improve on the state-of-the-art in reinforcement learning for portfolio optimization; risk and sustainability are essential in any modern trading strategy and we propose a system that does not merely report these metrics, but that actively optimizes the portfolio to improve on them.

Uta Pigorsch (University of Wuppertal) and Sebastian Schäfer (University of Wuppertal). 

Title: High-Dimensional Stock Portfolio Trading with Deep Reinforcement Learning.

Abstract. This paper proposes a Deep Reinforcement Learning algorithm for financial portfolio trading based on Deep Q-learning. The algorithm is capable of trading high-dimensional portfolios from cross-sectional datasets of any size which may include data gaps and non-unique history lengths in the assets. We sequentially set up environments by sampling one asset for each environment while rewarding investments with the resulting asset’s return and cash reservation with the average return of the set of assets. This enforces the agent to strategically assign capital to assets that it predicts to perform above-average. We apply our methodology in an out-of-sample analysis to 48 US stock portfolio setups, varying in the number of stocks from ten up to 500 stocks, in the selection criteria and in the level of transaction costs. The algorithm on average outperforms all considered passive and active benchmark investment strategies by a large margin using only one hyperparameter setup for all portfolios.

Amin Assareh (Fidelity Investments). 

Title: Information Retrieval from Alternative Data using Zero-Shot Self-Supervised Learning.

Abstract. Traditionally, in the financial services industry, a large amount of financial analysts’ time is spent on knowledge discovery and extraction from different unstructured data sources, such as reports, research notes, SEC filings, earnings call transcripts, news etc. In addition to inefficiency, this manual information retrieval process can be prone to human error, subjectivity, and inconsistency. Recent advances in representation learning provide a reliable platform for mapping a large volume of unstructured data to a high dimensional vector space where similarities and differences between data points can be quantified and used for featurization, pattern recognition and information retrieval. In this work we demonstrate that by representing terms, documents and companies in the same informative vector space and applying a simple self-supervised learning framework, relevant companies and documents can be retrieved with a good level of accuracy given the topics of interest, even with no prior labeled data.

Charl Maree (University of Agder) and Christian Omlin (University of Agder). 

Title: Understanding Spending Behavior: Recurrent Neural Network Explanation and Interpretation.

Abstract. Micro-segmentation of customers in the finance sector is a non-trivial task and has been an atypical omission from recent scientific literature. Where traditional segmentation classifies customers based on coarse features such as demographics, micro-segmentation depicts more nuanced differences between individuals, bringing forth several advantages including the potential for improved personalization in financial services. AI and representation learning offer a unique opportunity to solve the problem of micro-segmentation. Although ubiquitous in many industries, the proliferation of AI in sensitive industries such as finance has become contingent on the explainability of deep models. We had previously solved the micro-segmentation problem by extracting temporal features from the state space of a recurrent neural network (RNN). However, due to the inherent opacity of RNNs our solution lacked an explanation. In this study, we address this issue by extracting a symbolic explanation for our model and providing an interpretation of our temporal features. For the explanation, we use a linear regression model to reconstruct the features in the state space with high fidelity. We show that our linear regression coefficients have not only learned the rules used to recreate the features, but have also learned the relationships that were not directly evident in the raw data. Finally, we propose a novel method to interpret the dynamics of the state space by using the principles of inverse regression and dynamical systems to locate and label a set of attractors.

Riu Naito (Hitotsubashi University) and Toshihiro Yamada (Hitotsubashi University). 

Title: A deep learning-based high-order operator splitting method for high-dimensional  nonlinear parabolic PDEs via Malliavin calculus: application to CVA computation.

Abstract. The paper introduces a deep learning-based high-order operator splitting method for nonlinear parabolic partial differential equations (PDEs) by using a Malliavin calculus approach. Through the method, a solution of a nonlinear PDE is accurately approximated even when the dimension of the PDE is high. As an application, the method is applied to the CVA computation in high-dimensional finance models. Numerical experiments performed on GPUs show the efficiency of the proposed method.

Farshid Balaneji (University of Basel) and Dietmar Maringer (University of Basel). 

Title: Applying Sentiment Analysis, Topic Modeling and XGBoost to Classify Implied Volatility.

Abstract. Implied volatility is an important indicator that shows the market participants’ expectations about the future fluctuations in the options market. This paper evaluates the question of whether the combination of topics and sentiment scores extracted from mainstream financial news could improve forecasting the directional changes of the expected implied volatility index in the next month (iv30call). We select six stocks from the Dow Jones list of companies and acquire over 190,000 news between January 2019 and September 2019. By building text processing and topic modeling pipelines, we can examine (i) the role of daily mean and medium of sentiment scores; and (ii) the influence of topic models on the classification metrics. The results demonstrate that adding topic model has a positive effect on the accuracy of the model, which reaches higher accuracy in classifying the iv30call of the next business day in five out of six companies. The the outcome suggests for the selected assets that applying the mean of the daily sentiment, scores improve the models’ accuracy in comparison to the daily median.

Takanobu Mizuta (SPARX asset Management Co., Ltd.), Isao Yagi (Faculty of Informatics, Kogakuin University) and Kosei Takashima (Faculty of Economics and Business Administration, Nagaoka University). 

Title: Instability of financial markets by optimizing investment strategies investigated by an agent-based model.

Abstract. Most finance studies are discussed on the basis of several hypotheses, for example, investors rationally optimize their investment strategies. However, the hypotheses themselves are sometimes criticized. Market impacts, where trades of investors can impact and change market prices, making optimization impossible. In this study, we built an artificial market model by adding technical analysis strategy agents searching one optimized parameter to a whole simulation run to the prior model and investigated whether investors’ inability to accurately estimate market impacts in their optimizations leads to optimization instability. In our results, the parameter of investment strategy never converged to a specific value but continued to change. This means that even if all other traders are fixed, only one investor will use backtesting to optimize his/her strategy, which leads to the time evolution of market prices becoming unstable. Optimization instability is one level higher than “non-equilibrium of market prices.” Therefore, the time evolution of market prices produced by investment strategies having such unstable parameters is highly unlikely to be predicted and have stable laws written by equations. This nature makes us suspect that financial markets include the principle of natural uniformity and indicates the difficulty of building an equation model explaining the time evolution of prices.

Kheng Kua (School of Computer Science and Engineering, UNSW) and Aleksandar Ignjatovic (School of Computer Science and Engineering, UNSW). 

Title: Iterative Filtering Algorithms for Computing Consensus Analyst Estimates.

Abstract. In equity investment management, sell side analysts serve an important role in forecasting metrics of companies’ financial performance. These estimates are often produced in an opaque manner, namely, the process upon which the estimate is initiated or revised is not directly observable. With multiple analysts covering the same company, and an analyst covering multiple companies, we have an n-m relationship. The systematic capture of analyst estimates provide a systematic and quantitative proxy for market sentiment. Thus far the academic literature analysing this dataset has resolved to use relatively simple methods for aggregating the individual estimates to arrive at a consensus estimate.

In this paper we propose a novel method for aggregating analyst estimates utilising iterative filtering algorithms. This work is inspired by applications of such classes of algorithms to the robust aggregation of sensor network data and online reviews.We conduct experiments using real-world datasets to demonstrate the efficacy of this approach. The results suggest iterative filtering methods improve upon the forecast accuracy of the consensus forecast compared to the simple mean consensus.

Nicholas Baard (University of the Witwatersrand) and Terence Van Zyl (University of Johannesburg). 

Title: Twin-Delayed Deep Deterministic Policy Gradient Algorithm for Portfolio Selection.

Abstract. State-of-the-art reinforcement learning algorithms have shown suboptimal performance in some market conditions with regard to the portfolio selection problem. The reason for suboptimal performance could be due to overestimation bias in actor-critic methods through the use of neural networks as the function approximator. The resulting bias leads to a suboptimal policy being learned by the agent, hindering performance. This research focuses on using the Twin-Delayed Deep Deterministic Policy Gradient (TD3) algorithm for portfolio selection to achieve greater results than previously achieved. In addition, an analysis of the overall effectiveness of the algorithm in various market conditions is needed to determine the TD3’s robustness. This research establishes a reinforcement learning environment for portfolio selection and trains the TD3 alongside three state-of-the-art algorithms in five different market conditions. The algorithms are tested by allowing the agent to manage a portfolio in each market for a specified period. The results are used for the analysis of the algorithms. The research shows improved results achieved by the TD3 algorithm for portfolio selection compared to other state-of-the-art algorithms. Furthermore, the performance of the TD3 across the five selected markets proves the robustness of the algorithm in its use for the portfolio selection problem.

Sander Noels (Ghent University & Silverfin), Benjamin Vandermarliere (Silverfin), Ken Bastiaensen (Silverfin) and Tijl De Bie (Ghent University). 

Title: An Earth Mover’s Distance Based Graph Distance Metric For Financial Statements.

Abstract. Quantifying the similarity between a group of companies has proven to be useful for several purposes, including company benchmarking, fraud detection, and searching for investment opportunities. This exercise can be done using a variety of data sources, such as company activity data and financial data. However, ledger account data is widely available and is standardized to a large extent. Such ledger accounts within a financial statement can be represented by means of a tree, i.e. a special type of graph, representing both the values of the ledger accounts and the relationships between them. Given their broad availability and rich information content, financial statements form a prime data source based on which company similarities or distances could be computed. In this paper, we present a graph distance metric that enables one to compute the similarity between the financial statements of two companies. We conduct a comprehensive experimental study using real-world financial data to demonstrate the usefulness of our proposed distance metric. The experimental results show promising results on a number of use cases. This method may be useful for investors looking for investment opportunities, government officials attempting to identify fraudulent companies, and accountants looking to benchmark a group of companies based on their financial statements.

Rui Ying Goh (University of Edinburgh), Galina Andreeva (University of Edinburgh) and Yi Cao (University of Edinburgh). 

Title: Predicting financial volatility from personal transactional data.

Abstract. Cash flow transactions of individuals fluctuate over time and can be irregular. Financial volatility measures the variation of individual’s financial behaviours i.e., degree of uncertainty from the cash flow fluctuations. The evaluation of financial volatility is important in order to identify potential risk behaviours that may harm financial wellbeing. This study predicts financial volatility from transactional patterns to examine risky behaviours. In this work, we develop a financial volatility composite index as the target variable, which simultaneously account for the fluctuations in income, expenditures, and financial buffer. Then, we fit linear regression model to investigate the relationship between transactional behaviours and financial volatility. Lastly, we compare the performances of statistical and machine learning techniques in predicting financial volatility. We discover risky volatile behaviours that imply financial difficulty. High financial volatility is risky if it is associated with potential financial struggles that require long term dependence on overdraft, lower spending on fixed and living costs, or problems in catching up with regular financial commitments. Low financial volatility is harmful if it is associated with restricted transactions due to extreme negative balances or consistent heavy overdraft usage. In general, the proposed financial volatility predictive model provides insights to understand the implicit risk of customers and even their vulnerability.

Taiga Saito (Graduate school of economics, The University of Tokyo) and Akihiko Takahashi (Graduate school of economics, The University of Tokyo). 

Title: Portfolio optimization with choice of a probability measure.

Abstract. This paper considers a new problem for portfolio optimization with a choice of a probability measure, particularly optimal investment problem under sentiments. Firstly, we formulate the problem as a sup-sup-inf problem consisting of optimal investment and a choice of a probability measure expressing aggressive and conservative attitudes of the investor. This problem also includes the case where the agent has conservative and neutral views on risks represented by Brownian motions and degrees of conservativeness differ among the risk.

Secondly, we obtain an expression of the volatility process of a backward stochastic differential equation related to the conservative sentiment in order to investigate cases where the sup-sup-inf problem is solved. Specifically, we take a Malliavin calculus approach to solve the problem and obtain an optimal portfolio process. Finally, we provide an expression of the optimal portfolio under the sentiments in two examples with stochastic uncertainties in an exponential utility case and investigate the impact of the sentiments on the portfolio process.

Fatim Habbab (University of Essex), Michael Kampouridis (University of Essex) and Alexandros Voudouris (University of Essex). 

Title: Optimizing Mixed-Asset Portfolios Involving REITs.

Abstract. Real Estate Investment Trusts (REITs) is a popular investment choice as it allows investors to hold shares in real estate rather than investing large sums of money to purchase real estate by themselves. Previous work studied the effectiveness of multi-asset portfolios that include REITs via an efficient frontier analysis. However, the advantages of including (both domestic and international) REITs in multi-asset portfolios, as well as analyzing all the possible combinations of asset classes, has not been investigated before. In this paper, we fill in this gap by performing a thorough investigation across 456 different portfolios to demonstrate the added value of including REITs in mixed-asset portfolios in terms of different important financial metrics. To this end, we use a genetic algorithm approach to maximize the Sharpe ratio of the portfolios. Our results show that optimization via a genetic algorithm outperforms the results obtained from a global minimum variance portfolio. More importantly, our results also show that there can be significant improvements in average returns, risk and Sharpe ratio when including REITs.

Haohang Li (Stevens Institute of Technology) and Steve Yang (Stevens Institute of Technology). 

Title: Impact of False Information from Spoofing Strategies: An ABM Model of Market Dynamics.

Abstract. Spoofing has been identified a form of market manipulation, and it is harmful to the stability of the financial market.  However, the effect of spoofing activity is hard to analyze due to its complex interactions within the market and lack of data.  This paper presents an agent-based simulation model of the continuous double auction market to replicate and analyze the market dynamics under spoofing conditions.  The simulated market consists of fundamentalist, chartist, zero intelligence agents, and spoofing agents where several existing market stylized facts are reproduced and validated. The results show that in the presence of the spoofing agents and their market manipulation activities, the market volatility would increase, and spoofing activities would exacerbate the price variations. The fundamentalist agents would suffer a loss during the spoofing period but would be able to make profit during the price recovery phase. The chartist agents would suffer a loss when the spoofing agent realized its profit and the price recovers to its normal condition where they falsely believed the price movement trend would continue. The Sharpe ratio analysis also indicates the market manipulation activities of the spoofing agent would give themselves an unfair advantage resulting in a significantly higher Sharpe ratio than the other agents.

Rodica Ioana Lung (Babes-Bolyai University). 

Title: A game theoretic based k-nearest neighbor approach for binary classification.

Abstract. K-nearest neighbor is one of the simplest and intuitive binary classification methods providing robust results on a wide range of data. However, classification results can be improved by using a decision method that is capable to assign, if necessary, the minority label from the list of neighbors of a tested instance. In this paper, we propose the use of a simple game-theoretic model to assign labels based on the information provided by the neighbors in order to enhance its performance for binary classification.

Eva Christodoulaki (University of Essex), Michael Kampouridis (University of Essex) and Panagiotis Kanellopoulos (University of Essex). 

Title: Technical and Sentiment Analysis in Financial Forecasting with Genetic Programming.

Abstract. Financial Forecasting is a popular and thriving research area that relies on indicators derived from technical and sentiment analysis. In this paper, we  investigate the advantages that sentiment analysis indicators provide, by comparing their performance to that of technical indicators, when both are used individually as features into a genetic programming algorithm focusing on the maximization of the Sharpe ratio. Moreover, while previous sentiment analysis research has focused mostly on the titles of articles, in this paper we use the text of the articles and their summaries. Our goal is to explore further on all possible sentiment features and identify which features contribute the most. We perform experiments on 26 different datasets and show that sentiment analysis produces better, and statistically significant, average results than technical analysis in terms of Sharpe ratio and risk.

Gregor Lenhard (University of Basel) and Dietmar Maringer (University of Basel). 

Title: State-ANFIS: A Generalized Regime-Switching Model for Financial Modeling.

Abstract. This paper presents an extension to the adaptive neuro-fuzzy inference system (ANFIS) by Jang (1993) called State-ANFIS (S-ANFIS) that is able to model nonlinear functions by a weighted model combination. In this context one often observes several variables that determine the regime of a system. S-ANFIS distinguishes cases based on external state variables and produces a weighted output of linear models. An application of S-ANFIS to artificially generated time series data is shown and compared to its base model and other neural networks. In addition, an application to a well-known dataset, the three factors of Fama and French (2021) to describe stock returns, is presented to underline the usefulness of the model. The work contributes to the existing regime-switching literature like smooth transition models in that it is able to utilize arbitrary many state variables.

Wilson Tsakane Mongwe (University Of Johannesburg), Tendo Sidogi (University Of Johannesburg), Rendani Mbuvha (University Of Witwatersrand) and Tshilidzi Marwala (University Of Johannesburg). 

Title: Probabilistic Inference of South African Equity Option Prices Under Jump-Diffusion Processes.

Abstract. Jump-diffusion processes have been utilised to capture the leptokurtic nature of asset returns and to fit the market observed option volatility skew with great success. These models can be calibrated to historical share price data or forward-looking option market data. In this work, we infer South African equity option prices using the Bayesian inference framework. This approach allows one to attain uncertainties in the parameters of the calibrated models and confidence intervals with any predictions produced with the models. We calibrate the one-dimensional Merton jump-diffusion model to European put and call option data on the All-Share price index using Markov Chain Monte Carlo methods: the Metropolis Adjusted Langevin Algorithm, Hamiltonian Monte Carlo, and the No-U-Turn Sampler. Our approach produces a distribution of the jump-diffusion model parameters, which can be used to build economic scenario generators and price exotic options such as those embedded in life insurance contracts. The empirical results show that our approach can, on test data, exactly price all put option prices regardless of their moneyness, with slight miss-pricing on very deep in the money calls.

Andrew Paskaramoorthy (University of the Witwatersrand), Terence van Zyl (University of Johannesburg) and Tim Gebbie (University of Cape Town). 

Title: An Empirical Comparison of Cross-Validation Procedures for Portfolio Selection.

Abstract. We present the constrained portfolio selection problem as a learning problem requiring hyper-parameter specification. In practice, validation procedures estimate algorithm performance and select optimal hyper-parameters. The quality of the selected hyper-parameter depends on the performance of the validation procedure. The literature has not previously investigated the relationship between these validation procedures and the portfolio selection problem. This study empirically examines the behaviour of common validation procedures, including holdout, k-fold cross-validation, Montecarlo cross-validation, and repeated k-fold cross-validation for estimating performance and selecting hyper-parameters for constrained portfolio selection. The results demonstrate that repeated k-fold cross-validation is the best-performing procedure and recommend using $5$ repetitions with 3 < k < 6 in practice.

Ismail Mohamed (University of Kent) and Fernando Otero (University of Kent). 

Title: A Performance Study of Multiobjective Particle Swarm Optimization Algorithms for Market Timing.

Abstract. Market timing is the issue of deciding when to buy or sell a given asset on a financial market. As one of the core issues of algorithmic trading systems, designers of such systems have turned to computational intelligence methods to aid them in this task. In our previous work, we introduced a number of Particle Swarm Optimization (PSO) algorithms to compose strategies for market timing using a novel training and testing methodology that reduced the likelihood of overfitting and tackled market timing as a multiobjective optimization problem. In this paper, we provide a detailed analysis of these multiobjective PSO algorithms and address two limitations in the results presented previously. The first limitation is that the PSO algorithms have not been compared to well-known algorithms or market timing techniques. This is addressed by comparing the results obtained against NSGA-II and MACD, a technique commonly used in market timing strategies. The second limitation is that we have no insight regarding diversity of the Pareto sets returned by the algorithms. We address this by using RadViz to visualize the Pareto sets returned by all the algorithms, including NSGA-II and MACD. The results show that the multiobjective PSO algorithms return statistically significantly better results than NSGA-II and MACD. We also observe that the multiobjective PSOSP algorithm consistently displayed the best spread in its returned Pareto sets despite not having any explicit diversity promoting measures.