This data, both structured and unstructured, comes from traditional and alternative sources. The buy side’s overriding concern is whether or not these increasingly large and complex data sets can help predict market behavior. The answer lies in machine learning, according to Bruno Dupire, head of quantitative research at Bloomberg.
“It’s natural to use machine learning as opposed to classical statistical methods, even if conceptually they are not very different,” Dupire says. “In one sense, machine learning is just advanced statistics. Both are trying to extract the signal from the noise. And, in finance, there is always a lot of noise.”
Dupire notes that while these methods do share commonalities, machine learning has a unique ability to systematically exploit nonlinearities — or relationships in which the output does not change in proportion to a change in the inputs.
Specifically, machine learning uses much higher dimension representations of data than is possible with the standard statistical toolkit. Machine learning can help make sense of vast amounts of unstructured data by mapping it into a numerical format that is more amenable to algorithmic analysis. Many examples of this are by now familiar, including forecasting based on light intensity in a city or the number of cars in a retailer’s parking lot.
Unstructured data
One example of how machine learning can process unstructured data is sentiment analysis of news articles and social media posts. This is not a deep syntactic analysis, Dupire explains, but rather an automatic mapping of text in terms of its probability of being positive, neutral or negative. The algorithm is “trained” how to do this with a large set of text that people have already classified correctly. Using support vector machine (SVM), long short-term memory (LSTM) and many other techniques, Dupire and his team observe and correct the algorithm’s ability to detect sentiment.
“Once the algorithm is tuned, we let it analyze live tweets,” Dupire says. “The way this is used on the Bloomberg Terminal service is the social velocity function, which can analyze the sentiment for a stock in five-minute increments.”
This kind of supervised learning can also be applied to exotic option pricing. In this scenario, the goal is to teach the algorithm to understand the deterministic relationship between various deal parameters and the price of the option. Because these deterministic functions are very complex, each price evaluation can be extremely time-consuming. This is where machine learning lends an advantage, because the algorithm can be trained on examples of prices generated from a wide variety of market situations and deal parameters.
“Once you have a database of actual prices, you can generate a pricing function that can be applied to conditions and parameters the algorithm hasn’t seen before,” Dupire says. “If you have pre-computed a rich set of data, the interpolation method can be quite fast to price new deals.”
Structured data
In terms of structured data, machine learning can be immensely helpful in filtering out broader market effects to understand the behavior of a single security. Within Dupire’s group, this effort is called Project 499 because it attempts to estimate the returns of one stock in the S&P 500 based on information about the remaining 499.
“If a stock you are interested in loses 4 percent on a given day but the market was down by 3 percent, that information is not really informative,” he says. “To understand what’s happening, you need to mute the influence of global factors that affect the market or sector as a whole.”
For example, a stock price is generally understood to drop by the amount of the dividend at the ex-date. But the dividend yield is typically 0.5 percent per quarter, which is less than a stock’s daily movement. In other words, if you want to know whether the market is reacting correctly to the ex-date, looking at the time series will not be of much value.
“We’re observing the real return of one stock and comparing it with our best estimate based on what we know about the others,” Dupire says. “The difference constitutes the surprise, which is the quantity we are interested in. With machine learning, we can systematically filter out other factors, get a purified time series, kill a lot of the noise and get a much cleaner analysis.”
Factor investing
Factor-based strategies are an excellent fit for machine learning. The number of factors to consider is extensive, ranging from classical elements such as P/E ratio and long-term debt to supply chain data to option data from volatility surfaces. Machine learning can help determine the best possible combination of factors faster and with greater flexibility.
“Machine learning helps us use any available information and exploit it to see if there is a combination of factors that will allow us to better explain future returns,” Dupire says. “In this case, the inputs are all of these characteristics of the stock and the output is future performance.”
No matter what factors you want to consider, the steps in the process are the same. First, you define your universe of stocks, whether that is companies with high diversity or low volatility. Then you choose the characteristics to combine into a composite factor and apply filters to select a sub-universe of stocks to consider for investment. Machine learning accelerates the process so that it is much easier to apply grouping and sorting tools, normalize volatility of various groups, risk-equalize different factors and see where changes in value are distributed to various sectors.
Another practical application of machine learning is identifying which strategies will work best according to market conditions. This is a similar concept to Project 499, because you can select any number of factors to describe the state of the market, find periods in the past that correspond to these factors, and analyze which strategies worked well during these periods. Essentially, the algorithm informs you how to rotate through various strategies as market conditions evolve.
However, Dupire cautions that the rules of the market tend to change quickly and that the market itself is a machine “made to destroy signals.” “When we talk about factor investing, everyone is trying to find a new characteristic that may have some predictive powers,” he says. “Every time you discern a signal that is exploitable, it’s likely you’re not the only one. To discover these opportunities and act on them, you need the proper tools. That’s what we are trying to do with machine learning.”
Data volumes and complexity will only grow as more market participants understand and appreciate the vast web of interconnectivity that defines much of the financial world. Each of these connection points could be transformed into fuel for forecasting with the help of machine learning. At the same time, advances in analytical techniques, combined with continuously improving computing power, are expanding the practical applications of this exciting technology even further.
Financial firms are increasingly using previously cumbersome, inconsistent, non-traditional data to inform their investment decisions and risk management strategies. While firms recognize the value of newer datasets for alpha generation, they face challenges like data connectivity, varying quality and ease-of-use. By offering clients a singular access point for finding and receiving reliable data, alternative and otherwise, Bloomberg reduces operational aspects of the data procurement processes, speeding up time to value and enabling easy integration to existing systems.
Learn more about Bloomberg's Enterprise Data and alternative data solution.