Predicting the Future of Chinese Stocks


Finance, much like many fields, has become an increasingly siloed industry, especially in relation to tech. There are people who know finance and there are others who can code. This is the situation I faced when arriving as a summer analyst at JP Morgan in the summer of 2017.

Early on it became very apparent that there was not enough overlap, and there could be much to do in regards to integrating the two fields. So I set out to develop an algorithm that utilized the emerging technical aspects brought by the software teams that also was able to meet the needs of the financial analysts. As our division focused primarily on Chinese equity I set myself to the task of developing a machine learning algorithm that could outperform the archaic research methods that were in use. This is how I did it

Building a hypothesis

I began this project where I always do: at the end. What do the financial analysts want? What could they have right now that could make their lives easier and their decisions more robust? So I started to talk to people, as many as I could, and as I did the message became abundantly clear: there is too much information.

When an analyst wants to determine where I market is going he/she can look and hundreds, if not thousands of variables. And not only the status of these variables today, but also their historic statuses and there current momentum. For an analyst, and the finite brainpower they can bring to the table, this is an impossible task.

So to solve this problem I decided to build a machine learning algorithm that could look at a factors to determine where the market was heading based on historic market environments similar to that of today.

After a little research, I realized that when the market environment–which just denotes the aggregated movement of different market factors like the three listed above–is similar to historic ones the outcomes tend to be the same. This is shown above as market environments–composed of purchasing manager index, consumer confidence index, and US market movements–in both march of 2014, and July of 2016 were very similar and as shown the resulting market movement was also similar.

Launching the idea

As I now had a hypothesis, I set out to build an algorithm that could do this on a large scale. After going through multiple iterations, and hearing feedback from both programmers and analysts I landed on an algorithmic design:

This model could pull in factor data, process it by breaking it down into first and second derivatives, aggregate the factor results, search for similar instances, output predictions and then continuously back-test the algorithm to fine tune the predictive strength.

Testing the results

After implementing this architecture through python code the testing process began. I did this by running my algorithm through a k-fold assessment model, which back-tested my model over multiple time periods. The predictions proved strong, as shown in the example below for early 2017:

Over a 20 month testing period this algorithm proved an 85% success rate in predicting Chinese market movements.

When this algorithm was used to decide whether to buy or sell Chinese stocks it proved to create significant more value than the standard buy-and-hold strategy that was used by JP Morgan. The stark contrast in the investment returns of the two strategies over a 5 year period can be seen below:

Due to its success this algorithm is now being utilized by JP Morgan to create value for its clients each and every day.


This example is just one of many in how technology, due to a disconnect between industry experts and software developers, is not being fully utilized. There is a strong need for people in the business world today who can understand business problems while also having a firm grasp of how technology can help.