Sydney
9/255 George Street,
Sydney NSW 2000
02 9221 4066
JANAadmin@jana.com.au
A non-essential read for those interested in game theory and its application to investments.
In yet another win for Artificial Intelligence over humans, it was announced in July 2019 that a system called Pluribus had beaten five elite human professionals in six-player no-limit Texas hold’em poker. Pluribus was developed by Carnegie Mellon University and Facebook and its win followed that of its predecessor Libratus in the two-player version of this game and AlphaZero in chess and Go in 2017. AlphaZero was developed by DeepMind Technologies, which is owned by Alphabet (parent of Google).
These wins are astounding, especially as each of the Artificial Intelligence systems were given no human knowledge of poker, which has been around for 200 years, chess which has been around for 1500 years nor Go, which has been around for 4000 years. Instead, they were self-taught through repetitive play using a technique known as “reinforcement learning”:
So what can we humble humans learn from how Pluribus learnt to play poker? The first thing is that Pluribus did not learn to outsmart its opponents. Instead it learnt what is referred to as a Game Theory Optimal (GTO) strategy in poker. This is a complex idea and is more simply explained using the game of Rock, Paper, Scissors:
The GTO strategy in Rock, Paper, Scissors is to play each of rock, paper and scissors exactly one third of the time. While you might be able to exploit an opponent who plays rock more often by playing paper more often, you would open yourself up to be exploited by an opponent that plays scissors more often. But by sticking to the GTO strategy, you ensure that you can not lose (even if in this case you will not win either). If both players employ this GTO strategy, then the game of Rock, Paper, Scissors is said to be in a mixed strategy Nash Equilibrium, which is named after the American mathematician John Nash who was portrayed by Russell Crowe in the 2001 film A Beautiful Mind:
In 1951, Nash proved that at least one mixed strategy Nash equilibrium must exist for any game with a finite set of actions e.g. poker. By harnessing reinforcement learning and eight days of computing, which only costs around $200 from a cloud computing provider in 2019, Pluribus learnt to play so close to the GTO strategy for six-player no-limit Texas hold’em poker that it was able to win from its elite human professional opponents, who could not play the GTO strategy, at a rate of $1,000 per hour:
There are some similarities between poker and investing in the stock market as both involve making decisions based on incomplete information, putting money into a pool, from which the costs of running the pool are deducted, and competing against other participants rather than the operator of the pool i.e. the stock exchanges. The potential future pay offs from individual stock investments are influenced by the actions of other investors, whose positive view of a stock will lead to a higher price today, and a lower future pay off, and vice versa.
So is there a GTO strategy for investing in the stock market? If you don’t have access to any information (which is not the case for JANA Spark readers!) then a logical choice may be to invest passively in a stock market index. This is because your share of the pool would be equal to the weighted average share of the participants with information, less the costs deducted from the pool i.e. you will lose but not by much.
However, if you have information then it has been shown that the GTO strategy for investing in the stock market is to invest at each opportunity to maximise the expected geometric return of your portfolio. This helps to ensure that you do not take too much risk in case your information is not right and maximises the expected long-term growth rate of your portfolio i.e. you will win in the long run.
At JANA, we undertake a significant program of research to identify skilful investment managers with the information to build portfolios expected to deliver strong long-term performance. We also think carefully about how to combine these investment managers to ensure that this long-term performance can be delivered while keeping our clients in the game over the shorter term. We are employing humans to do this research but will continue to keep our eye on Artificial Intelligence:
Sydney
9/255 George Street,
Sydney NSW 2000
02 9221 4066
JANAadmin@jana.com.au
Melbourne
18/140 William Street,
Melbourne VIC 3000
03 9602 5400
JANAadmin@jana.com.au