Historical Data Randomization Using the Frequency Domain (Preview)
Identifying a Strategy’s Risks
Recently I decided to try to stress test one of my own personal strategies (it is a 100% mechanical mean reversion strategy). I wanted to see how this strategy might perform in a variety of market conditions, such as a random but similar market crash of 2008. Ultimately, I wanted control over randomizing historical data such that I can perform tests against infinite sets of data such as:
1. Crashes similar to 2008
2. Crashes 5% worse than 2008
3. Crashes 10% worse than 2008 (etc.)
4. Bull markets
5. Bear markets
6. Sideways markets
7. Volatile and nonvolatile
Unfortunately, historical ETF data only goes back about a decade or so. Hence the data is somewhat limited for finding a variety of such scenarios. In some discussions with a friend (PhD in electronics), it was brought up that I should take a look at performing randomization in the frequency domain rather than the time domain. Since we are both more familiar with digital signal processing rather than statistics, it was only natural for us. I will explain the details of the technique in the next post covering this research.
Charting the Randomized Data
I was not expecting much from the results, but a few tests and tweaks gave me data that was exactly what I was looking for. Here are some demonstrations. (green is original data, red is a random data set)
Crash similar/worse than 2008
Crash much worse than 2008
Strong bull market (followed by a severe crash)
Performing frequency modifications gives control over how the data is manipulated. By adding randomness to the low frequencies, steep bear and bull markets can be created. By adding randomness to high frequencies, volatile markets can be created. It is much like how an equalizer for a stereo system controls the amount of bass (bull/bear markets) and treble (volatility).
Purpose of Frequency Domain Randomization
The reason I am interested in this research is not to optimize a strategy against a variety of random historical data sets, but rather to identify risks. For example, my personal strategy has a weakness of fully scaling in too early during strongly trending markets. By changing some parameters to my strategy, I was able to not only maintain the original performance on the original data set, but also significantly reduce drawdowns during significant trends in the random data sets (like a 2008 bear market that drops by 75%).
I will provide more details of the research as it develops. If the technique proves useful, I can add it to QLeverageSim or just create a standalone utility for generating the random data.