I have stumbled on to a few mini-projects that revolve around fitting univariate and bivariate Hawkes processes. In this post, I will briefly summarize the write ups :

High Frequency Trade Prediction with Bivariate Hawkes Process

The authors starts with a SDE for intensity process and formulate its solution as a univariate Hawkes process. A visual depiction of self-excited intensity process is obtained via simulation. The time change theorem is stated and a QQ plot of the compensator is shown to follow an exponential inter-arrival distribution. The same thing is repeated for bivariate mutually exciting Hawkes process. Expressions for the log-likelihood of the bivariate Hawkes process is stated and MLE results are shown on a simulated dataset so that the estimates can be compared to the true values. TAQ database is used to obtain tick data for DELL, YHOO and ORCL stock. Since the data is discretized in whole seconds, the timestamps that share the same second are uniformly redistributed in the overlapping second. Using Lee and Ready tick algo, the trades are categorized as buy or sell trades. Bivariate Hawkes process are fit to the buy and sell trades. This model is put to test on a strategy where 1) stocks are longed if buy intensity > 8 times sell intensity, 2) stocks are shorted if the ratio drops below 1/8.

Reconstructing the Order Book

The authors looks at order arrivals for a period of 10 days between March 9, 2009 and March 20, 2009 for QQQ and AAPL. Summary stats of the # of limit order , market orders and cancellations show that % of market orders and cancellations is much more than 10% reported in the previous papers.  Density plot of number of orders for different ticks away from the best-bid/best-ask shows that the plot does not follow power law.  This result is common across QQQ and AAPL stocks and is common across limit buys , limit sells, cancellations. In  a paper by Bouchaud and others, a power-law was found to best describe the order book. However Bouchaud’s paper was written in 2002. In the last decade or so, HFT has taken over the market and HF traders are much more active near the best bid/ best ask rather than any tick away from it. Based on these empirical findings, the authors concentrate on modeling the order arrivals at the bid and ask. They use univariate Hawkes to model the order arrivals. Since the gradient and hessian of univariate Hawkes is well know, the authors use their MLE procedure, randomize the initial values over a range, estimate parameters for a block of 10,000 arrivals each.  Standard diagnostic tests such as checking whether the time changed process based on compensator yields a standard Poisson are carried out.  They find that the Hawkes process deviates the most form the data when there are few orders in a period of time. The authors come up with a simple HFT strategy based on the order intensity of limit buys and limit sells. The problems of implementing this strategy in a pre-trade setting/lab are also mentioned towards the end of the paper. One of their recommendations for future researchers is to build a regime based Hawkes model,i.e. different models for high and low intensity order arrivals.

Exciting times for Trade Arrivals

The authors start by explaining briefly the math behind discrete and continuous time Hawkes process, following it up with casting the MLE as a non-convex optimization problem. Given that it is non-convex, there is no efficient way of solving it. The nice thing about this project is that the authors cast the difficult non-convex optimization problem to solvable convex-optimization problem by making a few assumptions about the propagator function and calling the it, “Generalized Hawkes process”. Subsequently, the authors use exponential and generalized Hawkes to test a HFT strategy where one goes long or short based on the buy/sell intensity ratio.