The paper, written by Ioane Muni Toke and Fabrizio Pomponio, titled, Modeling Trades-Through in a Limit Order Book Using Hawkes Processes, uses Hawkes process to examine microstructure behavior.

This paper uses Multivariate Hawkes process to model trades-through. The best thing about this paper is that the authors have made the dataset available for the readers so that they can work through the numbers
and get a feel of model inference. The dataset is available at dataverse. I have used the dataset from the repository, crunched numbers and have managed to replicate most of the results in the paper. I hope this feature of “Reproducible Research” becomes more widespread and authors start disseminating their datasets along with their papers. In this blog post, I will summarize the main points of the paper.

**Introduction
**The authors model trades-through, i.e. transactions that reach at least the second level of limit orders in an order book. Trades-through are very important in price formation and microstructure. Any big size order
is usually chunked and executed and hence trades-through may contain information. The paper has three sections. In the first section, basic summary statistics of the dataset is given. There are 1296707 time stamps in the dataset.

**Trades-through Summary Statistics
**If you spend sometime watching the order book, then it becomes abundantly clear that trades-through are the ones that stand outside the usual trading pattern. What is a trades-through ? An nth limit trade-through is any trade that consumes at least one share at the nth limit available in the order book. The paper describes the trades-through statistics of BNP Paribas stock for 109 trading days(June 2010 to Oct 2010). The empirical findings leads one to infer the following

  • Trades-through are clustered both in physical time and in trade time.

  • The average waiting time between a trade and trades-through is more than the average waiting time between two trades-through

  • Both trades-through at the ask and at the bid are more closely followed in time by trades-through (whatever their sign), than trades at the bid and trades at the ask are

  • There seems to be a cross-side effect of clustering of trades-through: a trade-through at one side of the book will be more closely followed in time by a trade-through on the other side of the book

Modeling and Calibration
The authors fit a bivariate Hawkes process for the trades-through on the ask side and bid side. There are 4 variants of the model that are tested in the paper :

  1. Full model specification with baseline rate intensity for ask and bid trades-through being considered as constant

  2. Model with no Cross excitation term  with baseline rate intensity for ask and bid trades-through being considered as constant

  3. Full model specification with baseline rate intensity for ask and bid trades-through being considered as piecewise linear function

  4. Model with no Cross excitation term  with baseline rate intensity for ask and bid trades-through being considered as piecewise linear function

The parameters for each of the above four models for each trading day is aggregated across 109 days and the major finding is that there is no cross excitation effect.

**Goodness-of-fit
**The author perform the following two goodness-of- fit tests for each of the trades-through processes(ask and bid) for each day:

  1. Testing exponential distribution t for inter-arrivals of time changed process via standard Kolmogorov-Smirnov test

  2. ˆTesting whether the inter-arrivals of time changed process random variables are independent via Ljung-Box test

The authors conclude that univariate Hawkes process with piecewise-linear function is a better fit to the trades-through on the bid and ask side, than the other models considered in the paper.

imageTakeaway

The paper models the trades-through for BNP Paribas stock for a period of 109 days. An empirical analysis of self-excitation and cross-excitation motivates the authors to test out multivariate Hawkes model for the trades-through on the bid and ask side. There are four variants of Hawkes model fitted to the data. For each of the four models, for each day, two diagnostic tests are applied to ask trades-through process and bid trades-through process, thus obtaining 4 tests per day per model. These diagnostic tests are aggregated across 109 days. The authors find that, out of the four models, the univariate Hawkes process with a piecewise linear function for base intensity seems to fit the data better.