NetworkHawkesProcesses.jl

NetworkHawkesProcess.jl implements a class of probabilistic models that combine multivariate Hawkes processes with network models (Linderman, 2016). The intensity of such models takes the form

\[\lambda_n(t) = \lambda_n^{(0)}(t) + \sum_n \sum_{\substack{c_i = n,\\ s_i < t}} a_{n \rightarrow '} h_{n \rightarrow n'}(t - s_i \ | \ \theta_{n \rightarrow n'}),\]

where $a_{n \rightarrow n'}$ is a binary variable representing the existence of a directed link from process $n$ to process $n'$ generated by a process

\[a_{n \rightarrow n'} \sim p(a_{n \rightarrow n'} \ | \ \eta_{n \rightarrow n'})\]

These models–network Hawkes processes–permit simultaneous inference on the structure of a network and its event generating process. The NetworkHawkesProcesses package provides methods to simulate and estimate such processes. It allows researchers to construct models from a flexible set of model components, run inference from a list of compatible methods (including maximum-likelihood estimation, Markov chain Monte Carlo sampling, and variational inference), and explore results with visualization and diagnostic utilities.

Package Features

Supports continuous and discrete processes
Uses modular design to support extensible components
Implements simulation via Poisson thinning
Provides multiple estimation/inference methods
Supports a wide range of network specifications
Supports non-homogeneous baselines
Accelerates methods via Julia's built-in Threads module

Getting Started

Installation

julia> using Pkg;
julia> Pkg.add("https://github.com/cswaney/NetworkHawkesProcesses.jl.git")
# or pkg> add "https://github.com/cswaney/NetworkHawkesProcesses.jl.git"

Creating Processes

Proceses are constructed from a combination of component models. All processes require a Baseline model, and ImpulseResponse model, and a Weights model. For a ContinuousStandardHawkesProces, which does not incorporate network structe, these are all that is needed:

using NetworkHawkesProcesses

nnodes = 2
weight = 0.1
Δtmax = 1.0

baseline = NetworkHawkesProcesses.HomogeneousProcess(ones(nnodes))
weights = NetworkHawkesProcesses.DenseWeightModel(weight .* ones(nnodes, nnodes))
impulses = NetworkHawkesProcesses.ExponentialImpulseResponse(ones(nnodes, nnodes))
process = NetworkHawkesProcesses.ContinuousStandardHawkesProcess(baseline, impulses, weights)

A ContinuousNetworkHawkesProces requires a Network model and adjacency matrix as well:

network = NetworkHawkesProcesses.BernoulliNetworkModel(0.5, nnodes)
links = NetworkHawkesProcesses.rand(network)
process = NetworkHawkesProcesses.ContinuousNetworkHawkesProcess(baseline, impulses, weights, links, network)

Discrete processes are constructed in a similar fashion, but some component models must be changed to an appropriate discrete version and a time step size is required. For example, here is how to construct a DiscreteStandardHawkesProcess:

nnodes = 2
nbasis = 3
nlags = 4
weight = 0.1
dt = 1.0

baseline = NetworkHawkesProcesses.DiscreteHomogeneousProcess(ones(nnodes), dt)
weights = NetworkHawkesProcesses.DenseWeightModel(weight .* ones(nnodes, nnodes))
impulses = NetworkHawkesProcesses.DiscreteGaussianImpulseResponse(ones(nnodes, nnodes, nbasis) ./ nbasis, nlags, dt)
process = NetworkHawkesProcesses.DiscreteStandardHawkesProcess(baseline, impulses, weights, dt)

A DiscreteNetworkHawkesProcess only requires the addition of a Network model and adjacency matrix:

network = NetworkHawkesProcesses.BernoulliNetworkModel(0.5, nnodes)
links = NetworkHawkesProcesses.rand(network)
process = NetworkHawkesProcesses.DiscreteNetworkHawkesProcess(baseline, impulses, weights, links, network, dt)

Simulating Data

Following conventional (e.g., Disctributions), NetworkHawkesProcesses defines rand methods for the purpose of simulating data:

events, nodes, duration = rand(process, 100.0) # typeof(process) == ContinuousHawkesProcess
data = rand(process, 100) # typeof(process) == DiscreteHawkesProcess

Generally, the simulation duration is any positive value, but some baselines (e.g., LogGaussianCoxProcess) define a maximum duration.

Note that the format of simulated data is quite different for continuous and discrete processes. For continuous processes, simulation returns separate arrays of event times and nodes, in addition to the simulation duration. For discrete processes, simulated data is represented as a matrix whose rows contain event counts for each node, with each column representing a time step.

size(data) # (2, 100)

Running Inference

The package provides several method for performing inference, including maximum-likelihood estimation (mle!), Markov Chain Monte Carlo sampling (mcmc!), and variational inference (vb!, svi!).

res = NetworkHawkesProcesses.mle!(process, data; verbose=true, regularize=true) # maximum a posteriori

The inference methods available depend on the type of model (see the table below). Processes with Network models don't provide maximum-likelihood estimation due to the presence on binary parameters in the adjacency matrix; and only discrete processes provide variational inference methods.

Model	Available Methods
`ContinuousStandardHawkesProcess`	`mle!`, `mcmc!`
`ContinuousNetworkHawkesProcess`	`mcmc!`
`DiscreteStandardHawkesProcess`	`mle!`, `mcmc!`, `vb!`, `svi!`
`DiscreteNetworkHawkesProcess`	`mcmc!`, `vb!`, `svi!`

Note

All inference methods overwrite model parameters. In addition, each method takes a default initial guess at the model parameters if no initial guess is provided.

Tip

A fast option for an initial guess is to estimate a homogenous Poisson process (although this doesn't provide any guidance for impulse response, network, or connection parameters).

Reporting Diagnostics

The package provides several methods to examine models and simulated data.

isstable(processs) checks the stability of a Hawkes process. Unstable processes will often fail to generate finite data samples.
intensity(process, data, times) calculates the intensity of a process (given data) for all times in times.
loglikelihood(process, data) calculates the log-likelihood of a dataset.

multi-threading

Several methods use multi-threading to speed up computation. For all such methods, multi-threading is automatically activated and uses as many threads as available to the Julia process. For example, to use all local CPU threads (i.e., "logical cores"), start the Julia session with julia --threads auto.

Next Steps

Check out our tutorials and examples to learn more about how the package works and see practical examples of its usage, or browse our API documentation for detailed information on package components.