John Sylvester

University of Liverpool

I am a Lecturer at the Department of Computer Science, University of Liverpool.

Before this I was a PostDoc with Kitty Meeks and Jessica Enright at the University of Glasgow.

Prior to that I was a PostDoc with Thomas Sauerwald at the University of Cambridge.

My PhD was supervised by Agelos Georgakopoulos at the University of Warwick.

My undergrad degree was in Mathematics at University College London.


Interests

I work primarily in discrete probability, in particular random processes on graphs and random graphs. I am also interested in (temporal) graph theory and algorithms.


Conference Organisation

Liverpool Discrete Mathematics Colloquium on the 12th-13th November 2024


Preprints

Click on arrows to expand.

Functionality of Random Graphs with Viktor Zamaraev and Maksim Zhukovskii
[arXiv]

The functionality of a graph $G$ is the minimum number $k$ such that in every induced subgraph of $G$ there exists a vertex whose neighbourhood is uniquely determined by the neighborhoods of at most $k$ other vertices in the subgraph. The functionality parameter was introduced in the context of adjacency labeling schemes, and it generalises a number of classical and recent graph parameters including degeneracy, twin-width, and symmetric difference. We establish the functionality of a random graph $G(n,p)$ up to a constant factor for every value of $p$.

Boolean combinations of graphs with Sarosh Adenwalla, Samuel Braunfeld and Viktor Zamaraev
[arXiv]

The functionality of a graph $G$ is the minimum number $k$ such that in every induced subgraph of $G$ there exists a vertex whose neighbourhood is uniquely determined by the neighborhoods of at most $k$ other vertices in the subgraph. The functionality parameter was introduced in the context of adjacency labeling schemes, and it generalises a number of classical and recent graph parameters including degeneracy, twin-width, and symmetric difference. We establish the functionality of a random graph $G(n,p)$ up to a constant factor for every value of $p$.

Time-Biased Random Walks and Robustness of Expanders with Sam Olesker-Taylor and Thomas Sauerwald
[arXiv]

Random walks on expanders play a crucial role in Markov Chain Monte Carlo algorithms, derandomization, graph theory, and distributed computing. A desirable property is that they are rapidly mixing, which is equivalent to having a spectral gap $\gamma$ (asymptotically) bounded away from $0$.

Our work has two main strands. First, we establish a dichotomy for the robustness of mixing times on edge-weighted $d$-regular graphs (i.e., reversible Markov chains) subject to a Lipschitz condition, which bounds the ratio of adjacent weights by $\beta \ge 1$.

  • If $\beta \ge 1$ is sufficiently small, then $\gamma \asymp 1$ and the mixing time is logarithmic in $n$.
  • If $\beta \ge 2d$, there is an edge-weighting such that $\gamma$ is polynomially small in $1/n$.

Second, we apply our robustness result to a time-dependent version of the so-called $\varepsilon$-biased random walk, as introduced in Azar et al. [Combinatorica 1996].

  • We show that, for any constant $\varepsilon>0$, a bias strategy can be chosen adaptively so that the $\varepsilon$-biased random walk covers any bounded-degree regular expander in $\Theta(n)$ expected time, improving the previous-best bound of $O(n \log \log n)$.
  • We prove the first non-trivial lower bound on the cover time of the $\varepsilon$-biased random walk, showing that, on bounded-degree regular expanders, it is $\omega(n)$ whenever $\varepsilon = o(1)$. We establish this by controlling how much the probability of arbitrary events can be ``boosted'' by using a time-dependent bias strategy.

Mean-Biased Processes for Balanced Allocations with Dimitrios Los and Thomas Sauerwald
Submitted. [arXiv]

We introduce a new class of balanced allocation processes which bias towards underloaded bins (those with load below the mean load) either by skewing the probability by which a bin is chosen for an allocation (probability bias), or alternatively, by adding more balls to an underloaded bin (weight bias). A prototypical process satisfying the probability bias condition is Mean-Thinning: At each round, we sample one bin and if it is underloaded, we allocate one ball; otherwise, we allocate one ball to a second bin sample. Versions of this process have been in use since at least 1986. An example of a process, introduced by us, which satisfies the weight bias condition is Twinning: At each round, we only sample one bin. If the bin is underloaded, then we allocate two balls; otherwise, we allocate only one ball.

Our main result is that for any process with a probability or weight bias, with high probability the gap between maximum and minimum load is logarithmic in the number of bins. This result holds for any number of allocated balls (heavily loaded case), covers many natural processes that relax the Two-Choice process, and we also prove it is tight for many such processes, including Mean-Thinning and Twinning.

Our analysis employs a delicate interplay between linear, quadratic and exponential potential functions. It also hinges on a phenomenon we call ``mean quantile stabilization'', which holds in greater generality than our framework and may be of independent interest.

Tangled Paths: A Random Graph Model from Mallows Permutations with Jessica Enright, Kitty Meeks and William Pettersson
Submitted. [arXiv]

We introduce the random graph $\mathcal{P}(n,q)$ which results from taking the union of two paths of length $n\geq 1$, where the vertices of one of the paths have been relabelled according to a Mallows permutation with real parameter $0 < q( n ) \leq 1 $. This random graph model, the tangled path, goes through an evolution: if $q$ is close to $0$ the graph bears resemblance to a path and as $q$ tends to $1$ it becomes an expander. In an effort to understand the evolution of $\mathcal{P}(n,q)$ we determine the treewidth and cutwidth of $\mathcal{P}(n,q)$ up to log factors for all $q$. We also show that the property of having a separator of size one has a sharp threshold. In addition, we prove bounds on the diameter, and vertex isoperimetric number for specific values of $q$.



Publications in Journals and Peer-reviewed Conferences

Adjacency Labeling Schemes for Small Classes with Édouard Bonnet, Julien Duron and Viktor Zamaraev
ITCS 2025, to appear. [arXiv]

A graph class admits an implicit representation if, for every positive integer $n$, its $n$-vertex graphs have a $b(n)$-bit (adjacency) labeling scheme with $b(n)=O(\log n)$, i.e., their vertices can be labeled by binary strings of length $b(n)$ such that the presence of an edge between any pair of vertices $u, v$ can be deduced solely from the labels of $u$ and $v$. The famous Implicit Graph Conjecture posited that every hereditary (i.e., closed under taking induced subgraphs) factorial} (i.e., containing $2^{O(n \log n)}$ $n$-vertex graphs) class admits an implicit representation. The conjecture was finally refuted [Hatami and Hatami, FOCS '22], and does not even hold among monotone (i.e., closed under taking subgraphs) factorial classes [Bonnet et al., ICALP '24]. However, monotone small (i.e., containing at most $n! c^n$ many $n$-vertex graphs for some constant $c$) classes do admit implicit representations.

This motivates the Small Implicit Graph Conjecture: Every hereditary small class admits an $O(\log n)$-bit labeling scheme. We provide evidence supporting the Small Implicit Graph Conjecture. First, we show that every small weakly sparse (i.e., excluding some fixed bipartite complete graph as a subgraph) class has an implicit representation. This is a consequence of the following fact of independent interest proved in the paper: Every weakly sparse small class has bounded expansion (hence, in particular, bounded degeneracy). The latter generalizes and strengthens the previous results that every monotone small class has bounded degeneracy [Bonnet et al., ICALP '24], and that every weakly sparse class of bounded twin-width has bounded expansion [Bonnet et al., Combinatorial Theory '22]. Second, we show that every hereditary small class admits an $O(\log^3 n)$-bit labeling scheme, which provides a substantial improvement of the best-known polynomial upper bound of $n^{1-\varepsilon}$ on the size of adjacency labeling schemes for such classes. To do so, we establish that every small class has neighborhood complexity $O(n \log n)$, also of independent interest. We then apply a classic result, due to Welzl [SoCG '88], on efficiently ordering the universe of a set system of low Vapnik-Chervonenkis dimension such that every set can be described as the union of a limited number of intervals along this order.

Symmetric-Difference (Degeneracy) and Signed Tree Models with Édouard Bonnet, Julien Duron and Viktor Zamaraev
MFCS 2024, volume 306 of LIPIcs 32:1-16. [arXiv] [Conference]

We introduce a dense counterpart of graph degeneracy, which extends the recently-proposed invariant symmetric difference. We say that a graph has sd-degeneracy (for symmetric-difference degeneracy) at most $d$ if it admits an elimination order of its vertices where a vertex $u$ can be removed whenever it has a $d$-twin, i.e., another vertex $v$ such that at most $d$ vertices outside $\{u,v\}$ are neighbors of exactly one of $u,v$. The family of graph classes of bounded sd-degeneracy is a superset of that of graph classes of bounded degeneracy or of bounded flip-width, and more generally, of bounded symmetric difference. Unlike most graph parameters, sd-degeneracy is not hereditary: it may be strictly smaller on a graph than on some of its induced subgraphs. In particular, every $n$-vertex graph is an induced subgraph of some $O(n^2)$-vertex graph of sd-degeneracy $1$. In spite of this and the breadth of classes of bounded sd-degeneracy, we devise $\tilde{O}(\sqrt{n})$-bit adjacency labeling schemes for them, which are optimal up to the hidden polylogarithmic factor. This is attained on some even more general classes, consisting of graphs $G$ whose vertices bijectively map to the leaves of a tree $T$, where transversal edges and anti-edges added to $T$ define the edge set of $G$. We call such graph representations signed tree models as they extend the so-called tree models (or twin-decompositions) developed in the context of twin-width, by adding transversal anti-edges. While computing the degeneracy of an input graph can be done in linear time, we show that deciding whether its symmetric difference is at most $8$ is co-NP-complete, and whether its sd-degeneracy is at most $1$ is NP-complete.

Tight bounds on adjacency labels for monotone graph classes with Édouard Bonnet, Julien Duron, Viktor Zamaraev and Maksim Zhukovskii
ICALP 2024, volume 297 of LIPIcs 31:1-20. [arXiv] [Conference]

A class of graphs admits an adjacency labeling scheme of size $f(n)$, if the vertices of any $n$-vertex graph $G$ in the class can be assigned binary strings (aka labels) of length $f(n)$ so that the adjacency between each pair of vertices in $G$ can be determined only from their labels. The Implicit Graph Conjecture (IGC) claimed that any graph class which is hereditary (i.e.~closed under taking induced subgraphs) and factorial (i.e. containing $2^{\Theta(n \log n)}$ graphs on $n$ vertices) admits an adjacency labeling scheme of order optimal size $\mathcal{O}(\log n)$. After thirty years open, the IGC was recently disproved [Hatami and Hatami, FOCS 2022].

In this work we show that the IGC does not hold even for monotone graph classes, i.e. closed under taking subgraphs. More specifically, we show that there are monotone factorial graph classes for which the size of any adjacency labeling scheme is $\Omega(\log^2 n)$. Moreover, this is best possible, as any monotone factorial class admits an adjacency labeling scheme of size $\mathcal{O}(\log^2 n)$.

This is a consequence of our general result that establishes tight bounds on the size of adjacency labeling schemes for monotone graph classes: for any function $f: \mathbb{R}_{\geq 0} \rightarrow \mathbb{R}_{\geq 0}$ with $\log x \leq f(x) \leq x^{1-\delta}$ for some constant $\delta > 0$, that satisfies some natural conditions, there exist monotone graph classes, in which the number of $n$-vertex graphs grows as $2^{\mathcal{O}(nf(n))}$ and that do not admit adjacency labels of size at most $f(n) \log n$. On the other hand any such class admits adjacency labels of size $\mathcal{O}(f(n)\log n)$, which is a factor of $\log n$ away from the order optimal bound $\mathcal{O}(f(n))$. This is the first example of tight bounds on adjacency labels for graph classes that do not admit order optimal adjacency labeling schemes.

Rumors with Changing Credibility with Charlotte Out, Nicolás Rivera and Thomas Sauerwald
ITCS 2024, volume 287 of LIPIcs 86:1-23. [arXiv] [Conference]

Randomized rumor spreading processes diffuse information on an undirected graph and have been widely studied. In this work, we present a generic framework for analyzing a broad class of such processes on regular graphs. Our analysis is protocol-agnostic, as it only requires the expected proportion of newly informed vertices in each round to be bounded, and a natural negative correlation property.

This framework allows us to analyze various protocols, including PUSH, PULL, and PUSH-PULL, thereby extending prior research. Unlike previous work, our framework accommodates message failures at any time $t\geq 0$ with a probability of $1 − q(t)$, where the credibility $q(t)$ is any function of time. This enables us to model real-world scenarios in which the transmissibility of rumors may fluctuate, as seen in the spread of “fake news” and viruses. Additionally, our framework is sufficiently broad to cover dynamic graphs.


Small But Unwieldy: A Lower Bound on Adjacency Labels for Small Classes with Édouard Bonnet, Julien Duron, Viktor Zamaraev and Maksim Zhukovskii
SIAM Journal on Computing, Vol. 53, No. 5, 1578-1601. [arXiv] [Journal]
SODA 2024, pages 1147 - 1165. [Conference]

We show that for any natural number $s$, there is a constant $\gamma$ and a subgraph-closed class having, for any natural $n$, at most $\gamma^n$ graphs on $n$ vertices up to isomorphism, but no adjacency labeling scheme with labels of size at most $s \log n$. In other words, for every $s$, there is a small -even tiny- monotone class without universal graphs of size $n^s$. Prior to this result, it was not excluded that every small class has an almost linear universal graph, or equivalently a labeling scheme with labels of size $(1+o(1))\log n$. The existence of such a labeling scheme, a scaled-down version of the recently disproved Implicit Graph Conjecture, was repeatedly raised [Gavoille and Labourel, ESA '07; Dujmović et al., JACM '21; Bonamy et al., SIDMA '22; Bonnet et al., Comb. Theory '22]. Furthermore, our small monotone classes have unbounded twin-width, thus simultaneously disprove the already-refuted Small conjecture; but this time with a self-contained proof, not relying on elaborate group-theoretic constructions.

As our main ingredient, we show that with high probability an Erdős-Rényi random graph $G(n,p)$ with $p=O(1/n)$ has, for every $k \leqslant n$, at most $2^{O(k)}$ subgraphs on $k$ vertices, up to isomorphism. As a barrier to our general method of producing even more complex tiny classes, we show that when $p=\omega(1/n)$, the latter no longer holds. More concretely, we provide an explicit lower bound on the number of unlabeled $k$-vertex induced subgraphs of $G(n,p)$ when $1/n \leq p \leq 1-1/n$. We thereby obtain a threshold for the property of having exponentially many unlabeled induced subgraphs: if $\min \{p, 1-p\}<\delta/n$ with $\delta < 1$, then with high probability even the number of all unlabeled (not necessarily induced) subgraphs is $2^{o(n)}$, whereas if $C/n < p < 1-C/n$ for sufficiently large $C$, then with high probability the number of unlabeled induced subgraphs is $2^{\Theta(n)}$. This result supplements the study of counting unlabeled induced subgraphs that was initiated by Erdős and Rényi with a question on the number of unlabeled induced subgraphs of Ramsey graphs, eventually answered by Shelah.

The Power of Filling in Balanced Allocations with Dimitrios Los and Thomas Sauerwald
SIAM Journal on Discrete Mathematics, 38(1): 529-565, 2024. [arXiv] [Journal]

It is well known that if $m$ balls (jobs) are placed sequentially into $n$ bins (servers) according to the One-Choice protocol − choose a single bin in each round and allocate one ball to it − then, for $m\gg n$, the gap between the maximum and average load diverges. Many refinements of the One-Choice protocol have been studied that achieve a gap that remains bounded by a function of $n$, for any $m$. However most of these variations, such as Two-Choice, are less sample-efficient than One-Choice, in the sense that for each allocated ball more than one sample is needed (in expectation).

We introduce a new class of processes which are primarily characterized by "filling" underloaded bins. A prototypical example is the Packing process: At each round we only take one bin sample, if the load is below the average load, then we place as many balls until the average load is reached; otherwise, we place only one ball. We prove that for any process in this class the gap between the maximum and average load is $\mathcal{O}(\log n)$ for any number of balls $m$. For the Packing process, we also prove a matching lower bound. We also prove that the Packing process is more sample-efficient than One-Choice, that is, it allocates on average more than one ball per sample. Finally, we also demonstrate that the upper bound of $\mathcal{O}(\log n)$ on the gap can be extended to the Caching process (a.k.a. memory protocol) studied by Mitzenmacher, Prabhakar and Shah (2002).

Cops and Robbers on Multi-Layer Graphs with Jessica Enright, Kitty Meeks and William Pettersson
WG 2023, volume 14093 of LNCS 319–333. [arXiv] [Conference]

We generalise the popular cops and robbers game to multi-layer graphs, where each cop and the robber are restricted to a single layer (or set of edges). We show that initial intuition about the best way to allocate cops to layers is not always correct, and prove that the multi-layer cop number is neither bounded from above nor below by any function of the cop numbers of the individual layers. We determine that it is NP-hard to decide if $k$ cops are sufficient to catch the robber, even if all cop layers are trees. However, we give a polynomial time algorithm to determine if $k$ cops can win when the robber layer is a tree. Additionally, we investigate a question of worst-case division of a simple graph into layers: given a simple graph $G$, what is the maximum number of cops required to catch a robber over all multi-layer graphs where each edge of $G$ is in at least one layer and all layers are connected? For cliques, suitably dense random graphs, and graphs of bounded treewidth, we determine this parameter up to multiplicative constants. Lastly we consider a multi-layer variant of Meyniel's Conjecture, and show the existence of an infinite family of graphs whose multi-layer cop number is bounded from below by a constant times $n / \log n$, where $n$ is the number of vertices in the graph.


Balanced Allocations with Heterogeneous Bins: The Power of Memory with Dimitrios Los and Thomas Sauerwald
SODA 2023, pages 4448 - 4477. [arXiv] [Conference]

We consider the allocation of $m$ balls (jobs) into $n$ bins (servers). In the standard Two-Choice process, at each step $t=1,2,\ldots,m$ we first sample two bins uniformly at random and place a ball in the least loaded bin. It is well-known that for any $m \geq n$, this results in a gap (difference between the maximum and average load) of $\log_2 \log n + \Theta(1)$ (with high probability). In this work, we consider the Memory process where instead of two choices, we only sample one bin per step but we have access to a cache which can store the location of one bin. Mitzenmacher, Prabhakar and Shah showed that in the lightly loaded case ($m = n$), the Memory process achieves a gap of $\mathcal{O}(\log \log n)$.

Extending the setting of Mitzenmacher et al.~in two ways, we first allow the number of balls $m$ to be arbitrary, which includes the challenging heavily loaded case where $m \geq n$. Secondly, we follow the heterogeneous bins model of Wieder, where the sampling distribution of bins can be biased up to some arbitrary multiplicative constant. Somewhat surprisingly, we prove that even in this setting, the Memory process still achieves an $\mathcal{O}(\log \log n)$ gap bound. This is in stark contrast with the Two-Choice (or any $d$-Choice with $d=\mathcal{O}(1)$) process, where it is known that the gap diverges as $m \rightarrow \infty$.

Further, we show that for any sampling distribution independent of $m$ (but possibly dependent on $n$) the Memory process has a gap that can be bounded independently of $m$. Finally, we prove a tight gap bound of $\mathcal{O}(\log n)$ for the Memory process in another relaxed setting with heterogeneous (weighted) balls and a cache which can only be maintained for two steps.


Bounds on the Twin-Width of Product Graphs with William Pettersson
Discrete Mathematics & Theoretical Computer Science, 25(1), 2023. [arXiv] [Journal]

Twin-width is a graph width parameter recently introduced by Bonnet, Kim, Thomassé & Watrigant. Given two graphs G and H and a graph product *, we address the question: is the twin-width of G*H bounded by a function of the twin-widths of G and H and their maximum degrees? It is known that a bound of this type holds for strong products (Bonnet, Geniet, Kim, Thomassé & Watrigant; SODA 2021).

We show that bounds of the same form hold for Cartesian, tensor/direct, rooted, replacement, and zig-zag products. For the lexicographical product we prove that the twin-width of the product of two graphs is exactly the maximum of the twin-widths of the individual graphs. In contrast, for the modular product we show that no bound can hold. In addition, we provide examples showing many of our bounds are tight, and give improved bounds for certain classes of graphs.

Cover and Hitting Times of Hyperbolic Random Graphs with Marcos Kiwi and Markus Schepers
Random Structures & Algorithms, Vol. 65, No. 4, 915–978, 2024. [arXiv] [Journal]
RANDOM 2022, volume 245 of LIPIcs, pages 30:1–30:19. [Conference]

We study random walks on the giant component of Hyperbolic Random Graphs (HRGs), in the regime when the degree distribution obeys a power law with exponent in the range $(2,3)$. In particular, we focus on the expected times for a random walk to hit a given vertex or visit, i.e. cover, all vertices. We show that up to multiplicative constants: the cover time is $n(\log n)^2$, the maximum hitting time is $n\log n$, and the average hitting time is $n$. The first two results hold in expectation and a.a.s. and the last in expectation (with respect to the HRG). We prove these results by determining the effective resistance either between an average vertex and the well-connected "center" of HRGs or between an appropriately chosen collection of extremal vertices. We bound the effective resistance by the energy dissipated by carefully designed network flows associated to a tiling of the hyperbolic plane on which we overlay a forest-like structure.


The Cover Time of a (Multiple) Markov Chain with Rational Transition Probabilities is Rational
Statistics & Probability Letters, 187:109534, 2022. [arXiv] [Journal]

The cover time of a Markov chain on a finite state space is the expected time until all states are visited. We show that if the cover time of a discrete-time Markov chain with rational transitions probabilities is bounded, then it is a rational number. The result is proved by relating the cover time of the original chain to the hitting time of a set in another higher dimensional chain. We also extend this result to the setting where $k\geq 1 $ independent copies of a Markov chain are run simultaneously on the same state space and the cover time is the expected time until each state has been visited by at least one copy of the chain.


A New Temporal Interpretation of Cluster Editing with Cristiano Bocci, Chiara Capresi and Kitty Meeks
Journal of Computer and System Sciences, Vol. 144, 2024, 103551. [arXiv] [Journal]
IWOCA 2022, volume 13270 of LNCS 214-227. [Conference]

The NP-complete graph problem Cluster Editing seeks to transform a static graph into disjoint union of cliques by making the fewest possible edits to the edge set. We introduce a natural interpretation of this problem in the setting of temporal graphs, whose edge-sets are subject to discrete changes over time, which we call Editing to Temporal Cliques. This problem is NP-complete even when restricted to temporal graphs whose underlying graph is a path, but we obtain two polynomial-time algorithms for special cases with further restrictions. In the static setting, it is well-known that a graph is a disjoint union of cliques if and only if it contains no induced copy of $P_3$; we demonstrate that no general characterisation involving sets of at most four vertices can exist in the temporal setting, but obtain a complete characterisation involving forbidden configurations on at most five vertices. This characterisation gives rise to an FPT algorithm parameterised simultaneously by the permitted number of modifications and the lifetime of the temporal graph, which uses a simple search-tree strategy.


Time Dependent Biased Random Walks with John Haslegrave and Thomas Sauerwald
ACM Transactions on Algorithms, 18(2), 2022. [arXiv] [Journal]

We study the biased random walk where at each step of a random walk a ``controller'' can, with a certain small probability, fix the next step. This model was introduced by Azar et al. [STOC1992]; we extend their work to the time dependent setting and consider cover times of this walk. We obtain new bounds on the cover and hitting times and make progress towards resolving a conjecture of Azar et al. on maximising values of the stationary distribution. We also consider the problem of computing an optimal strategy for the controller to minimise the cover time and show that for directed graphs determining the cover time is $\mathsf{PSPACE}$-complete.


Balanced Allocations: Caching and Packing, Twinning and Thinning with Dimitrios Los and Thomas Sauerwald
SODA 2022, pages 1847-1874. [arXiv] [Conference]

We consider the sequential allocation of $m$ balls (jobs) into $n$ bins (servers) by allowing each ball to choose from some bins sampled uniformly at random. The goal is to maintain a small gap between the maximum load and the average load.

In this paper, we present a general framework that allows us to analyze various allocation processes that slightly prefer allocating into underloaded, as opposed to overloaded bins. Our analysis covers several natural instances of processes, including:

  • The Caching process (a.k.a. memory protocol) as studied by Mitzenmacher, Prabhakar and Shah (2002): At each round we only take one bin sample, but we also have access to a cache in which the most recently used bin is stored. We place the ball into the least loaded of the two.
  • The Packing process: At each round we only take one bin sample. If the load is below some threshold (e.g., the average load), then we place as many balls until the threshold is reached; otherwise, we place only one ball.
  • The Twinning process: At each round, we only take one bin sample. If the load is below some threshold, then we place two balls; otherwise, we place only one ball.
  • The Thinning process as recently studied by Feldheim and Gurel-Gurevich (2021): At each round, we first take one bin sample. If its load is below some threshold, we place one ball; otherwise, we place one ball into a second bin sample.

As we demonstrate, our general framework implies for all these processes a gap of $\mathcal{O}(\log n)$ between the maximum load and average load, even when an arbitrary number of balls $m \geq n$ are allocated (heavily loaded case). Our analysis is inspired by a previous work of Peres, Talwar and Wieder (2010) for the $(1+\beta)$-process, however here we rely on the interplay between different potential functions to prove stabilization.


The Complexity of Finding and Enumerating Optimal Subgraphs to Represent Spatial Correlation with Jessica Enright, Duncan Lee, Kitty Meeks and William Pettersson
Algorithmica 86(10): 3186-3230, 2024. [arXiv] [Journal]
COCOA 2021, volume 13135 of LNCS 152-166. [Conference]

Understanding spatial correlation is vital in many fields including epidemiology and social science. Lee, Meeks and Pettersson recently demonstrated that improved inference for areal unit count data can be achieved by carrying out modifications to a graph representing spatial correlations; specifically, they delete edges of the planar graph derived from border-sharing between geographic regions in order to maximise a specific objective function. In this paper we address the computational complexity of the associated graph optimisation problem.

We demonstrate that this problem cannot be solved in polynomial time unless P = NP; we further show intractability for two simpler variants of the problem. We follow these results with two parameterised algorithms that exactly solve the problem in polynomial time in restricted settings. The first of these utilises dynamic programming on a tree decomposition, and runs in polynomial time if both the treewidth and maximum degree are bounded. The second algorithm is restricted to problem instances with maximum degree three, as may arise from triangulations of planar surfaces, but is an FPT algorithm when the number of edges to be removed is taken as the parameter.


Multiple Random Walks on Graphs: Mixing Few to Cover Many with Nicolás Rivera and Thomas Sauerwald
Combinatorics, Probability and Computing, 32(4):594 - 637, 2023 [arXiv] [Journal]
ICALP 2021, volume 198 of LIPIcs 107:1-16. [Conference]

Random walks on graphs are an essential primitive for many randomised algorithms and stochastic processes. It is natural to ask how much can be gained by running $k$ multiple random walks independently and in parallel. Although the cover time of multiple walks has been investigated for many natural networks, the problem of finding a general characterisation of multiple cover times for worst-case start vertices (posed by Alon, Avin, Koucky, Kozma, Lotker, and Tuttle in 2008) remains an open problem.

First, we improve and tighten various bounds on the stationary} cover time when $k$ random walks start from vertices sampled from the stationary distribution. For example, we prove an unconditional lower bound of $\Omega( (n/k) \log n )$ on the stationary cover time, holding for any graph $G$ and any $1 \leq k =o(n\log n )$. Secondly, we establish the stationary cover times of multiple walks on several fundamental networks up to constant factors. Thirdly, we present a framework characterising worst-case cover times in terms of stationary cover times and a novel, relaxed notion of mixing time for multiple walks called partial mixing time. Roughly speaking, the partial mixing time only requires a specific portion of all random walks to be mixed. Using these new concepts, we can establish (or recover) the worst-case cover times for many networks including expanders, preferential attachment graphs, grids, binary trees and hypercubes.


The Power of Two Choices for Random Walks with Agelos Georgakopoulos, John Haslegrave and Thomas Sauerwald
Combinatorics, Probability and Computing, 31(1):73-100, 2022. [arXiv] [Journal]

We apply the power-of-two-choices paradigm to a random walk on a graph: rather than moving to a uniform random neighbour at each step, a controller is allowed to choose from two independent uniform random neighbours. We prove that this allows the controller to significantly accelerate the hitting and cover times in several natural graph classes. In particular, we show that the cover time becomes linear in the number $n$ of vertices on discrete tori and bounded degree trees, of order $\mathcal{O}(n \log \log n)$ on bounded degree expanders, and of order $\mathcal{O}(n (\log \log n)^2)$ on the Erdős-Rényi random graph in a certain sparsely connected regime. We also consider the algorithmic question of computing an optimal strategy, and prove a dichotomy in efficiency between computing strategies for hitting and cover times.


Choice and Bias for Random Walks with Agelos Georgakopoulos, John Haslegrave and Thomas Sauerwald
ITCS 2020, volume 151 of LIPIcs 76:1-19. [Conference]

We analyse the following random walk process inspired by the power-of-two-choice paradigm: starting from a given vertex, at each step, unlike the simple random walk (SRW) that always moves to a randomly chosen neighbour, we have the choice between two uniformly and independently chosen neighbours. We call this process the choice random walk (CRW).

We first prove that for any graph, there is a strategy for the CRW that visits any given vertex in expected time $\mathcal{O}(|E|)$. Then we introduce a general tool that quantifies by how much the probability of a rare event in the simple random walk can be boosted under a suitable CRW strategy. We believe this result to be of independent interest, and apply it here to derive an almost optimal $\mathcal{O}(n\log\log n)$ bound for the cover time of bounded-degree expanders. This tool also applies to so-called biased walks, and allows us to make progress towards a conjecture of Azar et al. [STOC 1992]. Finally, we prove the following dichotomy: computing an optimal strategy to minimise the hitting time of a vertex takes polynomial time, whereas computing one to minimise the cover time is $\mathsf{NP}$-hard.


The dispersion time of random walks on finite graphs with Nicolás Rivera, Thomas Sauerwald and Alexandre Stauffer
SPAA 2019, pages 103--113, 2019. [arXiv] [Conference (Extended Abstract)]

We study two random processes on an $n$-vertex graph inspired by the internal diffusion limited aggregation (IDLA) model. In both processes $n$ particles start from an arbitrary but fixed origin. Each particle performs a simple random walk until first encountering an unoccupied vertex, and at which point the vertex becomes occupied and the random walk terminates. In one of the processes, called Sequential-IDLA, only one particle moves until settling and only then does the next particle start whereas in the second process, called Parallel-IDLA, all unsettled particles move simultaneously. Our main goal is to analyze the so-called dispersion time of these processes, which is the maximum number of steps performed by any of the $n$ particles.

In order to compare the two processes, we develop a coupling which shows the dispersion time of the Parallel-IDLA stochastically dominates that of the Sequential-IDLA; however, the total number of steps performed by all particles has the same distribution in both processes. This coupling also gives us that dispersion time of Parallel-IDLA is bounded in expectation by dispersion time of the Sequential-IDLA up to a multiplicative $\log n$ factor. Moreover, we derive asymptotic upper and lower bound on the dispersion time for several graph classes, such as cliques, cycles, binary trees, $d$-dimensional grids, hypercubes and expanders. Most of our bounds are tight up to a multiplicative constant.

Random Walk Hitting Times and Effective Resistance in Sparsely Connected Erdős-Rényi Random Graphs
Journal of Graph Theory, 96(1):44-84, 2021. [arXiv] [Journal]

We prove a bound on the effective resistance $R(x,y)$ between two vertices $x,y$ of a connected graph which contains a suitably well-connected sub-graph. We apply this bound, in tandem with a simple lower bound, to the Erdős-Rényi random graph $\mathcal{G}\left(n,p\right)$ with $np=\Omega(\log n)$, proving that $R(x,y)$ concentrates around $1/d(x) + 1/d(y)$, that is, the sum of reciprocal degrees. We also prove expectation and concentration results for the random walk hitting times, Kirchoff index, cover cost, and the random target time (Kemeny's constant) on $\mathcal{G}\left(n,p\right)$ in the sparsely connected regime $\log n + \log\log \log n \leq np < n^{1/10}$.



Notes

Tails of Binomial Random Variables with Vanishing Mean [Pdf]

In this short note we shall consider the upper tail $\mathbb{P}\left(bin(n,p) \geq k\right)$ for the Binomial distribution $bin(n,p)$ when $np \rightarrow 0$ and $k> np$. We derive a simple expression for $\mathbb{P}\left(bin(n,p) \geq k\right)$ which shows that aysmtotically the Chernoff bound overestimates this probability by a multiplicative factor of $\sqrt{2\pi k} $ .



Videos of Talks I Have Given

Cover and Hitting Times of Hyperbolic Random Graphs

Choice and Bias for Random Walks

Multiple Random Walks on Graphs: Mixing Few to Cover Many