# John Sylvester

#### University of Liverpool

I am a Lecturer at the Department of Computer Science, University of Liverpool.

Before this I was a PostDoc with Kitty Meeks and Jessica Enright at the University of Glasgow.

Prior to that I was a PostDoc with Thomas Sauerwald at the University of Cambridge.

My PhD was supervised by Agelos Georgakopoulos at the University of Warwick.

My undergrad degree was in Mathematics at University College London.

### Interests

I work in discrete probability, in particular random processes on graphs and random graphs. I am also interested in algorithms and complexity for (temporal) graph problems.

### Preprints

Click on arrows to expand.

The Power of Filling in Balanced Allocations with Dimitrios Los and Thomas Sauerwald
Submitted. [arXiv]

It is well known that if $m$ balls (jobs) are placed sequentially into $n$ bins (servers) according to the One-Choice protocol − choose a single bin in each round and allocate one ball to it − then, for $m\gg n$, the gap between the maximum and average load diverges. Many refinements of the One-Choice protocol have been studied that achieve a gap that remains bounded by a function of $n$, for any $m$. However most of these variations, such as Two-Choice, are less sample-efficient than One-Choice, in the sense that for each allocated ball more than one sample is needed (in expectation). We introduce a new class of processes which are primarily characterized by "filling" underloaded bins. A prototypical example is the Packing process: At each round we only take one bin sample, if the load is below the average load, then we place as many balls until the average load is reached; otherwise, we place only one ball. We prove that for any process in this class the gap between the maximum and average load is $\mathcal{O}(\log n)$ for any number of balls $m$. For the Packing process, we also prove a matching lower bound. We also prove that the Packing process is more sample-efficient than One-Choice, that is, it allocates on average more than one ball per sample. Finally, we also demonstrate that the upper bound of $\mathcal{O}(\log n)$ on the gap can be extended to the Caching process (a.k.a. memory protocol) studied by Mitzenmacher, Prabhakar and Shah (2002).

Bounds on the Twin-Width of Product Graphs with William Pettersson
Submitted. [arXiv]

Twin-width is a graph width parameter recently introduced by Bonnet, Kim, Thomassé & Watrigant. Given two graphs G and H and a graph product *, we address the question: is the twin-width of G*H bounded by a function of the twin-widths of G and H and their maximum degrees? It is known that a bound of this type holds for strong products (Bonnet, Geniet, Kim, Thomassé & Watrigant; SODA 2021). We show that bounds of the same form hold for Cartesian, tensor/direct, rooted, replacement, and zig-zag products. For the lexicographical product we prove that the twin-width of the product of two graphs is exactly the maximum of the twin-widths of the individual graphs. In contrast, for the modular product we show that no bound can hold. In addition, we provide examples showing many of our bounds are tight, and give improved bounds for certain classes of graphs.

Tangled Paths: A Random Graph Model from Mallows Permutations with Jessica Enright, Kitty Meeks and William Pettersson
Submitted. [arXiv]

We introduce the random graph $\mathcal{P}(n,q)$ which results from taking the union of two paths of length $n\geq 1$, where the vertices of one of the paths have been relabelled according to a Mallows permutation with real parameter $0 < q( n ) \leq 1$. This random graph model, the tangled path, goes through an evolution: if $q$ is close to $0$ the graph bears resemblance to a path and as $q$ tends to $1$ it becomes an expander. In an effort to understand the evolution of $\mathcal{P}(n,q)$ we determine the treewidth and cutwidth of $\mathcal{P}(n,q)$ up to log factors for all $q$. We also show that the property of having a separator of size one has a sharp threshold. In addition, we prove bounds on the diameter, and vertex isoperimetric number for specific values of $q$.

### Publications in Journals and Peer-reviewed Conferences

Balanced Allocations with Heterogeneous Bins: The Power of Memory with Dimitrios Los and Thomas Sauerwald
SODA 2023, pages 4448 - 4477. [arXiv] [Conference]

We consider the allocation of $m$ balls (jobs) into $n$ bins (servers). In the standard Two-Choice process, at each step $t=1,2,\ldots,m$ we first sample two bins uniformly at random and place a ball in the least loaded bin. It is well-known that for any $m \geq n$, this results in a gap (difference between the maximum and average load) of $\log_2 \log n + \Theta(1)$ (with high probability). In this work, we consider the Memory process where instead of two choices, we only sample one bin per step but we have access to a cache which can store the location of one bin. Mitzenmacher, Prabhakar and Shah showed that in the lightly loaded case ($m = n$), the Memory process achieves a gap of $\mathcal{O}(\log \log n)$. Extending the setting of Mitzenmacher et al.~in two ways, we first allow the number of balls $m$ to be arbitrary, which includes the challenging heavily loaded case where $m \geq n$. Secondly, we follow the heterogeneous bins model of Wieder, where the sampling distribution of bins can be biased up to some arbitrary multiplicative constant. Somewhat surprisingly, we prove that even in this setting, the Memory process still achieves an $\mathcal{O}(\log \log n)$ gap bound. This is in stark contrast with the Two-Choice (or any $d$-Choice with $d=\mathcal{O}(1)$) process, where it is known that the gap diverges as $m \rightarrow \infty$~\cite{W07}. Further, we show that for any sampling distribution independent of $m$ (but possibly dependent on $n$) the Memory process has a gap that can be bounded independently of $m$. Finally, we prove a tight gap bound of $\mathcal{O}(\log n)$ for the Memory process in another relaxed setting with heterogeneous (weighted) balls and a cache which can only be maintained for two steps.

Cover and Hitting Times of Hyperbolic Random Graphs with Marcos Kiwi and Markus Schepers
APPROX/RANDOM 2022, volume 245 of LIPIcs, pages 30:1–30:19 [arXiv] [Conference]

We study random walks on the giant component of Hyperbolic Random Graphs (HRGs), in the regime when the degree distribution obeys a power law with exponent in the range $(2,3)$. In particular, we focus on the expected times for a random walk to hit a given vertex or visit, i.e. cover, all vertices. We show that up to multiplicative constants: the cover time is $n(\log n)^2$, the maximum hitting time is $n\log n$, and the average hitting time is $n$. The first two results hold in expectation and a.a.s. and the last in expectation (with respect to the HRG). We prove these results by determining the effective resistance either between an average vertex and the well-connected "center" of HRGs or between an appropriately chosen collection of extremal vertices. We bound the effective resistance by the energy dissipated by carefully designed network flows associated to a tiling of the hyperbolic plane on which we overlay a forest-like structure.

The Cover Time of a (Multiple) Markov Chain with Rational Transition Probabilities is Rational
Statistics & Probability Letters, 187:109534, 2022. [arXiv] [Journal (Open Access)]

The cover time of a Markov chain on a finite state space is the expected time until all states are visited. We show that if the cover time of a discrete-time Markov chain with rational transitions probabilities is bounded, then it is a rational number. The result is proved by relating the cover time of the original chain to the hitting time of a set in another higher dimensional chain. We also extend this result to the setting where $k\geq 1$ independent copies of a Markov chain are run simultaneously on the same state space and the cover time is the expected time until each state has been visited by at least one copy of the chain.

A New Temporal Interpretation of Cluster Editing with Cristiano Bocci, Chiara Capresi and Kitty Meeks
IWOCA 2022, volume 13270 of LNCS 214-227. [arXiv] [Conference]

The NP-complete graph problem Cluster Editing seeks to transform a static graph into disjoint union of cliques by making the fewest possible edits to the edge set. We introduce a natural interpretation of this problem in the setting of temporal graphs, whose edge-sets are subject to discrete changes over time, which we call Editing to Temporal Cliques. This problem is NP-complete even when restricted to temporal graphs whose underlying graph is a path, but we obtain two polynomial-time algorithms for special cases with further restrictions. In the static setting, it is well-known that a graph is a disjoint union of cliques if and only if it contains no induced copy of $P_3$; we demonstrate that no general characterisation involving sets of at most four vertices can exist in the temporal setting, but obtain a complete characterisation involving forbidden configurations on at most five vertices. This characterisation gives rise to an FPT algorithm parameterised simultaneously by the permitted number of modifications and the lifetime of the temporal graph, which uses a simple search-tree strategy.

Time Dependent Biased Random Walks with John Haslegrave and Thomas Sauerwald
ACM Transactions on Algorithms, 18(2), 2022. [arXiv] [Journal]

We study the biased random walk where at each step of a random walk a controller'' can, with a certain small probability, fix the next step. This model was introduced by Azar et al. [STOC1992]; we extend their work to the time dependent setting and consider cover times of this walk. We obtain new bounds on the cover and hitting times and make progress towards resolving a conjecture of Azar et al. on maximising values of the stationary distribution. We also consider the problem of computing an optimal strategy for the controller to minimise the cover time and show that for directed graphs determining the cover time is $\mathsf{PSPACE}$-complete.

Balanced Allocations: Caching and Packing, Twinning and Thinning with Dimitrios Los and Thomas Sauerwald
SODA 2022, pages 1847-1874. [arXiv] [Conference]

We consider the sequential allocation of $m$ balls (jobs) into $n$ bins (servers) by allowing each ball to choose from some bins sampled uniformly at random. The goal is to maintain a small gap between the maximum load and the average load. In this paper, we present a general framework that allows us to analyze various allocation processes that slightly prefer allocating into underloaded, as opposed to overloaded bins. Our analysis covers several natural instances of processes, including:

• The Caching process (a.k.a. memory protocol) as studied by Mitzenmacher, Prabhakar and Shah (2002): At each round we only take one bin sample, but we also have access to a cache in which the most recently used bin is stored. We place the ball into the least loaded of the two.
• The Packing process: At each round we only take one bin sample. If the load is below some threshold (e.g., the average load), then we place as many balls until the threshold is reached; otherwise, we place only one ball.
• The Twinning process: At each round, we only take one bin sample. If the load is below some threshold, then we place two balls; otherwise, we place only one ball.
• The Thinning process as recently studied by Feldheim and Gurel-Gurevich (2021): At each round, we first take one bin sample. If its load is below some threshold, we place one ball; otherwise, we place one ball into a second bin sample.
As we demonstrate, our general framework implies for all these processes a gap of $\mathcal{O}(\log n)$ between the maximum load and average load, even when an arbitrary number of balls $m \geq n$ are allocated (heavily loaded case). Our analysis is inspired by a previous work of Peres, Talwar and Wieder (2010) for the $(1+\beta)$-process, however here we rely on the interplay between different potential functions to prove stabilization.

The Complexity of Finding Optimal Subgraphs to Represent Spatial Correlation with Jessica Enright, Duncan Lee, Kitty Meeks and William Pettersson
COCOA 2021, volume 13135 of LNCS 152-166. [arXiv] [Conference]

Understanding spatial correlation is vital in many fields including epidemiology and social science. Lee, Meeks and Pettersson recently demonstrated that improved inference for areal unit count data can be achieved by carrying out modifications to a graph representing spatial correlations; specifically, they delete edges of the planar graph derived from border-sharing between geographic regions in order to maximise a specific objective function. In this paper we address the computational complexity of the associated graph optimisation problem. We demonstrate that this problem cannot be solved in polynomial time unless P = NP; we further show intractability for two simpler variants of the problem. We follow these results with two parameterised algorithms that exactly solve the problem in polynomial time in restricted settings. The first of these utilises dynamic programming on a tree decomposition, and runs in polynomial time if both the treewidth and maximum degree are bounded. The second algorithm is restricted to problem instances with maximum degree three, as may arise from triangulations of planar surfaces, but is an FPT algorithm when the number of edges to be removed is taken as the parameter.

Multiple Random Walks on Graphs: Mixing Few to Cover Many with Nicolás Rivera and Thomas Sauerwald
ICALP 2021 volume 198 of LIPIcs 107:1-16 [arXiv] [Conference (Open Access)]

Random walks on graphs are an essential primitive for many randomised algorithms and stochastic processes. It is natural to ask how much can be gained by running $k$ multiple random walks independently and in parallel. Although the cover time of multiple walks has been investigated for many natural networks, the problem of finding a general characterisation of multiple cover times for worst-case start vertices (posed by Alon, Avin, Koucky, Kozma, Lotker, and Tuttle in 2008) remains an open problem. First, we improve and tighten various bounds on the stationary} cover time when $k$ random walks start from vertices sampled from the stationary distribution. For example, we prove an unconditional lower bound of $\Omega( (n/k) \log n )$ on the stationary cover time, holding for any graph $G$ and any $1 \leq k =o(n\log n )$. Secondly, we establish the stationary cover times of multiple walks on several fundamental networks up to constant factors. Thirdly, we present a framework characterising worst-case cover times in terms of stationary cover times and a novel, relaxed notion of mixing time for multiple walks called partial mixing time. Roughly speaking, the partial mixing time only requires a specific portion of all random walks to be mixed. Using these new concepts, we can establish (or recover) the worst-case cover times for many networks including expanders, preferential attachment graphs, grids, binary trees and hypercubes.

The Power of Two Choices for Random Walks with Agelos Georgakopoulos, John Haslegrave and Thomas Sauerwald
Combinatorics, Probability and Computing, 31(1):73-100, 2022 [arXiv] [Journal (Open Access)]

We apply the power-of-two-choices paradigm to a random walk on a graph: rather than moving to a uniform random neighbour at each step, a controller is allowed to choose from two independent uniform random neighbours. We prove that this allows the controller to significantly accelerate the hitting and cover times in several natural graph classes. In particular, we show that the cover time becomes linear in the number $n$ of vertices on discrete tori and bounded degree trees, of order $\mathcal{O}(n \log \log n)$ on bounded degree expanders, and of order $\mathcal{O}(n (\log \log n)^2)$ on the Erdős-Rényi random graph in a certain sparsely connected regime. We also consider the algorithmic question of computing an optimal strategy, and prove a dichotomy in efficiency between computing strategies for hitting and cover times.

Choice and Bias for Random Walks with Agelos Georgakopoulos, John Haslegrave and Thomas Sauerwald
ITCS 2020, volume 151 of LIPIcs 76:1 – 19 [Conference (Open Access)]

We analyse the following random walk process inspired by the power-of-two-choice paradigm: starting from a given vertex, at each step, unlike the simple random walk (SRW) that always moves to a randomly chosen neighbour, we have the choice between two uniformly and independently chosen neighbours. We call this process the choice random walk (CRW). We first prove that for any graph, there is a strategy for the CRW that visits any given vertex in expected time $\mathcal{O}(|E|)$. Then we introduce a general tool that quantifies by how much the probability of a rare event in the simple random walk can be boosted under a suitable CRW strategy. We believe this result to be of independent interest, and apply it here to derive an almost optimal $\mathcal{O}(n\log\log n)$ bound for the cover time of bounded-degree expanders. This tool also applies to so-called biased walks, and allows us to make progress towards a conjecture of Azar et al. [STOC 1992]. Finally, we prove the following dichotomy: computing an optimal strategy to minimise the hitting time of a vertex takes polynomial time, whereas computing one to minimise the cover time is $\mathsf{NP}$-hard.

The dispersion time of random walks on finite graphs with Nicolás Rivera, Thomas Sauerwald and Alexandre Stauffer
SPAA 2019, pages 103--113, 2019. [arXiv] [Conference (Extended Abstract)]

We study two random processes on an $n$-vertex graph inspired by the internal diffusion limited aggregation (IDLA) model. In both processes $n$ particles start from an arbitrary but fixed origin. Each particle performs a simple random walk until first encountering an unoccupied vertex, and at which point the vertex becomes occupied and the random walk terminates. In one of the processes, called Sequential-IDLA, only one particle moves until settling and only then does the next particle start whereas in the second process, called Parallel-IDLA, all unsettled particles move simultaneously. Our main goal is to analyze the so-called dispersion time of these processes, which is the maximum number of steps performed by any of the $n$ particles. In order to compare the two processes, we develop a coupling which shows the dispersion time of the Parallel-IDLA stochastically dominates that of the Sequential-IDLA; however, the total number of steps performed by all particles has the same distribution in both processes. This coupling also gives us that dispersion time of Parallel-IDLA is bounded in expectation by dispersion time of the Sequential-IDLA up to a multiplicative $\log n$ factor. Moreover, we derive asymptotic upper and lower bound on the dispersion time for several graph classes, such as cliques, cycles, binary trees, $d$-dimensional grids, hypercubes and expanders. Most of our bounds are tight up to a multiplicative constant.

Random Walk Hitting Times and Effective Resistance in Sparsely Connected Erdős-Rényi Random Graphs
Journal of Graph Theory, 96(1):44-84, 2021. [arXiv] [Journal (Open Access)]

We prove a bound on the effective resistance $R(x,y)$ between two vertices $x,y$ of a connected graph which contains a suitably well-connected sub-graph. We apply this bound, in tandem with a simple lower bound, to the Erdős-Rényi random graph $\mathcal{G}\left(n,p\right)$ with $np=\Omega(\log n)$, proving that $R(x,y)$ concentrates around $1/d(x) + 1/d(y)$, that is, the sum of reciprocal degrees. We also prove expectation and concentration results for the random walk hitting times, Kirchoff index, cover cost, and the random target time (Kemeny's constant) on $\mathcal{G}\left(n,p\right)$ in the sparsely connected regime $\log n + \log\log \log n \leq np < n^{1/10}$.

### Notes

Tails of Binomial Random Variables with Vanishing Mean [Pdf]

In this short note we shall consider the upper tail $\mathbb{P}\left(bin(n,p) \geq k\right)$ for the Binomial distribution $bin(n,p)$ when $np \rightarrow 0$ and $k> np$. We derive a simple expression for $\mathbb{P}\left(bin(n,p) \geq k\right)$ which shows that aysmtotically the Chernoff bound overestimates this probability by a multiplicative factor of $\sqrt{2\pi k}$ .

### Videos of Talks I Have Given

Cover and Hitting Times of Hyperbolic Random Graphs

Choice and Bias for Random Walks

Multiple Random Walks on Graphs: Mixing Few to Cover Many