## John Sylvester## University of Liverpool |

My PhD was supervised by Agelos Georgakopoulos at the University of Warwick.

My undergrad degree was in Mathematics at University College London.

Click on arrows to expand.

We introduce a new class of balanced allocation processes which bias towards underloaded bins (those with load below the mean load) either by skewing the probability by which a bin is chosen for an allocation (probability bias), or alternatively, by adding more balls to an underloaded bin (weight bias). A prototypical process satisfying the probability bias condition is Mean-Thinning: At each round, we sample one bin and if it is underloaded, we allocate one ball; otherwise, we allocate one ball to a second bin sample. Versions of this process have been in use since at least 1986. An example of a process, introduced by us, which satisfies the weight bias condition is Twinning: At each round, we only sample one bin. If the bin is underloaded, then we allocate two balls; otherwise, we allocate only one ball.

Our main result is that for any process with a probability or weight bias, with high probability the gap between maximum and minimum load is logarithmic in the number of bins. This result holds for any number of allocated balls (heavily loaded case), covers many natural processes that relax the Two-Choice process, and we also prove it is tight for many such processes, including Mean-Thinning and Twinning.

Our analysis employs a delicate interplay between linear, quadratic and exponential potential functions. It also hinges on a phenomenon we call ``mean quantile stabilization'', which holds in greater generality than our framework and may be of independent interest.

We introduce the random graph $\mathcal{P}(n,q)$ which results from taking the union of two paths of length $n\geq 1$, where the vertices of one of the paths have been relabelled according to a Mallows permutation with real parameter $0 < q( n ) \leq 1 $. This random graph model, the tangled path, goes through an evolution: if $q$ is close to $0$ the graph bears resemblance to a path and as $q$ tends to $1$ it becomes an expander. In an effort to understand the evolution of $\mathcal{P}(n,q)$ we determine the treewidth and cutwidth of $\mathcal{P}(n,q)$ up to log factors for all $q$. We also show that the property of having a separator of size one has a sharp threshold. In addition, we prove bounds on the diameter, and vertex isoperimetric number for specific values of $q$.

We introduce a dense counterpart of graph degeneracy, which extends the recently-proposed invariant symmetric difference. We say that a graph has sd-degeneracy (for symmetric-difference degeneracy) at most $d$ if it admits an elimination order of its vertices where a vertex $u$ can be removed whenever it has a $d$-twin, i.e., another vertex $v$ such that at most $d$ vertices outside $\{u,v\}$ are neighbors of exactly one of $u,v$. The family of graph classes of bounded sd-degeneracy is a superset of that of graph classes of bounded degeneracy or of bounded flip-width, and more generally, of bounded symmetric difference. Unlike most graph parameters, sd-degeneracy is not hereditary: it may be strictly smaller on a graph than on some of its induced subgraphs. In particular, every $n$-vertex graph is an induced subgraph of some $O(n^2)$-vertex graph of sd-degeneracy $1$. In spite of this and the breadth of classes of bounded sd-degeneracy, we devise $\tilde{O}(\sqrt{n})$-bit adjacency labeling schemes for them, which are optimal up to the hidden polylogarithmic factor. This is attained on some even more general classes, consisting of graphs $G$ whose vertices bijectively map to the leaves of a tree $T$, where transversal edges and anti-edges added to $T$ define the edge set of $G$. We call such graph representations signed tree models as they extend the so-called tree models (or twin-decompositions) developed in the context of twin-width, by adding transversal anti-edges. While computing the degeneracy of an input graph can be done in linear time, we show that deciding whether its symmetric difference is at most $8$ is co-NP-complete, and whether its sd-degeneracy is at most $1$ is NP-complete.

A class of graphs admits an adjacency labeling scheme of size $f(n)$, if the vertices of any $n$-vertex graph $G$ in the class can be assigned binary strings (aka labels) of length $f(n)$ so that the adjacency between each pair of vertices in $G$ can be determined only from their labels. The Implicit Graph Conjecture (IGC) claimed that any graph class which is \emph{hereditary} (i.e.~closed under taking induced subgraphs) and \emph{factorial} (i.e.~containing $2^{\Theta(n \log n)}$ graphs on $n$ vertices) admits an adjacency labeling scheme of order optimal size $\mathcal{O}(\log n)$. After thirty years open, the IGC was recently disproved [Hatami and Hatami, FOCS 2022].

In this work we show that the IGC does not hold even for \emph{monotone} graph classes, i.e.~closed under taking subgraphs. More specifically, we show that there are monotone factorial graph classes for which the size of any adjacency labeling scheme is $\Omega(\log^2 n)$. Moreover, this is best possible, as any monotone factorial class admits an adjacency labeling scheme of size $\mathcal{O}(\log^2 n)$.

This is a consequence of our general result that establishes tight bounds on the size of adjacency labeling schemes for monotone graph classes: for any function $f: \mathbb{R}_{\geq 0} \rightarrow \mathbb{R}_{\geq 0}$ with $\log x \leq f(x) \leq x^{1-\delta}$ for some constant $\delta > 0$, that satisfies some natural conditions, there exist monotone graph classes, in which the number of $n$-vertex graphs grows as $2^{\mathcal{O}(nf(n))}$ and that do not admit adjacency labels of size at most $f(n) \log n$. On the other hand any such class admits adjacency labels of size $\mathcal{O}(f(n)\log n)$, which is a factor of $\log n$ away from the order optimal bound $\mathcal{O}(f(n))$. This is the first example of tight bounds on adjacency labels for graph classes that do not admit order optimal adjacency labeling schemes.

Randomized rumor spreading processes diffuse information on an undirected graph and have been widely studied. In this work, we present a generic framework for analyzing a broad class of such processes on regular graphs. Our analysis is protocol-agnostic, as it only requires the expected proportion of newly informed vertices in each round to be bounded, and a natural negative correlation property.

This framework allows us to analyze various protocols, including PUSH, PULL, and PUSH-PULL, thereby extending prior research. Unlike previous work, our framework accommodates message failures at any time $t\geq 0$ with a probability of $1 − q(t)$, where the credibility $q(t)$ is any function of time. This enables us to model real-world scenarios in which the transmissibility of rumors may fluctuate, as seen in the spread of “fake news” and viruses. Additionally, our framework is sufficiently broad to cover dynamic graphs.

We show that for any natural number $s$, there is a constant $\gamma$ and a subgraph-closed class having, for any natural $n$, at most $\gamma^n$ graphs on $n$ vertices up to isomorphism, but no adjacency labeling scheme with labels of size at most $s \log n$. In other words, for every $s$, there is a small -even tiny- monotone class without universal graphs of size $n^s$. Prior to this result, it was not excluded that every small class has an almost linear universal graph, or equivalently a labeling scheme with labels of size $(1+o(1))\log n$. The existence of such a labeling scheme, a scaled-down version of the recently disproved Implicit Graph Conjecture, was repeatedly raised [Gavoille and Labourel, ESA '07; Dujmović et al., JACM '21; Bonamy et al., SIDMA '22; Bonnet et al., Comb. Theory '22]. Furthermore, our small monotone classes have unbounded twin-width, thus simultaneously disprove the already-refuted Small conjecture; but this time with a self-contained proof, not relying on elaborate group-theoretic constructions.

As our main ingredient, we show that with high probability an Erdős-Rényi random graph $G(n,p)$ with $p=O(1/n)$ has, for every $k \leqslant n$, at most $2^{O(k)}$ subgraphs on $k$ vertices, up to isomorphism. As a barrier to our general method of producing even more complex tiny classes, we show that when $p=\omega(1/n)$, the latter no longer holds. More concretely, we provide an explicit lower bound on the number of unlabeled $k$-vertex induced subgraphs of $G(n,p)$ when $1/n \leq p \leq 1-1/n$. We thereby obtain a threshold for the property of having exponentially many unlabeled induced subgraphs: if $\min \{p, 1-p\}<\delta/n$ with $\delta < 1$, then with high probability even the number of all unlabeled (not necessarily induced) subgraphs is $2^{o(n)}$, whereas if $C/n < p < 1-C/n$ for sufficiently large $C$, then with high probability the number of unlabeled induced subgraphs is $2^{\Theta(n)}$. This result supplements the study of counting unlabeled induced subgraphs that was initiated by Erdős and Rényi with a question on the number of unlabeled induced subgraphs of Ramsey graphs, eventually answered by Shelah.

It is well known that if $m$ balls (jobs) are placed sequentially into $n$ bins (servers) according to the One-Choice protocol − choose a single bin in each round and allocate one ball to it − then, for $m\gg n$, the gap between the maximum and average load diverges. Many refinements of the One-Choice protocol have been studied that achieve a gap that remains bounded by a function of $n$, for any $m$. However most of these variations, such as Two-Choice, are less sample-efficient than One-Choice, in the sense that for each allocated ball more than one sample is needed (in expectation).

We introduce a new class of processes which are primarily characterized by "filling" underloaded bins. A prototypical example is the Packing process: At each round we only take one bin sample, if the load is below the average load, then we place as many balls until the average load is reached; otherwise, we place only one ball. We prove that for any process in this class the gap between the maximum and average load is $\mathcal{O}(\log n)$ for any number of balls $m$. For the Packing process, we also prove a matching lower bound. We also prove that the Packing process is more sample-efficient than One-Choice, that is, it allocates on average more than one ball per sample. Finally, we also demonstrate that the upper bound of $\mathcal{O}(\log n)$ on the gap can be extended to the Caching process (a.k.a. memory protocol) studied by Mitzenmacher, Prabhakar and Shah (2002).

We generalise the popular cops and robbers game to multi-layer graphs, where each cop and the robber are restricted to a single layer (or set of edges). We show that initial intuition about the best way to allocate cops to layers is not always correct, and prove that the multi-layer cop number is neither bounded from above nor below by any function of the cop numbers of the individual layers. We determine that it is NP-hard to decide if $k$ cops are sufficient to catch the robber, even if all cop layers are trees. However, we give a polynomial time algorithm to determine if $k$ cops can win when the robber layer is a tree. Additionally, we investigate a question of worst-case division of a simple graph into layers: given a simple graph $G$, what is the maximum number of cops required to catch a robber over all multi-layer graphs where each edge of $G$ is in at least one layer and all layers are connected? For cliques, suitably dense random graphs, and graphs of bounded treewidth, we determine this parameter up to multiplicative constants. Lastly we consider a multi-layer variant of Meyniel's Conjecture, and show the existence of an infinite family of graphs whose multi-layer cop number is bounded from below by a constant times $n / \log n$, where $n$ is the number of vertices in the graph.

We consider the allocation of $m$ balls (jobs) into $n$ bins (servers). In the standard Two-Choice process, at each step $t=1,2,\ldots,m$ we first sample two bins uniformly at random and place a ball in the least loaded bin. It is well-known that for any $m \geq n$, this results in a gap (difference between the maximum and average load) of $\log_2 \log n + \Theta(1)$ (with high probability). In this work, we consider the Memory process where instead of two choices, we only sample one bin per step but we have access to a cache which can store the location of one bin. Mitzenmacher, Prabhakar and Shah showed that in the lightly loaded case ($m = n$), the Memory process achieves a gap of $\mathcal{O}(\log \log n)$.

Extending the setting of Mitzenmacher et al.~in two ways, we first allow the number of balls $m$ to be arbitrary, which includes the challenging heavily loaded case where $m \geq n$. Secondly, we follow the heterogeneous bins model of Wieder, where the sampling distribution of bins can be biased up to some arbitrary multiplicative constant. Somewhat surprisingly, we prove that even in this setting, the Memory process still achieves an $\mathcal{O}(\log \log n)$ gap bound. This is in stark contrast with the Two-Choice (or any $d$-Choice with $d=\mathcal{O}(1)$) process, where it is known that the gap diverges as $m \rightarrow \infty$.

Further, we show that for any sampling distribution independent of $m$ (but possibly dependent on $n$) the Memory process has a gap that can be bounded independently of $m$. Finally, we prove a tight gap bound of $\mathcal{O}(\log n)$ for the Memory process in another relaxed setting with heterogeneous (weighted) balls and a cache which can only be maintained for two steps.

Twin-width is a graph width parameter recently introduced by Bonnet, Kim, Thomassé & Watrigant. Given two graphs G and H and a graph product *, we address the question: is the twin-width of G*H bounded by a function of the twin-widths of G and H and their maximum degrees? It is known that a bound of this type holds for strong products (Bonnet, Geniet, Kim, Thomassé & Watrigant; SODA 2021).

We show that bounds of the same form hold for Cartesian, tensor/direct, rooted, replacement, and zig-zag products. For the lexicographical product we prove that the twin-width of the product of two graphs is exactly the maximum of the twin-widths of the individual graphs. In contrast, for the modular product we show that no bound can hold. In addition, we provide examples showing many of our bounds are tight, and give improved bounds for certain classes of graphs.

We study random walks on the giant component of Hyperbolic Random Graphs (HRGs), in the regime when the degree distribution obeys a power law with exponent in the range $(2,3)$. In particular, we focus on the expected times for a random walk to hit a given vertex or visit, i.e. cover, all vertices. We show that up to multiplicative constants: the cover time is $n(\log n)^2$, the maximum hitting time is $n\log n$, and the average hitting time is $n$. The first two results hold in expectation and a.a.s. and the last in expectation (with respect to the HRG). We prove these results by determining the effective resistance either between an average vertex and the well-connected "center" of HRGs or between an appropriately chosen collection of extremal vertices. We bound the effective resistance by the energy dissipated by carefully designed network flows associated to a tiling of the hyperbolic plane on which we overlay a forest-like structure.

The cover time of a Markov chain on a finite state space is the expected time until all states are visited. We show that if the cover time of a discrete-time Markov chain with rational transitions probabilities is bounded, then it is a rational number. The result is proved by relating the cover time of the original chain to the hitting time of a set in another higher dimensional chain. We also extend this result to the setting where $k\geq 1 $ independent copies of a Markov chain are run simultaneously on the same state space and the cover time is the expected time until each state has been visited by at least one copy of the chain.

The NP-complete graph problem Cluster Editing seeks to transform a static graph into disjoint union of cliques by making the fewest possible edits to the edge set. We introduce a natural interpretation of this problem in the setting of temporal graphs, whose edge-sets are subject to discrete changes over time, which we call Editing to Temporal Cliques. This problem is NP-complete even when restricted to temporal graphs whose underlying graph is a path, but we obtain two polynomial-time algorithms for special cases with further restrictions. In the static setting, it is well-known that a graph is a disjoint union of cliques if and only if it contains no induced copy of $P_3$; we demonstrate that no general characterisation involving sets of at most four vertices can exist in the temporal setting, but obtain a complete characterisation involving forbidden configurations on at most five vertices. This characterisation gives rise to an FPT algorithm parameterised simultaneously by the permitted number of modifications and the lifetime of the temporal graph, which uses a simple search-tree strategy.

We study the biased random walk where at each step of a random walk a ``controller'' can, with a certain small probability, fix the next step. This model was introduced by Azar et al. [STOC1992]; we extend their work to the time dependent setting and consider cover times of this walk. We obtain new bounds on the cover and hitting times and make progress towards resolving a conjecture of Azar et al. on maximising values of the stationary distribution. We also consider the problem of computing an optimal strategy for the controller to minimise the cover time and show that for directed graphs determining the cover time is $\mathsf{PSPACE}$-complete.

We consider the sequential allocation of $m$ balls (jobs) into $n$ bins (servers) by allowing each ball to choose from some bins sampled uniformly at random. The goal is to maintain a small gap between the maximum load and the average load.

In this paper, we present a general framework that allows us to analyze various allocation processes that slightly prefer allocating into underloaded, as opposed to overloaded bins. Our analysis covers several natural instances of processes, including:

- The Caching process (a.k.a. memory protocol) as studied by Mitzenmacher, Prabhakar and Shah (2002): At each round we only take one bin sample, but we also have access to a cache in which the most recently used bin is stored. We place the ball into the least loaded of the two.
- The Packing process: At each round we only take one bin sample. If the load is below some threshold (e.g., the average load), then we place as many balls until the threshold is reached; otherwise, we place only one ball.
- The Twinning process: At each round, we only take one bin sample. If the load is below some threshold, then we place two balls; otherwise, we place only one ball.
- The Thinning process as recently studied by Feldheim and Gurel-Gurevich (2021): At each round, we first take one bin sample. If its load is below some threshold, we place one ball; otherwise, we place one ball into a second bin sample.
As we demonstrate, our general framework implies for all these processes a gap of $\mathcal{O}(\log n)$ between the maximum load and average load, even when an arbitrary number of balls $m \geq n$ are allocated (heavily loaded case). Our analysis is inspired by a previous work of Peres, Talwar and Wieder (2010) for the $(1+\beta)$-process, however here we rely on the interplay between different potential functions to prove stabilization.

Understanding spatial correlation is vital in many fields including epidemiology and social science. Lee, Meeks and Pettersson recently demonstrated that improved inference for areal unit count data can be achieved by carrying out modifications to a graph representing spatial correlations; specifically, they delete edges of the planar graph derived from border-sharing between geographic regions in order to maximise a specific objective function. In this paper we address the computational complexity of the associated graph optimisation problem.

We demonstrate that this problem cannot be solved in polynomial time unless P = NP; we further show intractability for two simpler variants of the problem. We follow these results with two parameterised algorithms that exactly solve the problem in polynomial time in restricted settings. The first of these utilises dynamic programming on a tree decomposition, and runs in polynomial time if both the treewidth and maximum degree are bounded. The second algorithm is restricted to problem instances with maximum degree three, as may arise from triangulations of planar surfaces, but is an FPT algorithm when the number of edges to be removed is taken as the parameter.

Random walks on graphs are an essential primitive for many randomised algorithms and stochastic processes. It is natural to ask how much can be gained by running $k$ multiple random walks independently and in parallel. Although the cover time of multiple walks has been investigated for many natural networks, the problem of finding a general characterisation of multiple cover times for worst-case start vertices (posed by Alon, Avin, Koucky, Kozma, Lotker, and Tuttle in 2008) remains an open problem.

First, we improve and tighten various bounds on the stationary} cover time when $k$ random walks start from vertices sampled from the stationary distribution. For example, we prove an unconditional lower bound of $\Omega( (n/k) \log n )$ on the stationary cover time, holding for any graph $G$ and any $1 \leq k =o(n\log n )$. Secondly, we establish the stationary cover times of multiple walks on several fundamental networks up to constant factors. Thirdly, we present a framework characterising worst-case cover times in terms of stationary cover times and a novel, relaxed notion of mixing time for multiple walks called partial mixing time. Roughly speaking, the partial mixing time only requires a specific portion of all random walks to be mixed. Using these new concepts, we can establish (or recover) the worst-case cover times for many networks including expanders, preferential attachment graphs, grids, binary trees and hypercubes.

We apply the power-of-two-choices paradigm to a random walk on a graph: rather than moving to a uniform random neighbour at each step, a controller is allowed to choose from two independent uniform random neighbours. We prove that this allows the controller to significantly accelerate the hitting and cover times in several natural graph classes. In particular, we show that the cover time becomes linear in the number $n$ of vertices on discrete tori and bounded degree trees, of order $\mathcal{O}(n \log \log n)$ on bounded degree expanders, and of order $\mathcal{O}(n (\log \log n)^2)$ on the Erdős-Rényi random graph in a certain sparsely connected regime. We also consider the algorithmic question of computing an optimal strategy, and prove a dichotomy in efficiency between computing strategies for hitting and cover times.

We analyse the following random walk process inspired by the power-of-two-choice paradigm: starting from a given vertex, at each step, unlike the simple random walk (SRW) that always moves to a randomly chosen neighbour, we have the choice between two uniformly and independently chosen neighbours. We call this process the choice random walk (CRW).

We first prove that for any graph, there is a strategy for the CRW that visits any given vertex in expected time $\mathcal{O}(|E|)$. Then we introduce a general tool that quantifies by how much the probability of a rare event in the simple random walk can be boosted under a suitable CRW strategy. We believe this result to be of independent interest, and apply it here to derive an almost optimal $\mathcal{O}(n\log\log n)$ bound for the cover time of bounded-degree expanders. This tool also applies to so-called biased walks, and allows us to make progress towards a conjecture of Azar et al. [STOC 1992]. Finally, we prove the following dichotomy: computing an optimal strategy to minimise the hitting time of a vertex takes polynomial time, whereas computing one to minimise the cover time is $\mathsf{NP}$-hard.

We study two random processes on an $n$-vertex graph inspired by the internal diffusion limited aggregation (IDLA) model. In both processes $n$ particles start from an arbitrary but fixed origin. Each particle performs a simple random walk until first encountering an unoccupied vertex, and at which point the vertex becomes occupied and the random walk terminates. In one of the processes, called Sequential-IDLA, only one particle moves until settling and only then does the next particle start whereas in the second process, called Parallel-IDLA, all unsettled particles move simultaneously. Our main goal is to analyze the so-called dispersion time of these processes, which is the maximum number of steps performed by any of the $n$ particles.

We prove a bound on the effective resistance $R(x,y)$ between two vertices $x,y$ of a connected graph which contains a suitably well-connected sub-graph. We apply this bound, in tandem with a simple lower bound, to the Erdős-Rényi random graph $\mathcal{G}\left(n,p\right)$ with $np=\Omega(\log n)$, proving that $R(x,y)$ concentrates around $1/d(x) + 1/d(y)$, that is, the sum of reciprocal degrees. We also prove expectation and concentration results for the random walk hitting times, Kirchoff index, cover cost, and the random target time (Kemeny's constant) on $\mathcal{G}\left(n,p\right)$ in the sparsely connected regime $\log n + \log\log \log n \leq np < n^{1/10}$.

In this short note we shall consider the upper tail $\mathbb{P}\left(bin(n,p) \geq k\right)$ for the Binomial distribution $bin(n,p)$ when $np \rightarrow 0$ and $k> np$. We derive a simple expression for $\mathbb{P}\left(bin(n,p) \geq k\right)$ which shows that aysmtotically the Chernoff bound overestimates this probability by a multiplicative factor of $\sqrt{2\pi k} $ .

Choice and Bias for Random Walks

Multiple Random Walks on Graphs: Mixing Few to Cover Many