Abbas Mehrabian’s research projects

2023–2025 Discovering novel mathematical constructions using large language models (AlphaEvolve) (paper)

We developed an artificial intelligence agent, called AlphaEvolve, to discover novel mathematical constructions and design new algorithms. AlphaEvolve developed a search algorithm that found a procedure to multiply two $4 \times 4$ complex-valued matrices using $47$ scalar multiplications; offering the first improvement, after 56 years, over Strassen's algorithm in this setting. Also, AlphaEvolve found a kissing configuration of $593$ spheres in dimensions $11$, improving the kissing number in this dimension.

AlphaEvolve is an evolutionary coding agent that enhances capabilities of language models on challenging tasks such as tackling open scientific problems or optimizing critical pieces of computational infrastructure. AlphaEvolve orchestrates an autonomous pipeline of language models, whose task is to improve an algorithm by making direct changes to the code. Using an evolutionary approach, continuously receiving feedback from one or more evaluators, AlphaEvolve iteratively improves the algorithm, potentially leading to new scientific and practical discoveries.

We demonstrated the broad applicability of this approach by applying it to a number of important computational problems. When applied to optimizing critical components of large-scale computational stacks at Google, AlphaEvolve developed a more efficient scheduling algorithm for data centers, found a simplification in the circuit design of hardware accelerators, and accelerated the training of the LLM underpinning AlphaEvolve itself. Furthermore, AlphaEvolve discovered novel, provably correct algorithms that surpass state-of-the-art solutions on a spectrum of problems in mathematics and computer science, significantly expanding the scope of prior automated discovery methods.

2021–2023 Finding new extremal graphs using AI (paper)

Let $f(n)$ denote the maximum number of edges possible in an $n$-node graph without a 3-cycle or a 4-cycle. In 1975, Erdős conjectured that $\lim_{n\to\infty} \frac{f(n)}{n\sqrt{n}} = \frac{1}{2\sqrt 2}$. This conjecture has remained open, but several lower and upper bounds have been obtained for $f(n)$ for specific values of $n$.

In this work, we focus on improving the lower bounds for $f(n)$. We formulate this as a sequential decision-making problem and compare AlphaZero, a neural network-guided tree search, with tabu search, a heuristic local search method. Using either method, by introducing a curriculum (jump-starting the search for larger graphs using good graphs found at smaller sizes) we improve the state-of-the-art lower bounds for several sizes. We also propose a flexible graph-generation environment and a permutation-invariant network architecture for learning to search in the space of graphs.

2019–2020 New algorithms for batched bandit problems (paper)

A bandit problem is a sequential game between a player (learner) and an environment. In each round, first the player chooses an action from a set of actions and then the environment presents a reward. The player’s goal is to maximize her total accumulated reward. The standard performance measure for a bandit policy is its regret, defined as the difference between the total expected reward collected by the policy and the total expected reward collected by the optimal policy. In the batched version of this problem, the player commits to a sequence of actions and observes the rewards after all actions in that sequence are played. The player is subject to using at most a given number of batches, and the total number of actions played by the player must sum to a given number.

With Hossein Esfandiari, Amin Karbasi, and Vahab Mirrokni we presented simple and efficient algorithms for batched stochastic multi-armed bandit and batched stochastic linear bandit problems. We proved bounds for their expected regrets that improve over the best known regret bounds for any number of batches. In particular, our algorithms in both settings achieve the optimal expected regrets by using only a logarithmic number of batches. We also studied the batched adversarial multi-armed bandit problem for the first time and found the optimal regret, up to logarithmic factors, of any algorithm with predetermined batch sizes (the paper).

2019 A new randomized algorithm for bandit problems (paper)

Two standard algorithms for bandit problems are upper-confidence-bound (UCB) and Thompson sampling (TS) algorithms. In general, UCB has better theoretical guarantees, while TS has better empirical performance. With Audrey Durand, Branislav Kveton, and Sharan Vaswani we proposed RandUCB, a bandit strategy that uses theoretically derived confidence intervals similar to UCB algorithms, but, akin to Thompson sampling, uses randomization to trade off exploration and exploitation. In the $K$-armed bandit setting, we show RandUCB achieves the minimax-optimal $\widetilde{O}(\sqrt{K T})$ regret after $T$ rounds. For linear bandits, we prove RandUCB achieves the minimax-optimal $\widetilde{O}(d \sqrt{T})$ regret even in the case of infinite arms, where $d$ denotes the dimension. We demonstrate the practical effectiveness of RandUCB with experiments in both the multi-armed and linear bandits. Our results illustrate that RandUCB matches the empirical performance of TS while obtaining the theoretically optimal regret bounds of UCB algorithms, thus achieving the best of both worlds (the paper).

2018–2019 New algorithms for homogenous and heterogeneous multiplayer bandits (paper 1, paper 2)

Consider a multiplayer stochastic multi-armed bandit game in which the players cannot communicate, and if two or more players pull the same arm, a collision occurs and the involved players receive zero rewards.

Two feedback models can be considered: a model in which the players can observe whether a collision has occurred, and a more difficult setup when no collision information is available. With Gabor Lugosi, we gave the first theoretical guarantees for the second model: an algorithm with a logarithmic regret, and an algorithm with a square-root regret type that does not depend on the gaps between the means. For the first model, we gave the first square-root regret bounds that do not depend on the gaps. Building on these ideas, we also gave an algorithm for reaching approximate Nash equilibria quickly in stochastic anti-coordination games (the paper).
Next consider the setting where each arm has a different mean for each player. In the model in which the players can observe whether a collision has occurred, an algorithm with regret $O((\log T)^{2+\kappa})$ for any constant $\kappa$ was presented in this paper, who left the existence of an algorithm with $O(\log T)$ regret as an open question ($T$ is the number of rounds). With Etienne Boursier, Emilie Kaufmann and Vianney Perchet, we provided an affirmative answer to this question in the case when there is a unique optimal assignment of players to arms. For the general case, we presented an algorithm with expected regret $O((\log T)^{1+\kappa})$, for any constant $\kappa$ (the paper).

2017–2019 Density estimation, or sample complexity of distribution learning (paper 1, paper 2, a tutorial covering the results of papers 1 and 2, paper 3)

Given samples from an unknown probability distribution with density $f$, we want to output a density function $g$ such that with high probability $\|g-f\|_1\leq\varepsilon$. Given that $f$ belongs to (or can be approximated by) a class of distributions $\mathcal F$, what is the minimum number of samples required?

Suppose we know $f$ is a mixture of $k$ Gaussians in $d$ dimensions, and we would like $g$ to have a similar structure. With Hassan Ashtiani, Shai Ben-David, Nick Harvey, Chris Liaw and Yaniv Plan, we proved that $\widetilde \Theta (kd^2/\varepsilon^2)$ samples are necessary and sufficient for this task, improving both the known upper and lower bounds. Moreover, for mixtures of axis-aligned Gaussians, $\widetilde \Theta (kd/\varepsilon^2)$ samples are necessary and sufficient. For proving these bounds we have developed a novel distribution learning technique based on compression schemes (the paper).
With Luc Devroye and Tommy Reddad, we removed the logarithmic factors in the lower bound and showed that $\Omega(kd^2/\varepsilon^2)$ samples are necessary for this problem (the paper).
Suppose the sample complexity for learning class $\mathcal F$ is $m(\varepsilon)$. Under mild technical conditions, with Hassan Ashtiani and Shai Ben-David, we showed the sample complexity for learning $k$-mixtures of distributions in $\mathcal F$ is bounded by $O( m(\varepsilon) k \log (k) / \varepsilon^2)$ (the published paper had a bug which was fixed in the arXiv version).
Let $G$ be an undirected graph with $m$ edges and $d$ vertices, and let $\mathcal F$ be the class of Ising models defined on $G$. With Luc Devroye and Tommy Reddad, we showed the optimal sample complexity for learning $\mathcal F$ is $\Theta((m+d)/\varepsilon^2)$ (the paper).

2018 Total variation distance between high-dimensional Gaussians (paper)

What is the total variation distance between two high-dimensional Gaussians $N(\mu_1,\Sigma_1)$ and $N(\mu_2,\Sigma_2)$? If $\mu_1=\mu_2$, we show the answer is $\Theta(\min\{1,\|\Sigma_1^{-1/2}\Sigma_2\Sigma_1^{-1/2}-I\|_F\})$, where $\|A\|_F$ denotes the Frobenius norm of $A$. We also found the value of the total variation distance, up to constant factors, when $\mu_1\neq\mu_2$. Joint work with Luc Devroye and Tommy Reddad.

2016–2017 VC-dimension of neural networks (paper)

Deep neural networks underlie many of the recent breakthroughs of applied machine learning, particularly in image and speech recognition. These successes motivate a renewed study of these networks’ theoretical properties. Classification is one of the learning tasks in which deep neural networks have been particularly successful, e.g., for image recognition. A natural foundational question that arises is: what are the theoretical limits on the classification power of these networks? The established way to formalize this question is by considering the VC-dimension, as it is well known that this asymptotically determines the sample complexity of PAC learning with such classifiers.

With Peter L. Bartlett, Nick Harvey and Chris Liaw, we proved new upper and lower bounds on the VC-dimension of deep neural networks with the ReLU activation function. These bounds are tight for almost the entire range of parameters. Letting $W$ be the number of parameters and $L$ be the number of layers, we proved that the VC-dimension is $O(W L \log(W))$ and $\Omega(WL\log(W/L)).$ This improves both the previously known upper bounds and lower bounds. These bounds generalize to arbitrary piecewise linear activation functions and also hold for the pseudodimension of these networks. In terms of the number $U$ of non-linear units, we proved a tight bound $\Theta(W U)$ on the VC-dimension and pseudodimension, which holds for arbitrary piecewise polynomial activation functions (the paper).

2016–2017 Growing a tree in a graph (paper)

Given a graph and a starting vertex, we iteratively build a random spanning tree as follows: initially the tree has just one vertex. in each round, we select a random edge with exactly one endpoint in the tree, and add that edge to the tree. We give bounds for the height of the resulting tree, in terms of parameters of the graph (diameter, maximum degree, genus, degeneracy, edge expansion, etc).

2013–2017 Randomized rumour spreading (paper 1, paper 2, paper 3, a survey in Persian)

The synchronous push&pull protocol is a round-robin rumour spreading protocol, defined as follows. Suppose that one node in a network is aware of a piece of information, the rumour. The protocol proceeds in rounds. In each round, every informed node contacts a random neighbour and sends the rumour to it (pushes the rumour), and every uninformed nodes contacts a random neighbour and gets the rumour if the neighbour possibly knows it (pulls the rumour). This model was defined in this 1987 paper and popularized by this 2000 paper.

Random $k$-trees are a class of low-diameter scale-free random graphs with large clustering coefficients, built as follows: initially we have a $k$-clique. In every step a new vertex is born, a random $k$-clique of the current graph is chosen, and the new vertex is joined to all vertices of the $k$-clique. With Ali Pourmiri I studied the behaviour of the synchronous push&pull protocol on random $k$-trees and proved that for each fixed $k>1$, if initially a random vertex is aware of the rumour, then with high probability after polylogarithmic (in the number of vertices) many rounds, the rumour propagates to almost all vertices. We proved a similar result for a closely related class of graphs, random $k$-Apollonian networks. On the other hand, we proved that if we want to inform each and every vertex, with high probability we need at least polynomially many rounds (the paper).
The asynchronous variant of the push&pull protocol is defined as follows. Independent Poisson clocks of rate 1 are associated with the vertices of graph $G$, one to each vertex. Initially, one vertex of $G$ knows the rumour. Whenever the clock of a vertex $x$ rings, it calls a random neighbour $y$: if $x$ knows the rumour and $y$ does not, then $x$ tells $y$ the rumour, and if $x$ does not know the rumour and $y$ knows it, $y$ tells $x$ the rumour. The guaranteed spread time is the smallest time that with high probability all vertices know the rumour. With Huseyin Acan, Andrea Collevecchio and Nick Wormald, we found the worst and best possible spread times, up to constant factors, for both variants. We also proved the first theoretical relationships between the guaranteed spread times in the two versions. Firstly, in all graphs the guaranteed spread time in the asynchronous version is within an $O(\log n)$ factor of that in the synchronous version, and this is tight. Next, we found examples of graphs whose asynchronous spread times are logarithmic, but the synchronous versions are polynomially large. Finally, we show for any graph that the ratio of the synchronous spread time to the asynchronous spread time is $O(n^{2/3})$ (the paper).
With Omer Angel and Yuval Peres, we showed for any graph that the ratio of the guaranteed synchronous spread time to the guaranteed asynchronous spread time is $O(n\log n)^{1/3}$, which is tight up to logarithmic factors (the paper).

2015–2016 Load balancing via randomized local search (paper)

Consider $n$ identical resources and $m$ identical users. Each user must be assigned to exactly one resource. We analyze a distributed randomized local search protocol introduced in 2004 by Paul W. Goldberg, whose goal is to assign users to resources such that the load is perfectly balanced. In this protocol, users are activated by independent exponential clocks of rate 1. On activation, a user samples a random resource and compares that resource’s load with its current load. If the new load is lower, the user switches. It is shown in this paper that the expected balancing time is $O(\ln^2 n+\ln(n)\cdot n^2/m)$. With Petra Berebrink, Peter Kling, and Chris Liaw, we improved this to $O(\ln(n)+n^2/m)$, which is tight.

2014–2016 Properties of the spatial preferential attachment model (paper)

Rumour spreading is a protocol that models the spread of information through a network via user-to-user interaction. The spread time of a graph is the number of rounds needed to spread the rumour to the entire graph. The Spatial Preferred Attachment (SPA) model (introduced here) is a random graph model for complex networks: vertices are placed in a metric space, and the link probability depends on the metric distance between vertices, and on their degree. With Jeannette Janssen we showed that the SPA model typically produces graphs that have a small effective diameter, i.e. $O(\log^2 n)$, while rumour spreading is relatively slow, namely polynomial in $n$.

2015–2016 Geometric bounds for the deviation of random matrices (paper)

Let $A$ be an isotropic, random sub-gaussian $m\times n$ matrix. The Gaussian complexity of a set $T\subseteq \mathbb{R}^n$ is defined as $\mathbb E \sup_{x \in T} |\langle g , x \rangle |$, where $g$ has i.i.d. standard normal entries. With Chris Liaw, Yaniv Plan, and Roman Vershynin, we showed that for any bounded set $T$, the deviation of $\|Ax\|_2$ around its mean is uniformly bounded by the Gaussian complexity of $T$. We also proved a local version of this theorem, which allows for unbounded sets. These theorems have various applications, in particular, we gave a new result regarding model selection in the constrained linear model.

2013–2014 Diameters of real-world network models (paper 1, paper 2)

I have developed a new technique for proving logarithmic upper bounds for the diameters of evolving random graph models, which is based on defining a coupling between random graphs and variants of random recursive trees. The advantage of the technique is two-fold: it is quite simple and provides short proofs, and it is applicable to a broad variety of models, including those incorporating preferential attachment. Using this I proved, for the first time, logarithmic upper bounds for the diameters of the following well known models: the forest fire model, the copying model, the PageRank-based selection model, the Aiello-Chung-Lu models, the generalized linear preference model, directed scale-free graphs, and the Cooper-Frieze model (the paper).
The random-surfer Webgraph model is defined as follows. Let $d$ be a positive integer and let $p \in (0,1]$. Generate a random directed rooted $n$-vertex multigraph, with all vertices having out-degree $d$, as follows. Start with a single vertex $v_0$, the root, with $d$ self-loops. At each subsequent step $s=1,2,\dots,n-1$, a new vertex $v_s$ appears and $d$ new edges are created from it to some of the present vertices, by doing the following probabilistic procedure $d$ times, independently: choose a vertex $u$ uniformly at random from $\{v_0,v_1,\dots,v_{s-1}\}$, and a fresh geometric random variable $X$ with parameter $p$; perform a simple random walk of length $X$ starting from $u$, and join $v_s$ to the last vertex of the walk.
This model is equivalent to the PageRank-based selection model. With Nick Wormald, we proved the diameter of both models is at most $8 e^p \log n / p$. This is the first logarithmic upper bound for these models, and verifies that they are small-world.

In the special case $d=1$, we obtain a directed tree with some self-loops at the root. We proved sharp results for the underlying undirected tree. When $p \geq 0.21$ we determined the asymptotic value of its height and diameter, and when $ p < 0.21 $, we provided logarithmic lower and upper bounds (the paper).

2012–2014 Properties of random Apollonian networks (paper 1, paper 2)

Start with a triangle. In each step, choose a random face (not the unbounded face), add a point in that face and join it to the vertices on the face. After $n$ steps, you will have a (random) triangulated plane graph with $n+3$ vertices. With Ehsan Ebrahimzadeh, Linda Farczadi, Jane Gao, Cristiane Sato, Nick Wormald, and Jonathan Zung, we showed that asymptotically almost surely (a.a.s.) as $n \rightarrow \infty$, every path in a RAN has length $o(n)$, refuting a conjecture of Frieze and Tsourakakis. We also showed that a RAN always has a path of length $(2n-5)^{\log(3)/\log(2)}$ and that the expected length of its longest path is $\Omega\left(n^{0.88}\right)$. Finally, we proved that a.a.s. the diameter of a RAN is asymptotic to $c \ln n$, where $c \approx 1.668$ is the solution of an explicit equation (the paper).
With Andrea Collevecchio and Nick Wormald, we proved there exists a constant $a<1$ such that a.a.s. the length of every path in a RAN is smaller than $n^a$, verifying a conjecture of Cooper and Frieze. Using a similar technique, we showed that if $r < d$ are fixed constants, then a.a.s. every $r$-ary subtree of a random $d$-ary recursive tree on $n$ vertices has less than $n^b$ vertices, for some $b=b(d,r)<1$ (the paper).

2011–2013 Unique-path labellings or good edge-labellings (paper 1, paper 2)

A unique-path labelling (also known as a good edge-labelling) of a simple graph is a labelling of its edges with real numbers such that, for any ordered pair of vertices $(u,v)$, there is at most one nondecreasing path from $u$ to $v$. Say a graph is good if it admits a good edge-labelling, and is bad otherwise. Bode, Farzad, and Theis conjectured that all graphs with large enough girth are good (their paper). I proved that if the maximum degree of a graph is within a constant factor of its average degree, and the graph is good, then it has at most $O\left(n^{1+o(1)}\right)$ edges, disproving their conjecture (the paper).
Araújo, Cohen, Giroire, and Havet observed that the maximum number of edges of a good graph on $n$ vertices is $\Omega(n \log(n))$ and $O(n\sqrt{n})$, and asked about more precise bounds (their paper). With Dieter Mitsche and Pawel Pralat, we solved this problem by proving that any good $n$-vertex graph has at most $n \log_2 (n)$ edges and for every $n$ there exists a good $n$-vertex graph with $n \log_2 (n) - O(n)$ edges. (the paper)

2011–2012 Stretch factor of a randomly embedded random graph (paper)

Consider a random graph $G(n,p)$ whose vertex set has been randomly embedded in the unit square and whose edges are given weights equal to the geometric distance between their end vertices. Then each pair $u,v$ of vertices have a distance in the weighted graph and a Euclidean distance. The stretch factor of the embedded graph is defined as the maximum ratio of these two distances over all pairs $u,v$. With Nick Wormald, we gave upper and lower bounds on the stretch factor (holding asymptotically almost surely), and showed that for $p$ bounded away from 0 and 1, these bounds are best possible in a certain sense. Our results imply that the stretch factor is bounded with probability tending to $1$ if and only if $n(1-p)\rightarrow 0$, answering a question of Joseph O’Rourke.

2010–2012 The complexity of a DAG partitioning problem (paper)

Consider the following DAG partitioning problem: given a directed acyclic graph with arc weights, delete a set of arcs of minimum total weight so that each of the resulting connected components has exactly one sink. With Soroush Alamdari, we proved that the problem is hard to approximate, namely, there is no $n^{1-\varepsilon}$-approximation algorithm, for any fixed $\varepsilon>0$. We also gave a polynomial time algorithm for solving the DAG Partitioning problem in graphs with bounded pathwidth.

2010–2011 Network creation games (paper)

Consider a network creation game, in which each player (vertex) has a fixed budget to establish links to other players; each link has unit price and each agent tries to minimize its cost, which is either its local diameter or its total distance to other players in the (undirected) underlying graph of the created network. This is similar to a directed version of this basic network creation game. With Shayan Ehsani, MohammadAmin Fazli, Sina Sadeghian Sadeghabad, MohammadAli Safari, Morteza Saghafian and Saber Shokatfadaee, we proved some bounds for the price of anarchy of this game, and in particular, we showed a non-monotone property of this game, similar to Braess’s paradox in network routing games.

2009–2012 The cops and robber game (paper 1, paper 2.1, paper 2.2, paper 3.1, paper 3.2, paper 3.3)

My BSc thesis was on the Cops and Robber game, a round-robin vertex-to-vertex pursuit game played on a graph. In my thesis, written in Persian in 2009 under the supervision of MohammadAli Safari, I surveyed some results on this game. I also proved a few results about the capture time of grids (the capture time of a graph, introduced here, is the minimum number of steps needed to capture the robber in that graph) (the paper).
In my Master’s I continued working on this game. I especially focused on the variant where the robber is faster than the cops (introduced here). With Noga Alon, we proved that if the robber has speed $s$, then the cop number of a connected $n$-vertex graph can be as large as $\Omega\left(n^{s/(s+1)}\right)$ (the paper). I had previously proved this result for $s=2, 4$, and had conjectured that this bound is asymptotically tight, i.e., that $O\left(n^{s/(s+1)}\right)$ cops can always capture the robber (the paper). This generalizes Meyniel’s well known conjecture (see this survey), which states the same thing for $s=1$.
Next, I worked on the variant in which the robber has infinite speed. With Noga Alon, we showed that the cop number and treewidth of planar graphs are of the same order, and determined the cop number of random graphs (the paper). Then I characterized graphs with cop number one and gave an algorithm for their detection, and proved some bounds for the cop number of expander graphs (the paper). Finally, I showed that any interval graph has cop number $O(\sqrt n)$ but there are chordal graphs with cop number $\Omega(n / \log n)$ (the paper).

2006–2008 Flows in graphs (paper 1, paper 2)

An unoriented flow in a graph (introduced here) is an assignment of real numbers to the edges, such that the sum of the values of all edges incident with each vertex is zero. This is equivalent to a flow in a bidirected graph all of whose edges are extraverted. A nowhere-zero unoriented $k$-flow is an unoriented flow with values from the set $\{\pm 1,... ,\pm (k-1)\}$. Together with Saieed Akbari, Aliakbar Daemi, Omid Hatami and Adel Javanmard, we worked on the problem of finding the minimum $k$ such that a graph has an unoriented $k$-flow. We proved several results for regular graphs (the paper) and Hamiltonian graphs (the paper).

2007–2008 Pacman intelligent controller (technical report)

I worked along with Arian Khosravi and Ali Dehghan, under the supervision of Ramin Halavati, to implement a software controller for Ms. Pac-Man, a simple real-time game. We played, discussed and implemented a variety of approaches, and finally came up with a heuristic-based program, which became first in the IEEE Congress on Evolutionary Computation (CEC 2007) competition and was the best known agent for the game for two years. See it in action!

2007 Complexity of graph homomorphisms

I studied cylindrical construction, a concept introduced by Amir Daneshgar in order to prove hardness results for the graph homomorphism problem. I was able to move a few small steps towards the Bang-Jensen and Hell’s conjecture about the dichotomy of digraph homomorphism problem, but meanwhile, the whole conjecture was proved in this paper.

Last update: 24 October 2025. Go back to my homepage.