Abbas Mehrabian’s research projects

2019–2020 New algorithms for batched bandit problems (paper)

A bandit problem is a sequential game between a player (learner) and an environment. In each round, first the player chooses an action from a set of actions and then the environment presents a reward. The player’s goal is to maximize her total accumulated reward. The standard performance measure for a bandit policy is its regret, defined as the difference between the total expected reward collected by the policy and the total expected reward collected by the optimal policy. In the batched version of this problem, the player commits to a sequence of actions and observes the rewards after all actions in that sequence are played. The player is subject to using at most a given number of batches, and the total number of actions played by the player must sum to a given number.

With Hossein Esfandiari, Amin Karbasi, and Vahab Mirrokni we presented simple and efficient algorithms for batched stochastic multi-armed bandit and batched stochastic linear bandit problems. We proved bounds for their expected regrets that improve over the best known regret bounds for any number of batches. In particular, our algorithms in both settings achieve the optimal expected regrets by using only a logarithmic number of batches. We also studied the batched adversarial multi-armed bandit problem for the first time and found the optimal regret, up to logarithmic factors, of any algorithm with predetermined batch sizes (the paper).

2019 A new randomized algorithm for bandit problems (paper)

Two standard algorithms for bandit problems are upper-confidence-bound (UCB) and Thompson sampling (TS) algorithms. In general, UCB has better theoretical guarantees, while TS has better empirical performance. With Audrey Durand, Branislav Kveton, and Sharan Vaswani we proposed RandUCB, a bandit strategy that uses theoretically derived confidence intervals similar to UCB algorithms, but, akin to Thompson sampling, uses randomization to trade off exploration and exploitation. In the $K$-armed bandit setting, we show RandUCB achieves the minimax-optimal $\widetilde{O}(\sqrt{K T})$ regret after $T$ rounds. For linear bandits, we prove RandUCB achieves the minimax-optimal $\widetilde{O}(d \sqrt{T})$ regret even in the case of infinite arms, where $d$ denotes the dimension. We demonstrate the practical effectiveness of RandUCB with experiments in both the multi-armed and linear bandits. Our results illustrate that RandUCB matches the empirical performance of TS while obtaining the theoretically optimal regret bounds of UCB algorithms, thus achieving the best of both worlds (the paper).

2018–2019 New algorithms for homogenous and heterogeneous multiplayer bandits (paper 1, paper 2)

Consider a multiplayer stochastic multi-armed bandit game in which the players cannot communicate, and if two or more players pull the same arm, a collision occurs and the involved players receive zero rewards.

2017–2019 Density estimation, or sample complexity of distribution learning (paper 1, paper 2, a tutorial covering the results of papers 1 and 2, paper 3)

Given samples from an unknown probability distribution with density $f$, we want to output a density function $g$ such that with high probability $\|g-f\|_1\leq\varepsilon$. Given that $f$ belongs to (or can be approximated by) a class of distributions $\mathcal F$, what is the minimum number of samples required?

2018 Total variation distance between high-dimensional Gaussians (paper)

What is the total variation distance between two high-dimensional Gaussians $N(\mu_1,\Sigma_1)$ and $N(\mu_2,\Sigma_2)$? If $\mu_1=\mu_2$, we show the answer is $\Theta(\min\{1,\|\Sigma_1^{-1/2}\Sigma_2\Sigma_1^{-1/2}-I\|_F\})$, where $\|A\|_F$ denotes the Frobenius norm of $A$. We also found the value of the total variation distance, up to constant factors, when $\mu_1\neq\mu_2$. Joint work with Luc Devroye and Tommy Reddad.

2016–2017 VC-dimension of neural networks (paper)

Deep neural networks underlie many of the recent breakthroughs of applied machine learning, particularly in image and speech recognition. These successes motivate a renewed study of these networks’ theoretical properties. Classification is one of the learning tasks in which deep neural networks have been particularly successful, e.g., for image recognition. A natural foundational question that arises is: what are the theoretical limits on the classification power of these networks? The established way to formalize this question is by considering the VC-dimension, as it is well known that this asymptotically determines the sample complexity of PAC learning with such classifiers.

With Peter L. Bartlett, Nick Harvey and Chris Liaw, we proved new upper and lower bounds on the VC-dimension of deep neural networks with the ReLU activation function. These bounds are tight for almost the entire range of parameters. Letting $W$ be the number of parameters and $L$ be the number of layers, we proved that the VC-dimension is $O(W L \log(W))$ and $\Omega(WL\log(W/L)).$ This improves both the previously known upper bounds and lower bounds. These bounds generalize to arbitrary piecewise linear activation functions and also hold for the pseudodimension of these networks. In terms of the number $U$ of non-linear units, we proved a tight bound $\Theta(W U)$ on the VC-dimension and pseudodimension, which holds for arbitrary piecewise polynomial activation functions (the paper).

2016–2017 Growing a tree in a graph (paper)

Given a graph and a starting vertex, we iteratively build a random spanning tree as follows: initially the tree has just one vertex. in each round, we select a random edge with exactly one endpoint in the tree, and add that edge to the tree. We give bounds for the height of the resulting tree, in terms of parameters of the graph (diameter, maximum degree, genus, degeneracy, edge expansion, etc).

2013–2017 Randomized rumour spreading (paper 1, paper 2, paper 3, a survey in Persian)

The synchronous push&pull protocol is a round-robin rumour spreading protocol, defined as follows. Suppose that one node in a network is aware of a piece of information, the rumour. The protocol proceeds in rounds. In each round, every informed node contacts a random neighbour and sends the rumour to it (pushes the rumour), and every uninformed nodes contacts a random neighbour and gets the rumour if the neighbour possibly knows it (pulls the rumour). This model was defined in this 1987 paper and popularized by this 2000 paper.

2015–2016 Load balancing via randomized local search (paper)

Consider $n$ identical resources and $m$ identical users. Each user must be assigned to exactly one resource. We analyze a distributed randomized local search protocol introduced in 2004 by Paul W. Goldberg, whose goal is to assign users to resources such that the load is perfectly balanced. In this protocol, users are activated by independent exponential clocks of rate 1. On activation, a user samples a random resource and compares that resource’s load with its current load. If the new load is lower, the user switches. It is shown in this paper that the expected balancing time is $O(\ln^2 n+\ln(n)\cdot n^2/m)$. With Petra Berebrink, Peter Kling, and Chris Liaw, we improved this to $O(\ln(n)+n^2/m)$, which is tight.

2014–2016 Properties of the spatial preferential attachment model (paper)

Rumour spreading is a protocol that models the spread of information through a network via user-to-user interaction. The spread time of a graph is the number of rounds needed to spread the rumour to the entire graph. The Spatial Preferred Attachment (SPA) model (introduced here) is a random graph model for complex networks: vertices are placed in a metric space, and the link probability depends on the metric distance between vertices, and on their degree. With Jeannette Janssen we showed that the SPA model typically produces graphs that have a small effective diameter, i.e. $O(\log^2 n)$, while rumour spreading is relatively slow, namely polynomial in $n$.

2015–2016 Geometric bounds for the deviation of random matrices (paper)

Let $A$ be an isotropic, random sub-gaussian $m\times n$ matrix. The Gaussian complexity of a set $T\subseteq \mathbb{R}^n$ is defined as $\mathbb E \sup_{x \in T} |\langle g , x \rangle |$, where $g$ has i.i.d. standard normal entries. With Chris Liaw, Yaniv Plan, and Roman Vershynin, we showed that for any bounded set $T$, the deviation of $\|Ax\|_2$ around its mean is uniformly bounded by the Gaussian complexity of $T$. We also proved a local version of this theorem, which allows for unbounded sets. These theorems have various applications, in particular, we gave a new result regarding model selection in the constrained linear model.

2013–2014 Diameters of real-world network models (paper 1, paper 2)

2012–2014 Properties of random Apollonian networks (paper 1, paper 2)

2011–2013 Unique-path labellings or good edge-labellings (paper 1, paper 2)

2011–2012 Stretch factor of a randomly embedded random graph (paper)

Consider a random graph $G(n,p)$ whose vertex set has been randomly embedded in the unit square and whose edges are given weights equal to the geometric distance between their end vertices. Then each pair $u,v$ of vertices have a distance in the weighted graph and a Euclidean distance. The stretch factor of the embedded graph is defined as the maximum ratio of these two distances over all pairs $u,v$. With Nick Wormald, we gave upper and lower bounds on the stretch factor (holding asymptotically almost surely), and showed that for $p$ bounded away from 0 and 1, these bounds are best possible in a certain sense. Our results imply that the stretch factor is bounded with probability tending to $1$ if and only if $n(1-p)\rightarrow 0$, answering a question of Joseph O’Rourke.

2010–2012 The complexity of a DAG partitioning problem (paper)

Consider the following DAG partitioning problem: given a directed acyclic graph with arc weights, delete a set of arcs of minimum total weight so that each of the resulting connected components has exactly one sink. With Soroush Alamdari, we proved that the problem is hard to approximate, namely, there is no $n^{1-\varepsilon}$-approximation algorithm, for any fixed $\varepsilon>0$. We also gave a polynomial time algorithm for solving the DAG Partitioning problem in graphs with bounded pathwidth.

2010–2011 Network creation games (paper)

Consider a network creation game, in which each player (vertex) has a fixed budget to establish links to other players; each link has unit price and each agent tries to minimize its cost, which is either its local diameter or its total distance to other players in the (undirected) underlying graph of the created network. This is similar to a directed version of this basic network creation game. With Shayan Ehsani, MohammadAmin Fazli, Sina Sadeghian Sadeghabad, MohammadAli Safari, Morteza Saghafian and Saber Shokatfadaee, we proved some bounds for the price of anarchy of this game, and in particular, we showed a non-monotone property of this game, similar to Braess’s paradox in network routing games.

2009–2012 The cops and robber game (paper 1, paper 2.1, paper 2.2, paper 3.1, paper 3.2, paper 3.3)

2006–2008 Flows in graphs (paper 1, paper 2)

An unoriented flow in a graph (introduced here) is an assignment of real numbers to the edges, such that the sum of the values of all edges incident with each vertex is zero. This is equivalent to a flow in a bidirected graph all of whose edges are extraverted. A nowhere-zero unoriented $k$-flow is an unoriented flow with values from the set $\{\pm 1,... ,\pm (k-1)\}$. Together with Saieed Akbari, Aliakbar Daemi, Omid Hatami and Adel Javanmard, we worked on the problem of finding the minimum $k$ such that a graph has an unoriented $k$-flow. We proved several results for regular graphs (the paper) and Hamiltonian graphs (the paper).

2007–2008 Pacman intelligent controller (technical report)

I worked along with Arian Khosravi and Ali Dehghan, under the supervision of Ramin Halavati, to implement a software controller for Ms. Pac-Man, a simple real-time game. We played, discussed and implemented a variety of approaches, and finally came up with a heuristic-based program, which became first in the IEEE Congress on Evolutionary Computation (CEC 2007) competition and was the best known agent for the game for two years. See it in action!

2007 Complexity of graph homomorphisms

I studied cylindrical construction, a concept introduced by Amir Daneshgar in order to prove hardness results for the graph homomorphism problem. I was able to move a few small steps towards the Bang-Jensen and Hell’s conjecture about the dichotomy of digraph homomorphism problem, but meanwhile, the whole conjecture was proved in this paper.


Last update: 6 June 2021. Go back to my homepage.