Time 
Speaker 
Title 

8:15am8:30am  Organizers
 Opening Remarks 
8:30am9:15am  Maryam Fazel
 Online Optimization, Smoothing, and Worstcase Competitive Ratio  [abstract] 
In Online Optimization, the data in an optimization problem is revealed over time and at each step a decision variable needs to be set without knowing the future data. This setup covers online resource allocation, from classical inventory problems to the Adwords problem popular in online advertising.
In this talk, we prove bounds on the competitive ratio of two primaldual algorithms for a broad class of online convex optimization problems. We give a sufficient condition on the objective function that guarantees a constant worstcase competitive ratio for monotone functions. We show how smoothing the objective can improve the competitive ratio of these algorithms, and for separable functions, we show that the optimal smoothing can be derived by solving a convex optimization problem. This result allows us to directly optimize the competitive ratio bound over a class of smoothing functions, and hence design effective smoothing customized for a given cost function.

9:15am9:30am  Guilherme Franca and Jose Bento
 Markov Chain Lifting and Distributed ADMM 
9:30am11:00am  POSTER SESSION (including coffee break 10:30am  11:00am) 
11:00am11:45am  Sebastien Bubeck
 Kernelbased Methods for Bandit Convex Optimization  [abstract] 
A lot of progress has been made in recent years on extending classical multiarmed bandit strategies to very large set of actions. A particularly challenging setting is the one where the action set is continuous and the underlying cost function is convex, this is the socalled bandit convex optimization (BCO) problem. I will tell the story of BCO and explain some of the new ideas that we recently developed to solve it. I will focus on three new ideas from our recent work http://arxiv.org/abs/1607.03084 with Yin Tat Lee and Ronen Eldan: (i) a new connection between kernel methods and the popular multiplicative weights strategy; (ii) a new connection between kernel methods and one of Erdos’ favorite mathematical object, the Bernoulli convolution, and (iii) a new adaptive (and increasing!) learning rate for multiplicative weights. These ideas could be of broader interest in learning/algorithm’s design.

11:45am12:00pm  Gauthier Gidel, Tony Jebara and Simon LacosteJulien
 FrankWolfe Algorithms for Saddle Point Problems 
12:00pm2:00pm  Lunch Break 
2:00pm2:45pm  Nicolas Boumal
 Semidefinite Programs with a Dash of Smoothness: Why and When the LowRank Approach Works  [abstract] 
Semidefinite programs (SDPs) can be solved in polynomial time by interior point methods, but scalability can be an issue. To address this shortcoming, over a decade ago, Burer and Monteiro proposed to solve SDPs with few equality constraints via lowrank, nonconvex surrogates. Remarkably, for some applications, local optimization methods seem to converge to global optima of these nonconvex surrogates reliably.
In this presentation, we show that the BurerMonteiro formulation of SDPs in a certain class almost never has any spurious local optima, that is: the nonconvexity of the lowrank formulation is benign (even saddles are strict). This class of SDPs covers applications such as maxcut, community detection in the stochastic block model, robust PCA, phase retrieval and synchronization of rotations.
The crucial assumption we make is that the lowrank problem lives on a manifold. Then, theory and algorithms from optimization on manifolds can be used.
Optimization on manifolds is about minimizing a cost function over a smooth manifold, such as spheres, lowrank matrices, orthonormal frames, rotations, etc. We will present the basic framework as well as parts of the more general convergence theory, including recent complexity results. (Toolbox: http://www.manopt.org)
Select parts are joint work with P.A. Absil, A. Bandeira, C. Cartis and V. Voroninski.

2:45pm3:00pm  Hongzhou Lin, Julien Mairal and Zaid Harchaoui
 QuickeNing: A Generic QuasiNewton Algorithm for Faster GradientBased Optimization 
3:00pm4:30pm  POSTER SESSION (including coffee break 3:00pm  3:30pm) 
4:30pm5:15pm  Ohad Shamir
 Oracle Complexity of SecondOrder Methods for FiniteSum Problems  [abstract] 
Finitesum optimization problems are ubiquitous in machine learning, and are commonly solved using firstorder methods which rely on gradient computations. Recently, there has been growing interest in *secondorder* methods, which rely on both gradients and Hessians. In principle, secondorder methods can require much fewer iterations than firstorder methods, and hold the promise for more efficient algorithms. Although computing and manipulating Hessians is prohibitive for highdimensional problems in general, the Hessians of individual functions in finitesum problems can often be efficiently computed, e.g. because they possess a lowrank structure. Can secondorder information indeed be used to solve such problems more efficiently? In this talk, I'll provide evidence that the answer  perhaps surprisingly  is negative, at least in terms of worstcase guarantees. However, I'll also discuss what additional assumptions and algorithmic approaches might potentially circumvent this negative result.
Joint work with Yossi Arjevani.

5:15pm5:30pm  Surbhi Goel, Varun Kanade, Adam Klivans and Justin Thaler
 Reliably Learning the ReLU in Polynomial Time 