Site menu:

OPT2019
We welcome you to participate in the 11th OPT Workshop on Optimization for Machine Learning. This year's OPT workshop will be run as an independent event, co-located with NeurIPS in Vancouver. It will happen on December 14th, overlapping with the NeurIPS workshops, to facilitate easy switching between the two venues for attendees.
We are looking forward to an exciting OPT 2019!
Location: Exchange Hotel Vancouver
Accepted Papers
- How Good is SGD with Random Shuffling? — Itay M Safran (Weizmann Institute of Science) and Ohad Shamir (Weizmann Institute of Science)
- The Application of Multi-block ADMM on Isotonic Regression Problems — Junxiang Wang (George Mason University) and Liang Zhao (George Mason university)
- Sampled Quasi-Newton Methods for Deep Learning — Albert S Berahas (Lehigh University), Majid Jahani (Lehigh University) and Martin Takac (Lehigh University)
- Linearly Convergent Frank-Wolfe without Prior Knowledge — Geoffrey Negiar (EECS Berkeley), Armin Askari (UC Berkeley), Martin Jaggi (EPFL) and Fabian Pedregosa (Google)
- Decaying Momentum Helps Neural Network Training — John Chen (Rice University) and Anastasios Kyrillidis (Rice University)
- Near-Optimal Methods for Minimizing Star-Convex Functions and Beyond — Oliver Hinder (University of Pittsburgh), Aaron Sidford (Stanford) and Nimit S Sohoni (Stanford University)*
- Convergence Analysis of a Momentum Algorithm with Adaptive Step Size for Nonconvex Optimization — Anas Barakat (Telecom ParisTech) and Pascal Bianchi (Telecom ParisTech)
- On the Unreasonable Effectiveness of the Greedy Algorithm: Greedy Adapts to Sharpness — Alfredo Torrico (Polytechnique Montreal), Mohit Singh (Georgia Tech) and Sebastian Pokutta (ZIB)
- Choosing the Step Size: Intuitive Line Search Algorithms with Efficient Convergence — Sara Fridovich-Keil (UC Berkeley) and Benjamin Recht (UC Berkeley)
- Obtaining Regularization for Free via Iterate Averaging — Jingfeng Wu (Johns Hopkins University), Vladimir Braverman (Johns Hopkins University) and Lin Yang (UCLA)
- Accelerating boosting via accelerated greedy coordinate descent — Xiaomeng Ju (University of British Columbia ), Yifan Sun (University of British Columbia), Sharan Vaswani (Mila, Universit&eecute; de Montr&eecute;al) and Mark Schmidt (University of British Columbia)
- The nonsmooth landscape of blind deconvolution — Mateo Diaz Diaz (Cornell University)
- A Simple and Fast Distributed Accelerated Gradient Method — Chhavi Sharma (IIT Bombay), Vishnu Narayanan (IIT Bombay) and Balamurugan Palaniappan (IIT Bombay)
- On Adaptivity in Information-constrained Online Learning — Siddharth Mitra (Chennai Mathematical Institute (CMI)) and Aditya Gopalan (Indian Institute of Science (IISc), Bangalore)
- Estimating the Lipschitz constant of Neural Networks with Polynomial Optimization methods — Fabian R Latorre Gomez (EPFL), Paul Rolland (EPFL) and Volkan Cevher (EPFL)
- Minimizing approximately submodular functions — Marwa El Halabi (MIT) and Stefanie Jegelka (MIT)
- Breaking the Curse of Dimensionality (Locally) to Accelerate Conditional Gradients — Alejandro Carderera (Georgia Institute of Technology), Jelena Diakonikolas (University of California, Berkeley) and Sebastian Pokutta (ZIB)
- Byzantine Resilient Non-Convex SVRG with Distributed Batch Gradient Computations — Prashant Khanduri (Syracuse University), Saikiran Bulusu (Syracuse University), Pranay Sharma (Syracuse University) and Pramod Varshney ()
- Bundle Method Sketching for Low Rank Semidefinite Programming — Lijun Ding (Cornell University) and Benjamin Grimmer ()
- Escaping Saddle Points with Inequality Constraints via Noisy Sticky Projected Gradient Descent — Dmitrii Avdiukhin (Indiana University, Bloomington), Grigory Yaroslavtsev (Indiana University, Bloomington) and Chi Jin (UC Berkeley)
- Provable Benefit of Orthogonal Initialization in Optimizing Deep Linear Networks — Wei Hu (Princeton University) and Jeffrey Pennington (Google Brain)
- Using Dimensionality Reduction to Optimize t-SNE — Sandeep Silwal (MIT) and Rikhav Shah (UC Berkeley)
- Convex Duality and Cutting Plane Methods for Over-parameterized Neural Networks — Tolga Ergen (Stanford University) and Mert Pilanci (Stanford)
- Scaling parallel Gaussian process optimization with adaptive batching and resparsification — Daniele Calandriello (IIT), Luigi Carratino (unige), Alessandro Lazaric (FAIR), Michal Valko (Inria) and Lorenzo Rosasco ()
- Revisiting the Generalization of Adaptive Gradient Methods — Naman Agarwal (Google), Rohan Anil (Google), Elad Hazan (Princeton University and Google Brain), Tomer Koren (Google) and Cyril Zhang (Princeton University)