We welcome you to participate in the 7th NIPS Workshop on Optimization for Machine Learning, to be held at:
Montreal, Quebec, Canada, Dec 12th, 2014
Room 513e,f on Level 5
- Jean Bernard Lasserre (CNRS, France)
- Yoshua Bengio (University of Montreal, Canada)
- Amir Beck (Technion, Israel)
Title: The moment-LP and moment-SOS approaches in polynomial optimization
Abstract: For a certain class of optimization problems with positivity constrants, powerful positivity certificates of Real Algebraic Geometry allow one to express difficult positivity constraints in a way that can be exploited for efficient numerical computation by defining a hierarchy of convex relaxations which provides a monotone sequence of upper bounds that eventually tends to the solution of the original unrelaxed problem. We discuss the relative merits and drawbacks of both hierarchies of LP- and SDP-relaxations and their impact not only in optimization (and particularly combinatorial optimization) but also in many areas for solving instances of the so-called Genaralized Problem of Moments (GMP) with polynomial data (of which Global Polynomial Optimization is in fact the simplest instance).
(link to full abstract)
Title: Optimization challenges for deep learning
Abstract: Optimizing deep and/or recurrent networks is a fundamentally hard challenge because deep representations are obtained as the composition of many non-linearities, making the objective function a highly non-linear function of the parameters, which in turn makes it difficult for standard gradient-based methods to find good solutions. For deeper networks trying to learn tasks that involve the composition of subtasks, gradient-based optimization can get stuck in very poor solutions, even though we know that very good solutions exist. A benchmark task illustrating this problem is presented. The next question is why. We present results suggesting that it is not mostly because of local minima but rather because of saddle points with a flat landscape around the saddle points. Finally, we discuss potential solutions to this challenge, either based on new gradient-based optimization approaches aimed at dealing with saddle points, or by completely abandoning gradients and exploring approaches that generalize gradient propagation to target propagation, with the potential to handle the optimization of highly non-linear functions which may even involve discrete computation (e.g. neurons with discrete outputs).
Title: On the Minimization over Sparse Symmetric Sets
Abstract: We consider the problem of minimizing a smooth function over the intersection of a closed convex set and a sparsity constraint. We begin by showing how the orthogonal projection operator can be computed efficiently when the underlying convex set has certain symmetry properties. A hierarchy of optimality conditions for the problem is then established, and the theoretical superiority of coordinate-wise type conditions is established. Finally, based on the derived optimality conditions, we present several coordinate descent type algorithms for solving the sparsity constrained problem, and illustrate their performance on several numerical examples. This is joint work with Nadav Hallak.