|
Projects
Second-Order Methods for ML
We leverage second-order methods to improve ML optimization.
Jump Down:

Enhancing ML models with second-order information
Common approaches to training ML models come with some inherent disadvantages because they rely on mathematical methods involving only first-order information (direction of change), but not second-order information (rate of change). This project, titled “Scalable Second-order Methods for Training, Designing, & Deploying Machine Learning Models” aims to leverage second-order methods to improve ML optimization.
Efficient optimization algorithms are essential to enabling many applications of machine learning. However, optimization methods that use only first derivative information can suffer from slow convergence, poor communication, and the need for laborious hyper-parameter tuning. While second-order methods could mitigate many of these disadvantages, they are far less used within the ML community.
In this project, we are advancing the innovative application of second-order information to develop, implement, and apply novel methods to enhance the design, diagnostics, and training of ML models. To accomplish this, we are tackling challenges involved in training large-scale nonconvex ML models from four general angles: high-quality local minima; distributed computing environments; generalization performance; and acceleration. We also aim to develop efficient Hessian-based diagnostics tools for analyzing the training process as well as already-trained models.
To study improvements and applications for our proposed methods, we are developing implementations for both shared-memory and distributed computing environments in the context of improved communication properties; exploiting adversarial data; and the improvement of neural architecture design and search.
Project Team
Associated ICSI Group
ICSI Research Team
Michael Mahoney
View BioMichael W. Mahoney, PhD, is Vice President, Principal Scientist, and Group Lead for the AI and Big Data group at ICSI.
Amir Gholaminejad
View BioAmir Gholaminejad (Gholami), PhD, is a Research Affiliate at ICSI and an Associate Research Scientist at the Berkeley Artificial Intelligence Research and Sky Computing Labs at UC Berkeley.
About
Focus Areas
- Machine Learning (Supervised, Unsupervised, Reinforcement Learning)
Get in touch
Want to discuss opportunities to work with ICSI? We’d love to hear from you.
