Metric Learning with Convex Optimization
Source:
Computer and Information Science, University of Pennsylvania, Philadelphia, p.162 (2007)
URL:
http://www.weinbergerweb.net/kqw/Publications/Publications.html
Abstract:
Many machine learning algorithms rely heavily on the existence of a good measure of
(dis-)similarity between input vectors. One of the most commonly used measures of dis-
similarity is the Euclidean distance in input space. This is often suboptimal in many ways.
The Euclidean distance metric does not incorporate any side-information that might be
available and it does not take advantage of the data structure or specifics of the machine
learning goals. Ideally a metric should be learned for each specific task.
Recent advances in numerical optimization provide us with a powerful tool for met-
ric learning (and machine learning in general): Convex optimization. I will investigate
two approaches to metric learning based on convex optimization for two different data
scenarios:
The first algorithm, Large Margin Nearest Neighbor (LMNN), operates in a supervised
scenario. LMNN learns a metric specifically to improve k-nearest neighbors classification.
This is achieved through a linear transformation of the input data that moves similarly
labeled inputs close together and separates differently labeled inputs by a large margin.
LMNN can be written as a semidefinite program that could be applied to large data sets
with up to 60000 training samples.
The second algorithm, Maximum Variance Unfolding (MVU), is designed for an un-
supervised scenario. The algorithm finds a low dimensional Euclidean embedding of the
data that preserves local distances while globally maximizing the variance. Similar to
LMNN, MVU can also be phrased as a semidefinite program. This formulation gives
local guarantees and distinguishes the algorithm from prior work.