Richard Bergmair's Media Library

ML Lecture #7: Nearest Prototype Methods

By nearest prototype methods, we mean methods such as k-means and k-nearest neighbour. These are also sometimes referred to as instance-based methods.

These methods work by storing individual instances of data points from the sample, or statistical aggregates over individual instances, and then making predictions about new and previously unseen instances by comparing them to the ones in store.

In this video lecture, we present the two basic ideas behind these methods. The first idea is the idea of abstracting over clusters of individual instances statistically, for example by using a measure of central tendency, leading to methods such as k-means, k-medians, k-medoids, etc. A new instance can then be classified by assigning the class label of the cluster of points which has its mean, its median, its medoid, etc. in the location closest to the location of the new instance.

The second idea is the idea of smoothing over outliers by taking a majority vote among the k-neighbours that are nearest, or the k-clusters which have their prototype nearest.

The machine learning method used at PANOPTICOM as part of our media monitoring solution uses similar ideas, although it cannot be described as a straightforward application of k-means or the like.