samedi, février 27, 2010

Machine Learning in Picasa

I was (re-)trying out the software version of Picasa yesterday and found a new exciting face recognition feature. It's not really that new since it has existed in Picasa for more than a year already. (See here) In contrast to Facebook, where you the user have to pinpoint where are the faces in a picture, Picasa automatically detects the presence of a face and fits it in a rectangle of appropriate size. This technology is actually very common now in point-and-shoot digital cameras, but surprisingly, it seems to be kinda new for photo album software. So this is the first part of the new feature, face detection. It makes sure that only a human-being's face is selected, not a dog's or a monkey's. These two pages [1], [2] contain a survey of some face-detection algorithms.
Now after a face is boxed out in a picture, I'll be able to enter the name of that person, and the identity will be linked to the corresponding contact in my Google address book. The next part of the 'new' feature heavily involves classification tasks: Picasa searches through my entire photo collection on my hard drive, detects the faces (if there are any), and classifies the identities of the faces based on the trained examples, i.e. those faces that I have specified names for. It then lists all the faces that the underlying classification algorithm determines as matches and asks me to label the results as a correct or wrong match. The procedures repeat when new photos are added to my photo library. This is a typical supervised-learning process. What algorithms do they use? Possible: Bayesian approaches, SVM, hybrids, or maybe something completely novel.
An initial tryout shows that the face classification algorithm that Picasa uses is quite impressive in terms of mis-classification rate. With less than 10 trained examples that I provided at the beginning for my parents and myself, Picasa subsequently made identity suggestions of about 300 faces that it detects in my library, and only less than 20 were wrong. There are also some interesting observations for the misclassified faces. For instance, those for my mum all came from her siblings, indicating that they actually look alike. (Of course, they do.)
The computational performance, however, is not quite up to my satisfaction. Since I have a large photo gallery, it took really a long time to perform the above tasks. Furthermore, as the tasks were running in the background, they consumed about 80% of the CPU power, and my cooling fan was making a lot of noise because of the large amount of heat generated. Apparently it's not a battery-friendly or mobile-oriented application. Nevertheless, it's a refreshing feature that Google has implemented here.

Aucun commentaire: