Saturday, September 22, 2007

Compressed Sensing

Via a link posted by Jeff Hammerbacher, an interesting guy from Facebook, comes my current fascination with Compressive Sensing.

Compressive Sensing is a subfield of remote sensing study that attempts to resolve some of the difficult issues that arise when distributing a large number of small, failure-prone, and power-limited sensing devices. Suppose you have 1000 sensors, which need to work on a single battery for several years, and of which the expected failure rate is 20%. How to do that and get useable data at the end of the survey without breaking the bank is a difficult question.

Compressive Sensing then enters the picture. Taking queues from post-data-collection lossy compression, such as JPEG, which substantially reduces the amount of information required to faithfully represent the original information, Compressive Sensing asks "If we're going to throw all that data away in the end, after collecting the data, could we simply not collect the data to begin with?," which is tantamount to asking "Can we only record what will be important in the end?"

Of course, intuitively if you know what's going to be important in the end, there's really not that much point in recording it. We don't record things that we already understand, so how can you determine ahead of time what will be needed? It seems to go against all fundamentally held conceptions in data mining -- collect EVERYTHING, figure out what's important later.

Compressed Sensing, however, makes the incredibly counterintuitive leap that one should actually record only a completely random subset of the data random linear combination of the data (see update below), not paying any attention to the question of what will be important in the end. Linear algebra will come to your rescue and reveal anything you want to know by discerning its tertiary effects on what was actually collected.

Some useful links:

How does this relate to Machine Learning, and why am I interested in it here? Well, partially it's just mad cool, and uses some incredible mathematics, plus I'm in love with Terence and drip over his every word. But also, perhaps more legitimately, Compressive Sensing has many interesting applications to dimensionality reduction. Just as one can represent an image by recording a completely random subset of the image, one can represent a high-dimensional feature space by taking completely random projections. That set of projections can then be used in your favorite algorithms, and Compressive Sensing comes to the rescue at the end when it's time to re-interpret the algorithm's output.

Igor Carron of Machine Learning (Theory) gives a nice overview of its potential impact: http://hunch.net/?p=273

Update: Igor Carron points out in the comments below that it's not a completely random subset of the data that's needed, but specifically a projection of the source data with a basis incoherent with that of the source data. A set of random linear projections turns out to do the trick.

3 comments:

Igor said...

Peter,

Nice post. It's not just exactly " a completely random subset of the data", because these measurements really need to include all the data. It is just that a random linear combination of them is enough.

Also, as you know, it does not need to be random, it just needs to be incoherent:

http://nuit-blanche.blogspot.com/2007/09/compressed-sensing-how-to-wow-your.html

FYI, I have all my links on the subject here:
http://igorcarron.googlepages.com/compressedsensing
and here in the blog form:
http://nuit-blanche.blogspot.com/search/label/compressed%20sensing

Igor.

Peter Dolan said...

Hi Igor,

Thanks for the clarification, I've updated the post text above to correct the inaccuracy. I've been looking around the internet and my textbooks for a precise definition of incoherent, but found only a lot of its usage and none of its definition. Happen to have one?

Thanks!

Igor said...

Peter,

As found in the paper from Baraniuk, http://www.dsp.ece.rice.edu/cs/compressiveSensing-IEEE-SPMag-LectureNotes-15web.pdf

"An alternative approach to stability is to ensure that the measurement matrix  is incoherent
with the sparsifying basis in the sense that the vectors {j} cannot sparsely represent the
vectors { i} and vice versa [1, 2, 4]. The classical example features delta spikes and Fourier sinusoids
playing the roles of {j} and { i}; the Fourier uncertainty principle immediately yields the
incoherence."

Which is why Candes mentions the fact that while incoherence is problematic in quantum mechanics (cannot have both location and velocity at the same time), it is godsend in CS.

This is also why I took diracs and sines in the example I mentioned earlier.

Igor.
http://nuit-blanche.blogspot.com