The Point-to-Point (P2P) correspondence goes way back in Computer Vision theory. It holds great interest in both automated surveillance and in some techniques of motion capture. If solved correctly, objects in a video can be tracked regardless of noisy data or scenarios like occlusion. A human eye solves this problem trivially because of the effective use of all the visual cues available in sight. Moreover a human brain can trivially extract high-level information from the scenes and use it to draw useful conclusions (e.g. if a person got into a car, and the car drove away into the distance, probably you wouldn't see the person walking again).
In a crowd of people, tracking a single person for a human is usually trivial (given a good line of sight). Extending on this idea, given multiple observers you could possibly track all the people in the crowd. But for a machine, performing a similar task might be quite difficult. This is because, unlike the human brain, a machine processing a video cannot automatically infer temporal (over multiple frames) associations between different objects of interest. So this project explores ideas on how to efficiently compute those associations and generate reasonable tracks for all objects of interest. The result would be a trajectory of all the foreground objects in the video.
As an example, look at the 4 frames below (left to right, top to bottom). They were extracted from a video (by photoshop-ing the background off) after every 25 frames. If you consider a distinct object as a fully connected body, then you can see that tracking a person is surely non-trivial. Look at the errors, like the one in the lower left frame, where the man in the middle, has a broken off leg. A good point-to-point correspondence tracker will mask such errors. Although these frames have been synthetically generated, the errors can be considered quite real because such scenarios usually occur as a result of many background subtraction techniques.
This project has been partly funded by the NSF Award No. 0534438 "MotionSearch: Motion Trajectory-Based Object Activity Retrieval and Recognition from Video and Sensor Databases"