Computer Vision Fundamentals - Autumn 2002

Intructor: Dr Sohaib A. Khan
sohaib at lums dot edu dot pk
http://web.lums.edu.pk/~sohaib
  Office hrs:
Mon 930 am - 11 am
Wed 930 am - 11 am
Fri 11am - 1 pm

This course is designed to give a broad overview of the fundamentals of computer vision, laying the foundations for advanced graduate classes in vision. This course will be conducted with an application perspective. Therefore students will be expected to implement several techniques learnt in the lectures. A good calculus, discrete mathematics and programming background is expected for this class. Knowledge of probability and random processes is a plus, but will not be assumed. This class is designed for senior-level and introductory graduate-level students of Computer Science.

This class will not overlap significantly with the Digital Image Processing class offered last quarter, though students who have taken DIP will have an advantage in the beginning.

Textbook:
Computer Vision
Linda G. Shapiro, George C. Stockman
ISBN 0-13-030796-3

Prentice Hall, 2001


Posted 03/09/02

Lecture 1: Introduction. [Download slides].
Material in these slides (including images, data etc.) must not be used for commercial benefit without instructor's written permission.

Course Outline

Chapter 1 reading material is available from the website of one of the authors. Click here. Sec 2.5 is not available online, so reading assignment is reduced to Ch 1 only.

Useful links dicussed in class:
http://www.cs.ucf.edu/~vision/projects/projects.html
http://www.wearcam.org, in particular, look at research, and video orbits. The Claire sequence shown in class is from this page. Other mosaics results were from Cen Rao's webpage at UCF, who has implemented the technique of Bergen et. al.

MATLAB's Primer is a very good introduction to this environment and should be used by students who are not familiar with MATLAB.

Program 0: PGM and PPM formats: Submission deadline - 10 Sept 2002
One way to view PPM and PGM files on your computer is to download free imageviewer called IrfanView. It also supports conversions between several other formats and is a very useful tool for the type of work that we will do in this class.
See class notes (slides) for program details.
Test images: mecca06.pgm, mecca06.ppm


Posted 06/09/02

Lecture 2: Continuation of Introduction, and Getting Started with MATLAB - Download slides.

For more details of the video segmentation project (with examples from Larry King Live), visit this webpage. A related project from the same group is on movie genre classification from previews.

For the Orlando Police Dept project mentioned in the lecture, visit the OPD surveillance project website. This contains several interesting videos which demonstrate tracking output in real-time.

Transcript of the help session on MATLAB (2nd half of the lecture) is also available. Click here for the sample m-file function demonstrated in class. Though I did not mention this in class, but the input to this function can be a matrix of any size (I demonstrated it only with a scalar number). Working through the MATLAB primer is highly recommended for everyone.

Some students have asked me questions about Program 0. Here is what you need to do:


Posted 10/09/02

Lecture 3: 2D Imaging Transformations - Download slides.

More details of the video geo-registration project are available at the project website.

Reading assignment 2.1-2.5

Program 0 can be submitted on a floppy disk, as a hard copy, or on the common drive at J:\Cs436\Program0


Posted 12/09/02

Lecture 4: Displacement Models - Download slides

Wearable computing links - Steve Mann, Thad Starner, paper referred to in class on transformations is available online.

Program 1: 2D Affine Warping: Due Date: Thursday 19th Sept. - For details, click here. Data mecca06.pgm, mecca06t.pgm

Note that there are no office hours now on Thursday. Instead, an additional hour at Fri 11:00 - 12:00 noon has been added.


Posted 18-09-02

Lecture 5: 3D Transformations - Download slides

End of Module 1.

Notice change in office hours (There are now 5 hours per week. Try to utilize them effectively)

Program 1 is due this Thursday (to be submitted electronically and demonstrated in the lab).

Homework 1 has been assigned (see lecture slides). Due next Thursday


Posted 24-09-02

Lecture 6 and Lecture 7: Binary Image Analysis - Edge Detection - Download slides

Homework 1 (written assignment) is due on Thursday


Posted 26-09-02

Lecture 8: Edge Detection - Download Slides

Program 2: Canny's Edge Detector. Due Monday 07-10-02 by 12:00 noon
Submit at J:\CS436\Program2
Please name your folder with your student id number (no-hyphens) so they appear in the same order as class listing. Any comments in the folder name, eg. resubmission or part2 etc, if necessary, should be after the student id number.


Posted 03-10-02

Lecture 9 - Binary Shape Analysis 1 - Download slides
Lecture 10 - Binary Shape Analysis 2 - Download slides

Program 2 - Requirements and Deliverables
Test Images - mecca06.pgm, rice.pgm, stc1.pgm, egg.pgm, cone.pgm

Hough Transform: A very nice java demo and description is available here.
For an interesting example application of HT and associated paper, click here.


Solutions:

Homework 1 - Download

Quiz 1 - Download

Additional office hours on Tuesday morning before exam


Posted 11-10-02

Lecture 11 (After midterm): Binary Morphology - Download slides

Program 3 Hough Transform for Lines: For handout click here.
Test Images:
sqr1-original, sqr1-canny, sqr1-canny with noise, sqr1-canny with breaks
ucfCsBldg-original, ucfCsBldg-canny, orldntwn-original, orldntwn-canny, angel-original, angel-canny


Solution to MidTerm

Program 3: For implementation help and intermediate/final outputs, click here.

The report that you write with final submission should include plenty of figures (as in my document above), The report should contain interesting implementation details as well as a good discussion on intermediate and final outputs for several examples. You should also comment on why a particular output makes sense or not, and what happens if some parameters are changed.


Canny's Implementation This webpage should clarify confusions in implementation of Canny, in particular regarding Non-Maxima Suppression, in which there were a lot of problems in the submitted homework. The document is written in the format of a sample report, to give you an idea of what is required in the report. Note that the algorithm is not explained in detail, but implementational details and discussion of output forms the bulk of the report.


Solution to Quiz 2
Homework 2 Due Tuesday 29th October, in class


Lecture 13: Morphology, Connected Components- Download slides
We also covered region properties like area, centroid, bounding box, circularity, 2nd moments, which should be covered from Sec 3.6 of text.

Lecture 14: Corrected Morphology slides - Download slides
In this lecture we also covered subtraction, which should be followed from text (section 9.1, 9.2). I also showed several examples of subtraction in class.

Combined handout for these three lectures (no need to print, will hand out photocopies of this in class)
Lecture 15: Brightness constancy constraint
Lecture 16: Pyramids (slides), Lucas-Kanade
Lecture 17: Global Motion

Program 4 (Global Motion) is worth 20 points. For this program, you need Warping, Reduce operation, and fx, fy, ft derivatives (as in the recent lectures). Complete details will be announced in Lecture 17, but everyone should get these modules working, if they haven't already done so. This program may be attempted in MATLAB, but that will require recoding of some of the work you have already done. Also, MATLAB implementation may be painfully slow, if you do not optimize your code. It is upto everyone to decide on their own whether they want to attempt in MATLAB or not. I will provide the solution in MATLAB, once the submission deadline has passed.

Test Images: penny.pgm, penny-translation.pgm, penny-scaling.pgm, penny-rotation.pgm
Transformations between these images are known, because of the way I generated them, using warping.

Transformation between
penny and penny-translation

1 0 10
0 1 20
0 0 1

Transformation between
penny and penny-scaling

1.2 0 -10
0 1.2 -10
0 0 1

Transformation between
penny and penny-rotation

cos(20) -sin(20) 20
sin(20) cos(20) -30
0 0 1

Since you already have a warping module, you can use any image, warp it according to some known transformation, and then recover the transformation to check your code.
Matrix Inversion: If you choose to implement in C/C++, you will need matrix inversion code. I am posting a set of routines from a book. I have written an example help file, which illustrates how the routines can be used to solve a system of the form Ax=b, (actually without computing the inverse, using LU-Decomposition). Click here to download example project. Or, you may search the net and find another option.


Homework 3


Test images to check your derivative masks: [a1.pgm, a2.pgm]. a2 image is translated version of a1 by (1,1), so fx * 1 + fy * 1 + ft should give zeros at most locations (except for four corners). These images can be used (without pyramids) to check your masks.

Simple Test data for bonus part [claire_0_5.zip] consists of 6 images, so you will have to recover 5 transformations, and warp each image back to the original location. There are two options after that, either you can blend them or paste them over one another. The results of each are [claire_0_5_blend.pgm, claire_0_5_paste.pgm]. Blending is more accurate for identifying errors, as errors are only visible on the edges of pasted images. However, the overall output looks nicer using paste-overs.

I will post longer test sequences later. You can generate your own test sequences out of large images using your warping function.


Help with Program4:

Intermediate outputs:
You can use images a1.pgm and a2.pgm at only one level to test your program. For this image pair, (u=1, v=1) for all pixels. If the images are thresholded such that the black region is zero, white region is 1, then the values of A (6x6 matrix) and B (6x1 matrix) are:

A B
25452
20370
1848
-1168
-526
-88
20370
21932
1848
-526
-1168
-88
1848
1848
168
-88
-88
-8
-1168
-526
-88
21932
20370
1848
-526
-1168
-88
20370
25452
1848
-88
-88
-8
1848
1848
168
1760
1760
160
1760
1760
160

If you use the images as they are, then the values will be 255^2 times the matrices above (why?).

The result of inv(A)*B for the above matrices (no matter whether you use 0-255 or 0-1 range) is

-1.7347e-016
-4.996e-016
1
0
0
1

Depending on the routine you are using for solving linear systems, the answer may differ slightly, but only after several places of decimal. For example, with the LUDCMP routine that I posted earlier on the website, the results are:

-1.12409e-009
1.11147e-008
1.00000
-3.15264e-009
2.41818e-008
1.00000

More test images:
You should try your program on the images given for Program 1 mecca06.pgm and mecca06t.pgm, and report the transformation. Also recover the transformation from mecca06t.pgm to mecca06.pgm and see if they are the inverse of each other. Try on at least one image of your own, by first warping the image according to some known transformation and then recovering the transformation and report on error. Also, report your transformation for the any pair of two images of the claire sequence (zip file of 6 images is given above).

Longer test sequence for bonus part:
lums-test.zip
. Note that this is in color so you will have to recover transformation from a gray version of this (use irfan view or write your own), and then apply it to all three layers to generate color mosaic.


Lectures 18-20 cover pattern recognition concepts dicussed in Chapter 4 and 10 of the text. In addition, fuzzy K-Means and the idea of EM algorithm was presented for clustering.Bayes classifier slides are available [here]. Example applications were also discussed [download]. Webpages for these applications are linked from the following webpage: http://www.cs.ucf.edu/~vision/projects/projects.html. For all the projects discussed in class, you should know the input/output relationship and the features in the classifiers for each problem.

For a detailed list of topics covered in the entire course, see this webpage