2009年5月11日

Rapid object detection using a boosted cascade of simple features

"Rapid object detection using a boosted cascade of simple features," Paul Viola and Michael Jones, CVPR, 2001.

This paper describes a machine learning approach for visual object detection which is capable of processing images extremely rapidly and achieving high detection rates. This work is distinguished by 3 key contributions.


1. Integral Image:
 A new image representation with 3 kinds of feature. A simple integral image at area A is the sum of the pixels in white area subtract the sum of other pixels in gray area (figure below).

2. A Learning algorithm based on adaboost


3. The Attentional cascade



read more...

Semantic Texton Forests for Image Categorization and Segmentation

“Semantic Texton Forests for Image Categorization and Segmentation”, J Shotton, M Johnson, R Cipolla, CVPR, 2008.

This paper proposes semantic texton forests as new low-level feature, which is efficient compared with k-means clustering. This paper also presents the bag of semantics textons, which is computed over the whole image for categorization and local rectangular regions for segmentation. Finally, this paper uses an image-level prior for segmentation based on SFT and BOST.



1. Semantic Texton Forests (STF): 
 a. In training, generate randomized decision trees by the following steps.
  i) At root, randomly select a small subset I’ of dataset I.
  ii) Spilt into left and right subsets (Il, Ir) by split function f and threshold t, and repeat  splitting until leaf node.
  iii) Repeat i) and ii) for T times to generate T trees.
 b. Feature extraction: a path from root to leaf and a class distribution at leaf.

2. Bags of Semantic Textons (BOST):
 a. A prior estimate in a given region (the region could be the whole image).
 b. Semantic texton histogram: counts of each visited node of every pixels in the region.
 c. Region prior: average class distribution of each visited leaf node.

3. Image-level Prior (ILP):
 a. Emphasize the likely categories and discourage unlikely categories.
 b. Multiply the distributions by parameter α to soften the prior.




read more...

AnnoSearch: Image auto-annotation by search

XJ Wang, L Zhang, F Jing, WY Ma, AnnoSearch: Image auto-annotation by search, CVPR, 2006.

This paper proposes a novel way to annotate images by leveraging search and data mining technologies based on the framework below.




The AnnoSearch system, its input is an image and a keyword which describes a concept of this image. As the above figure, the framework contains 3 stages:

1. Text-based search
Given the keyword, the system do text-based retrieval on a large-scale and high-quality Web image database and get the retrieved images.

2. Content-based search
Given the retrieved images by above text-based search, the system does content-based search to ensure the visual similarity. For scalability, this paper adopts a hash encoding algorithm.

3. Learning annotations by clustering
After finishing the above retrieval stages, the system uses an effective clustering technique called Search Result Clustering (SRC) to cluster the retrieved images and generate readable name with each cluster. The system finally annotates the given image with the names of the clusters whose scores is larger than a certain threshold.



read more...