2009年6月11日

AdaRank: a boosting algorithm for information retrieval

"AdaRank: a boosting algorithm for information retrieval," Jun Xu, Hang Li, SIGIR 2007.

This paper proposes an boosting algorithm AdaRank of the issue learning to rank for document retrieval. According to the essence of AdaBoost, AdaRank repeatedly constructs "weak rankers" on the basis of re-weighted data, and the final prediction function is also the linear combination of those "weak rankers".

The evaluation of AdaRank is MAP (Mean Average Precision) and NDCG (Normalized Discounted Cumulative Gain).

Here below is the algorithm of AdaRank



Based on Adaboost, AdaRank is a novel attempt for ranking methods. Although the accuracy of the real experiment is not so good as my expectation, I think the efficiency may be much better than some previous work.

read more...

Support vector learning for ordinal regression

"Support vector learning for ordinal regression," R. Herbrich, ICANN, 1999.

This paper proposes a new learning task for ordinal regression which is complementary to both classification and metric regression because of discrete and ordered outcome space. The formulation for this task is to formalize as a problem of binary classification by minimizing pairwise 0-1 loss.



,
where U is the rank function to output the score.

Here below is an toy example from this paper, the figure(b) is the mapping result for figure(a), and after mapping, we can simply find the margin.



This paper considers the simply concept of SVM, and uses it for ranking. Although inefficiency and not-so-good accuracy, it's still a guide for us to think about ranking problems.


read more...

2009年5月11日

Rapid object detection using a boosted cascade of simple features

"Rapid object detection using a boosted cascade of simple features," Paul Viola and Michael Jones, CVPR, 2001.

This paper describes a machine learning approach for visual object detection which is capable of processing images extremely rapidly and achieving high detection rates. This work is distinguished by 3 key contributions.


1. Integral Image:
 A new image representation with 3 kinds of feature. A simple integral image at area A is the sum of the pixels in white area subtract the sum of other pixels in gray area (figure below).

2. A Learning algorithm based on adaboost


3. The Attentional cascade



read more...

Semantic Texton Forests for Image Categorization and Segmentation

“Semantic Texton Forests for Image Categorization and Segmentation”, J Shotton, M Johnson, R Cipolla, CVPR, 2008.

This paper proposes semantic texton forests as new low-level feature, which is efficient compared with k-means clustering. This paper also presents the bag of semantics textons, which is computed over the whole image for categorization and local rectangular regions for segmentation. Finally, this paper uses an image-level prior for segmentation based on SFT and BOST.



1. Semantic Texton Forests (STF): 
 a. In training, generate randomized decision trees by the following steps.
  i) At root, randomly select a small subset I’ of dataset I.
  ii) Spilt into left and right subsets (Il, Ir) by split function f and threshold t, and repeat  splitting until leaf node.
  iii) Repeat i) and ii) for T times to generate T trees.
 b. Feature extraction: a path from root to leaf and a class distribution at leaf.

2. Bags of Semantic Textons (BOST):
 a. A prior estimate in a given region (the region could be the whole image).
 b. Semantic texton histogram: counts of each visited node of every pixels in the region.
 c. Region prior: average class distribution of each visited leaf node.

3. Image-level Prior (ILP):
 a. Emphasize the likely categories and discourage unlikely categories.
 b. Multiply the distributions by parameter α to soften the prior.




read more...

AnnoSearch: Image auto-annotation by search

XJ Wang, L Zhang, F Jing, WY Ma, AnnoSearch: Image auto-annotation by search, CVPR, 2006.

This paper proposes a novel way to annotate images by leveraging search and data mining technologies based on the framework below.




The AnnoSearch system, its input is an image and a keyword which describes a concept of this image. As the above figure, the framework contains 3 stages:

1. Text-based search
Given the keyword, the system do text-based retrieval on a large-scale and high-quality Web image database and get the retrieved images.

2. Content-based search
Given the retrieved images by above text-based search, the system does content-based search to ensure the visual similarity. For scalability, this paper adopts a hash encoding algorithm.

3. Learning annotations by clustering
After finishing the above retrieval stages, the system uses an effective clustering technique called Search Result Clustering (SRC) to cluster the retrieved images and generate readable name with each cluster. The system finally annotates the given image with the names of the clusters whose scores is larger than a certain threshold.



read more...

2009年3月22日

Nonlinear dimensionality reduction by locally linear embedding

"Nonlinear dimensionality reduction by locally linear embedding," Roweis & Saul, Science, 2000.



This paper present an unsupervised learning algorithm, locally linear embedding (LLE), which can construct a neighborhood preserving mapping based on the steps below:

  1. Assign K neighbors for each data point Xi.
  2. Reconstruct Xi from its neighbors with linear weight, by minimize the cost function
  3. Compute the low-dimensional embedding vector Yi, by minimize the function

read more...

Eigenfaces for Reconition

"Eigenfaces for recognition," M Turk, A Pentland - Journal of Cognitive Neuroscience, 1991.




This paper propose a feature for dimension reduction. The significant features are known as "eigenfaces", and the center idea is to project face images onto the eigenfaces based on PCA. The main procedure is:
  1. Given a set of face images :
  2. Define the average face :
  3. Each face differs from the average face :
  4. Construct the M by M matrix L = A'A in replace of the covariance matrix C since there are at most M-1 meaningful eigenfaces, where M is the number of training data and
  5. Now we can use L to find eigenfaces :

I think the "eigenfaces" is an intelligent and interesting idea in 1990s like today's SIFT and visual words, even though the times that computer is not so powerful nowadays. Although the approach is not work with unclear face images, it still gives us a idea for handling today's multimedia data.


read more...

2009年3月18日

[Note] 20090325


  • feature significance (without label / subset)
    • tf-idf, c-tf-idf
  • feature significance (with label / subset)
    • MI, x^2, correlation (subset)
  • exploiting feature correlation (without label / transform)
    • PCA, manifold methods (LLE, ISOMAP)
  • exploiting feature correlation (with label / transform)
    • fisher (LDA)
  • optimizing classification (subset)
    • adaboost, maximum entropy
  • exploiting hidden semantics (topics)
    • pLSA, SVD, LSI, Information Bottleneck




read more...

2009年3月15日

ContextSeer: Context Search and Recommendation at Query Time for Shared Consumer Photos

"ContextSeer: Context Search and Recommendation at Query Time for Shared Consumer Photos," Yi-Hsuan Yang, Po-Tun Wu, Ching-Wei Lee, Kuan-Hung Lin, Winston H. Hsu, ACM Multimedia 2008.




The advent of media-sharing sites has drastically increased the volume of multimedia resources on the web, yet these collections become more difficult for searching and navigating result from their magnitude. This paper present a novel search system called ContextSeer to improve search quality:
  1. To develop wc-tf-idf, by improving the concept tf-idf with more emphasis on the higher-ranked object, since the context cues of lower-ranked objects should be less important than those of higher- ranked ones.
  2. To propose a efficient canonical image selection algorithm cannoG, which generates multiple representative views related to the query.
  3. To form a new benchmark Flickr550, which contains 0.5 million public photos from Flickr through Flickr API, for evaluating ContextSeer.


read more...

2009年3月4日

Distinctive Image Features from Scale-Invariant Keypoints

"Distinctive Image Features from Scale-Invariant Keypoints," Lowe, IJCV, 2004.




This paper presents a new method for image matching, which is called Scale Invariant Feature Transform(SIFT). SIFT can extract distinctive features from image data, and the features are invariant to image rotation and scale, and robust matching across affine distortion, addition of noise, and change in illumination.

  1. Detection of scale-space extrema: use difference of Gaussian(DOG) to identify the candidate keypoints.
  2. Accurate keypoint localization: do keypoint selection for rejecting some unstable candidate keypoints.
  3. Orientation assignment: assign an orientation to each keypoint based on local image properties for invariance to image rotation.
  4. Local image descriptor: tranform into a representation with above parameters.

I think local points is an creative idea that I have learnt in retrieve system, and the accuracy of SIFT is also amazing to me. Clustering had been move on a big step since SIFT.



read more...

2009年2月27日

How to give a good research talk

"How to give a good research talk," Jones et. al.




As a new graduate, I have no experience to present my work at the stage since I do not have work. I only have the experience to present other person's work at meeting. Therefore, I feel hard to understand some suggestion of the paper such as don't put outlines in the start of the slides, don't start preparing slides early.



I summarize the paper as below.


What to say?

  • Using examples to prevent your talk from too abstract, since your talk will persuade your listeners to read your paper.
  • Do not waste time on an abstract description of what the talk is about, just try to jump straight in with an example.
  • Be honest to talk about the difficulties in this paper, let the audience help your research!

How to prepare slides?

  • Don't start with contents slidessuch as outline, it wastes your precious minute.
  • Use some aids such as overhead projector.
  • Don't start writing slides too early.


When giving the talk,

  • Try steady, deep breathing and relaxation exercises beforehand, and make your eyes contact with audience on the stage.
  • Do not over-run.


read more...

How to Read a Paper

"How to Read a Paper," Keshav, ACM SIGCOMM Computer Communication Review 2007



I think this paper provide a good way to survey papers.

I use the two steps to survey paper before.

  • First, try to read abstract, introduction, and conclusions to see if the idea is interesting to me.
  • Second, read the contents carefully.
However, I always get stuck in the second step, and I found the reason after I finish reading this paper. The most problem is that I still not have sufficient knowledge to catch the paper, since I never read the reference. I will try to use the third pass to read paper from now on.

As doing a literature survey, I always go to the top conferences' website to see the recent interesting topics and the relative paper.


I summarize the paper as below.

The Three-pass apporach
  • 1st pass
    To read title, abstract, introduction, and conclusions, try to get the overall idea of the paper.
    => Able to answer five Cs : Category, Context, Correctness, Contributions, Clarity.
  • 2nd pass
    To grasp the content of the paper, including the related figures and diagrams.
    => Summarize the main idea of the paper, and then go to 3rd pass if in need.
  • 3rd pass
    To try to virtually re-implement the paper.
    => The most difficult step, but will be really realize the paper after this pass.

Doing a literature survey

  1. Use academic search engine (Ex:Google Scholar) to find some recent papers.
  2. Find shared citations and repeated author names.
  3. Go to website for some top conferences and look through their recent topics.


read more...

2009年2月26日

New Blog

It's the blog about the class AMMAI.



read more...