1
Robust Building Identification for
Mobile Augmented Reality
Vijay Chandrasekhar, Chih-Wei Chen, Gabriel Takacs
Abstract—Mobile augmented reality applications have received
considerable interest in recent years, as camera equipped mobile
phones become ubiquitous. We have developed a “Point and
Find” application on a cell phone, where a user can point
his cell phone at a building on the Stanford campus, and get
relevant information of the building on his phone. The problem
of recognizing buildings under different lighting conditions, in the
presence of occlusion and clutter, still remains a challenging prob-
lem. Nister’s Scalable Vocabulary Tree (SVT) [1] approach has
received considerable interest for large scale object recognition.
The scheme uses heirarchical k-means to create a vocabulary
of features or “visual words”. We first show how we can use
a SVT and an entropy-based ranking metric to achieve 100%
recognition on the well known ZuBuD data set [2]. We present a
SVM kernel-based extension to the SVT approach and show that
it achieves a 100% recognition rate as well. We discuss the short-
comings of the ZuBuD data set, and present a more challenging
Stanford-Nokia data set, with promising results.
I. INTRODUCTION
H IGH-END mobile phones have developed into capa-
ble computational devices equipped with high-quality
color displays, high-resolution digital cameras, and real-time
hardware-accelerated 3D graphics. They can exchange infor-
mation over broadband data connections, and be aware of their
locations using GPS. These devices enable many new types of
services such as a car or pedestrian navigation aid, a tourist
guide, or a tool for comparison shopping. For many of these
services knowing the location is a critical clue, but that is not
enough. For example, a tourist can be interested in finding
information on several objects or stores visible from that
location. Pointing with a camera-phone provides a natural way
of indicating ones interest and browsing information available
at a particular location. O