Registering Ground and Satellite Imagery for Visual Localization

Report No. ARL-TR-6105
Authors: Philip David and Sean Ho
Date/Pages: August 2012; 28 pages
Abstract: Accurately determining the position and orientation of an observer (a vehicle or a human) in outdoor urban environments is an important and challenging problem. The standard approach is to use the global positioning system (GPS), but this system performs poorly near tall buildings where line of sight to a sufficient number of satellites cannot be obtained. Most previous vision-based approaches for localization register ground imagery to a previously generated ground-level model of the environment. Generating such a model can be difficult and time consuming, and is impractical in some environments. Instead, we propose to perform localization by registering a single omnidirectional ground image to a two-dimensional (2D) urban terrain model that is easily generated from aerial imagery. We introduce a novel image descriptor that encodes the position and orientation of a camera relative to buildings in the environment. The descriptor is efficiently generated from edges and vanishing points in an omnidirectional image and is registered to descriptors previously generated for the terrain model. Rather than constructing a local computer-aided design (CAD)-like model of the environment, which is difficult in cluttered environments, our descriptor measures, at equally spaced intervals over the 360° field of view, the orientation of visible building facades projected onto the ground plane (i.e., the building footprints). We evaluate our approach on an urban data set with significant clutter and demonstrate an accuracy of about 1 m, which is an order of magnitude better than commercial GPS operating in open environments.
Distribution: Approved for public release
  Download Report ( 1.075 MBytes )
If you are visually impaired or need a physical copy of this report, please visit and contact DTIC.

Last Update / Reviewed: August 1, 2012