Abstract
While humans can readily identify basic-level categories such as table, turtle, or trumpet, recognition of subordinate-level categories within a domain (e.g. species of birds or make/model/year of vehicles) is very difficult and typically requires extensive experience or expertise with a given domain. To date, research efforts to develop computational approaches for the recognition of such subordinate or “fine-grained” categories have largely sought to apply the same techniques used for basic-level recognition, only on a larger scale (more categories).
In this talk, I will describe directions that we are currently pursuing in my lab to address the specific challenges inherent in fine-grained recognition. The key underlying paradigm is a pose-normalized representation which pairs a captured domain-level model of geometry with category-specific appearance models. This representation enables objects to be perceived independent of pose, articulation or viewing angle. Distinguishing features are learned and recognition is performed in this pose-normalized space. I will conclude by discussing the integration of human domain expertise into computational models and the diverse applications of fine-grained recognition.