Abstract: The web is not yet perfect: while text is easily searched and organized, pictures (the vast majority of the bits that one can find online) are still digital dark matter. In order to see how one could make pictures first-class citizens of the web, I explore the idea of Visipedia, a visual interface for Wikipedia that is able to answer visual queries and enables experts to contribute and organize visual knowledge. Five distinct groups of humans would interact through Visipedia: users, experts, editors, visual workers, and machine vision scientists. The latter would gradually build automata able to interpret images. I explore some of the technical challenges involved in making Visipedia happen. I argue that Visipedia will likely grow organically, combining state-of-the-art machine vision with human labor and and I will present experiments suggesting that machines and humans may be combined into a seamless visual information processing system.
Joint work with P. Welinder, S. Belongie, S. Branson, K. Wah