ABSTRACT
Deep convolutional neural networks (CNNs) are central to modern computer vision systems. This talk covers two recent works which explore new ideas in CNN architectures and training procedures.
The first proposes a multigrid extension of CNNs. Here, network layers operate across scale space, consuming multigrid inputs and producing multigrid outputs; convolutional filters themselves have both within-scale and cross-scale extent. Multigrid structure enables such networks to learn internal attention and dynamic routing mechanisms, and use them to accomplish tasks on which standard CNNs fail.
The second constructs custom regularization functions for use in supervised training of CNNs. This technique is applicable when the ground-truth labels themselves exhibit internal structure; it derives a regularizer by learning an autoencoder over the set of annotations. Training thereby becomes a two-phase procedure. The first phase models labels with an autoencoder. The second phase trains the actual network of interest by attaching an auxiliary branch that must predict output via a hidden layer of the autoencoder.
Joint works with Tsung-Wei Ke and Stella X. Yu, as well as Mohammadreza Mostajabi and Gregory Shakhnarovich.