Classification by multilayer neural networks depends on the existence of appropriate features in the early hidden layers, so that the representations are linearly separable in the penultimate layer. By using hidden layers with just two or three units, the representational structure of the intermediate layers can be visualized. Time courses of the evolution of the hidden layer representations are visualized as animations. The visualizations reveal a tendency for the hidden unit image of the input space to collapse into a non-linear (warped) manifold of lower dimensionality; i.e., the weight matrix is (nearly) singular. A task which is not linearly separable in the input space is rendered linearly separable by the warping of the manifold. A current unresolved problem is to understand the ramifications of the collapse, and what it may indicate about intermediate representations in deep networks.
427 Thackeray Hall