DistBelief supported convolutions and end-to-end training with convolutions via backprop before the ICML 2012 paper appeared (even though that particular paper didn't use convolutions).
DistBelief Convolutions Backpropagation Training Before ICML 2012
By
–