We didn't investigate this very intensely but my intuition is that this acted like a curriculum, so that you could learn the "short" connections first, and start going backwards. C' B' A' A B C (learning A' to A translation is simple as there aren't many layers in between).
Curriculum Learning: Progressive Neural Network Training Strategy
By
–