Not sure what the more recent and much larger Gemma-2 models (2B, 9B and 27B parameters) have to do with this. The main issue is that your paper claimed that training a small 213m parameter transformer model emitted 284t CO2e, when in fact the correct number is 0.087 t CO2e, or
Paper’s CO2 emissions claim for small transformer model corrected
By
–