AI Dynamics

Global AI News Aggregator

Von Neumann’s Hypothetical Transformer Architecture Hyperparameters

I wonder if von Neumann had a large d_model, n_layer, head_size or block_size, or kv cache. All of these hyperparams might manifest slightly different.

→ View original post on X — @karpathy,

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *