oh yes I've been thinking about some similar things. so you want to condense sequences of token embeddings into a single vector to save compute. I think my research shows this should work in theory (you can condense lots of text into a vector without losing information)
Condensing Token Embeddings into Single Vectors for Compute Efficiency
By
–
Leave a Reply