A hard test of a LLM is ability to write a sestina, the hardest poetic form. Claude 3 is very good, and a much better writer, but struggles a little more than GPT-4 with form, messing up a few lines. Both can't pull off the envoi at the end Compare to a 3.5-class model like Grok
