What is your research question, and why would word sequences reflecting the frequencies in an undocumented corpus be a valuable source of information in answering it? (And no, this isn't about GPT-4 only; that's just the one in question today.)
Corpus Frequencies as Research Data for LLM Analysis
By
–