In addition to the CAISI evaluation, it would be useful if NIST conducted public tests of AI abilities as an independent evaluator – though those obviously should not be pre-release tests & can be done when models are public. Independent testing is important & getting expensive.