AI Dynamics

Global AI News Aggregator

kNN Gzip Method Outperforms Cosine Similarity on IMDb Reviews

A quick experiment that took a bit due to kNN scaling on large datasets: the kNN + Gzip method is a bit better than cosine similarity on count vectors. On the IMDb Movie review dataset:
– 70% test acc for gzip
– 65% test acc for cosine distance My (re)implementation code:

→ View original post on X — @rasbt,

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *