AI Dynamics

Global AI News Aggregator

REGULATION

Public vs Proprietary Datasets in AI Model Training

By

@thegautamkamath

–

15 December 2022 17h04

Thanks Alex. I like the papers that do this, but I also have some concern when this is done on a dataset that is proprietary and only Google has access (JFT). I would like to see a version that is pretrained on LAION. This still has privacy issues, but is at least all public.

→ View original post on X — @thegautamkamath,

15 December 2022
Privacy in AI: Beyond Training and Model Usage

By

@thegautamkamath

–

15 December 2022 14h38

And finally, privacy is… hard! While a lot of work focuses on training and using models privately, this is a narrow view of privacy, which encapsulates much more. 14/n

→ View original post on X — @thegautamkamath,

15 December 2022
Privacy-Respecting Public Pre-Training Datasets for AI Models

By

@thegautamkamath

–

15 December 2022 14h38

So where do we go from here? We conclude with a number of suggestions for the field. The first ones focuses on making sure we have public pre-training sets which are truly privacy-respecting. Can we make such a dataset/model with comparable utility to what people use now? 12/n

→ View original post on X — @thegautamkamath,

15 December 2022
Public Data vs Private ML Training Ethics

By

@thegautamkamath

–

15 December 2022 14h38

1. Publicly available data is not the same as public data. For example, http://
insecam.org has livestreams from videocameras with default passwords. This is publicly available. But it certainly should not be used to train an ML model which purports to be "private." 6/n

→ View original post on X — @thegautamkamath,

15 December 2022
Privacy Challenges in Public Data Pretraining and Fine-tuning

By

@thegautamkamath

–

15 December 2022 14h38

Seems great, right? Public data is plentiful online, we can just download tons of it, pretrain our models with this public data, and do fine-tuning privately! Privacy is solved! Of course not, and we highlight three (orthogonal) considerations for these settings. 5/n

→ View original post on X — @thegautamkamath,

15 December 2022
Differential Privacy Challenges in Large-Scale AI Pretraining

By

@thegautamkamath

–

15 December 2022 14h38

New paper w Nicholas Carlini & @florian_tramer
: "Considerations for Differentially Private Learning with Large-Scale Public Pretraining." We critique the increasingly popular use of large-scale public pretraining in private ML. Comments welcome. https://
arxiv.org/abs/2212.06470 1/n

→ View original post on X — @thegautamkamath,

15 December 2022
The Rise and Fall of Peer Review System Integrity

By

@steipete

–

15 December 2022 12h58

Do a little fraud // get a paper published // get down tonight https://
experimentalhistory.substack.com/p/the-rise-and
-fall-of-peer-review
…

→ View original post on X — @steipete,

15 December 2022
OpenAI’s Missing GPT Detector Tool Release

By

@thom_wolf

–

15 December 2022 9h46

I'm a bit surprised OpenAI didn't release an updated GPT detector together with ChatGPT. Seems like something that would be useful for the impacted sectors, no?

→ View original post on X — @thom_wolf,

15 December 2022
Fair Compensation for Artists’ Work Used by AI Systems

By

@jeremyphoward

–

15 December 2022 7h29

I think there's also a discussion about "can it be fair for people to benefit from artists' work without them being compensated", regardless of whether it's legal. I also think coders should feel the same way about this as they feel about their code being used by OpenAI Codex.

→ View original post on X — @jeremyphoward,

15 December 2022
Regulation Ineffective for Protecting Industries from Automation

By

@jeremyphoward

–

15 December 2022 7h13

I agree 1) is stronger than 2). Although I'm not sure it's *really* strong. In general, there's lots of good arguments *against* using regulation to prop up industries that are impacted by automation.

→ View original post on X — @jeremyphoward,

15 December 2022