Outstanding paper at #NeurIPS22 main idea: is it possible to figure out when test data is coming from classes unknown during training? First they prove an impossibility theorem, then give positive/constructive results to characterize learnability of OOD. https://
openreview.net/pdf?id=sde_7Zz
GXOE
…
SAFETY
-
Detecting Out-of-Distribution Data: Impossibility and Learnability Theorems
By
–
-

Stable Diffusion 2 Improves Safety Filter with SFW Focus
By
–
was catching up on this reading and noticed that the #NeurIPS2022 paper on “Red-Teaming the Stable Diffusion Safety Filter” is already out of date thanks to #StableDiffusion2 SD2 becoming a SFW "foundational txt2img" model means less spurious NSFW triggers! behold, dolphins!
-
Latest LLM Reads Text on Wall Warning of Impact
By
–
The latest LLM can read writing on a wall. It says "You're about to hit me."
-

Hallucination Risk in Generative AI Products and Market Response
By
–
Hallucination is an existential risk to any generative AI product, and people are blindly (irresponsibly?) forging ahead (and we're all tired and wary of "haha this was all AI!" rugpulls). Observing the different reaction to @MetaAI
's Galactica vs @metaphorsystems is instructive -
InstructGPT/RLHF tuning makes model assume all questions answerable
By
–
My guess is this is InstructGPT/RLHF rather than anything in the pre-training corpus. Tuning implicitly makes it assume all questions are answerable — it sees all text as “ ” and Q/A is a subset of that.
-
Stable Diffusion 2.0 Shows Quality Decline Compared to 1.5
By
–
plot twist: stable diffusion 2.0 looks quite a bit worse on the few prompts i've tried so far compared to 1.5 (even not including celebrities/artists). Running theory seems to be this is due to an aggressive data sanitization campaign since the original release (?).
-
Police Defunding Leads to Killer Robot Authorization Policy
By
–
Defund the police is having some unintended consequences. https://
missionlocal.org/2022/11/killer
-robots-to-be-permitted-under-sfpd-draft-policy/
… -
SBF’s AI Safety Funding as Justification for Crimes
By
–
Ostensibly, the funding to AI (especially AI safety) forms a key part of the motivation for SBF's crimes. Now, "motive" is never simple, but it's likely at least a factor in how SBF justified the crimes to himself. This was the end to supposedly justify the means.
-
AI Performance Limitations on Unconstrained Web Search Tasks
By
–
It works well when it’s force constrained to sites like reddit twitter etc. it just can’t be trusted to find good sites