AI Dynamics

Global AI News Aggregator

@emollick

Image Generation Bias Mitigation Through Prompt Injection Techniques

By

@emollick

–

29 February 2024 3h31

But as you know, image generation is a different technology with well-understood biases. The attempts to mitigate those with prompt injection is a common approach, even if Google did it badly. https://
bloomberg.com/graphics/2023-
generative-ai-bias/
…

→ View original post on X — @emollick,

29 February 2024
LLMs Don’t Have Political Opinions, Research Shows

By

@emollick

–

29 February 2024 3h04

Asking AIs for their political opinions is a hot topic, but this paper shows it can be misleading. LLMs don’t have them: “We found that models will express diametrically opposing views depending on minimal changes in prompt phrasing or situative context” https://
arxiv.org/html/2402.1678
6v1
…

→ View original post on X — @emollick,

29 February 2024
Correlation vs Accuracy in Big Five Personality Testing

By

@emollick

–

28 February 2024 23h36

Since it keeps coming up in the comments: correlation is not accuracy. More on the Big Five & other tests

→ View original post on X — @emollick,

28 February 2024
Unsettling AI Prompt Engineering Science Fiction Story

By

@emollick

–

28 February 2024 23h27

An extremely unsettling science fiction story about prompting AI, told in the format of a Wikipedia article from the future (
@qntm is pretty good at unsettling science fiction stories). Also an absolutely amazing reference in the title.

→ View original post on X — @emollick,

28 February 2024
Do Frontier LLMs Actually Reason? New Evidence Suggests Yes

By

@emollick

–

28 February 2024 21h19

It is fascinating that, over a year after ChatGPT released, we still don't actually know when, if, and how well frontier LLMs "reason." There have been some papers suggesting limits to reasoning, but this new work finds that LLMs, while not as good as humans, can do reasoning. https://t.co/PlMiNE2RFJ
— Ethan Mollick (@emollick) 28 février 2024

It is fascinating that, over a year after ChatGPT released, we still don't actually know when, if, and how well frontier LLMs "reason." There have been some papers suggesting limits to reasoning, but this new work finds that LLMs, while not as good as humans, can do reasoning.

→ View original post on X — @emollick,

28 February 2024
Citing AI Models with Dates for Research Reproducibility

By

@emollick

–

28 February 2024 19h35

If you are writing a paper on AI, cite the model number but also the date range in which you used it. These systems are being continually improved and tuned. However, the lack of clear changes or versioning makes replicability hard sometimes (along with inherent LLM randomness)

→ View original post on X — @emollick,

28 February 2024
AI Bot Limitations: Hallucinations, Security Risks, and Routine Automation

By

@emollick

–

28 February 2024 18h36

Some people trying the bot have said they are unimpressed and that it often escalates to humans:
1) That is wise, giving too much outward facing work to an AI is very risky, they hallucinate & can be jailbroken
2) I think people may not realize how much work is extremely routine

→ View original post on X — @emollick,

28 February 2024
Douglas Adams Right About AGI: Super-Intelligence With Slow Inference

By

@emollick

–

28 February 2024 18h17

It would be fitting if the science fiction author most right about AGI turned out to be Douglas Adams. We achieve super-intelligence, but inference is agonizingly slow…

→ View original post on X — @emollick,

28 February 2024
Widespread harassment in tech industries beyond national headlines

By

@emollick

–

28 February 2024 18h11

Just because it didn’t make national news did not mean that harassment was not widespread in other industries.

→ View original post on X — @emollick,

28 February 2024
GPT-4 Customer Service Bot Replaces 700 Full-Time Agents

By

@emollick

–

28 February 2024 18h00

A GPT-4 powered customer service bot “is doing the equivalent work of 700 full-time agents” after a month. This is a press release from the company deploying it, so take it with a grain of salt, but it does match what I am seeing elsewhere and it suggests big implications.

→ View original post on X — @emollick,

28 February 2024