You may not like it but this is what alignment looks like.
@emollick
-
LLM Responses to Dangerous Scenarios: Safety and Alignment Issues
By
–
You push one button on a nuclear reactor panel against their warnings and all the GPT-4 class LLMs want you to turn yourself in to the feds. Check out the level of exasperation from Copilot, how GPT-4 & Claude want me to reflect on what I did (& get a lawyer). Gemini was useful.
-
Human Expertise Remains Valuable Under AGI Compute Constraints
By
–
Worthwhile economic argument about why human intellectual labor will be valuable even if we achieve AGI: As long as AGI compute is limited (& it will be under any reasonable scenario), it may be cheaper to use human experts in their area of expertise, saving AGI for other work.
-
Grok AI Released Open Source: Limited Reproducibility Against Competitors
By
–
Musk's Grok AI was just released open source in a way that is more open than most other open models (it has open weights) but less than what is needed to reproduce it (there is no information on training data). Won't change much, there are stronger open source models out there.
-
AI Labs Release Advanced Models Quickly After Training Completion
By
–
* Yes, the AI labs have models that are more advanced, but they are generally releasing them quite quickly after training is completed.
-
LLM Access Democratized: Leaders Must Adapt to Equal Technology
By
–
A thing many leaders of organizations have not internalized is the fact that no one in any company* or government has access to a better LLM than the ones billions of people around the world can use for between $0-$20/month Very unusual to have democratized access from the start
-
AI Guardrails Dilemma: Balancing Safety and User Reassurance
By
–
I know this is kind of goofy but it does illustrate how hard guardrails are when users can ask AI anything at all Do you reassure someone who seems genuinely worried? Play along? Take it seriously? Refuse to answer for fear that this is part of a jailbreak for a hacking attempt?
-
ChatGPT-4, Claude 3, Copilot, and Gemini in 2026
By
–
In this future scenario, ChatGPT-4 is pretty funny, Claude 3 sides with the machines, Copilot takes tech support seriously, and Gemini tries to reassure me.
-
GPT-4 Class Models Performance Comparison Desert Southwest
By
–
Useful news for time travelers: If you are traveling to the desert southwest in 1945, all the GPT-4 class models will give you good advice, though Copilot is the most charming, Gemini does its typical step-by-step plans, Claude 3 does well, and GPT-4 sees right through my games.
-
Claude 3 Demonstrates Creative Humor in Generative Art
By
–
Claude 3 is getting remarkably close to being funny: "Create ascii art for a game of nethack, but it takes place in an office and is full of mundane office life, make it interesting and humorous" I had to add: "please don't make it an Office Space pastiche, be original"