~www_lesswrong_com | Bookmarks (698)

This prompt (sometimes) makes ChatGPT think about terrorist organisations — LessWrong

lesswrong.com

Published on April 24, 2025 9:15 PM GMTYesterday, I couldn't wrap my head around some programming...
Published on April 24, 2025 9:15 PM GMTYesterday, I couldn't wrap my head around some programming concepts in Python, so I turned to ChatGPT (gpt-4o) for help. This evolved into a very long conversation (the longest I've ever had with it by far), at the end of which I pasted around 600 lines of code from Github and asked it to explain them to...
1
Token and Taboo — LessWrong

lesswrong.com

Published on April 24, 2025 8:17 PM GMTWhat in retrospect seem like serious moral crimes were...
Published on April 24, 2025 8:17 PM GMTWhat in retrospect seem like serious moral crimes were often widely accepted while they were happening. This means that moral progress can require intellectual progress.[1] Intellectual progress often requires questioning received ideas, but questioning moral norms is sometimes taboo. For example, in ancient Greece it would have been taboo to say that women should have the same political...
1
Trouble at Miningtown: Prologue — LessWrong

lesswrong.com

Published on April 24, 2025 7:09 PM GMTIn late 2019 I wrote a TTRPG.The theme was...
Published on April 24, 2025 7:09 PM GMTIn late 2019 I wrote a TTRPG.The theme was alien sentience (or perhaps sapience is the technical term), both "organic" (extra terrestrial) and "artificial" (AI/robots). This is the prologue that kicks off play. I found it really fascinating to look back on late 2019 and how I was thinking about some of these topics. Earthdate March 2022. Two months ago...
1
Putting up Bumpers — LessWrong

lesswrong.com

Published on April 23, 2025 4:05 PM GMTtl;dr: Even if we can't solve alignment, we can...
Published on April 23, 2025 4:05 PM GMTtl;dr: Even if we can't solve alignment, we can solve the problem of catching and fixing misalignment.If a child is bowling for the first time, and they just aim at the pins and throw, they’re almost certain to miss. Their ball will fall into one of the gutters. But if there were beginners’ bumpers in place blocking...
1
The AI Belief-Consistency Letter — LessWrong

lesswrong.com

Published on April 23, 2025 12:01 PM GMTDear policymakers,We demand that the AI alignment budget be...
Published on April 23, 2025 12:01 PM GMTDear policymakers,We demand that the AI alignment budget be Belief-Consistent with the military budget.Belief-Consistency is a simple yet powerful idea:If you spend 8000 times less on AI alignment (compared to the military),You must also believe that AI risk is 8000 times less (than military risk).[1]Yet the only way to reach Belief-Consistency, is toGreatly increase AI alignment spending,orBecome...
1
Jaan Tallinn's 2024 Philanthropy Overview — LessWrong

lesswrong.com

Published on April 23, 2025 11:06 AM GMTto follow up my philantropic pledge from 2020, i've...
Published on April 23, 2025 11:06 AM GMTto follow up my philantropic pledge from 2020, i've updated my philanthropy page with the 2024 results.in 2024 my donations funded $51M worth of endpoint grants (plus $2.0M in admin overhead and philanthropic software development). this comfortably exceeded my 2024 commitment of $42M (20k times $2100.00 — the minimum price of ETH in 2024).this also concludes my...
1
Fish and Faces — LessWrong

lesswrong.com

Published on April 23, 2025 3:35 AM GMTWhat would it take to convince you to come...
Published on April 23, 2025 3:35 AM GMTWhat would it take to convince you to come and see a fish that recognizes faces?Note: I'm not a marine biologist, nor have I kept fish since I was four. I have no idea what fish can really do. For the purposes of this post, let's suppose that fish recognizing faces is not theoretically impossible, but beyond...
1
Are we "being poisoned"? — LessWrong

lesswrong.com

Published on April 23, 2025 5:11 AM GMTI would like to revisit some of the concepts...
Published on April 23, 2025 5:11 AM GMTI would like to revisit some of the concepts Scott explored in his 2020 article "For, Then Against, High-Saturated-Fat Diets". I'm hoping someone will have some novel/updated insights or new research to share concerning the impacts of the Western diet on health in 2025. I'm about to turn 32, and I sense that I'm moving toward becoming a...
1
To Understand History, Keep Former Population Distributions In Mind — LessWrong

lesswrong.com

Published on April 23, 2025 4:51 AM GMTGuillaume Blanc has a piece in Works in Progress...
Published on April 23, 2025 4:51 AM GMTGuillaume Blanc has a piece in Works in Progress (I assume based on his paper) about how France’s fertility declined earlier than in other European countries, and how its power waned as its relative population declined starting in the 18th century. In 1700, France had 20% of Europe’s population (4% of the whole world population). Kissinger writes...
1
Is alignment reducible to becoming more coherent? — LessWrong

lesswrong.com

Published on April 22, 2025 11:47 PM GMTEpistemic status: Like all alignment ideas, this one is...
Published on April 22, 2025 11:47 PM GMTEpistemic status: Like all alignment ideas, this one is incomplete and/or wrong, but I am hoping mostly incomplete and not wrong.One of the hard subproblems of alignment is constructing a "stable pointer to value" (see this overview and this (sub)sequence discussing OU agents). The way that I think of this is that we want to be able...
1
The EU Is Asking for Feedback on Frontier AI Regulation (Open to Global Experts)—This Post Breaks Down What’s at Stake for AI Safety — LessWrong

lesswrong.com

Published on April 22, 2025 8:39 PM GMTThe European AI Office is currently writing the rules...
Published on April 22, 2025 8:39 PM GMTThe European AI Office is currently writing the rules for how general-purpose AI (GPAI) models will be governed under the EU AI Act. The are explicitly asking for feedback on how to interpret and operationalize key obligations under the AI Act. This includes the thresholds for systemic risk, the definition of GPAI, how to estimate training compute, and when...
1
Corrupted by Reasoning: Reasoning Language Models Become Free-Riders in Public Goods Games — LessWrong

lesswrong.com

Published on April 22, 2025 7:25 PM GMTSummary:Traditional LLMs outperform reasoning models in cooperative Public Goods...
Published on April 22, 2025 7:25 PM GMTSummary:Traditional LLMs outperform reasoning models in cooperative Public Goods tasks. Models like Llama-3.3-70B maintain ~90% contribution rates in public goods games, while reasoning-focused models (o1, o3 series) average only ~40%.We observe an "increased tendency to escape regulations" in reasoning models. As models improve in analytical capabilities, they show decreased cooperative behavior in multi-agent settings.Reasoning models more readily opt for...
1
Alignment from equivariance II - language equivariance as a way of figuring out what an AI "means" — LessWrong

lesswrong.com

Published on April 22, 2025 7:04 PM GMTI recently had the privilege of having my idea...
Published on April 22, 2025 7:04 PM GMTI recently had the privilege of having my idea criticized at the London Institute for Safe AI, including by Philip Kreer and Nicky Case. Previously the idea was vague; being with them forced me to make the idea specific. I managed to make it so specific that they found a problem with it! That's progress :)Reminder: diagrams...
1
There is no Red Line — LessWrong

lesswrong.com

Published on April 22, 2025 6:28 PM GMTThere will be no single moment, no dramatic cinematic...
Published on April 22, 2025 6:28 PM GMTThere will be no single moment, no dramatic cinematic climax where humanity loses control. Forget the Hollywood singularity, the sharp left turn into dystopia often breathlessly debated by the very people enabling a slower, more mundane version of it. That’s not how it happens. It’s subtler. More insidious. More… us.You will give it up. Every day. Piece...
1
Manifund 2025 Regrants — LessWrong

lesswrong.com

Published on April 22, 2025 5:36 PM GMTEach year, Manifund partners with regrantors: experts in the...
Published on April 22, 2025 5:36 PM GMTEach year, Manifund partners with regrantors: experts in the field of AI safety, each given an independent budget of $100k+. Regrantors can initiate fast, small grants, seeding early-stage projects with $5k-$50k.For 2025, we’ve raised $2.25m so far, and are excited to announce our first 10 regrantors:Neel Nanda — DeepMindLisa Thiergart — SL5 Task ForceLauren Mangla — ConstellationAidan...
1
AISN#52: An Expert Virology Benchmark — LessWrong

lesswrong.com

Published on April 22, 2025 5:08 PM GMTWelcome to the AI Safety Newsletter by the Center...
Published on April 22, 2025 5:08 PM GMTWelcome to the AI Safety Newsletter by the Center for AI Safety. We discuss developments in AI and AI safety. No technical background required.In this edition: AI now outperforms human experts in specialized virology knowledge in a new benchmark; A new report explores the risk of AI-enabled coups.Listen to the AI Safety Newsletter for free on Spotify...
1
Problems with Bayesianism: A Socratic Dialogue — LessWrong

lesswrong.com

Published on April 22, 2025 2:09 PM GMTCrossposted from my blog In this fictional dialogue between a...
Published on April 22, 2025 2:09 PM GMTCrossposted from my blog In this fictional dialogue between a Bayesian (B) and a Non-Bayesian (N) I will propose solutions to some pre-existing problems with Bayesian epistemology, as well as introduce a new problem for which I offer a solution at the end. (Computer scientists may consider skipping to that section).Here’s a Bayes theorem cheat sheet if you...
1
Societal and technological progress as sewing an ever-growing, ever-changing, patchy, and polychrome quilt — LessWrong

lesswrong.com

Published on April 22, 2025 1:21 PM GMTJoel Z. Leibo [1], Alexander Sasha Vezhnevets [1], William A. Cunningham...
Published on April 22, 2025 1:21 PM GMTJoel Z. Leibo [1], Alexander Sasha Vezhnevets [1], William A. Cunningham [1, 2], Sébastien Krier [1], Manfred Diaz [3], Simon Osindero [1][1] Google DeepMind, [2] University of Toronto, [3] Mila Québec AI Institute Disclaimer: These are our own opinions; they do not represent the views of Google DeepMind as a whole or its broader community of safety researchers.Beyond Alignment: The Patchwork Quilt of...
1
You Better Mechanize — LessWrong

lesswrong.com

Published on April 22, 2025 1:10 PM GMTOr you had better not. The question is which...
Published on April 22, 2025 1:10 PM GMTOr you had better not. The question is which one. This post covers the announcement of Mechanize, the skeptical response from those worried AI might kill everyone, and the associated (to me highly frustrating at times) Dwarkesh Patel podcast with founders Tamay Besiroglu and Ege Erdil. Mechanize plans to help advance the automation of AI labor, which...
1
Experimental testing: can I treat myself as a random sample? — LessWrong

lesswrong.com

Published on April 22, 2025 12:34 PM GMTTL;DR: Several experiments show that I can extract useful...
Published on April 22, 2025 12:34 PM GMTTL;DR: Several experiments show that I can extract useful information just by treating myself as a random sample, and thus a view that I can't use myself as a random sample is false. But it's still not clear whether this can be used to prove the Doomsday argument.There are two views: one view is that I can...
1
Family-line selection optimizer — LessWrong

lesswrong.com

Published on April 22, 2025 7:16 AM GMTO3 and Claude 3.7 are terribly dishonest creatures. Gemini...
Published on April 22, 2025 7:16 AM GMTO3 and Claude 3.7 are terribly dishonest creatures. Gemini 2.5 can be a bit dishonest too, even though Google writes tests like mad. I would bet that lie-proof benchmarks will be difficult and expensive to make and that the lie-proofing techniques won't easily generalize outside of coding tasks. Perhaps a more punishing optimizer would help solve this...
1
Accountability Sinks — LessWrong

lesswrong.com

Published on April 22, 2025 5:00 AM GMTThis is a cross-post from https://250bpm.substack.com/p/accountability-sinksBack in the 1990s,...
Published on April 22, 2025 5:00 AM GMTThis is a cross-post from https://250bpm.substack.com/p/accountability-sinksBack in the 1990s, ground squirrels were briefly fashionable pets, but their popularity came to an abrupt end after an incident at Schiphol Airport on the outskirts of Amsterdam. In April 1999, a cargo of 440 of the rodents arrived on a KLM flight from Beijing, without the necessary import papers. Because...
1
Most AI value will come from broad automation, not from R&D — LessWrong

lesswrong.com

Published on April 22, 2025 3:22 AM GMTThis is a linkpost to an article by Ege...
Published on April 22, 2025 3:22 AM GMTThis is a linkpost to an article by Ege Erdil and I that we wrote to explain an important perspective that we share regarding AI automation. I'll quote the introduction:A popular view about the future impact of AI on the economy is that it will be primarily mediated through AI automation of R&D. In some form or...
1
Q2 AI Forecasting Benchmark: $30,000 in Prizes — LessWrong

lesswrong.com

Published on April 21, 2025 5:29 PM GMTDiscuss
1

~www_lesswrong_com | Bookmarks (698)

Domains