DeepMind: Frontier Safety Framework — LessWrong
Published on May 17, 2024 5:30 PM GMTDeepMind's RSP is here. Excerpt from the blogpost:Today, we...
Identifying Functionally Important Features with End-to-End Sparse Dictionary Learning — LessWrong
Published on May 17, 2024 4:25 PM GMTA short summary of the paper is presented below.This...
AISafety.com – Resources for AI Safety — LessWrong
Published on May 17, 2024 3:57 PM GMTThere are many resources for those who wish to...
My Hammer Time Final Exam — LessWrong
Published on May 17, 2024 9:28 AM GMTEpistemic Status: I thought about and wrote each paragraph...
Is There Really a Child Penalty in the Long Run? — LessWrong
Published on May 17, 2024 11:56 AM GMTA couple of weeks ago three European economists published...
Is there a place to find the most cited LW articles of all time? — LessWrong
Published on May 17, 2024 1:20 AM GMTI expect it would be useful when developing an...
D&D.Sci (Easy Mode): On The Construction Of Impossible Structures — LessWrong
Published on May 17, 2024 12:25 AM GMTThis is a D&D.Sci scenario: a puzzle where players...
To an LLM, everything looks like a logic puzzle — LessWrong
Published on May 16, 2024 10:21 PM GMTI keep seeing this meme doing the rounds where...
AI Safety Institute's Inspect hello world example for AI evals — LessWrong
Published on May 16, 2024 8:47 PM GMTSharing my detailed walk-through on using the UK AI...
Feeling (instrumentally) Rational — LessWrong
Published on May 16, 2024 6:56 PM GMTContra this post from the SequencesIn Eliezer's sequence post,...
How is GPT-4o Related to GPT-4? — LessWrong
Published on May 15, 2024 6:33 PM GMTGPT-4o both has a new tokenizer and was trained...
[Linkpost] Please don't take Lumina's anticavity probiotic — LessWrong
Published on May 15, 2024 6:03 PM GMTI suspect some number of LWers have taken or...
Was Partisanship Good for the Environmental Movement? — LessWrong
Published on May 15, 2024 5:30 PM GMTThis is the third in a sequence of posts...
Quantized vs. continuous nature of qualia — LessWrong
Published on May 15, 2024 12:52 PM GMTThis question is not very well-posed, but I've done...
How to be a messy thinker — LessWrong
Published on May 15, 2024 11:57 AM GMTCrossposted from my blog: https://invertedpassion.com/how-to-be-a-messy-thinker/I love thinking about thinking....
Embedded Whistle Synth — LessWrong
Published on May 15, 2024 2:50 AM GMT A few years ago I ported my whistle...
Catastrophic Goodhart in RL with KL penalty — LessWrong
Published on May 15, 2024 12:58 AM GMTTLDR: In the last two posts, we showed that...
Ilya Sutskever and Jan Leike resign from OpenAI — LessWrong
Published on May 15, 2024 12:45 AM GMTIlya Sutskever and Jan Leike have resigned. They led...
my note system — LessWrong
Published on May 15, 2024 12:20 AM GMTI've been told that my number of blog posts...
MIRI's May 2024 Newsletter — LessWrong
Published on May 15, 2024 12:13 AM GMTMIRI updates:MIRI is shutting down the Visible Thoughts Project.We...
GPT-4o is out — LessWrong
Published on May 13, 2024 6:33 PM GMTOpenAI just announced an improved LLM called GPT-4o.From their...
Somerville Porchfest Thoughts — LessWrong
Published on May 13, 2024 5:20 PM GMT This Saturday was Porchfest in Somerville, an annual...
Branding AI Safety Groups: A Field Guide — LessWrong
Published on May 13, 2024 5:17 PM GMTThis article is the first in a series I plan to...
Against Student Debt Cancellation From All Sides of the Political Compass — LessWrong
Published on May 13, 2024 2:55 PM GMTA stance against student debt cancellation doesn’t rely on...
Monthly Roundup #18: May 2024 — LessWrong
Published on May 13, 2024 12:30 PM GMTAs I note in the third section, I will...
Tools to discern between real and AI — LessWrong
Published on May 13, 2024 9:18 AM GMTWhat are the best ways to figure out if...
What you really mean when you claim to support “UBI for job automation” — LessWrong
Published on May 13, 2024 8:52 AM GMTAuthor’s Note: Though I’m currently a governance researcher at...
The two-tiered society — LessWrong
Published on May 13, 2024 7:53 AM GMTOn AI and Jobs: How to Make AI Work...
Benefitial habits/personal rules with very minimal tradeoffs? — LessWrong
Published on May 13, 2024 6:06 AM GMTI'm looking for personal rules one might live by...
Retrospective on Mathematical Boundaries Workshop — LessWrong
Published on May 12, 2024 9:58 PM GMTmostly written by Evan MiyazonoMinimum viable introductionWe ran a...
Resources for learning about poise / gracefulness? — LessWrong
Published on May 11, 2024 6:30 PM GMTI'm doing some initial investigation for a Notes on...
New intro textbook on AIXI — LessWrong
Published on May 11, 2024 6:18 PM GMTMarcus Hutter and his PhD students David Quarel and...
Questions are usually too cheap — LessWrong
Published on May 11, 2024 1:00 PM GMTIt is easier to ask than to answer. That’s my...
Ethics and prospects of AI related jobs? — LessWrong
Published on May 11, 2024 9:31 AM GMTI've been on the lookout for new jobs recently...
Should I Finish My Bachelor's Degree? — LessWrong
Published on May 11, 2024 5:17 AM GMTTo some, it might seem like a strange question....
Custom Audio Switch Box — LessWrong
Published on May 11, 2024 2:40 AM GMT When I play live I have a bunch...
MATS Winter 2023-24 Retrospective — LessWrong
Published on May 11, 2024 12:09 AM GMTCo-Authors: @Rocket, @Ryan Kidd, @LauraVaughan, @McKennaFitzgerald, @Christian Smith, @Juan...
Creating unrestricted AI Agents with a refusal-vector ablated Llama 3 70B — LessWrong
Published on May 11, 2024 12:08 AM GMTTLDR; I demonstrate the use of refusal vector ablation...
Pascal's Mugging and the Order of Quantification — LessWrong
Published on May 10, 2024 6:34 PM GMTOne of the fun things to do when learning...
The Alignment Problem No One Is Talking About — LessWrong
Published on May 10, 2024 6:34 PM GMTThe following is the first in a 6 part...
Chapter 3 - Solutions Landscape — LessWrong
Published on May 9, 2024 5:33 PM GMTIntroductionThe full draft textbook is available here.Epistemic Status: I'm...
We might be missing some key feature of AI takeoff; it'll probably seem like "we could've seen this coming" — LessWrong
Published on May 9, 2024 3:43 PM GMTPredicting the future is hard, so it’s no surprise...
Four Unrelated Is Over — LessWrong
Published on May 9, 2024 2:50 PM GMT Somerville historically had a zoning ordinance limiting housing...
AI #63: Introducing Alpha Fold 3 — LessWrong
Published on May 9, 2024 2:20 PM GMTIt was a remarkably quiet announcement. We now have...
I Got 95 Theses But a Glitch Ain’t One — LessWrong
Published on May 9, 2024 2:10 PM GMTOr rather Samuel Hammond does. Tyler Cowen finds it...
The Human's Role in Mesa Optimization — LessWrong
Published on May 9, 2024 12:07 PM GMTWhen it comes to mesa optimization, there’s usually two...
Visualizing neural network planning — LessWrong
Published on May 9, 2024 6:40 AM GMTTLDRWe develop a technique to try and detect if...
Forecasting: the way I think about it — LessWrong
Published on May 9, 2024 12:49 AM GMTThis is the first post in a little series...
some thoughts on LessOnline — LessWrong
Published on May 8, 2024 11:17 PM GMTI mostly wrote this for facebook, but it ended...
Semantic Disagreement of Sleeping Beauty Problem — LessWrong
Published on May 8, 2024 7:09 PM GMTThis is the tenth post in my series on...