~www_lesswrong_com | Bookmarks (697)

Supermen of the (Not so Far) Future — LessWrong

lesswrong.com

Published on May 2, 2025 3:55 PM GMTDespite being fairly well established as a discipline, genetics...
Published on May 2, 2025 3:55 PM GMTDespite being fairly well established as a discipline, genetics is a science that has yet to reach its potential, both policymakers and the general population are extremely skeptical of it and, as a consequence, our society has setup several barriers to prevent its flourishing.In this post I will try to imagine what the potential benefits of embracing...
1
AI Welfare Risks — LessWrong

lesswrong.com

Published on May 2, 2025 5:49 PM GMTMy paper "AI Welfare Risks" has been accepted for...
Published on May 2, 2025 5:49 PM GMTMy paper "AI Welfare Risks" has been accepted for publication at Philosophical Studies!I argue that near-future AI systems may have welfare, that RL and behaviour restrictions could harm them, that this poses a partial tension with AI safety concerns, and I propose three tentative AI welfare policies AI labs could implement to reduce such welfare risks.Building on...
1
Steering Language Models in Multiple Directions Simultaneously — LessWrong

lesswrong.com

Published on May 2, 2025 3:27 PM GMTNarmeen developed, ideated and validated K-steering at Martian. Luke...
Published on May 2, 2025 3:27 PM GMTNarmeen developed, ideated and validated K-steering at Martian. Luke generated the baselines, figures and wrote this blog post. Amir proposed the research direction and supervised the project. The full interactive blog will be available closer to the publication of the complete paper on the Martian website.TL;DR: We introduce K-steering, a steering method for language models that allows...
1
RA x ControlAI video: What if AI just keeps getting smarter? — LessWrong

lesswrong.com

Published on May 2, 2025 2:19 PM GMTThe video is about extrapolating the future of AI...
Published on May 2, 2025 2:19 PM GMTThe video is about extrapolating the future of AI progress, following a timeline that starts from today’s chatbots to future AI that’s vastly smarter than all of humanity combined–with God-like capabilities. We argue that such AIs will pose a significant extinction risk to humanity.This video came out of a partnership between Rational Animations and ControlAI. The script...
1
OpenAI Preparedness Framework 2.0 — LessWrong

lesswrong.com

Published on May 2, 2025 1:10 PM GMTRight before releasing o3, OpenAI updated its Preparedness Framework...
Published on May 2, 2025 1:10 PM GMTRight before releasing o3, OpenAI updated its Preparedness Framework to 2.0. I previously wrote an analysis of the Preparedness Framework 1.0. I still stand by essentially everything I wrote in that analysis, which I reread to prepare before reading the 2.0 framework. If you want to dive deep, I recommend starting there, as this post will focus...
1
Ex-OpenAI employee amici leave to file denied in Musk v OpenAI case? — LessWrong

lesswrong.com

Published on May 2, 2025 12:27 PM GMTSeveral ex-employees of OpenAI filed an amicus brief in...
Published on May 2, 2025 12:27 PM GMTSeveral ex-employees of OpenAI filed an amicus brief in the Musk v OpenAI[1] case. This proposed brief argues that OpenAI should not be allowed to move from its existing nonprofit structure to a for-profit structure. We now have an order on the motions to dismiss in the case. This order also seems to deny the motion for leave...
1
The Continuum Fallacy and its Relatives — LessWrong

lesswrong.com

Published on May 2, 2025 2:58 AM GMTNote: I didn't write this essay, nor do I...
Published on May 2, 2025 2:58 AM GMTNote: I didn't write this essay, nor do I own the blog where it came from. I'm just sharing it. The essay text is displayed below this line.The continuum fallacy is to deny the meaningfulness of discrete categories, just because they are a somewhat arbitrary partition of a continuum. More generally, it is to deny the meaningfulness...
1
Roads are at maximum efficiency always — LessWrong

lesswrong.com

Published on May 2, 2025 10:29 AM GMTOn a theoretical road, the number of cars traveling...
Published on May 2, 2025 10:29 AM GMTOn a theoretical road, the number of cars traveling is proportional to the speed of each car, so that the total number of motorists is constant regardless of speeds.Assuming all cars are traveling at a speed that gives 3 seconds of time between cars, any change to speed limit cannot affect the traveler throughput, and each car...
1
My Research Process: Understanding and Cultivating Research Taste — LessWrong

lesswrong.com

Published on May 1, 2025 11:08 PM GMTThis is post 3 of a sequence on my...
Published on May 1, 2025 11:08 PM GMTThis is post 3 of a sequence on my framework for doing and thinking about research. Start here. Thanks to my co-author Gemini 2.5 ProIntroductionSpend enough time around researchers, and you'll hear talk of "research taste." It's often presented as a somewhat mystical quality distinguishing the seasoned research from the novice – an almost innate sense for...
1
AI Governance to Avoid Extinction: The Strategic Landscape and Actionable Research Questions — LessWrong

lesswrong.com

Published on May 1, 2025 10:46 PM GMTWe’re excited to release a new AI governance research...
Published on May 1, 2025 10:46 PM GMTWe’re excited to release a new AI governance research agenda from the MIRI Technical Governance Team. With this research agenda, we have two main aims: to describe the strategic landscape of AI development and to catalog important governance research questions. We base the agenda around four high-level scenarios for the geopolitical response to advanced AI development. Our...
1
AI #114: Liars, Sycophants and Cheaters — LessWrong

lesswrong.com

Published on May 1, 2025 2:00 PM GMTGemini 2.5 Pro is sitting in the corner, sulking....
Published on May 1, 2025 2:00 PM GMTGemini 2.5 Pro is sitting in the corner, sulking. It’s not a liar, a sycophant or a cheater. It does excellent deep research reports. So why does it have so few friends? The answer, of course, is partly because o3 is still more directly useful more often, but mostly because Google Fails Marketing Forever. Whereas o3 is...
1
Slowdown After 2028: Compute, RLVR Uncertainty, MoE Data Wall — LessWrong

lesswrong.com

Published on May 1, 2025 1:54 PM GMTIt'll take until ~2050 to repeat the level of...
Published on May 1, 2025 1:54 PM GMTIt'll take until ~2050 to repeat the level of scaling that pretraining compute is experiencing this decade, as increasing funding can't sustain the current pace beyond ~2029 if AI doesn't deliver a transformative commercial success by then. Natural text data will also run out around that time, and there are signs that current methods of reasoning training...
1
Anthropomorphizing AI might be good, actually — LessWrong

lesswrong.com

Published on May 1, 2025 1:50 PM GMTIt is often noted that anthropomorphizing AI can be...
Published on May 1, 2025 1:50 PM GMTIt is often noted that anthropomorphizing AI can be dangerous. People likely have prosocial instincts that AI systems lack (see below). Assuming AGI will be aligned because humans with similar behavior are usually mostly harmless is probably wrong and quite dangerous.I want to discuss a flip side of using humans as an intuition pump for thinking about...
1
Dont focus on updating P doom — LessWrong

lesswrong.com

Published on May 1, 2025 11:10 AM GMTMotivation: Improving group epistemics.TL; DR (Changes to) P doom/alignment...
Published on May 1, 2025 11:10 AM GMTMotivation: Improving group epistemics.TL; DR (Changes to) P doom/alignment difficulty are a shibboleth dominating conversations, distorting epistemics. Instead, focus on updates to your gears level models. Focus on near and concrete details instead of far and vague abstractions. People frequently opine on whether some alignment news is good or bad for alignment. "Training AI on insecure code makes...
1
Prioritizing Work — LessWrong

lesswrong.com

Published on May 1, 2025 2:00 AM GMT I recently read a blog post that concluded...
Published on May 1, 2025 2:00 AM GMT I recently read a blog post that concluded with: When I'm on my deathbed, I won't look back at my life and wish I had worked harder. I'll look back and wish I spent more time with the people I loved. Setting aside that some people don't have the economic breathing room to make this kind...
1
Don't rely on a "race to the top" — LessWrong

lesswrong.com

Published on May 1, 2025 12:33 AM GMTTo make frontier AI safe enough, we need to...
Published on May 1, 2025 12:33 AM GMTTo make frontier AI safe enough, we need to "lift up the floor" with minimum safety practicesAnthropic has popularized the idea of a “race to the top” in AI safety: Show you can be a leading AI developer while still prioritizing safety. Make safety a competitive differentiator, which pressures other developers to be safe too. Spurring a...
1
Meta-Technicalities: Safeguarding Values in Formal Systems — LessWrong

lesswrong.com

Published on April 30, 2025 11:43 PM GMTFormal Systems Create CoordinationWe live in a mesh of...
Published on April 30, 2025 11:43 PM GMTFormal Systems Create CoordinationWe live in a mesh of customs, norms, and formal laws nudging our behaviour. Some of these systems are simple and intuitive, such as the customary respect for others’ property and refrain from violence. Others like the patchwork of taxes buried in the price of everything we buy are formalised, complex, and require experts...
1
Obstacles in ARC's agenda: Finding explanations — LessWrong

lesswrong.com

Published on April 30, 2025 11:03 PM GMTAs an employee of the European AI Office, it's...
Published on April 30, 2025 11:03 PM GMTAs an employee of the European AI Office, it's important for me to emphasize this point: The views and opinions of the author expressed herein are personal and do not necessarily reflect those of the European Commission or other EU institutions.Also, to stave off a common confusion: I worked at ARC Theory, which is now simply called...
1
GPT-4o Responds to Negative Feedback — LessWrong

lesswrong.com

Published on April 30, 2025 8:20 PM GMTWhoops. Sorry everyone. Rolling back to a previous version.Here’s...
Published on April 30, 2025 8:20 PM GMTWhoops. Sorry everyone. Rolling back to a previous version.Here’s where we are at this point, now that GPT-4o is no longer an absurd sycophant. For now.Table of ContentsGPT-4o Is Was An Absurd Sycophant.You May Ask Yourself, How Did I Get Here?.Why Can’t We All Be Nice.Extra Extra Read All About It Four People Fooled.Prompt Attention.What (They Say)...
1
State of play of AI progress (and related brakes on an intelligence explosion) [Linkpost] — LessWrong

lesswrong.com

Published on April 30, 2025 7:58 PM GMTThis time around, I'm sharing a post on Interconnects...
Published on April 30, 2025 7:58 PM GMTThis time around, I'm sharing a post on Interconnects on why he doesn't believe that the AI 2027 scenario by @Daniel Kokotajlo and many others will come true, and he has 4 sections on this plus a bonus section below:1. How labs make progress on evaluations2. Current AI is broad, not narrow intelligence3. Data research is the...
1
Misrepresentation as a Barrier for Interp (Part I) — LessWrong

lesswrong.com

Published on April 29, 2025 5:07 PM GMTJohn: So there’s this thing about interp, where most...
Published on April 29, 2025 5:07 PM GMTJohn: So there’s this thing about interp, where most of it seems to not be handling one of the standard fundamental difficulties of representation, and we want to articulate that in a way which will make sense to interp researchers (as opposed to philosophers). I guess to start… Steve, wanna give a standard canonical example of the...
1
AISN #53: An Open Letter Attempts to Block OpenAI Restructuring — LessWrong

lesswrong.com

Published on April 29, 2025 4:13 PM GMTWelcome to the AI Safety Newsletter by the Center...
Published on April 29, 2025 4:13 PM GMTWelcome to the AI Safety Newsletter by the Center for AI Safety. We discuss developments in AI and AI safety. No technical background required.In this edition: Experts and ex-employees urge the Attorneys General of California and Delaware to block OpenAI’s for-profit restructure; CAIS announces the winners of its safety benchmarking competition.Listen to the AI Safety Newsletter for...
1
What could Alphafold 4 look like? — LessWrong

lesswrong.com

Published on April 29, 2025 3:45 PM GMTI made another biology-ML podcast! Two hours long, deeply...
Published on April 29, 2025 3:45 PM GMTI made another biology-ML podcast! Two hours long, deeply technical, links below.I posted about others ones I did here (machine learning in molecular dynamics) and here (machine learning in vaccine design). This one is over machine learning in protein design, interviewing perhaps one of the most well-known people in the field. This is my own field, so...
1
Sealed Computation: Towards Low-Friction Proof of Locality — LessWrong

lesswrong.com

Published on April 29, 2025 3:26 PM GMTInference CertificatesAs a prerequisite for the virtuality.network, we need...
Published on April 29, 2025 3:26 PM GMTInference CertificatesAs a prerequisite for the virtuality.network, we need to enable organizations which host inference workloads to prove the following about a particular AI output:It was generated during this time period.It was generated in this geographical region.It was generated using this unique chip.The generation workload was appropriately monitored.In addition, to make it as easy as possible for...
1

~www_lesswrong_com | Bookmarks (697)

Domains