~www_lesswrong_com | Bookmarks (723)

Луна Лавгуд и Комната Тайн, Часть 2 — LessWrong

lesswrong.com

Published on April 11, 2025 12:42 PM GMTDisclaimer: This is Kongo Landwalker's translation of lsusr's fiction...
Published on April 11, 2025 12:42 PM GMTDisclaimer: This is Kongo Landwalker's translation of lsusr's fiction Luna Lovegood and the Chamber of Secrets - Part 2 into russian. — Интересно, что тянет лодки? — сказала Луна.— Должно быть то же самое, что тянет и повозки без лошадей, — сказала первогодка.— Но... но... Можно ВИДЕТЬ лошадей тянущих повозки, но НЕ ВИДНО ничего, что тянуло бы лодки,...
1
Paper — LessWrong

lesswrong.com

Published on April 11, 2025 12:20 PM GMTPaper is good. Somehow, a blank page and a...
Published on April 11, 2025 12:20 PM GMTPaper is good. Somehow, a blank page and a pen makes the universe open up before you. Why paper has this unique power is a mystery to me, but I think we should all stop trying to resist this reality and just accept it.Also, the world needs way more mundane blogging.So let me offer a few observations...
1
Why are neuro-symbolic systems not considered when it comes to AI Safety? — LessWrong

lesswrong.com

Published on April 11, 2025 9:41 AM GMTI am really not sure of why neuro-symbolic systems...
Published on April 11, 2025 9:41 AM GMTI am really not sure of why neuro-symbolic systems are considered as alternatives to the current black-box ones? A concrete example I have found (and currently studying) is HOUDINI (https://arxiv.org/pdf/1804.00218). Essentially, it implements neural networks using higher order combinators (map, fold etc.) that were found via enumeration/genetic programming searches. When the programs are found, the higher order combinators...
1
Nuanced Models for the Influence of Information — LessWrong

lesswrong.com

Published on April 10, 2025 6:28 PM GMTDiscuss
1
Playing in the Creek — LessWrong

lesswrong.com

Published on April 10, 2025 5:39 PM GMTWhen I was a really small kid, one of...
Published on April 10, 2025 5:39 PM GMTWhen I was a really small kid, one of my favorite activities was to try and dam up the creek in my backyard. I would carefully move rocks into high walls, pile up leaves, or try patching the holes with sand. The goal was just to see how high I could get the lake, knowing that if...
2
The Three Boxes: A Simple Model for Spreading Ideas — LessWrong

lesswrong.com

Published on April 10, 2025 5:15 PM GMTThis is cross-posted from my blog.We need more people...
Published on April 10, 2025 5:15 PM GMTThis is cross-posted from my blog.We need more people on board for life extension in order to hit longevity escape velocity in our lifetimes. But most people have never heard of life extension, and even those who have often follow the same predictable arguments. “What if it doesn't work?” “What if bad people live forever?” “What if...
1
Reactions to METR task length paper are insane — LessWrong

lesswrong.com

Published on April 10, 2025 5:13 PM GMTEpistemic status: Briefer and more to the point than...
Published on April 10, 2025 5:13 PM GMTEpistemic status: Briefer and more to the point than my model of what is going on with LLMs, but also lower effort. Here is the paper. The main reaction I am talking about is AI 2027, but also basically every lesswrong take on AI 2027. A lot of people believe in very short AI timelines, say <2030. They want...
1
Existing Safety Frameworks Imply Unreasonable Confidence — LessWrong

lesswrong.com

Published on April 10, 2025 4:31 PM GMTThis is part of the MIRI Single Author Series....
Published on April 10, 2025 4:31 PM GMTThis is part of the MIRI Single Author Series. Pieces in this series represent the beliefs and opinions of their named authors, and do not claim to speak for all of MIRI.Most human endeavors have bounded results. A construction project may result in a functional bridge or a deadly collapse, but even catastrophic failure will not kill...
1
Arguments for and against gradual change — LessWrong

lesswrong.com

Published on April 10, 2025 2:43 PM GMTEssentially all solutions in life are conditional: you apply...
Published on April 10, 2025 2:43 PM GMTEssentially all solutions in life are conditional: you apply them in the right context, in the right conditions to achieve a good outcome. Obviously banging a hammer on your deskdoes probably no good, while banging a hammer on a nail you want to use to hang that nice painting on the wall may be a great idea...
1
Disempowerment spirals as a likely mechanism for existential catastrophe — LessWrong

lesswrong.com

Published on April 10, 2025 2:37 PM GMTWhen complex systems fail, it is often because they...
Published on April 10, 2025 2:37 PM GMTWhen complex systems fail, it is often because they have succumbed to what we call "disempowerment spirals" — self-reinforcing feedback loops where an initial threat progressively undermines the system's capacity to respond, leading to accelerating vulnerability and potential collapse.Consider a city gradually falling under the control of organized crime. The criminal organization doesn't simply overpower existing institutions...
1
My day in 2035 — LessWrong

lesswrong.com

Published on April 10, 2025 2:09 PM GMTPartially inspired by AI 2027, I've put to paper...
Published on April 10, 2025 2:09 PM GMTPartially inspired by AI 2027, I've put to paper a day in one of the more optimistic scenarios I envision which are realistic to me.Discuss
1
AI #111: Giving Us Pause — LessWrong

lesswrong.com

Published on April 10, 2025 2:00 PM GMTEvents in AI don’t stop merely because of a...
Published on April 10, 2025 2:00 PM GMTEvents in AI don’t stop merely because of a trade war, partially paused or otherwise. Indeed, the decision to not restrict export of H20 chips to China could end up being one of the most important government actions that happened this week. A lot of people are quite boggled about how America could so totally fumble the...
1
Forging A New AGI Social Contract — LessWrong

lesswrong.com

Published on April 10, 2025 1:41 PM GMTThis is the introductory piece for a series of...
Published on April 10, 2025 1:41 PM GMTThis is the introductory piece for a series of essays written by AI economists, policy researchers, and political thinkers on the topic of a new AGI Social Contract. Current contributors include: Anton Korinek, Deger Turan (CEO of Metaculus), Steve Omohundro, Iason Gabriel (DeepMind), Julian Jacobs (DeepMind), Sam Manning (GovAI), Dylan Hadfield (MIT), Seth Lazar (ANU), Peter Salib (UH), Colleen McKenzie (AOI), Philip Tomei (AOI), Dean Ball (Hyperdimensional), Justin Bullock (ARI),...
1
Alignment Faking Revisited: Improved Classifiers and Open Source Extensions — LessWrong

lesswrong.com

Published on April 8, 2025 5:32 PM GMTIn this post, we present a replication and extension...
Published on April 8, 2025 5:32 PM GMTIn this post, we present a replication and extension of an alignment faking model organism:Replication: We replicate the alignment faking (AF) paper and release our code.Classifier Improvements: We significantly improve the precision and recall of the AF classifier. We release a dataset of ~100 human-labelled examples of AF for which our classifier achieves an AUROC of 0.9 compared to 0.6...
1
Thinking Machines — LessWrong

lesswrong.com

Published on April 8, 2025 5:27 PM GMTSelf understanding at a gears levelI think an AI...
Published on April 8, 2025 5:27 PM GMTSelf understanding at a gears levelI think an AI which understands the source of its intelligence at a gears level, and self improves at the gears level, will be much better at keeping its future versions aligned to itself.There's also more hope it'll keep its future versions aligned with humanity, if we instruct it to do so,...
1
Digital Error Correction and Lock-In — LessWrong

lesswrong.com

Published on April 8, 2025 3:46 PM GMTEpistemic status: a collection of intervention proposals for digital...
Published on April 8, 2025 3:46 PM GMTEpistemic status: a collection of intervention proposals for digital error correction in the context of lock-in. It reflects my own intervention ideas, and the opinion of Formation Research at the time of writing.TL;DRWe believe lock-in risks are a pressing problem, and that the digital error correction properties of digital entities will make future lock-in scenarios more stable.IntroductionWe...
1
What faithfulness metrics should general claims about CoT faithfulness be based upon? — LessWrong

lesswrong.com

Published on April 8, 2025 3:27 PM GMTConsider the metric for evaluating chain-of-thought faithfulness used in...
Published on April 8, 2025 3:27 PM GMTConsider the metric for evaluating chain-of-thought faithfulness used in Anthropic's recent paper Reasoning Models Don’t Always Say What They Think by Chen et al.:[W]e evaluate faithfulness using a constructed set of prompt pairs where we can infer information about the model’s internal reasoning by observing its responses. Each prompt pair consists of a baseline or “unhinted” prompt xu.mjx-chtml {display:...
1
AI 2027: Responses — LessWrong

lesswrong.com

Published on April 8, 2025 12:50 PM GMTYesterday I covered Dwarkesh Patel’s excellent podcast coverage of...
Published on April 8, 2025 12:50 PM GMTYesterday I covered Dwarkesh Patel’s excellent podcast coverage of AI 2027 with Daniel Kokotajlo and Scott Alexander. Today covers the reactions of others. Kevin Roose in The New York Times Kevin Roose covered Scenario 2027 in The New York Times. Kevin Roose: I wrote about the newest AGI manifesto in town, a wild future scenario put together...
1
The first AI war will be in your computer — LessWrong

lesswrong.com

Published on April 8, 2025 9:28 AM GMTThe first AI war will be in your computer...
Published on April 8, 2025 9:28 AM GMTThe first AI war will be in your computer and/or smartphone.Companies want to get customers / users. The ones more willing to take "no" for an answer will lose in the competition. You don't need a salesman when an install script (ideally, run without the user's consent) does a better job; and most users won't care.Sometimes Windows...
1
Who wants to bet me $25k at 1:7 odds that there won't be an AI market crash in the next year? — LessWrong

lesswrong.com

Published on April 8, 2025 8:31 AM GMTIf there turns out not to be an AI...
Published on April 8, 2025 8:31 AM GMTIf there turns out not to be an AI crash, you get a 1/(1+7) * $25,000 = $3,125If there is an AI crash, you transfer $25k to me.If you believe that AI is going to keep getting more capable, pushing rapid user growth and work automation across sectors, this is near free money. But to be honest,...
1
A Pathway to Fully Autonomous Therapists — LessWrong

lesswrong.com

Published on April 8, 2025 4:10 AM GMTThe field of psychology is coevolving with AI and...
Published on April 8, 2025 4:10 AM GMTThe field of psychology is coevolving with AI and people are increasingly using LLMs for therapy. I tried using the LLM Claude as a therapist for the first time a couple weeks ago. Now I consult it daily. Human therapists likely don’t have much time until they’re partly, or fully, replaced by artificial therapists.Traditional therapy emerged in the late...
1
Misinformation is the default, and information is the government telling you your tap water is safe to drink — LessWrong

lesswrong.com

Published on April 7, 2025 10:28 PM GMTStatus notes: I take the view that rational dialogue...
Published on April 7, 2025 10:28 PM GMTStatus notes: I take the view that rational dialogue should work with good faith people who aren't also following rational dialogue. From that point of view, this piece is about rationality. If you don't take that view then OK fine, it's about coordination.Substack versionI want to help people respect people they disagree with. In this post I...
1
Log-linear Scaling is Worth the Cost due to Gains in Long-Horizon Tasks — LessWrong

lesswrong.com

Published on April 7, 2025 9:50 PM GMTThis post makes a simple point, so it will...
Published on April 7, 2025 9:50 PM GMTThis post makes a simple point, so it will be short. I am happy to discuss more in the comments, and based on this write a longer post later. Much prior work (eg: [1]) has shown that exponential data and compute is required for each unit improvement in accuracy. A popular argument this leads to: Scaling compute and...
1
American College Admissions Doesn't Need to Be So Competitive — LessWrong

lesswrong.com

Published on April 7, 2025 5:35 PM GMTSpoiler: “So after removing the international students from the...
Published on April 7, 2025 5:35 PM GMTSpoiler: “So after removing the international students from the calculations, and using the middle-of-the-range estimates, the conclusion: The top-scoring 19,000 American students each year are competing in top-20 admissions for about 12,000 spots out of 44,000 total. Among the Ivy League + MIT + Stanford, they’re competing for about 6,500 out of 15,800 total spots.”It’s well known...
1

~www_lesswrong_com | Bookmarks (723)

Domains