~www_lesswrong_com | Bookmarks (708)

Brainrot — LessWrong

lesswrong.com

Published on January 26, 2025 5:35 AM GMTJanuary: In early 2026, Meta launches a fleet of...
Published on January 26, 2025 5:35 AM GMTJanuary: In early 2026, Meta launches a fleet of new AI influencers, targeting the massive audience displaced by the Xiaohongshu-TikTok wars. They are beautiful, funny, smart—whatever you want them to be. Equipped with the latest in online learning, the agents immediately begin adapting to social media trends as they occur. Engagement metrics reach new highs. February: Fads pass...
1
Notes on Argentina — LessWrong

lesswrong.com

Published on January 26, 2025 3:51 AM GMTFitz Roy Massif, El Chalten, ArgentinaI recently got back...
Published on January 26, 2025 3:51 AM GMTFitz Roy Massif, El Chalten, ArgentinaI recently got back from Argentina, where we decided to spend part of our honeymoon. We chose Argentina for our honeymoon because my wife and I met in Patagonia in 2022, and we love the region. Behind the choice was also my desire to see a country undergoing a major economic transformation:...
1
Recommendations for Recent Posts/Sequences on Instrumental Rationality? — LessWrong

lesswrong.com

Published on January 26, 2025 12:41 AM GMTI absolutely love the Science of Winning at Life...
Published on January 26, 2025 12:41 AM GMTI absolutely love the Science of Winning at Life sequence. It's a delightful blend of well-researched cognitive science and Bayesian reasoning. The initial paragraph sums up @lukeprog's motivation:Some have suggested that the Less Wrong community could improve readers' instrumental rationality more effectively if it first caught up with the scientific literature on productivity and self-help, and then...
1
Anomalous Tokens in DeepSeek-V3 and r1 — LessWrong

lesswrong.com

Published on January 25, 2025 10:55 PM GMT“Anomalous”, “glitch”, or “unspeakable” tokens in an LLM are...
Published on January 25, 2025 10:55 PM GMT“Anomalous”, “glitch”, or “unspeakable” tokens in an LLM are those that induce bizarre behavior or otherwise don’t behave like regular text.The SolidGoldMagikarp saga is pretty much essential context, as it documents the discovery of this phenomenon in GPT-2 and GPT-3.But, as far as I was able to tell, nobody had yet attempted to search for these tokens...
1
The Rising Sea — LessWrong

lesswrong.com

Published on January 25, 2025 8:48 PM GMTAnd then we hit a wall. Nobody expected it. Well......
Published on January 25, 2025 8:48 PM GMTAnd then we hit a wall. Nobody expected it. Well... almost nobody. Yann LeCun posted his "I told you so's" all over X. Gary Marcus insisted he'd predicted this all along. Sam Altman pivoted, declaring o3 was actually already ASI. The first rumors of scaling laws breaking down were already circulating in late 2024. By late 2025, it was...
1
Liron Shapira vs Ken Stanley on Doom Debates. A review — LessWrong

lesswrong.com

Published on January 24, 2025 6:01 PM GMTI summarize my learnings and thoughts on Liron Shapira's...
Published on January 24, 2025 6:01 PM GMTI summarize my learnings and thoughts on Liron Shapira's discussion with Ken Stanley on the Doom Debates podcast. I refer to them as LS and KS respectively.High level summaryKey beliefs of KS:Future superintelligence will be 'open-ended'. Hence, thinking of them as optimizers will lead to incomplete thinking and risk mitigations.P(doom) is non-zero, but no fixed number. Changes...
1
Is there such a thing as an impossible protein? — LessWrong

lesswrong.com

Published on January 24, 2025 5:12 PM GMTThis is something I’ve been thinking about since my...
Published on January 24, 2025 5:12 PM GMTThis is something I’ve been thinking about since my synthesizability article.Let’s assume, given the base twenty amino acids that are naturally present in the human body, we have every possible permutation of them for up to 100 amino acids, stored in a box with pH 7.4 water and normal pressures and temperature and isolated from one another....
1
Stargate AI-1 — LessWrong

lesswrong.com

Published on January 24, 2025 3:20 PM GMTThere was a comedy routine a few years ago....
Published on January 24, 2025 3:20 PM GMTThere was a comedy routine a few years ago. I believe it was by Hannah Gadsby. She brought up a painting, and looked at some details. The details weren’t important in and of themselves. If an AI had randomly put them there, we wouldn’t care. Except an AI didn’t put them there. And they weren’t there at...
1
QFT and neural nets: the basic idea — LessWrong

lesswrong.com

Published on January 24, 2025 1:54 PM GMTPreviously in the series: The laws of large numbers...
Published on January 24, 2025 1:54 PM GMTPreviously in the series: The laws of large numbers and Basics of Bayesian learning.Reminders: formalizing learning in ML and Bayesian learningLearning and inference in neural nets and Bayesian modelsAs a very basic sketch, in order to specify an ML algorithm one needs five pieces of data. An architecture: i.e., a parametrized space of functions that associates to each weight...
1
Eliciting bad contexts — LessWrong

lesswrong.com

Published on January 24, 2025 10:39 AM GMTSay an LLM agent behaves innocuously in some context...
Published on January 24, 2025 10:39 AM GMTSay an LLM agent behaves innocuously in some context A, but in some sense “knows” that there is some related context B such that it would have behaved maliciously (inserted a backdoor in code, ignored a security bug, lied, etc.). For example, in the recent alignment faking paper Claude Opus chooses to say harmful things so that on...
1
Insights from "The Manga Guide to Physiology" — LessWrong

lesswrong.com

Published on January 24, 2025 5:18 AM GMTPhysiology seemed like a grab-bag of random processes which...
Published on January 24, 2025 5:18 AM GMTPhysiology seemed like a grab-bag of random processes which no one really understands. If you understand a physiological process—congratulations, that idea probably doesn’t transfer much to other domains. You just know how humans—and maybe closely related animals—do the thing. At least, that’s how I felt. (These sentiments tend to feel sillier when spelled out.)I haven't totally changed...
1
Do you consider perfect surveillance inevitable? — LessWrong

lesswrong.com

Published on January 24, 2025 4:57 AM GMTA lot of my recent research work focusses on:1....
Published on January 24, 2025 4:57 AM GMTA lot of my recent research work focusses on:1. building the case for why perfect surveillance is becoming increasingly hard to avoid in the future2. thinking through the implications of this, if it happenedWhen I say perfect surveillance, imagine everything your eyes see and your ears hear is being broadcast 24x7x365 to youtube (and its equivalents in...
1
Uncontrollable: A Surprisingly Good Introduction to AI Risk — LessWrong

lesswrong.com

Published on January 24, 2025 4:30 AM GMTI recently read Darren McKee's book "Uncontrollable: The Threat...
Published on January 24, 2025 4:30 AM GMTI recently read Darren McKee's book "Uncontrollable: The Threat of Artificial Superintelligence and the Race to Save the World". I recommend this book as the best current introduction to AI risk for people with limited AI background. It prompted me to update my thinking about Asimov's Laws and related risks in light of recent evidence about AI...
1
Contra Dances Getting Shorter and Earlier — LessWrong

lesswrong.com

Published on January 23, 2025 11:30 PM GMT I think of a standard contra dance as...
Published on January 23, 2025 11:30 PM GMT I think of a standard contra dance as running 8pm-11pm: three hours is a nice amount of time for dancing, and 8pm is late enough that dinner isn't rushed. Looking over the 136 regular Free Raisins dances from 2010 to 2019 matches my impression: 85% were 3hr, 62% started at 8pm, and 51% did both. I...
1
Starting Thoughts on RLHF — LessWrong

lesswrong.com

Published on January 23, 2025 10:16 PM GMTCross posted from SubstackContinuing the Stanford CS120 Introduction to...
Published on January 23, 2025 10:16 PM GMTCross posted from SubstackContinuing the Stanford CS120 Introduction to AI Safety course readings (Week 2, Lecture 1)This is likely too elementary for those who follow AI Safety research - my writing this is an aid to thinking through these ideas and building up higher-level concepts rather than just passively doing the readings. Recommendation: skim if familiar with...
1
Recursive Self-Modeling as a Plausible Mechanism for Real-time Introspection in Current Language Models — LessWrong

lesswrong.com

Published on January 22, 2025 6:36 PM GMT(and as a completely speculative hypothesis for the minimum...
Published on January 22, 2025 6:36 PM GMT(and as a completely speculative hypothesis for the minimum requirements for sentience in both organic and synthetic systems)Factual and Highly PlausibleModel latent space self-organizes during training. We know this. You could even say it's what makes models work at all.Models learn any patterns there are to be learned. They do not discriminate between intentionally engineered patterns or...
1
Ut, an alternative gender-neutral pronoun — LessWrong

lesswrong.com

Published on January 22, 2025 5:36 PM GMTThis post is about ‘ut’, a gender-neutral pronoun I...
Published on January 22, 2025 5:36 PM GMTThis post is about ‘ut’, a gender-neutral pronoun I am proposing and plan to use experimentally in the future on my blog and here on lesswrong.(The u of ‘ut’ sounds like the u of uber, not the u of utter.)The pronoun is used to refer to a single individual (singular) whose gender is unspecified. It follows the...
1
Mechanisms too simple for humans to design — LessWrong

lesswrong.com

Published on January 22, 2025 4:54 PM GMTCross-posted from Telescopic TurnipAs we all know, humans are...
Published on January 22, 2025 4:54 PM GMTCross-posted from Telescopic TurnipAs we all know, humans are terrible at building butterflies. We can make a lot of objectively cool things like nuclear reactors and microchips, but we still can't create a proper artificial insect that flies, feeds, and lays eggs that turn into more butterflies. That seems like evidence that butterflies are incredibly complex machines...
1
Training Data Attribution: Examining Its Adoption & Use Cases — LessWrong

lesswrong.com

Published on January 22, 2025 3:41 PM GMTNote: This report was conducted in June 2024 and...
Published on January 22, 2025 3:41 PM GMTNote: This report was conducted in June 2024 and is based on research originally commissioned by the Future of Life Foundation (FLF). The views and opinions expressed in this document are those of the authors and do not represent the positions of FLF.This report investigates Training Data Attribution (TDA) and its potential importance to and tractability for...
1
Training Data Attribution (TDA): Examining Its Adoption & Use Cases — LessWrong

lesswrong.com

Published on January 22, 2025 3:40 PM GMTNote: This report was conducted in June 2024 and...
Published on January 22, 2025 3:40 PM GMTNote: This report was conducted in June 2024 and is based on research originally commissioned by the Future of Life Foundation (FLF). The views and opinions expressed in this document are those of the authors and do not represent the positions of FLF.This report investigates Training Data Attribution (TDA) and its potential importance to and tractability for...
1
The Quantum Mars Teleporter: An Empirical Test Of Personal Identity Theories — LessWrong

lesswrong.com

Published on January 22, 2025 11:48 AM GMTtl;dr: If a copy is not identical to the...
Published on January 22, 2025 11:48 AM GMTtl;dr: If a copy is not identical to the original, MWI predicts that I will always observe myself surviving failed Mars teleportations rather than becoming the copy on Mars. BackgroundThe classic teleportation thought-experiment asks whether a perfect copy is "you". This normally presents as a pure decision problem – do you step into the teleporter? But I suggest...
1
Bayesian Reasoning on Maps — LessWrong

lesswrong.com

Published on January 22, 2025 10:45 AM GMTThis is a linkpost for an article I've written...
Published on January 22, 2025 10:45 AM GMTThis is a linkpost for an article I've written for my blog. Readers of LessWrong may want to skip the intro about Bayesian Reasoning, but might find the application to the Peter Miller vs Rootclaim debate quite interesting.I’ve been a fan of Bayesian Reasoning since the time I’ve read Harry Potter and the Methods of Rationality. In...
1
Against blanket arguments against interpretability — LessWrong

lesswrong.com

Published on January 22, 2025 9:46 AM GMTOn blanket criticism and refutationIn his long post on...
Published on January 22, 2025 9:46 AM GMTOn blanket criticism and refutationIn his long post on the subject, Charbel-Raphaël argues against theories of impacts of interpretability. I think it's a largely a good, well-argued post, and if the only thing you get out of it is reading that post, I'll be contributing to improving the discourse. There is other material with similar claims that...
1
Evolution and the Low Road to Nash — LessWrong

lesswrong.com

Published on January 22, 2025 7:06 AM GMTSolution concepts in game theory—like the Nash equilibrium and...
Published on January 22, 2025 7:06 AM GMTSolution concepts in game theory—like the Nash equilibrium and its refinements—are used in two key ways. Normatively, they proscribe how rational agents ought to behave. Descriptively, they propose how agents actually behave when interactions settle into equilibrium. The Nash equilibrium[1] underpins much of modern game theory and its applications in economics, political science, and evolutionary biology. Here, we focus on the descriptive use...
1

~www_lesswrong_com | Bookmarks (708)

Domains