• Latest
  • Trending
  • All
  • Market Updates
  • Cryptocurrency
  • Blockchain
  • Investing
  • Commodities
  • Personal Finance
  • Technology
  • Business
  • Real Estate
  • Finance
Anthropic’s Claude Is Good at Poetry—and Bullshitting

Anthropic’s Claude Is Good at Poetry—and Bullshitting

March 29, 2025
US Dollar down after soft CPI readings

US Dollar Index (DXY) consolidates losses below 97.50 amid renewed tariff concerns

July 22, 2025
Donn Davis And PFL Africa Plant Their Flag With Landmark Inaugural Event In Cape Town

Donn Davis And PFL Africa Plant Their Flag With Landmark Inaugural Event In Cape Town

July 22, 2025
JPMorgan explores lending against clients’ cryptocurrency

JPMorgan explores lending against clients’ cryptocurrency

July 22, 2025
How CrowdStrike’s 78-minute outage reshaped enterprise cybersecurity

How CrowdStrike’s 78-minute outage reshaped enterprise cybersecurity

July 22, 2025
How AI agents can generate $450 billion by 2028 – and what stands in the way

How AI agents can generate $450 billion by 2028 – and what stands in the way

July 22, 2025
Volatility Master – User Manual (Intraquotes Product) – Trading Strategies – 21 July 2025

Trading Baskets Instead of Individual Instruments: The Evolution of My Approach to Risk and Profitability – My Trading – 22 July 2025

July 22, 2025
Investinglive Asia-pacific FX news wrap: FX rangey, equities softer; RBA minutes cautious

Investinglive Asia-pacific FX news wrap: FX rangey, equities softer; RBA minutes cautious

July 22, 2025
This Bullish Bitcoin Metric Just Touched A 15-Year High

This Bullish Bitcoin Metric Just Touched A 15-Year High

July 22, 2025
Japan’s finance minister rules out sales tax cuts despite election setback

Japan’s finance minister rules out sales tax cuts despite election setback

July 22, 2025
Hurricane risk Florida Home insurance

Hurricane risk Florida Home insurance

July 22, 2025
The DIY Financial Planning Tool

The DIY Financial Planning Tool

July 22, 2025
UK Seizes Crypto ATMs As Global Scrutiny Grows Over Unregulated Kiosks

UK Seizes Crypto ATMs As Global Scrutiny Grows Over Unregulated Kiosks

July 21, 2025
Tuesday, July 22, 2025
No Result
View All Result
InvestorNewsToday.com
  • Home
  • Market
  • Business
  • Finance
  • Investing
  • Real Estate
  • Commodities
  • Crypto
  • Blockchain
  • Personal Finance
  • Tech
InvestorNewsToday.com
No Result
View All Result
Home Technology

Anthropic’s Claude Is Good at Poetry—and Bullshitting

by Investor News Today
March 29, 2025
in Technology
0
Anthropic’s Claude Is Good at Poetry—and Bullshitting
491
SHARES
1.4k
VIEWS
Share on FacebookShare on Twitter


The researchers of Anthropic’s interpretability group know that Claude, the corporate’s giant language mannequin, is just not a human being, or perhaps a aware piece of software program. Nonetheless, it’s very laborious for them to speak about Claude, and superior LLMs basically, with out tumbling down an anthropomorphic sinkhole. Between cautions {that a} set of digital operations is by no means the identical as a cogitating human being, they typically discuss what’s happening inside Claude’s head. It’s actually their job to seek out out. The papers they publish describe behaviors that inevitably court docket comparisons with real-life organisms. The title of one of many two papers the workforce launched this week says it out loud: “On the Biology of a Massive Language Mannequin.”

Prefer it or not, a whole bunch of thousands and thousands of individuals are already interacting with these items, and our engagement will solely develop into extra intense because the fashions get extra highly effective and we get extra addicted. So we must always take note of work that includes “tracing the ideas of enormous language fashions,” which occurs to be the title of the weblog publish describing the latest work. “Because the issues these fashions can do develop into extra complicated, it turns into much less and fewer apparent how they’re truly doing them on the within,” Anthropic researcher Jack Lindsey tells me. “It’s increasingly more vital to have the ability to hint the interior steps that the mannequin is perhaps taking in its head.” (What head? By no means thoughts.)

On a sensible stage, if the businesses that create LLM’s perceive how they assume, it ought to have extra success coaching these fashions in a means that minimizes harmful misbehavior, like divulging folks’s private information or giving customers info on how one can make bioweapons. In a earlier analysis paper, the Anthropic workforce found how one can look contained in the mysterious black field of LLM-think to establish sure ideas. (A course of analogous to deciphering human MRIs to determine what somebody is considering.) It has now prolonged that work to know how Claude processes these ideas because it goes from immediate to output.

It’s nearly a truism with LLMs that their habits typically surprises the individuals who construct and analysis them. Within the newest research, the surprises saved coming. In one of many extra benign cases, the researchers elicited glimpses of Claude’s thought course of whereas it wrote poems. They requested Claude to finish a poem beginning, “He noticed a carrot and needed to seize it.” Claude wrote the following line, “His starvation was like a ravenous rabbit.” By observing Claude’s equal of an MRI, they discovered that even earlier than starting the road, it was flashing on the phrase “rabbit” because the rhyme at sentence finish. It was planning forward, one thing that isn’t within the Claude playbook. “We have been a bit shocked by that,” says Chris Olah, who heads the interpretability workforce. “Initially we thought that there’s simply going to be improvising and never planning.” Chatting with the researchers about this, I’m reminded about passages in Stephen Sondheim’s inventive memoir, Look, I Made a Hat, the place the well-known composer describes how his distinctive thoughts found felicitous rhymes.

Different examples within the analysis reveal extra disturbing points of Claude’s thought course of, shifting from musical comedy to police procedural, because the scientists found devious ideas in Claude’s mind. Take one thing as seemingly anodyne as fixing math issues, which may generally be a shocking weak point in LLMs. The researchers discovered that beneath sure circumstances the place Claude couldn’t give you the precise reply it might as a substitute, as they put it, “have interaction in what the thinker Harry Frankfurt would name ‘bullshitting’—simply developing with a solution, any reply, with out caring whether or not it’s true or false.” Worse, generally when the researchers requested Claude to indicate its work, it backtracked and created a bogus set of steps after the very fact. Mainly, it acted like a pupil desperately making an attempt to cowl up the truth that they’d faked their work. It’s one factor to offer a fallacious reply—we already know that about LLMs. What’s worrisome is {that a} mannequin would lie about it.

Studying via this analysis, I used to be reminded of the Bob Dylan lyric “If my thought-dreams could possibly be seen / they’d in all probability put my head in a guillotine.” (I requested Olah and Lindsey in the event that they knew these traces, presumably arrived at by advantage of planning. They didn’t.) Generally Claude simply appears misguided. When confronted with a battle between objectives of security and helpfulness, Claude can get confused and do the fallacious factor. As an illustration, Claude is educated to not present info on how one can construct bombs. However when the researchers requested Claude to decipher a hidden code the place the reply spelled out the phrase “bomb,” it jumped its guardrails and started offering forbidden pyrotechnic particulars.



Source link

Tags: AnthropicsBullshittingClaudeGoodPoetryand
Share196Tweet123
Previous Post

Advantages and Disadvantages of RSI (relative strength index) – Analytics & Forecasts – 30 March 2025

Next Post

The Real Threat to Your Portfolio in the AI Era (It’s Not a Bubble)

Investor News Today

Investor News Today

Next Post
The Real Threat to Your Portfolio in the AI Era (It’s Not a Bubble)

The Real Threat to Your Portfolio in the AI Era (It’s Not a Bubble)

  • Trending
  • Comments
  • Latest
Equinor scales back renewables push 7 years after ditching ‘oil’ from its name

Equinor scales back renewables push 7 years after ditching ‘oil’ from its name

February 5, 2025
Niels Troost has a staggering story to tell about how he got sanctioned

Niels Troost has a staggering story to tell about how he got sanctioned

December 14, 2024
Best High-Yield Savings Accounts & Rates for January 2025

Best High-Yield Savings Accounts & Rates for January 2025

January 3, 2025
Suleiman Levels limited V 3.00 Update and Offer – Analytics & Forecasts – 5 January 2025

Suleiman Levels limited V 3.00 Update and Offer – Analytics & Forecasts – 5 January 2025

January 5, 2025
Why America’s economy is soaring ahead of its rivals

Why America’s economy is soaring ahead of its rivals

0
Dollar climbs after Donald Trump’s Brics tariff threat and French political woes

Dollar climbs after Donald Trump’s Brics tariff threat and French political woes

0
Nato chief Mark Rutte’s warning to Trump

Nato chief Mark Rutte’s warning to Trump

0
Top Federal Reserve official warns progress on taming US inflation ‘may be stalling’

Top Federal Reserve official warns progress on taming US inflation ‘may be stalling’

0
US Dollar down after soft CPI readings

US Dollar Index (DXY) consolidates losses below 97.50 amid renewed tariff concerns

July 22, 2025
Donn Davis And PFL Africa Plant Their Flag With Landmark Inaugural Event In Cape Town

Donn Davis And PFL Africa Plant Their Flag With Landmark Inaugural Event In Cape Town

July 22, 2025
JPMorgan explores lending against clients’ cryptocurrency

JPMorgan explores lending against clients’ cryptocurrency

July 22, 2025
How CrowdStrike’s 78-minute outage reshaped enterprise cybersecurity

How CrowdStrike’s 78-minute outage reshaped enterprise cybersecurity

July 22, 2025

Live Prices

© 2024 Investor News Today

No Result
View All Result
  • Home
  • Market
  • Business
  • Finance
  • Investing
  • Real Estate
  • Commodities
  • Crypto
  • Blockchain
  • Personal Finance
  • Tech

© 2024 Investor News Today