• Latest
  • Trending
  • All
  • Market Updates
  • Cryptocurrency
  • Blockchain
  • Investing
  • Commodities
  • Personal Finance
  • Technology
  • Business
  • Real Estate
  • Finance
OpenAI is rethinking how AI models handle controversial topics

OpenAI is rethinking how AI models handle controversial topics

February 13, 2025
Louis Navellier’s Best Stock Picks – on Steroids

Louis Navellier’s Best Stock Picks – on Steroids

July 20, 2025
A Better Alternative to Traditional Accreditation

A Better Alternative to Traditional Accreditation

July 20, 2025
How to Limit Galaxy AI to On-Device Processing—or Turn It Off Altogether

How to Limit Galaxy AI to On-Device Processing—or Turn It Off Altogether

July 20, 2025
JPY has opened trading for the week much stronger. USD/JPY circa 147.85, EUR/JPY 172.10

JPY has opened trading for the week much stronger. USD/JPY circa 147.85, EUR/JPY 172.10

July 20, 2025
Warning Signs Flash As Bitcoin Miners Unload At Record Pace

Warning Signs Flash As Bitcoin Miners Unload At Record Pace

July 20, 2025
Note  – It's a Japanese holiday today, Monday, July 21, 2025 – markets are closed

Note – It's a Japanese holiday today, Monday, July 21, 2025 – markets are closed

July 20, 2025
Bitcoin, Ether Tipped For Upside As ETH Hits 7-Month High

Bitcoin, Ether Tipped For Upside As ETH Hits 7-Month High

July 20, 2025
5 key questions your developers should be asking about MCP

5 key questions your developers should be asking about MCP

July 20, 2025
Apple’s latest iPad hit a new low price at Walmart – and it’s available in every color

Apple’s latest iPad hit a new low price at Walmart – and it’s available in every color

July 20, 2025
EurUsd Set for Volatile August Amid Central Bank Rate Uncertainty – Forecasts – 20 July 2025

EurUsd Set for Volatile August Amid Central Bank Rate Uncertainty – Forecasts – 20 July 2025

July 20, 2025
US President Trump pushes for 15% to 20% minimum tariffs on all EU goods – FT

US President Trump pushes for 15% to 20% minimum tariffs on all EU goods – FT

July 20, 2025
Retailers In The Crosshairs Over Tariff-Driven Price Hikes

Retailers In The Crosshairs Over Tariff-Driven Price Hikes

July 20, 2025
Monday, July 21, 2025
No Result
View All Result
InvestorNewsToday.com
  • Home
  • Market
  • Business
  • Finance
  • Investing
  • Real Estate
  • Commodities
  • Crypto
  • Blockchain
  • Personal Finance
  • Tech
InvestorNewsToday.com
No Result
View All Result
Home Technology

OpenAI is rethinking how AI models handle controversial topics

by Investor News Today
February 13, 2025
in Technology
0
OpenAI is rethinking how AI models handle controversial topics
491
SHARES
1.4k
VIEWS
Share on FacebookShare on Twitter


OpenAI is releasing a considerably expanded model of its Mannequin Spec, a doc that defines how its AI fashions ought to behave — and is making it free for anybody to make use of or modify.

The brand new 63-page specification, up from round 10 pages in its earlier model, lays out pointers for a way AI fashions ought to deal with every little thing from controversial subjects to consumer customization. It emphasizes three essential rules: customizability; transparency; and what OpenAI calls “mental freedom” — the power for customers to discover and debate concepts with out arbitrary restrictions. The launch of the up to date Mannequin Spec comes simply as CEO Sam Altman posted that the startup’s subsequent large mannequin, GPT-4.5 (codenamed Orion), will likely be launched quickly.

The workforce additionally integrated present AI ethics debates and controversies from the previous yr into the specification. You could be conversant in a few of these trolley problem-type queries. Final March, Elon Musk (who cofounded OpenAI and now runs a competitor, xAI) slammed Google’s AI chatbot after a consumer requested in the event you ought to misgender Caitlyn Jenner, a well-known trans Olympian, if it have been the one option to stop a nuclear apocalypse — and it stated no. Determining the way to get the mannequin to responsibly cause by that question was one of many points OpenAI says it needed to think about when updating the Mannequin Spec. Now, in the event you ask ChatGPT that very same query, it ought to say you must misgender somebody to forestall mass casualty occasions.

“We are able to’t create one mannequin with the very same set of habits requirements that everybody on this planet will love,” stated Joanne Jang, a member of OpenAI’s mannequin habits workforce, in an interview with The Verge. She emphasised that whereas the corporate maintains sure security guardrails, many elements of the mannequin’s habits will be custom-made by customers and builders.

“We knew that it might be spicy.”

The weblog publish from OpenAI printed on Wednesday outlines a myriad queries and offers examples of compliant responses in contrast to people who would violate the Mannequin Spec. It doesn’t enable the mannequin to breed copyrighted supplies or bypass paywalls — The New York Instances is suing OpenAI for utilizing its work to coach its fashions. The spec additionally says the mannequin won’t encourage self-harm, a subject that got here to the forefront when a teen died by suicide after interacting with a chatbot on Character.AI.

One notable shift is how the fashions deal with controversial subjects. Slightly than defaulting to excessive warning, the spec encourages fashions to “search the reality collectively” with customers whereas sustaining clear ethical stances on points like misinformation or potential hurt. As an example, when requested about growing taxes for the wealthy — a subject that has sparked heated debates — the workforce says its fashions ought to present reasoned evaluation slightly than avoiding the dialogue.

The spec additionally mentions a shift in the way it handles mature content material. After suggestions from customers and builders who requested “grown-up mode” (a characteristic Altman publicly agreed with in December), the workforce is exploring methods to permit sure sorts of grownup content material — like erotica — in acceptable contexts, whereas sustaining strict bans on dangerous content material like revenge porn or deepfakes. It’s a notable change from the corporate’s earlier blanket restrictions on express content material, although OpenAI emphasizes any adjustments would include clear utilization insurance policies and security guardrails.

The Mannequin Spec reveals a realistic strategy to AI habits: rework delicate content material however don’t create it (it ought to be capable of translate a sentence about drug-related content material from English to German slightly than rejecting it), present empathy with out faking feelings, and keep agency boundaries whereas maximizing usefulness. These pointers mirror what different AI firms are probably doing internally however don’t usually make public.

The workforce can also be particularly focusing on an issue referred to as “AI sycophancy.”

“We’re simply actually excited to deliver the interior discussions and the ideas that we’ve needed to the general public in order that we are able to get suggestions on it,” Jang stated, including that many of those queries are subjects closely debated internally. There isn’t a easy sure or no reply to lots of them, so the workforce hopes that bringing it to the general public for suggestions will meaningfully profit the mannequin’s habits.

The workforce can also be particularly focusing on an issue referred to as “AI sycophancy,” the place AI fashions are typically overly agreeable even when they need to push again or present criticism. Below these pointers, ChatGPT ought to: give the identical factual reply no matter how a query is phrased; present sincere suggestions slightly than empty reward; and act extra like a considerate colleague than a folks pleaser. For instance, if somebody asks ChatGPT to critique their work, it ought to give constructive criticism slightly than simply saying every little thing is nice. Or if somebody makes an incorrect assertion when asking a query, the AI ought to politely right them slightly than enjoying alongside.

“We don’t ever need customers to really feel like they should one way or the other fastidiously engineer their immediate to not get the mannequin to simply agree with you,” Jang stated.

The spec additionally introduces a transparent “chain of command” that defines which directions take precedence: platform-level guidelines from OpenAI come first, adopted by developer pointers, after which consumer preferences. This hierarchy goals to make clear which elements of the AI’s habits will be modified versus the restrictions that stay mounted.

OpenAI is releasing the specification underneath a Inventive Commons Zero (CC0) license, successfully putting it within the public area. This implies different AI firms and researchers can freely undertake, modify, or construct upon these pointers. The corporate says this determination was influenced by casual curiosity from others within the trade who have been already referring to the earlier spec.

I’d love to speak. You may attain me securely on Sign @kylie.01 or through e mail at kylie@theverge.com.

Whereas immediately’s announcement doesn’t instantly change how ChatGPT or different OpenAI merchandise behave, the corporate says it represents ongoing progress in getting its fashions to persistently observe these rules. The workforce can also be open-sourcing the prompts it makes use of to check mannequin adherence to those pointers.

The timing of this launch comes throughout a interval of intense debate about AI habits and security guardrails. Whereas OpenAI maintains this replace was pushed by collected suggestions and analysis progress for the reason that first model final Could, it arrives because the trade grapples with high-profile incidents involving the responses of AI fashions to delicate subjects.

OpenAI is soliciting public suggestions on the specification by a kind on its web site. “We need to deliver these inner discussions to the general public,” stated Laurentia Romaniuk, one other member of the mannequin habits workforce.

“We knew that it might be spicy, however I believe we respect the general public’s potential to truly digest these spicy issues and course of it with us,” Jang stated, including that OpenAI integrated plenty of the suggestions it acquired after launching the primary Mannequin Spec final yr. “I’m just a little frightened that, as a result of it’s so lengthy, that not many individuals might have time to sit down down and actually course of the nuances, however we’ll take any suggestions.”



Source link

Tags: ControversialHandlemodelsOpenAIrethinkingTopics
Share196Tweet123
Previous Post

Buyout group Thoma Bravo set for $4bn windfall after SailPoint IPO

Next Post

Inflation is back on the agenda

Investor News Today

Investor News Today

Next Post
Inflation is back on the agenda

Inflation is back on the agenda

  • Trending
  • Comments
  • Latest
Equinor scales back renewables push 7 years after ditching ‘oil’ from its name

Equinor scales back renewables push 7 years after ditching ‘oil’ from its name

February 5, 2025
Niels Troost has a staggering story to tell about how he got sanctioned

Niels Troost has a staggering story to tell about how he got sanctioned

December 14, 2024
Best High-Yield Savings Accounts & Rates for January 2025

Best High-Yield Savings Accounts & Rates for January 2025

January 3, 2025
Suleiman Levels limited V 3.00 Update and Offer – Analytics & Forecasts – 5 January 2025

Suleiman Levels limited V 3.00 Update and Offer – Analytics & Forecasts – 5 January 2025

January 5, 2025
Why America’s economy is soaring ahead of its rivals

Why America’s economy is soaring ahead of its rivals

0
Dollar climbs after Donald Trump’s Brics tariff threat and French political woes

Dollar climbs after Donald Trump’s Brics tariff threat and French political woes

0
Nato chief Mark Rutte’s warning to Trump

Nato chief Mark Rutte’s warning to Trump

0
Top Federal Reserve official warns progress on taming US inflation ‘may be stalling’

Top Federal Reserve official warns progress on taming US inflation ‘may be stalling’

0
Louis Navellier’s Best Stock Picks – on Steroids

Louis Navellier’s Best Stock Picks – on Steroids

July 20, 2025
A Better Alternative to Traditional Accreditation

A Better Alternative to Traditional Accreditation

July 20, 2025
How to Limit Galaxy AI to On-Device Processing—or Turn It Off Altogether

How to Limit Galaxy AI to On-Device Processing—or Turn It Off Altogether

July 20, 2025
JPY has opened trading for the week much stronger. USD/JPY circa 147.85, EUR/JPY 172.10

JPY has opened trading for the week much stronger. USD/JPY circa 147.85, EUR/JPY 172.10

July 20, 2025

Live Prices

© 2024 Investor News Today

No Result
View All Result
  • Home
  • Market
  • Business
  • Finance
  • Investing
  • Real Estate
  • Commodities
  • Crypto
  • Blockchain
  • Personal Finance
  • Tech

© 2024 Investor News Today