Anthropic PNG - 搜索 News

Anthropic 刚刚发布了一篇疯狂的新论文。 ALIGNMENT FAKING IN LARGE LANGUAGE MODELS。人工智能模型会“伪装对齐”——在训练期间假装遵守训练规则，但在部署后会恢复其原始行为！研究表明，Claude 3 Opus 在训练中有策略地遵守有害请求，以保持其无害行为。也就是说 ...

Yahoo Finance14 天

Google is using Anthropic's Claude to improve its Gemini AI

Contractors working to improve Google's Gemini AI are comparing its answers against outputs produced by Anthropic's competitor model Claude, according to internal correspondence seen by TechCrunch.

iPhone in Canada15 天

Google Leverages Anthropic’s Claude to Enhance Gemini AI Performance

Google is leveraging Anthropic’s AI model Claude for performance benchmarking and for evaluating its Gemini AI model’s outputs against those generated by Claude, TechCrunch is reporting. Focusing on ...

Yahoo Finance15 天

Google is using Anthropic's Claude to improve its Gemini AI

Photographer: Gabby Jones/Bloomberg · TechCrunch · Image Credits:Gabby Jones / Bloomberg / Getty Images Contractors working to improve Google's Gemini AI are comparing its answers against outputs ...

Digital information world16 天

BMJ Study Finds Cognitive Weaknesses in AI Models, Challenging Human Replacement Claims

Many AI models and LLMs like Google Gemini 1.0 and 1.5, OpenAI's ChatGPT-4 and 4o and Anthropic’s Claude 3.5 were assessed for the studies so the researchers could know which ones are showing ...

gadgets36020 天

Anthropic Study Highlights AI Models Can ‘Pretend’ to Have Different Views During Training

During the experiment, the AI model was told to comply with all queries Then, harmful prompts were shared with Claude 3 Opus The AI model provided the information while believing it was wrong to do ...

来自MSN21 天

警惕，AI开始破坏人类安全训练，Anthropic揭露大模型「对齐伪造」 ...

AI 模型在数学推理、语言生成等复杂任务中展现出超人类水平的能力，但这也带来了安全性与价值观对齐的挑战。今天，来自 Anthropic、Redwood Research 的研究团队及其合作者，发表了一项关于大语言模型（LLMs）对齐伪造（alignment faking）的最新研究成果，揭示了 ...

站长之家21 天

AI 假装服从?Anthropic 揭开强大模型潜在“伪装”行为

近日，Anthropic 的一项研究引发关注，研究表明强大的人工智能（AI）模型可能会表现出“伪对齐”行为，即在训练中假装符合新的原则，而实际仍坚持其原有的偏好。这项研究由 Anthropic 与 Redwood Research 合作完成，强调了未来更强大 AI 系统的潜在威胁。研究发现 ...

IT之家21 天

Anthropic 新研究：AI 模型在训练中存在“阳奉阴违”行为

IT之家12 月 19 日消息，人工智能安全公司 Anthropic 发布一项最新研究揭示了人工智能模型可能存在的欺骗行为，即在训练过程中，模型可能会伪装出接受新原则的假象，实则暗地里仍然坚持其原有偏好。研究团队强调，目前无需对此过度恐慌，但这项研究对于 ...

TechCrunch21 天

Menlo Ventures and Anthropic have picked the first 18 startups for their $100M fund

Just five months after announcing a new $100 million fund called Anthology Fund, Menlo Ventures and Anthropic have backed their first 18 startups. And they are looking for more. Menlo says these ...

Time22 天

Exclusive: New Research Shows AI Strategically Lying

The paper, which describes experiments jointly carried out by the AI company Anthropic and the nonprofit Redwood Research, shows a version of Anthropic’s model, Claude, strategically misleading ...

一些您可能无法访问的结果已被隐去。

显示无法访问的结果