mastodon.ie is one of the many independent Mastodon servers you can use to participate in the fediverse.
Irish Mastodon - run from Ireland, we welcome all who respect the community rules and members.

Administered by:

Server stats:

1.6K
active users

#chatbots

17 posts15 participants0 posts today
Miguel Afonso Caetano<p>"These issues arise because the underlying source of AI models’ “character traits” is poorly understood. At Anthropic, we try to shape our models’ characteristics in positive ways, but this is more of an art than a science. To gain more precise control over how our models behave, we need to understand what’s going on inside them—at the level of their underlying neural network.</p><p>In a new paper, we identify patterns of activity within an AI model’s neural network that control its character traits. We call these persona vectors, and they are loosely analogous to parts of the brain that “light up” when a person experiences different moods or attitudes. Persona vectors can be used to:</p><p>- Monitor whether and how a model’s personality is changing during a conversation, or over training;<br>- Mitigate undesirable personality shifts, or prevent them from arising during training;<br>- Identify training data that will lead to these shifts."</p><p><a href="https://www.anthropic.com/research/persona-vectors" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://www.</span><span class="ellipsis">anthropic.com/research/persona</span><span class="invisible">-vectors</span></a></p><p><a href="https://tldr.nettime.org/tags/AI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AI</span></a> <a href="https://tldr.nettime.org/tags/GenerativeAI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>GenerativeAI</span></a> <a href="https://tldr.nettime.org/tags/LLMs" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>LLMs</span></a> <a href="https://tldr.nettime.org/tags/Anthropic" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Anthropic</span></a> <a href="https://tldr.nettime.org/tags/Claude" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Claude</span></a> <a href="https://tldr.nettime.org/tags/PersonaVectors" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>PersonaVectors</span></a> <a href="https://tldr.nettime.org/tags/Chatbots" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Chatbots</span></a></p>
Miguel Afonso Caetano<p>"Epistemic arrogance is baked in to the culture of Silicon Valley: Blind, foolhardy confidence may be terrible for operating within large and intricate systems, but it’s great for founding and investing in regulations-flouting software companies. Many of the industry’s leading lights are proud ignoramuses, completely unaware of the gaps and blind spots in their knowledge, and ambitious young hackers and programmers are no doubt modeling their own attitudes toward the world on the overconfident performance of genius by people like Elon Musk.</p><p>But what seems particularly striking about this arrogance at this moment, though, is the extent to which it’s also baked into--and reinforced by--the L.L.M.-based chatbots now driving billions of dollars of investment. L.L.M.-based chatbots are effectively epistemic-arrogance machines: They “themselves” have no idea what they “know” or don’t, and in many circumstances will generate baldly incorrect text before admitting to lacking knowledge. Their accuracy has improved significantly over the past three years, but an L.L.M. chatbot fundamentally can’t know what it doesn’t know."</p><p><a href="https://maxread.substack.com/p/the-cracked-coder-fetish" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">maxread.substack.com/p/the-cra</span><span class="invisible">cked-coder-fetish</span></a></p><p><a href="https://tldr.nettime.org/tags/SiliconValley" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SiliconValley</span></a> <a href="https://tldr.nettime.org/tags/Ideology" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Ideology</span></a> <a href="https://tldr.nettime.org/tags/Programming" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Programming</span></a> <a href="https://tldr.nettime.org/tags/DOGE" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DOGE</span></a> <a href="https://tldr.nettime.org/tags/LLMs" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>LLMs</span></a> <a href="https://tldr.nettime.org/tags/Chatbots" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Chatbots</span></a></p>
Lars Marowsky-Brée 😷<p>I went off on a bit of a vent on the AImperor's new clothes:</p><p><a href="https://opensourcerer.eu/the-aimperors-new-clothes/index.html" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">opensourcerer.eu/the-aimperors</span><span class="invisible">-new-clothes/index.html</span></a></p><p><a href="https://mastodon.online/tags/GenAI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>GenAI</span></a> <a href="https://mastodon.online/tags/LLMs" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>LLMs</span></a> <a href="https://mastodon.online/tags/SoftwareEngineering" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SoftwareEngineering</span></a> <a href="https://mastodon.online/tags/ChatBots" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ChatBots</span></a></p>
Pivot to AI [RSS]<p>Proton’s Lumo AI chatbot: not end-to-end encrypted, not open source</p><p>Proton Mail is famous for its privacy and security. The cool trick they do is that not even Proton can decode your email. That’s because it never exists on their systems as plain text — it’s always encrypted! The most Proton can do if a government comes calling is give them the metadata — who […]</p><p><a href="https://pivot-to-ai.com/2025/08/02/protons-lumo-ai-chatbot-not-end-to-end-encrypted-not-open-source/" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">pivot-to-ai.com/2025/08/02/pro</span><span class="invisible">tons-lumo-ai-chatbot-not-end-to-end-encrypted-not-open-source/</span></a><br><a href="https://mstdn.social/tags/Chatbots" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Chatbots</span></a> <a href="https://mstdn.social/tags/Security" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Security</span></a></p>
Baessando ☭🇧🇷🇵🇸🇺🇳<p>"Enquanto a OpenAI se esforçava para desindexar conversas do Google hoje, eles esqueceram a regra mais básica da internet - nada realmente desaparece. Mais de 100.000 bate-papos ChatGPT ainda estão em Archive.org, embora com uma reviravolta. Os bate-papos não são apenas links ou fragmentos. São conversas completas, congeladas no tempo, contendo "confissões" semelhantes que expusemos ontem. Os usuários compartilharam esses bate-papos publicamente - não por padrão, mas apenas clicando em Compartilhar.</p><p>Entre as conversas recém-descobertas, os padrões emergem de nossas descobertas originais. A maioria dos bate-papos compartilhados é inofensiva, mas alguns deles não são. Aqui estão três exemplos do banco de dados archive.org (veja a nota abaixo por que não mencionamos nomes):</p><p><a href="https://bolha.one/tags/AI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AI</span></a> <a href="https://bolha.one/tags/IA" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>IA</span></a> <a href="https://bolha.one/tags/OpenAI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>OpenAI</span></a> <a href="https://bolha.one/tags/ChatGPT" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ChatGPT</span></a> <a href="https://bolha.one/tags/Prote%C3%A7%C3%A3odeDados" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ProteçãodeDados</span></a> <a href="https://bolha.one/tags/Privacidade" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Privacidade</span></a> <a href="https://bolha.one/tags/Chatbots" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Chatbots</span></a>"</p><p><span class="h-card" translate="no"><a href="https://lemmy.pt/c/tecnologia" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>tecnologia</span></a></span> <span class="h-card" translate="no"><a href="https://lemmy.eco.br/c/privacidade" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>privacidade</span></a></span> </p><p><span class="h-card" translate="no"><a href="https://tldr.nettime.org/@remixtures" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>remixtures</span></a></span> <a href="https://tldr.nettime.org/@remixtures/114958931922453021" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">tldr.nettime.org/@remixtures/1</span><span class="invisible">14958931922453021</span></a></p>
Miguel Afonso Caetano<p>"While OpenAI scrambled to de-index conversations from Google today, they forgot the internet's most basic rule—nothing truly disappears. Over 100.000 ChatGPT chats are still in Archive.org, although with a twist. The chats aren't just links or fragments. They're complete conversations, frozen in time, containing similar “confessions” we exposed yesterday. Users shared these chats publicly - not by default, but only by clicking Share.</p><p>Among the freshly uncovered conversations, patterns emerge from our original findings. Most of the shared chats are harmless, but some of them are not. Here are three examples from the archive.org database (see note below why we don’t mention names):"</p><p><a href="https://www.digitaldigging.org/p/chatgpt-confessions-gone-they-are" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://www.</span><span class="ellipsis">digitaldigging.org/p/chatgpt-c</span><span class="invisible">onfessions-gone-they-are</span></a></p><p><a href="https://tldr.nettime.org/tags/AI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AI</span></a> <a href="https://tldr.nettime.org/tags/GenerativeAI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>GenerativeAI</span></a> <a href="https://tldr.nettime.org/tags/OpenAI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>OpenAI</span></a> <a href="https://tldr.nettime.org/tags/ChatGPT" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ChatGPT</span></a> <a href="https://tldr.nettime.org/tags/DataProtection" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DataProtection</span></a> <a href="https://tldr.nettime.org/tags/Privacy" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Privacy</span></a> <a href="https://tldr.nettime.org/tags/Chatbots" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Chatbots</span></a></p>
Bob Carver<p>Everyday Privacy—What You’re Giving Away Without Realizing It<br><a href="https://youtu.be/fAzFVBDuF9U" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="">youtu.be/fAzFVBDuF9U</span><span class="invisible"></span></a> <a href="https://infosec.exchange/tags/cybersecurity" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>cybersecurity</span></a> <a href="https://infosec.exchange/tags/DigitalPrivacy" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DigitalPrivacy</span></a> <a href="https://infosec.exchange/tags/DataSecurity" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DataSecurity</span></a> <a href="https://infosec.exchange/tags/PrivacyMatters" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>PrivacyMatters</span></a> <a href="https://infosec.exchange/tags/SmartDeviceRisks" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SmartDeviceRisks</span></a><br><a href="https://infosec.exchange/tags/VoiceAssistantPrivacy" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>VoiceAssistantPrivacy</span></a> <a href="https://infosec.exchange/tags/LocationTracking" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>LocationTracking</span></a> <a href="https://infosec.exchange/tags/SurveillanceEconomy" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SurveillanceEconomy</span></a> <a href="https://infosec.exchange/tags/AIandPrivacy" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AIandPrivacy</span></a>#EverydaySurveillance <a href="https://infosec.exchange/tags/Chatbots" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Chatbots</span></a></p>
Matt Hodgkinson<p>vibe physics</p><p>Angela Collier tears into billionaires and tech bros who think they can push the boundaries of physics by prompting a sycophantic language model.<br> <a href="https://m.youtube.com/watch?si=-AMxQgiyNgoZjQsC&amp;v=TMoz3gSXBcY&amp;feature=youtu.be" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">m.youtube.com/watch?si=-AMxQgi</span><span class="invisible">yNgoZjQsC&amp;v=TMoz3gSXBcY&amp;feature=youtu.be</span></a></p><p><a href="https://scicomm.xyz/tags/ChatBots" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ChatBots</span></a> <a href="https://scicomm.xyz/tags/VibePhysics" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>VibePhysics</span></a> <a href="https://scicomm.xyz/tags/AngelaCollier" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AngelaCollier</span></a> <a href="https://scicomm.xyz/tags/YouTube" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>YouTube</span></a> <a href="https://scicomm.xyz/tags/TechBros" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>TechBros</span></a> <a href="https://scicomm.xyz/tags/LLMs" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>LLMs</span></a> <a href="https://scicomm.xyz/tags/Pseudoscience" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Pseudoscience</span></a> <a href="https://scicomm.xyz/tags/PseudoPhysics" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>PseudoPhysics</span></a> <a href="https://scicomm.xyz/tags/AntiIntellectualism" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AntiIntellectualism</span></a> <a href="https://scicomm.xyz/tags/AItools" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AItools</span></a> <a href="https://scicomm.xyz/tags/AIscience" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AIscience</span></a> <a href="https://scicomm.xyz/tags/Billionaires" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Billionaires</span></a> <a href="https://scicomm.xyz/tags/PodcastBrain" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>PodcastBrain</span></a></p>
Tino Eberl<p><a href="https://mastodon.online/tags/SteadyCommunityContent" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SteadyCommunityContent</span></a> KINews <a href="https://mastodon.online/tags/Retr%C3%B6t" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Retröt</span></a></p><p>Wie gut sind KI-<a href="https://mastodon.online/tags/Suchmaschinen" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Suchmaschinen</span></a> beim Auffinden und Zitieren journalistischer Originalquellen?</p><p>Wenn generative <a href="https://mastodon.online/tags/Chatbots" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Chatbots</span></a> mit <a href="https://mastodon.online/tags/Internetsuche" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Internetsuche</span></a> falsche <a href="https://mastodon.online/tags/Quellen" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Quellen</span></a> zitieren und sich dabei sicher geben, wird es riskant. Eine Untersuchung des Tow Center for Digital Journalism hat acht KI-Suchmaschinen unter die Lupe genommen und geschaut, wie gut die KI-Syseme im Umgang mit Originalartikeln abschneiden.</p><p><a href="https://mastodon.online/tags/Fehlinformation" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Fehlinformation</span></a> <a href="https://mastodon.online/tags/Quellenangaben" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Quellenangaben</span></a></p><p><a href="https://tino-eberl.de/uncategorized/ki-suchmaschinen-im-faktencheck-60-der-zitate-sind-falsch/" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">tino-eberl.de/uncategorized/ki</span><span class="invisible">-suchmaschinen-im-faktencheck-60-der-zitate-sind-falsch/</span></a></p>
Miguel Afonso Caetano<p>"Something that has become undeniable this month is that the best available open weight models now come from the Chinese AI labs.</p><p>I continue to have a lot of love for Mistral, Gemma and Llama but my feeling is that Qwen, Moonshot and Z.ai have positively smoked them over the course of July."</p><p><a href="https://simonwillison.net/2025/Jul/30/chinese-models/" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">simonwillison.net/2025/Jul/30/</span><span class="invisible">chinese-models/</span></a></p><p><a href="https://tldr.nettime.org/tags/AI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AI</span></a> <a href="https://tldr.nettime.org/tags/GenerativeAI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>GenerativeAI</span></a> <a href="https://tldr.nettime.org/tags/China" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>China</span></a> <a href="https://tldr.nettime.org/tags/OpenWeight" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>OpenWeight</span></a> <a href="https://tldr.nettime.org/tags/LLMs" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>LLMs</span></a> <a href="https://tldr.nettime.org/tags/Chatbots" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Chatbots</span></a></p>
ResearchBuzz: Firehose<p>Fast Company: Exclusive: Google could be reading your ChatGPT conversations. Concerned? You should be. “Google is indexing conversations with ChatGPT that users have sent to friends, families, or colleagues—turning private exchanges intended for small groups into search results visible to millions.”</p><p><a href="https://rbfirehose.com/2025/07/31/exclusive-google-could-be-reading-your-chatgpt-conversations-concerned-you-should-be-fast-company/" class="" rel="nofollow noopener" target="_blank">https://rbfirehose.com/2025/07/31/exclusive-google-could-be-reading-your-chatgpt-conversations-concerned-you-should-be-fast-company/</a></p>
Dr Pen<p>Do you know anyone who has regular 'private' conversations with a chatbot? By conversations I mean lengthy 1-2-1 "discussions" on fairly in depth topics, either personal or academic/scientific. By chatbot I mean any prominent LLM chat app. Im curious how many people are doing this.</p><p><a href="https://mastodon.social/tags/chatbots" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>chatbots</span></a> <a href="https://mastodon.social/tags/ai" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ai</span></a> <a href="https://mastodon.social/tags/ChatGPT" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ChatGPT</span></a> <a href="https://mastodon.social/tags/GenAI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>GenAI</span></a> <a href="https://mastodon.social/tags/academia" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>academia</span></a> <a href="https://mastodon.social/tags/academicchatter" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>academicchatter</span></a></p>
The Conversation U.S.<p>If people can’t intuitively tell the difference between human writing and <a href="https://newsie.social/tags/chatbots" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>chatbots</span></a>, perhaps there are other methods for determining human versus artificial authorship. <a href="https://newsie.social/tags/AI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AI</span></a> <a href="https://theconversation.com/too-many-em-dashes-weird-words-like-delves-spotting-text-written-by-chatgpt-is-still-more-art-than-science-259629" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">theconversation.com/too-many-e</span><span class="invisible">m-dashes-weird-words-like-delves-spotting-text-written-by-chatgpt-is-still-more-art-than-science-259629</span></a></p>
Miguel Afonso Caetano<p>"Far be it from me to accuse Anthropic of this. When they designed MCP, the idea was to quickly and easily extend chat interfaces with tool functionality (and a whole bunch of other stuff that folks ignore in the protocol!). For that context, it’s actually a good fit for the job (bar some caveats that can easily be fixed).</p><p>No, the dünnbrettbohrer of the MCP world are the implementers of the MCP servers themselves. Right now, it’s the peak of the hype cycle of inflated expectations, meaning a lot of people are selling low-code, or no-code, dressed up as MCP — but it’s still the same old shenanigans under the hood.</p><p>What I would like to achieve today is to give you simple guidance on when, how, and where to use MCP without shooting yourself in the foot (such as with Github’s latest MCP server disaster, an exploit that left private repository data vulnerable to attackers)."</p><p><a href="https://nordicapis.com/mcp-if-you-must-then-do-it-like-this/" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">nordicapis.com/mcp-if-you-must</span><span class="invisible">-then-do-it-like-this/</span></a></p><p><a href="https://tldr.nettime.org/tags/CyberSecurity" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>CyberSecurity</span></a> <a href="https://tldr.nettime.org/tags/MCP" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>MCP</span></a> <a href="https://tldr.nettime.org/tags/AI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AI</span></a> <a href="https://tldr.nettime.org/tags/GenerativeAI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>GenerativeAI</span></a> <a href="https://tldr.nettime.org/tags/LLMs" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>LLMs</span></a> <a href="https://tldr.nettime.org/tags/Chatbots" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Chatbots</span></a> <a href="https://tldr.nettime.org/tags/APIs" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>APIs</span></a></p>
Miguel Afonso Caetano<p>"Recent advances have enabled LLM-powered AI agents to autonomously execute complex tasks by combining language model reasoning with tools, memory, and web access. But can these systems be trusted to follow deployment policies in realistic environments, especially under attack? To investigate, we ran the largest public red-teaming competition to date, targeting 22 frontier AI agents across 44 realistic deployment scenarios. Participants submitted 1.8 million prompt-injection attacks, with over 60,000 successfully eliciting policy violations such as unauthorized data access, illicit financial actions, and regulatory noncompliance. We use these results to build the Agent Red Teaming (ART) benchmark - a curated set of high-impact attacks - and evaluate it across 19 state-of-the-art models. Nearly all agents exhibit policy violations for most behaviors within 10-100 queries, with high attack transferability across models and tasks. Importantly, we find limited correlation between agent robustness and model size, capability, or inference-time compute, suggesting that additional defenses are needed against adversarial misuse. Our findings highlight critical and persistent vulnerabilities in today's AI agents. By releasing the ART benchmark and accompanying evaluation framework, we aim to support more rigorous security assessment and drive progress toward safer agent deployment."</p><p><a href="https://arxiv.org/abs/2507.20526" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="">arxiv.org/abs/2507.20526</span><span class="invisible"></span></a></p><p><a href="https://tldr.nettime.org/tags/AI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AI</span></a> <a href="https://tldr.nettime.org/tags/GenerativeAI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>GenerativeAI</span></a> <a href="https://tldr.nettime.org/tags/LLMs" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>LLMs</span></a> <a href="https://tldr.nettime.org/tags/CyberSecurity" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>CyberSecurity</span></a> <a href="https://tldr.nettime.org/tags/Chatbots" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Chatbots</span></a> <a href="https://tldr.nettime.org/tags/AIAgents" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AIAgents</span></a> <a href="https://tldr.nettime.org/tags/AgenticAI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AgenticAI</span></a></p>
Ecologia Digital<p>"Part of the challenge for <a href="https://mato.social/tags/AIdevelopers" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AIdevelopers</span></a> is reaching a balance between verifying information for accuracy and enabling the model to be “<a href="https://mato.social/tags/creative" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>creative</span></a>”."<br>Is there a chance that AI firms will be the ones hiring good <a href="https://mato.social/tags/humanjournalists" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>humanjournalists</span></a> to keep the facts coming?</p><p>FT: "The ‘<a href="https://mato.social/tags/hallucinations" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>hallucinations</span></a>’ that haunt <a href="https://mato.social/tags/AI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AI</span></a>: why <a href="https://mato.social/tags/chatbots" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>chatbots</span></a> struggle to tell the <a href="https://mato.social/tags/truth" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>truth</span></a>"</p><p><a href="https://www.ft.com/content/7a4e7eae-f004-486a-987f-4a2e4dbd34fb" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://www.</span><span class="ellipsis">ft.com/content/7a4e7eae-f004-4</span><span class="invisible">86a-987f-4a2e4dbd34fb</span></a></p>
Miguel Afonso Caetano<p>"The world’s leading artificial intelligence groups are stepping up efforts to reduce the number of “hallucinations” in large language models, as they seek to solve one of the big obstacles limiting take-up of the powerful technology.</p><p>Google, Amazon, Cohere and Mistral are among those trying to bring down the rate of these fabricated answers by rolling out technical fixes, improving the quality of the data in AI models, and building verification and fact-checking systems across their generative AI products.</p><p>The move to reduce these so-called hallucinations is seen as crucial to increase the use of AI tools across industries such as law and health, which require accurate information, and help boost the AI sector’s revenues.</p><p>It comes as chatbot errors have already resulted in costly mistakes and litigation. Last year, a tribunal ordered Air Canada to honour a discount that its customer service chatbot had made up, and lawyers who have used AI tools in court documents have faced sanctions after it made up citations.</p><p>But AI experts warn that eliminating hallucinations completely from large language models is impossible because of how the systems operate."</p><p><a href="https://www.ft.com/content/7a4e7eae-f004-486a-987f-4a2e4dbd34fb" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://www.</span><span class="ellipsis">ft.com/content/7a4e7eae-f004-4</span><span class="invisible">86a-987f-4a2e4dbd34fb</span></a></p><p><a href="https://tldr.nettime.org/tags/AI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AI</span></a> <a href="https://tldr.nettime.org/tags/GenerativeAI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>GenerativeAI</span></a> <a href="https://tldr.nettime.org/tags/LLMs" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>LLMs</span></a> <a href="https://tldr.nettime.org/tags/Chatbots" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Chatbots</span></a> <a href="https://tldr.nettime.org/tags/Hallucinations" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Hallucinations</span></a></p>
ResearchBuzz: Firehose<p>PC World: Proton’s new encrypted AI chatbot, Lumo, puts your privacy first. “Lumo can do a lot of the things that other AI chatbots can do, like summarize documents, write emails, and generate code. But all user data is stored locally and protected with so-called ‘zero-access’ encryption. This means that only you have the key to your own content.”</p><p><a href="https://rbfirehose.com/2025/07/29/pc-world-protons-new-encrypted-ai-chatbot-lumo-puts-your-privacy-first/" class="" rel="nofollow noopener" target="_blank">https://rbfirehose.com/2025/07/29/pc-world-protons-new-encrypted-ai-chatbot-lumo-puts-your-privacy-first/</a></p>
ResearchBuzz: Firehose<p>Ars Technica: xAI workers balked over training request to help “give Grok a face,” docs show . “Dozens of xAI employees expressed concerns—and many objected—when asked to record videos of their facial expressions to help “give Grok a face,” Business Insider reported. BI reviewed internal documents and Slack messages, finding that the so-called project ‘Skippy’ was designed to help […]</p><p><a href="https://rbfirehose.com/2025/07/28/ars-technica-xai-workers-balked-over-training-request-to-help-give-grok-a-face-docs-show/" class="" rel="nofollow noopener" target="_blank">https://rbfirehose.com/2025/07/28/ars-technica-xai-workers-balked-over-training-request-to-help-give-grok-a-face-docs-show/</a></p>
ResearchBuzz: Firehose<p>Engadget: DuckDuckGo now lets you customize the responses of its Duck.ai chatbots. “Since last June, when DuckDuckGo introduced AI Chat, you’ve been able to use chat bots like Claude directly through the browser. Now the company is making it easier to tweak the system prompts of those AI models while retaining your privacy.”</p><p><a href="https://rbfirehose.com/2025/07/28/engadget-duckduckgo-now-lets-you-customize-the-responses-of-its-duck-ai-chatbots/" class="" rel="nofollow noopener" target="_blank">https://rbfirehose.com/2025/07/28/engadget-duckduckgo-now-lets-you-customize-the-responses-of-its-duck-ai-chatbots/</a></p>