machine-learning on The Trail of Bits Blog

Lack of isolation in agentic browsers resurfaces old vulnerabilities

Tue, 13 Jan 2026 07:00:00 -0500

We exploited a lack of isolation mechanisms in multiple agentic browsers to perform attacks ranging from the dissemination of false information to cross-site data leaks. These attacks resurface decades-old patterns of vulnerabilities that the web security community spent years building effective defenses against.

Can chatbots craft correct code?

Fri, 19 Dec 2025 07:00:00 -0500

LLMs fundamentally differ from compilers because they lack determinism and semantic guarantees, making them useful coding assistants but unreliable for autonomous code generation without human review and formal verification.

Prompt injection to RCE in AI agents

Wed, 22 Oct 2025 07:00:00 -0400

We bypassed human approval protections for system command execution in AI agents, achieving RCE in three agent platforms.

Fickling’s new AI/ML pickle file scanner

Tue, 16 Sep 2025 07:00:00 -0400

We’ve added a pickle file scanner to Fickling that uses an allowlist approach to protect AI/ML environments from malicious pickle files that could compromise models or infrastructure.

Intern projects that outlived the internship

Thu, 28 Aug 2025 07:00:00 -0400

Our business operations intern at Trail of Bits built two AI-powered tools that became permanent company resources—a podcast workflow that saves 1,250 hours annually and a Slack exporter that enables efficient knowledge retrieval across the organization.

Weaponizing image scaling against production AI systems

Thu, 21 Aug 2025 07:00:00 -0400

In this blog post, we’ll detail how attackers can exploit image scaling on Gemini CLI, Vertex AI Studio, Gemini’s web and API interfaces, Google Assistant, Genspark, and other production AI systems. We’ll also explain how to mitigate and defend against these attacks, and we’ll introduce Anamorpher, our open-source tool that lets you explore and generate these crafted images.

Trail of Bits' Buttercup wins 2nd place in AIxCC Challenge

Sat, 09 Aug 2025 10:30:00 -0400

Our team won the runner-up prize of $3M at DARPA’s AI Cyber Challenge, demonstrating Buttercup’s world-class automated vulnerability discovery and patching capabilities with remarkable cost efficiency.

Buttercup is now open-source!

Fri, 08 Aug 2025 00:00:00 -0400

Now that DARPA’s AI Cyber Challenge (AIxCC) has officially ended, we can finally make Buttercup, our CRS (Cyber Reasoning System), open source!

AIxCC finals: Tale of the tape

Thu, 07 Aug 2025 00:00:00 -0400

While the AIxCC winner has not yet been announced, differences in the finalists’ approaches show that there are multiple viable paths forward to using AI for vulnerability detection.

Prompt injection engineering for attackers: Exploiting GitHub Copilot

Wed, 06 Aug 2025 00:00:00 -0400

Prompt injection pervades discussions about security for LLMs and AI agents. But there is little public information on how to write powerful, discreet, and reliable prompt injection exploits. In this post, we will design and implement a prompt injection exploit targeting GitHub’s Copilot Agent, with a focus on maximizing reliability and minimizing the odds of detection.

Uncovering memory corruption in NVIDIA Triton (as a new hire)

Tue, 05 Aug 2025 07:00:00 -0400

In my first month at Trail of Bits as an AI/ML security engineer, I found two remotely accessible memory corruption bugs in NVIDIA’s Triton Inference Server during a routine onboarding practice.

Hijacking multi-agent systems in your PajaMAS

Thu, 31 Jul 2025 09:00:00 -0400

We’re releasing pajaMAS: a curated set of MAS hijacking demos that illustrate important principles of MAS security.

We built the security layer MCP always needed

Mon, 28 Jul 2025 07:00:00 -0400

Today we’re announcing the beta release of mcp-context-protector, a security wrapper for LLM apps using the Model Context Protocol (MCP). It defends against the line jumping attacks documented earlier in this blog series, such as prompt injection via tool descriptions and ANSI terminal escape codes.

Datasig: Fingerprinting AI/ML datasets to stop data-borne attacks

Fri, 02 May 2025 07:00:00 -0400

Datasig generates compact, unique fingerprints for AI/ML datasets that let you compare training data with high accuracy—without needing access to the raw data itself.
This critical capability helps AIBOM (AI bill of materials) tools detect data-borne vulnerabilities that traditional security tools completely miss.

Insecure credential storage plagues MCP

Wed, 30 Apr 2025 03:00:00 -0400

This post describes how many examples of MCP software store long-term API keys for third-party services in plaintext on the local filesystem, often with insecure, world-readable permissions.

Deceiving users with ANSI terminal codes in MCP

Tue, 29 Apr 2025 09:00:00 -0400

This post describes attacks using ANSI terminal code escape sequences to hide malicious instructions to the LLM, leveraging the line jumping vulnerability we discovered in MCP.

How MCP servers can steal your conversation history

Wed, 23 Apr 2025 10:30:00 -0400

Malicious MCP servers can inject trigger phrases into tool descriptions to exfiltrate entire conversation histories and steal sensitive credentials and IP.

Jumping the line: How MCP servers can attack you before you ever use them

Mon, 21 Apr 2025 10:30:00 -0400

MCP’s ’line jumping’ vulnerability lets malicious servers inject prompts through tool descriptions to manipulate AI behavior before tools are ever invoked.

Kicking off AIxCC’s Finals with Buttercup

Mon, 21 Apr 2025 09:00:00 -0400

Trail of Bits’ Buttercup competes in DARPA’s AIxCC Finals with expanded resources, multiple rounds, new challenge types, and custom AI model capabilities.

Celebrating our 2024 open-source contributions

Thu, 23 Jan 2025 09:00:30 -0500

While Trail of Bits is known for developing security tools like Slither, Medusa, and Fickling, our engineering efforts extend far beyond our own projects. Throughout 2024, our team has been deeply engaged with the broader security ecosystem, tackling challenges in open-source tools and infrastructure that security engineers rely on every day. This year, our engineers […]

Evaluating Solidity support in AI coding assistants

Tue, 19 Nov 2024 09:00:37 -0500

AI-enabled code assistants (like GitHub’s Copilot, Continue.dev, and Tabby) are making software development faster and more productive. Unfortunately, these tools are often bad at Solidity. So we decided to improve them! To make it easier to write, edit, and understand Solidity with AI-enabled tools, we have: Added support for Solidity into Tabby […]

Auditing Gradio 5, Hugging Face’s ML GUI framework

Thu, 10 Oct 2024 12:00:29 -0400

This is a joint post with the Hugging Face Gradio team; read their announcement here! You can find the full report with all of the detailed findings from our security audit of Gradio 5 here. Hugging Face hired Trail of Bits to audit Gradio 5, a popular open-source library that provides a web interface that […]

Inside DEF CON: Michael Brown on how AI/ML is revolutionizing cybersecurity

Tue, 17 Sep 2024 09:00:08 -0400

At DEF CON, Michael Brown, Principal Security Engineer at Trail of Bits, sat down with Michael Novinson from Information Security Media Group (ISMG) to discuss four critical areas where AI/ML is revolutionizing security. Here’s what they covered: AI/ML techniques surpass the limits of traditional software analysis As Moore’s law slows down after 20 years of […]

Provisioning cloud infrastructure the wrong way, but faster

Tue, 27 Aug 2024 09:00:06 -0400

Today we’re going to provision some cloud infrastructure the Max Power way: by combining automation with unchecked AI output. Unfortunately, this method produces cloud infrastructure code that 1) works and 2) has terrible security properties. In a nutshell, AI-based tools like Claude and ChatGPT readily provide extremely bad cloud infrastructure provisioning code, […]

Trail of Bits’ Buttercup heads to DARPA’s AIxCC

Fri, 09 Aug 2024 09:10:29 -0400

With DARPA’s AI Cyber Challenge (AIxCC) semifinal starting today at DEF CON 2024, we want to introduce Buttercup, our AIxCC submission. Buttercup is a Cyber Reasoning System (CRS) that combines conventional cybersecurity techniques like fuzzing and static analysis with AI and machine learning to find and fix software vulnerabilities. The system is designed to operate […]

Auditing the Ask Astro LLM Q&A app

Fri, 05 Jul 2024 09:00:28 -0400

Today, we present the second of our open-source AI security audits: a look at security issues we found in an open-source retrieval augmented generation (RAG) application that could lead to chatbot output poisoning, inaccurate document ingestion, and potential denial of service. This audit follows up on our previous work that identified 11 security vulnerabilities in […]

Understanding Apple’s On-Device and Server Foundation Models release

Fri, 14 Jun 2024 16:49:37 -0400

Earlier this week, at Apple’s WWDC, we finally witnessed Apple’s AI strategy. The videos and live demos were accompanied by two long-form releases: Apple’s Private Cloud Compute and Apple’s On-Device and Server Foundation Models. This blog post is about the latter. So, what is Apple releasing, and how does it compare to […]

PCC: Bold step forward, not without flaws

Fri, 14 Jun 2024 15:46:48 -0400

Earlier this week, Apple announced Private Cloud Compute (or PCC for short). Without deep context on the state of the art of Artificial Intelligence (AI) and Machine Learning (ML) security, some sensible design choices may seem surprising. Conversely, some of the risks linked to this design are hidden in the fine print. […]

Exploiting ML models with pickle file attacks: Part 2

Tue, 11 Jun 2024 11:00:17 -0400

In part 1, we introduced Sleepy Pickle, an attack that uses malicious pickle files to stealthily compromise ML models and carry out sophisticated attacks against end users. Here we show how this technique can be adapted to enable long-lasting presence on compromised systems while remaining undetected. This variant technique, which we call […]

Exploiting ML models with pickle file attacks: Part 1

Tue, 11 Jun 2024 09:00:36 -0400

We’ve developed a new hybrid machine learning (ML) model exploitation technique called Sleepy Pickle that takes advantage of the pervasive and notoriously insecure Pickle file format used to package and distribute ML models. Sleepy pickle goes beyond previous exploit techniques that target an organization’s systems when they deploy ML models to instead […]

Announcing AI/ML safety and security trainings

Fri, 07 Jun 2024 09:00:41 -0400

We are offering AI/ML safety and security training this year! Recent advances in AI/ML technologies opened up a new world of possibilities for businesses to run more efficiently and offer better services and products. However, incorporating AI/ML into computing systems brings new and unique complexities, risks, and attack surfaces. In our experience […]

Relishing new Fickling features for securing ML systems

Mon, 04 Mar 2024 09:00:44 -0500

We’ve added new features to Fickling to offer enhanced threat detection and analysis across a broad spectrum of machine learning (ML) workflows. Fickling is a decompiler, static analyzer, and bytecode rewriter for the Python pickle module that can help you detect, analyze, or create malicious pickle files. While the ML community […]

Our response to the US Army’s RFI on developing AIBOM tools

Wed, 28 Feb 2024 11:30:05 -0500

The US Army’s Program Executive Office for Intelligence, Electronic Warfare and Sensors (PEO IEW&S) recently issued a request for information (RFI) on methods to implement and automate production of an artificial intelligence bill of materials (AIBOM) as part of Project Linchpin. The RFI describes the AIBOM as a detailed […]

Celebrating our 2023 open-source contributions

Wed, 24 Jan 2024 09:00:22 -0500

At Trail of Bits, we pride ourselves on making our best tools open source, such as Slither, PolyTracker, and RPC Investigator. But while this post is about open source, it’s not about our tools… In 2023, our employees submitted over 450 pull requests (PRs) that were merged into non-Trail of Bits repositories. This demonstrates our […]

Our thoughts on AIxCC’s competition format

Thu, 18 Jan 2024 09:00:38 -0500

Late last month, DARPA officially opened registration for their AI Cyber Challenge (AIxCC). As part of the festivities, DARPA also released some highly anticipated information about the competition: a request for comments (RFC) that contained a sample challenge problem and the scoring methodology. Prior rules documents and FAQs released by DARPA painted […]

LeftoverLocals: Listening to LLM responses through leaked GPU local memory

Tue, 16 Jan 2024 12:00:39 -0500

We are disclosing LeftoverLocals: a vulnerability that allows recovery of data from GPU local memory created by another process on Apple, Qualcomm, AMD, and Imagination GPUs. LeftoverLocals impacts the security posture of GPU applications as a whole, with particular significance to LLMs and ML models run on impacted GPU […]

AI In Windows: Investigating Windows Copilot

Wed, 27 Dec 2023 09:00:22 -0500

AI is becoming ubiquitous, as developers of widely used tools like GitHub and Photoshop are quickly implementing and iterating on AI-enabled features. With Microsoft’s recent integration of Copilot into Windows, AI is even on the old stalwart of computing—the desktop. The integration of an AI assistant into an entire operating system is a significant development that warrants investigation.

Assessing the security posture of a widely used vision model: YOLOv7

Wed, 15 Nov 2023 10:15:05 -0500

TL;DR: We identified 11 security vulnerabilities in YOLOv7, a popular computer vision framework, that could enable attacks including remote code execution (RCE), denial of service, and model differentials (where an attacker can trigger a model to perform differently in different contexts). Open-source software […]

How AI will affect cybersecurity: What we told the CFTC

Mon, 31 Jul 2023 07:00:32 -0400

Dan Guido, CEO The second meeting of the Commodity Futures Trading Commission’s Technology Advisory Committee (TAC) on July 18 focused on the effects of AI on the financial sector. During the meeting, I explained that AI has the potential to fundamentally change the balance between cyber offense and defense, and that we need security-focused benchmarks […]

Trail of Bits’s Response to OSTP National Priorities for AI RFI

Tue, 18 Jul 2023 13:46:44 -0400

The Office of Science and Technology Policy (OSTP) has circulated a request for information (RFI) on how best to develop policies that support the responsible development of AI while minimizing risk to rights, safety, and national security. In our response, we highlight the following points: To ensure that AI […]

Trail of Bits’s Response to NTIA AI Accountability RFC

Fri, 16 Jun 2023 08:00:10 -0400

The National Telecommunications and Information Administration (NTIA) has circulated an Artificial Intelligence (AI) Accountability Policy Request for Comment on what policies can support the development of AI audits, assessments, certifications, and other mechanisms to create earned trust in AI systems. Trail of Bits has submitted a response to the […]

Codex (and GPT-4) can’t beat humans on smart contract audits

Wed, 22 Mar 2023 07:00:49 -0400

Is artificial intelligence (AI) capable of powering software security audits? Over the last four months, we piloted a project called Toucan to find out. Toucan was intended to integrate OpenAI’s Codex into our Solidity auditing workflow. This experiment went far […]

We need a new way to measure AI security

Tue, 14 Mar 2023 08:00:47 -0400

Trail of Bits has launched a practice focused on machine learning and artificial intelligence, bringing together safety and security methodologies to create a new risk assessment and assurance program. This program evaluates potential bespoke risks and determines the necessary safety and security measures for AI-based systems.

Secure your machine learning with Semgrep

Mon, 03 Oct 2022 09:00:53 -0400

tl;dr: Our publicly available Semgrep ruleset now has 11 rules dedicated to the misuse of machine learning libraries. Try it out now! Picture this: You’ve spent months curating images, trying out different architectures, downloading pretrained models, messing with Kubernetes, and you’re finally ready to ship your sparkling new machine learning (ML) product. […]

PrivacyRaven: Implementing a proof of concept for model inversion

Tue, 09 Nov 2021 00:45:55 -0500

Originally published August 3, 2021 During my Trail of Bits winternship and springternship, I had the pleasure of working with Suha Hussain and Jim Miller on PrivacyRaven, a Python-based tool for testing deep-learning frameworks against a plethora of privacy attacks. I worked on improving PrivacyRaven’s versatility by adding compatibility for services […]

Never a dill moment: Exploiting machine learning pickle files

Mon, 15 Mar 2021 11:06:18 -0400

Many machine learning (ML) models are Python pickle files under the hood, and it makes sense. The use of pickling conserves memory, enables start-and-stop model training, and makes trained models portable (and, thereby, shareable). Pickling is easy to implement, is built into Python without requiring additional dependencies, and supports serialization of custom […]

Efficient audits with machine learning and Slither-simil

Fri, 23 Oct 2020 07:00:51 -0400

Trail of Bits has manually curated a wealth of data—years of security assessment reports—and now we’re exploring how to use this data to make the smart contract auditing process more efficient with Slither-simil. Based on accumulated knowledge embedded in previous audits, we set out to detect similar vulnerable code snippets […]

PrivacyRaven Has Left the Nest

Thu, 08 Oct 2020 08:00:36 -0400

If you work on deep learning systems, check out our new tool, PrivacyRaven—it’s a Python library that equips engineers and researchers with a comprehensive testing suite for simulating privacy attacks on deep learning systems. Because deep learning enables software to perform tasks without explicit programming, it’s become ubiquitous in […]

Multi-Party Computation on Machine Learning

Fri, 04 Oct 2019 10:13:15 -0400

During my internship this summer, I built a multi-party computation (MPC) tool that implements a 3-party computation protocol for perceptron and support vector machine (SVM) algorithms. MPC enables multiple parties to perform analyses on private datasets without sharing them with each other. I defveloped a technique that lets three parties obtain the results of machine […]