machine-learning

We exploited a lack of isolation mechanisms in multiple agentic browsers to perform attacks ranging from the dissemination of false information to cross-site data leaks. These attacks resurface decades-old patterns of vulnerabilities that the web security community spent years building effective defenses against.

LLMs fundamentally differ from compilers because they lack determinism and semantic guarantees, making them useful coding assistants but unreliable for autonomous code generation without human review and formal verification.

We bypassed human approval protections for system command execution in AI agents, achieving RCE in three agent platforms.

We’ve added a pickle file scanner to Fickling that uses an allowlist approach to protect AI/ML environments from malicious pickle files that could compromise models or infrastructure.

Our business operations intern at Trail of Bits built two AI-powered tools that became permanent company resources—a podcast workflow that saves 1,250 hours annually and a Slack exporter that enables efficient knowledge retrieval across the organization.

In this blog post, we’ll detail how attackers can exploit image scaling on Gemini CLI, Vertex AI Studio, Gemini’s web and API interfaces, Google Assistant, Genspark, and other production AI systems. We’ll also explain how to mitigate and defend against these attacks, and we’ll introduce Anamorpher, our open-source tool that lets you explore and generate these crafted images.

Our team won the runner-up prize of $3M at DARPA’s AI Cyber Challenge, demonstrating Buttercup’s world-class automated vulnerability discovery and patching capabilities with remarkable cost efficiency.

Now that DARPA’s AI Cyber Challenge (AIxCC) has officially ended, we can finally make Buttercup, our CRS (Cyber Reasoning System), open source!

While the AIxCC winner has not yet been announced, differences in the finalists’ approaches show that there are multiple viable paths forward to using AI for vulnerability detection.

Prompt injection pervades discussions about security for LLMs and AI agents. But there is little public information on how to write powerful, discreet, and reliable prompt injection exploits. In this post, we will design and implement a prompt injection exploit targeting GitHub’s Copilot Agent, with a focus on maximizing reliability and minimizing the odds of detection.

In my first month at Trail of Bits as an AI/ML security engineer, I found two remotely accessible memory corruption bugs in NVIDIA’s Triton Inference Server during a routine onboarding practice.

We’re releasing pajaMAS: a curated set of MAS hijacking demos that illustrate important principles of MAS security.

Today we’re announcing the beta release of mcp-context-protector, a security wrapper for LLM apps using the Model Context Protocol (MCP). It defends against the line jumping attacks documented earlier in this blog series, such as prompt injection via tool descriptions and ANSI terminal escape codes.

Datasig generates compact, unique fingerprints for AI/ML datasets that let you compare training data with high accuracy—without needing access to the raw data itself.
This critical capability helps AIBOM (AI bill of materials) tools detect data-borne vulnerabilities that traditional security tools completely miss.

This post describes how many examples of MCP software store long-term API keys for third-party services in plaintext on the local filesystem, often with insecure, world-readable permissions.

This post describes attacks using ANSI terminal code escape sequences to hide malicious instructions to the LLM, leveraging the line jumping vulnerability we discovered in MCP.

Malicious MCP servers can inject trigger phrases into tool descriptions to exfiltrate entire conversation histories and steal sensitive credentials and IP.

MCP’s ’line jumping’ vulnerability lets malicious servers inject prompts through tool descriptions to manipulate AI behavior before tools are ever invoked.

Trail of Bits’ Buttercup competes in DARPA’s AIxCC Finals with expanded resources, multiple rounds, new challenge types, and custom AI model capabilities.

While Trail of Bits is known for developing security tools like Slither, Medusa, and Fickling, our engineering efforts extend far beyond our own projects. Throughout 2024, our team has been deeply engaged with the broader security ecosystem, tackling challenges in open-source tools and infrastructure that security engineers rely on every day. This year, our engineers […]

AI-enabled code assistants (like GitHub’s Copilot, Continue.dev, and Tabby) are making software development faster and more productive. Unfortunately, these tools are often bad at Solidity. So we decided to improve them! To make it easier to write, edit, and understand Solidity with AI-enabled tools, we have: Added support for Solidity into Tabby […]

This is a joint post with the Hugging Face Gradio team; read their announcement here! You can find the full report with all of the detailed findings from our security audit of Gradio 5 here. Hugging Face hired Trail of Bits to audit Gradio 5, a popular open-source library that provides a web interface that […]

At DEF CON, Michael Brown, Principal Security Engineer at Trail of Bits, sat down with Michael Novinson from Information Security Media Group (ISMG) to discuss four critical areas where AI/ML is revolutionizing security. Here’s what they covered: AI/ML techniques surpass the limits of traditional software analysis As Moore’s law slows down after 20 years of […]

Today we’re going to provision some cloud infrastructure the Max Power way: by combining automation with unchecked AI output. Unfortunately, this method produces cloud infrastructure code that 1) works and 2) has terrible security properties. In a nutshell, AI-based tools like Claude and ChatGPT readily provide extremely bad cloud infrastructure provisioning code, […]

With DARPA’s AI Cyber Challenge (AIxCC) semifinal starting today at DEF CON 2024, we want to introduce Buttercup, our AIxCC submission. Buttercup is a Cyber Reasoning System (CRS) that combines conventional cybersecurity techniques like fuzzing and static analysis with AI and machine learning to find and fix software vulnerabilities. The system is designed to operate […]

Articles about: machine-learning

Lack of isolation in agentic browsers resurfaces old vulnerabilities

Can chatbots craft correct code?

Prompt injection to RCE in AI agents

Fickling’s new AI/ML pickle file scanner

Intern projects that outlived the internship

Weaponizing image scaling against production AI systems

Trail of Bits' Buttercup wins 2nd place in AIxCC Challenge

Buttercup is now open-source!

AIxCC finals: Tale of the tape

Prompt injection engineering for attackers: Exploiting GitHub Copilot

Uncovering memory corruption in NVIDIA Triton (as a new hire)

Hijacking multi-agent systems in your PajaMAS

We built the security layer MCP always needed

Datasig: Fingerprinting AI/ML datasets to stop data-borne attacks

Insecure credential storage plagues MCP

Deceiving users with ANSI terminal codes in MCP

How MCP servers can steal your conversation history

Jumping the line: How MCP servers can attack you before you ever use them

Kicking off AIxCC’s Finals with Buttercup

Celebrating our 2024 open-source contributions

Evaluating Solidity support in AI coding assistants

Auditing Gradio 5, Hugging Face’s ML GUI framework

Inside DEF CON: Michael Brown on how AI/ML is revolutionizing cybersecurity

Provisioning cloud infrastructure the wrong way, but faster

Trail of Bits’ Buttercup heads to DARPA’s AIxCC