Skip to main content

Code Sandboxes

Code Sandboxes is a Python package for creating safe, isolated environments where AI systems can write, run, and test code without affecting the real world or the user's device.

Package Scope

Code Sandboxes is the execution layer in the Datalayer AI stack:

This section clarifies what the package owns versus what is delegated to adjacent layers.

┌─────────────────────────────────────────────────────────────┐
│ agent-runtimes │
│ (Agent hosting, protocols, UI) │
├──────────────────────────┬──────────────────────────────────┤
│ agent-codemode │ agent-skills │
│ (discovery, codegen) │ (skills management) │
├──────────────────────────┴──────────────────────────────────┤
│ code-sandboxes │ ◀── You are here
│ (Safe code execution environment) │
└─────────────────────────────────────────────────────────────┘

Responsibilities:

  • ✅ Execute Python/shell code safely in isolated environments
  • ✅ Provide filesystem operations (read, write, list, upload, download)
  • ✅ Run shell commands with streaming output
  • ✅ Manage sandbox lifecycle (start, stop, snapshot)
  • ✅ Support multiple isolation levels (local, Docker, cloud)

Not Responsible For:

  • ❌ MCP tool discovery or binding generation (→ agent-codemode)
  • ❌ Skill management and composition (→ agent-skills)
  • ❌ Agent protocols or UI components (→ agent-runtimes)

Key Features

  • 🔒 Secure Isolation: Run untrusted code safely in sandboxed environments
  • 🐍 Python Code Execution: Execute Python code with streaming output and rich results
  • 📁 Filesystem Operations: Read, write, list, upload, and download files
  • 💻 Command Execution: Run shell commands with streaming support
  • 📊 Detailed Status Reporting: Distinguish between infrastructure and code-level failures
  • 🎯 Pydantic Models: Type-safe models with automatic validation and JSON serialization
  • ⚡ Multiple Backends: Local eval, Docker containers, Jupyter kernels, or cloud runtimes
  • 🔄 State Persistence: Maintain variables and context between executions
  • 📊 Rich Output: Support for text, HTML, images, and structured data
  • 📸 Snapshots: Save and restore sandbox state
  • 🚀 GPU Support: Access GPU compute for ML workloads

Sandbox Variants

Code Sandboxes supports multiple execution backends organized into two categories:

Local Sandboxes

Execute code in-process, sharing memory with the host Python process.

VariantIsolation LevelBest For
local-evalNone (Python exec)Development, testing

Remote Sandboxes

Execute code out-of-process via Jupyter kernel protocol, providing better isolation.

VariantIsolation LevelBest For
local-dockerContainerLocal isolated execution
local-jupyterProcess (Jupyter kernel)Local persistent state
datalayer-runtimeCloud VMProduction, GPU workloads

Quick Start

pip install code-sandboxes

Basic Usage

from code_sandboxes import Sandbox

with Sandbox.create() as sandbox:
result = sandbox.run_code("print('Hello from the sandbox!')")
print(result.stdout) # Hello from the sandbox!

Execution Status Reporting

Code Sandboxes provides detailed status information for each execution:

result = sandbox.run_code("x = 1 / 0")

# Check infrastructure-level success
if not result.execution_ok:
print(f"Sandbox failed: {result.execution_error}")
# Check explicit process exit (sys.exit)
elif result.exit_code not in (None, 0):
print(f"Process exited with code: {result.exit_code}")

# Check code-level error (Python exception)
elif result.code_error:
print(f"Python error: {result.code_error.name}: {result.code_error.value}")
print(f"Traceback: {result.code_error.traceback}")

# Success!
else:
print(f"Result: {result.text}")
print(f"Duration: {result.duration:.2f}s")

# Convenience property
if result.success:
print("Everything worked perfectly!")

Integration with Other Packages

With Agent Codemode

Code Sandboxes is used by agent-codemode to execute tool composition code:

from code_sandboxes import Sandbox
from agent_codemode import CodeModeExecutor, ToolRegistry

# agent-codemode uses code-sandboxes internally
executor = CodeModeExecutor(registry, sandbox_variant="datalayer-runtime")

With Agent Skills

Agent Skills uses Code Sandboxes to execute skill scripts:

from code_sandboxes import LocalEvalSandbox
from agent_skills import SandboxExecutor

sandbox = LocalEvalSandbox()
executor = SandboxExecutor(sandbox)

Learn More