GitHub - Sam-Si/DCodeX: Distributed Code Execution Sandbox

A gRPC-powered code execution engine with secure sandboxing, real-time streaming, and smart caching.

Quick Start

# Build the server
bazel build //src/api:server

# Run the server (listens on localhost:50051)
bazel run //src/api:server

# Install Python client dependencies
pip install -r python_client/requirements.txt

# Generate Python gRPC bindings
python -m grpc_tools.protoc -I. --python_out=. --grpc_python_out=. proto/sandbox.proto

# Run the client with individual files
python python_client/main.py --file examples/c/01_hello_world.c
python python_client/main.py --file examples/cpp/01_hello_world.cpp
python python_client/main.py --file examples/python/01_hello_world.py

Server Options

Flag Default Description
--port 50051 Server port
--max_concurrent_sandboxes 10 Max concurrent executions
--sandbox_cpu_time_limit_seconds 1 CPU time limit (seconds)
--sandbox_wall_clock_timeout_seconds 2 Wall-clock timeout (seconds)
--sandbox_memory_limit_bytes 4294967296 Memory limit (bytes, default 4GB)
--sandbox_max_output_bytes 10240 Max stdout+stderr (bytes, default 10KB)
# Example: Custom port with 8GB memory limit
bazel run //src/api:server -- --port 9090 --sandbox_memory_limit_bytes 8589934592

# Example: Strict limits for testing untrusted code
bazel run //src/api:server -- --sandbox_cpu_time_limit_seconds 3 --sandbox_memory_limit_bytes 52428800

Python Client

# Execute all C examples
python python_client/main.py --directory examples/c

# Execute all C++ examples
python python_client/main.py --directory examples/cpp

# Execute all Python examples  
python python_client/main.py --directory examples/python

# Execute a single file (language auto-detected from extension)
python python_client/main.py --file examples/c/01_hello_world.c
python python_client/main.py --file examples/cpp/01_hello_world.cpp
python python_client/main.py --file examples/python/01_hello_world.py

# Run each file twice to demonstrate caching
python python_client/main.py --directory examples/c --cache-demo

# Interactive mode
python python_client/main.py --interactive

# Connect to remote server
python python_client/main.py --server 192.168.1.100:50051 --directory examples/cpp

# Pass stdin data to a program
python python_client/main.py --file examples/cpp/13_stdin_input.cpp --stdin 'DCodeX\n5\n10\n20\n30\n40\n50\n'

# Pass stdin from a file
python python_client/main.py --file examples/cpp/13_stdin_input.cpp --stdin-file /tmp/input.txt

Client Options

Flag Short Default Description
--server localhost:50051 Server address
--directory -d examples/cpp Directory of code files
--file -f None Single file to execute
--cache-demo -c False Run twice to show caching
--interactive -i False Interactive menu mode
--stdin "" Stdin data (use \n for newlines)
--stdin-file None File to read stdin from

Examples

C (examples/c/)

File Description
01_hello_world.c Basic I/O
02_basic_math.c Math operations
03_pointers.c Pointer arithmetic and swap
04_structs.c Struct definitions and usage
05_file_io.c File read/write operations
06_dynamic_memory.c malloc/calloc/realloc/free

C++ (examples/cpp/)

File Description
01_hello_world.cpp Basic I/O
02_basic_math.cpp Math operations
03_fibonacci.cpp Fibonacci sequence
04_prime_numbers.cpp Prime finder
05_factorial.cpp Factorial calculator
06_arrays_and_vectors.cpp STL containers
07_strings.cpp String manipulation
08_memory_allocation.cpp Memory management
09_cpu_intensive.cpp Heavy computation
10_sandbox_safe.cpp Resource-conscious code
11_infinite_loop.cpp Timeout demo
12_memory_exhaustion.cpp Memory limit demo
13_stdin_input.cpp Stdin handling
14_output_flood.cpp Output truncation demo

Python (examples/python/)

File Description
01_hello_world.py Basic I/O
02_basic_math.py Math operations
03_file_operations.py File I/O
04_data_structures.py Data structures
05_iterators_generators.py Iterators/generators
06_infinite_loop.py Timeout demo
07_memory_exhaustion.py Memory limit demo
08_slow_computation.py CPU limit demo
09_output_flood.py Output truncation demo

Features

  • Secure Sandboxing: Linux rlimit enforces CPU/memory constraints
  • Wall-Clock Timeout: Catches sleeping/blocked processes that CPU limits miss
  • gRPC Alarm-Based Timeout: Efficient timeout handling without fork overhead
  • Output Limiting: Hard cap on stdout+stderr (default 10KB) prevents flooding
  • Stdin Support: Pass input data via --stdin or --stdin-file
  • Real-time Streaming: gRPC bidirectional streaming for live output
  • Smart Caching: absl::Hash-based LRU cache with 1-hour TTL
  • Multi-Language Support: C, C++, and Python with auto-detection from file extension

Architecture: gRPC Alarm for Process Timeout

The sandbox uses gRPC Alarm for efficient process timeout handling instead of the traditional fork-based watcher process approach.

Before: Fork-Based Watcher (DEPRECATED)

// Old approach - creates an extra process for each timeout
pid_t watcher_pid = fork();
if (watcher_pid == 0) {
    for (int fd = 0; fd < 1024; ++fd) close(fd);
    sleep(timeout_seconds);
    if (kill(pid, 0) == 0) kill(pid, SIGKILL);
    _exit(0);
}

Problems with fork-based approach:

  • Creates an extra process for each timeout (process table pollution)
  • Duplicates parent's address space (memory overhead)
  • Requires complex cleanup (zombie process handling)
  • Not integrated with gRPC server lifecycle
  • Race conditions between watcher and main process

After: gRPC Alarm-Based Timeout

// New approach - uses gRPC Alarm for efficient timeout
auto timeout_manager = std::make_unique<ProcessTimeoutManager>(
    pid, timeout, [&timed_out_flag, pid]() {
      timed_out_flag.store(true);
      if (kill(pid, 0) == 0) kill(pid, SIGKILL);
    });
timeout_manager->Start();

Benefits of gRPC Alarm approach:

Aspect Fork-Based gRPC Alarm
Extra processes 1 per timeout 0
Memory overhead Full address space copy Minimal
Zombie handling Required Not needed
gRPC integration None Native
Cancellation Complex Simple Cancel()
Thread safety Manual absl::Mutex

Demo: Comparing Approaches

To demonstrate the benefits:

# Terminal 1: Monitor process count
watch -n 0.5 'ps -e | wc -l'

# Terminal 2: Start server
bazel run //src/api:server

# Terminal 3: Send concurrent requests with timeouts
for i in {1..50}; do
  python python_client/main.py --file examples/python/06_infinite_loop.py &
done

Expected Results:

  • Old approach: Process count increases by 2x (main + watcher for each request)
  • New approach: Process count increases by 1x (only the sandboxed process)

Execution Pipeline Architecture

The code execution follows a Command Pattern (GoF) with discrete execution steps:

ExecutionContext
       │
       ▼
┌─────────────────────┐
│ CreateSourceFileStep│  Creates temp file with code
└─────────────────────┘
       │
       ▼
┌─────────────────────┐
│    CompileStep      │  Compiles source (C++ only)
└─────────────────────┘
       │
       ▼
┌─────────────────────┐
│   RunProcessStep    │  Executes with sandboxing
│   + gRPC Alarm      │  Timeout management
└─────────────────────┘
       │
       ▼
┌─────────────────────┐
│ FinalizeResultStep  │  Formats result and trace
└─────────────────────┘
       │
       ▼
  ExecutionResult

Project Structure

DCodeX/
├── src/api/                      # C++ gRPC server
│   ├── main.cpp                  # Server entry point
│   ├── code_executor_service.h/cpp   # gRPC service implementation
│   └── execute_reactor.h/cpp     # gRPC stream reactor
├── src/engine/                   # Execution engine
│   ├── sandbox.h/cpp             # SandboxedProcess orchestrator
│   ├── execution_types.h         # ExecutionResult, ResourceStats
│   ├── execution_step.h          # Command pattern steps
│   ├── execution_pipeline.h      # Template method pipeline
│   ├── execution_strategy.h      # Strategy pattern (C/C++/Python)
│   ├── process_timeout_manager.h # gRPC Alarm-based timeout
│   ├── warm_worker_pool.h/cpp    # Worker pool for concurrency
│   ├── process_runner.h          # RAII process management
│   ├── output_filter.h           # Output truncation
│   └── temp_file_manager.h       # Temp file utilities
├── src/common/                   # Common utilities
│   ├── execution_cache.h/cpp     # LRU cache with TTL
│   └── status_macros.h           # Local ABSL_RETURN_IF_ERROR macros
├── proto/
│   ├── sandbox.proto             # gRPC protocol definition
│   └── BUILD                     # Proto/CC gRPC library
├── python_client/                # Python client
│   ├── main.py                   # Entry point
│   ├── grpc_client.py            # gRPC client wrapper
│   ├── executor.py               # Execution orchestration
│   ├── formatter.py              # Output formatting
│   └── ...
├── examples/                     # Example code files
│   ├── c/                        # C examples
│   ├── cpp/                      # C++ examples
│   └── python/                   # Python examples
├── pyproject.toml                # Python project config
└── MODULE.bazel                  # Bazel workspace

Architecture Overview

Design Patterns

The codebase follows several GoF design patterns for clean separation of concerns:

Pattern Component Purpose
Command ExecutionStep Encapsulates each pipeline step as an object
Template Method ExecutionPipeline Defines skeleton algorithm, steps fill in details
Strategy ExecutionStrategy Interchangeable execution algorithms (C/C++/Python)
Dependency Injection CacheInterface Decouples cache implementation from consumers

Dependency Injection

The codebase uses constructor-based dependency injection for testability and flexibility:

// CacheInterface is injected, not statically accessed
class SandboxedProcess {
 public:
  explicit SandboxedProcess(std::shared_ptr<CacheInterface> cache);
 private:
  std::shared_ptr<CacheInterface> cache_;
};

// Server creates and injects the cache
auto cache = std::make_shared<ExecutionCache>(absl::Hours(1), 1000);
CodeExecutorServiceImpl service(max_sandboxes, std::move(cache));

Error Handling

Local macros provide clean error handling without external dependencies:

// src/common/status_macros.h
#define ABSL_RETURN_IF_ERROR(expr) \
  if (const absl::Status _status = (expr); !_status.ok()) return _status

#define ABSL_ASSIGN_OR_RETURN(lhs, rexpr) \
  auto _result = (rexpr); \
  if (!_result.ok()) return _result.status(); \
  lhs = std::move(_result).value()

License

Apache License 2.0