prototype: SEP-1577 - Sampling With Tools by ochafik · Pull Request #991 · modelcontextprotocol/typescript-sdk

and others added 22 commits

October 1, 2025 14:28
Add comprehensive tool calling support to MCP sampling:

- New content types: ToolCallContent and ToolResultContent
- Split SamplingMessage into role-specific UserMessage/AssistantMessage
- Add ToolChoice schema for controlling tool usage behavior
- Update CreateMessageRequest with tools and tool_choice parameters
- Update CreateMessageResult with new stop reasons (toolUse, refusal, other)
- Enhance ClientCapabilities.sampling with context and tools sub-capabilities
- Mark includeContext as soft-deprecated in favor of explicit tools
- Add comprehensive unit tests (27 new test cases covering all new schemas)

All tests pass (47/47 in types.test.ts).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
…OpenAI APIs

Remove the non-standard isError field from ToolResultContentSchema.
Errors should be represented in the content object itself, matching
the standard behavior of Claude and OpenAI tool result APIs.

Updated tests to validate error content directly without isError flag.
All tests pass (47/47 in types.test.ts, 683/683 in full suite).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Update backfillSampling.ts to support SEP-1577 tool calling:

**New Conversions:**
- MCP Tool → Claude API tool format (toolToClaudeFormat)
- MCP ToolChoice → Claude tool_choice (toolChoiceToClaudeFormat)
- Claude tool_use → MCP ToolCallContent (in contentToMcp)
- MCP ToolResultContent → Claude tool_result (in contentFromMcp)

**Message Handling:**
- Extract and convert tools/tool_choice from CreateMessageRequest
- Pass tools to Claude API messages.create
- Handle multi-content responses (prioritize tool_use over text)
- Map stop reasons: tool_use → toolUse, end_turn → endTurn, etc.

**Flow Support:**
The proxy now fully supports agentic tool calling loops:
1. Server sends request with tools
2. Claude responds with tool_use
3. Server executes tool and sends tool_result
4. Claude provides final answer

All conversions maintain type safety with proper MCP types
(UserMessage, AssistantMessage, ToolCallContent, ToolResultContent).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
…le search

Adds a comprehensive example showing how to use MCP sampling with a tool loop.
The server exposes a fileSearch tool that uses an LLM with locally-defined
ripgrep and read tools to intelligently search and read files.

Key features:
- Implements a full agentic tool loop pattern
- Uses systemPrompt parameter for proper LLM instruction
- Validates tool inputs using Zod schemas
- Ensures path safety with canonicalization and CWD constraints
- Demonstrates recursive tool use (LLM decides which tools to call)
- Proper error handling throughout the tool loop
- Includes iteration limit to prevent infinite loops

This example demonstrates SEP-1577 tool calling support in MCP sampling.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Analyzed existing test files and examples to document:
- How to set up Client with StdioClientTransport for testing
- How to implement sampling request handlers
- Proper test structure and cleanup patterns
- Example code snippets for sampling handlers
- How to simulate a tool loop conversation in tests
- Common pitfalls and solutions

This analysis covers both unit testing (InMemoryTransport) and
integration testing (StdioClientTransport) patterns for servers
that use MCP sampling.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Adds integration tests for toolLoopSampling server that verify:
- Complete tool loop flow (ripgrep → read → final answer)
- Path validation and security (prevents directory traversal)
- Error handling for invalid tool names
- Input validation with malformed tool inputs
- Maximum iteration limit enforcement

Tests use StdioClientTransport to spawn actual server process and
implement sampling handlers that simulate LLM behavior with tool
calls and responses.

All 5 tests pass successfully, providing solid coverage of the
agentic tool loop pattern.

Also updates toolLoopSampling.ts with linter formatting fixes.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
…pSampling

Updated toolLoopSampling.ts to properly handle CreateMessageResult.content
as both single content blocks and arrays:

- Changed runToolLoop return type to include both answer and transcript
- Extract and execute ALL tool_use blocks in parallel using Promise.all()
- Concatenate all text content blocks for final answer
- Return full message transcript for debugging

This ensures the tool loop works correctly when the LLM returns multiple
content blocks (text + tool calls, or multiple tool calls in one turn).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Added sendLoggingMessage calls to provide real-time feedback on tool loop
operations:

- Log iteration number at the start of each loop
- Log number and names of tools being executed
- Log completion message with total iteration count

Also fixed toolWithSampleServer.ts to handle CreateMessageResult.content
as arrays (extract and concatenate all text blocks).

This provides visibility into the tool loop's progress for debugging and
monitoring purposes.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
…oolLoopSampling

Added line range support to the read tool:
- Added optional startLineInclusive and endLineInclusive parameters
- Returns numbered lines when range is specified
- Validates line ranges and provides helpful error messages

Improved logging with tool-specific messages:
- Loop iteration logs: "Loop iteration N: X tool invocation(s) requested"
- Ripgrep logs: "Searching pattern 'X' under Y"
- Read logs: "Reading file X" or "Reading file X (lines A-B)"

Updated tool descriptions:
- Added hint to read tool about requesting context lines around matches
- Emphasized that ripgrep output includes line numbers

This provides better visibility into tool operations and enables more
efficient file reading by fetching only relevant line ranges.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

This was referenced

Oct 1, 2025
Added comprehensive token usage tracking and reporting to the tool loop
sampling example:

**backfillSampling.ts**:
- Pass Anthropic API usage data through _meta field of CreateMessageResult
- Includes all token counts from Claude API response

**toolLoopSampling.ts**:
- Added AggregatedUsage interface to track cumulative token counts
- Aggregate usage across all API calls in the tool loop:
  - input_tokens (regular input)
  - cache_creation_input_tokens (tokens to create cache)
  - cache_read_input_tokens (tokens read from cache)
  - output_tokens (generated output)
  - api_calls (number of createMessage calls)
- Updated runToolLoop return type to include usage
- Display formatted usage summary in fileSearch tool output:
  - Total input tokens with breakdown by type
  - Total output tokens
  - Total tokens consumed
  - Number of API calls made

Example output:
```
--- Token Usage Summary ---
Total Input Tokens: 1234
  - Regular: 800
  - Cache Creation: 200
  - Cache Read: 234
Total Output Tokens: 567
Total Tokens: 1801
API Calls: 3
```

This provides complete visibility into Claude API token consumption for
monitoring costs and optimizing cache usage.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

@ochafik

Updated backfillSampling and toolLoopSampling to support the new schema
where UserMessage and AssistantMessage content can be arrays:

**backfillSampling.ts**:
- Split contentFromMcp into contentBlockFromMcp (single block) and
  contentFromMcp (handles both single and arrays)
- Updated message mapping to pass content arrays directly to Claude API
- Now properly handles messages with multiple content blocks

**toolLoopSampling.ts**:
- Removed flattening logic that created multiple messages
- SamplingMessage now natively supports content arrays
- Simplified message history management

**toolLoopSampling.test.ts**:
- Added helper to handle content as potentially an array
- Updated all test assertions to work with array content
- All 5 tests passing

This aligns with the MCP protocol change allowing content arrays in
UserMessage and AssistantMessage, enabling multi-block responses (e.g.,
text + tool calls in one message).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

@ochafik

@ochafik

…ion of tool loop (no tools!)

@ochafik

@ochafik

@ochafik

The examples:tool-loop script was pointing to src/ files which are not
included in the npm package (only dist/ is included via the "files" field).
This caused the script to fail when run via npx with a git URL.

Changed from:
  tsx src/examples/backfill/backfillSampling.ts tsx src/examples/server/toolLoopSampling.ts

To:
  node dist/esm/examples/backfill/backfillSampling.js node dist/esm/examples/server/toolLoopSampling.js

This allows the script to work both locally and when installed via npx from git.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

@ochafik

@ochafik @claude

@ochafik

@ochafik

@ochafik

@ochafik

@ochafik

@ochafik

@ochafik

@ochafik

@ochafik

@ochafik

@ochafik

@ochafik

@ochafik

@ochafik

@ochafik

@ochafik

@ochafik

- Remove duplicate 'sampling' property in ClientCapabilitiesSchema (types.ts:299)
- Add type guards for union types in backfillSampling.ts:
  - Check 'text' in c.resource before accessing text property
  - Check 'blob' in c.resource before accessing blob property
  - Fix c.resource.data to c.resource.blob for PDF (blob is the correct property)
  - Fix c.url to c.uri for resource_link (uri is the correct property)
- Fix toolLoop.ts:
  - Change tool_choice to toolChoice (camelCase convention)
  - Add type guard for text blocks when joining content
- Fix toolRegistry.ts:
  - Remove .shape access on Zod schemas (not available on ZodType)
  - Add type assertion for callback to handle optional inputSchema

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

@ochafik

@ochafik