## Summary
- Add `Dockerfile.test`: Python 3.12 image with Claude Code CLI
installed
- Add `scripts/test-docker.sh`: Local script to run tests in Docker
- Add `test-e2e-docker` job to CI workflow that runs the full e2e suite
in a container
- Add `.dockerignore` to speed up Docker builds
## Context
This helps catch Docker-specific issues like #406 where filesystem-based
agents loaded via `setting_sources=["project"]` may silently fail in
Docker environments.
## Local Usage
```bash
# Run unit tests in Docker (no API key needed)
./scripts/test-docker.sh unit
# Run e2e tests in Docker
ANTHROPIC_API_KEY=sk-... ./scripts/test-docker.sh e2e
# Run all tests
ANTHROPIC_API_KEY=sk-... ./scripts/test-docker.sh all
```
## Test plan
- [x] Unit tests pass in Docker locally (129 passed)
- [ ] CI job runs successfully
🤖 Generated with [Claude Code](https://claude.com/claude-code)
---------
Co-authored-by: Claude <noreply@anthropic.com>
Reduce CI job count by only running examples on Python 3.13 instead of
all Python versions (3.10-3.13). This reduces the combinatorial
explosion while still ensuring examples work on the latest Python
version.
Co-authored-by: Claude <noreply@anthropic.com>
Closes the gap between Python and TypeScript SDK hook output types by
adding:
- `reason` field for explaining decisions
- `continue_` field for controlling execution flow
- `suppressOutput` field for hiding stdout
- `stopReason` field for stop explanations
- `decision` now supports both "approve" and "block" (not just "block")
- `AsyncHookJSONOutput` type for deferred hook execution
- Proper typing for `hookSpecificOutput` with discriminated unions
Also adds comprehensive examples and tests:
- New examples in hooks.py demonstrating all new fields
- Unit tests in test_tool_callbacks.py for new output types
- E2E tests in e2e-tests/test_hooks.py with real API calls
- CI integration in .github/workflows/test.yml
🤖 Generated with [Claude Code](https://claude.com/claude-code)
---------
Co-authored-by: Claude <noreply@anthropic.com>
Temporarily removes windows-latest from the test-e2e job matrix to
disable Windows end-to-end testing. Unit tests continue to run on
Windows.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-authored-by: Claude <noreply@anthropic.com>
Add cross-platform testing for Windows alongside Linux across all test
jobs (unit tests, e2e tests, and examples). Uses native Windows
installation via PowerShell script and platform-specific timeout
handling for example scripts.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
---------
Co-authored-by: Claude <noreply@anthropic.com>
## Summary
This PR adds support for custom tool callbacks and comprehensive e2e
testing for MCP calculator functionality.
## Key Features Added
- **Custom tool permission callbacks** - Allow dynamic tool permission
control via `can_use_tool` callback
- **E2E test suite** - Real Claude API tests validating MCP tool
execution end-to-end
- **Fixed MCP calculator example** - Now properly uses `allowed_tools`
for permission management
## Changes
### Custom Callbacks
- Added `ToolPermissionContext` and `PermissionResult` types for tool
permission handling
- Implemented `can_use_tool` callback support in SDK client
- Added comprehensive tests in `tests/test_tool_callbacks.py`
### E2E Testing Infrastructure
- Created `e2e-tests/` directory with pytest-based test suite
- `test_mcp_calculator.py` - Tests all calculator operations with real
API calls
- `conftest.py` - Pytest config with mandatory API key validation
- GitHub Actions workflow for automated e2e testing on main branch
- Comprehensive documentation in `e2e-tests/README.md`
### Bug Fixes
- Fixed MCP calculator example to use `allowed_tools` instead of
incorrect `permission_mode`
- Resolved tool permission issues preventing MCP tools from executing
## Testing
E2E tests require `ANTHROPIC_API_KEY` environment variable and will fail
without it.
Run locally:
```bash
export ANTHROPIC_API_KEY=your-key
python -m pytest e2e-tests/ -v -m e2e
```
Run unit tests including callback tests:
```bash
python -m pytest tests/test_tool_callbacks.py -v
```
🤖 Generated with [Claude Code](https://claude.ai/code)
---------
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Kashyap Murali <kashyap@anthropic.com>
## Summary
- Standardized the trigger configuration across test, lint, and e2e
workflows
- All workflows now use consistent format with pull_request listed
before push
- E2E workflow now also runs on pushes to main branch
## Test plan
- [ ] Verify workflows trigger correctly on pull requests
- [ ] Verify workflows trigger correctly on pushes to main branch
- [ ] Check that e2e tests run successfully on main branch
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-authored-by: Claude <noreply@anthropic.com>