🕷️ joern-mcp

A Model Context Protocol (MCP) server that provides AI assistants with static code analysis capabilities using Joern's Code Property Graph (CPG) technology.

Features

Multi-Language Support: Java, C/C++, JavaScript, Python, Go, Kotlin, C#, Ghidra, Jimple, PHP, Ruby, Swift
Docker Isolation: Each analysis session runs in a secure container
GitHub Integration: Analyze repositories directly from GitHub URLs
Session-Based: Persistent CPG sessions with automatic cleanup
Redis-Backed: Fast caching and session management
Async Queries: Non-blocking CPG generation and query execution

Quick Start

Prerequisites

Python 3.8+
Docker
Redis
Git

Installation

Clone and install dependencies:

git clone https://github.com/Lekssays/joern-mcp.git
cd joern-mcp
pip install -r requirements.txt

Setup (builds Joern image and starts Redis):

./setup.sh

Configure (optional):

cp config.example.yaml config.yaml
# Edit config.yaml as needed

Run the server:

python main.py
# Server will be available at http://localhost:4242

Integration with GitHub Copilot

The server uses Streamable HTTP transport for network accessibility and supports multiple concurrent clients.

Add to your VS Code settings.json:

{
  "github.copilot.advanced": {
    "mcp": {
      "servers": {
        "joern-mcp": {
          "url": "http://localhost:4242/mcp",
        }
      }
    }
  }
}

Make sure the server is running before using it with Copilot:

python main.py

Available Tools

Core Tools

create_cpg_session: Initialize analysis session from local path or GitHub URL
run_cpgql_query: Execute synchronous CPGQL queries with JSON output
run_cpgql_query_async: Execute asynchronous queries with status tracking
get_query_status: Check status of asynchronously running queries
get_query_result: Retrieve results from completed queries
cleanup_queries: Clean up old completed query results
get_session_status: Check session state and metadata
list_sessions: View active sessions with filtering
close_session: Clean up session resources
cleanup_all_sessions: Clean up multiple sessions and containers

Code Browsing Tools

get_codebase_summary: Get high-level overview of codebase (file count, method count, language)
list_files: List all source files with optional regex filtering
list_methods: Discover all methods/functions with filtering by name, file, or external status
get_method_source: Retrieve actual source code for specific methods
list_calls: Find function call relationships and dependencies
get_call_graph: Build call graphs (outgoing callees or incoming callers) with configurable depth
list_parameters: Get detailed parameter information for methods
find_literals: Search for hardcoded values (strings, numbers, API keys, etc)
get_code_snippet: Retrieve code snippets from files with line range

Security Analysis Tools

find_taint_sources: Locate likely external input points (taint sources)
find_taint_sinks: Locate dangerous sinks where tainted data could cause vulnerabilities
find_taint_flows: Find dataflow paths from sources to sinks using Joern dataflow primitives
find_argument_flows: Find flows where the exact same expression is passed to both source and sink calls
check_method_reachability: Check if one method can reach another through the call graph
list_taint_paths: List detailed taint flow paths from sources to sinks
get_program_slice: Build a program slice from a specific line or call

Example Usage

# Create session from GitHub
{
  "tool": "create_cpg_session",
  "arguments": {
    "source_type": "github",
    "source_path": "https://github.com/user/repo",
    "language": "java"
  }
}

# Get codebase overview
{
  "tool": "get_codebase_summary",
  "arguments": {
    "session_id": "abc-123-def"
  }
}

# List all methods in the codebase
{
  "tool": "list_methods",
  "arguments": {
    "session_id": "abc-123-def",
    "include_external": false,
    "limit": 50
  }
}

# Get source code for a specific method
{
  "tool": "get_method_source",
  "arguments": {
    "session_id": "abc-123-def",
    "method_name": "authenticate"
  }
}

# Find what methods call a specific function
{
  "tool": "get_call_graph",
  "arguments": {
    "session_id": "abc-123-def",
    "method_name": "execute_query",
    "depth": 2,
    "direction": "incoming"
  }
}

# Search for hardcoded secrets
{
  "tool": "find_literals",
  "arguments": {
    "session_id": "abc-123-def",
    "pattern": "(?i).*(password|secret|api_key).*",
    "limit": 20
  }
}

# Get code snippet from a file
{
  "tool": "get_code_snippet",
  "arguments": {
    "session_id": "abc-123-def",
    "filename": "src/main.c",
    "start_line": 10,
    "end_line": 25
  }
}

# Run custom CPGQL query
{
  "tool": "run_cpgql_query",
  "arguments": {
    "session_id": "abc-123-def",
    "query": "cpg.method.name.l"
  }
}

# Find potential security vulnerabilities
{
  "tool": "find_taint_sources",
  "arguments": {
    "session_id": "abc-123-def",
    "language": "c"
  }
}

# Check for data flows from sources to sinks
{
  "tool": "find_taint_flows",
  "arguments": {
    "session_id": "abc-123-def",
    "source_patterns": ["getenv", "fgets"],
    "sink_patterns": ["system", "sprintf"]
  }
}

# Find argument flows between function calls
{
  "tool": "find_argument_flows",
  "arguments": {
    "session_id": "abc-123-def",
    "source_name": "validate_input",
    "sink_name": "process_data",
    "arg_index": 0
  }
}

# Get detailed taint paths
{
  "tool": "list_taint_paths",
  "arguments": {
    "session_id": "abc-123-def",
    "source_pattern": "getenv",
    "sink_pattern": "system",
    "max_paths": 5
  }
}

# Build program slice for security analysis
{
  "tool": "get_program_slice",
  "arguments": {
    "session_id": "abc-123-def",
    "filename": "main.c",
    "line_number": 42,
    "call_name": "memcpy"
  }
}

Security Analysis Capabilities

The security analysis tools provide comprehensive vulnerability detection including:

Taint Analysis:

Source identification: find_taint_sources locates external input points
Sink identification: find_taint_sinks finds dangerous operations
Flow analysis: find_taint_flows traces data from sources to sinks
Argument flow analysis: find_argument_flows finds exact expression reuse between calls
Path enumeration: list_taint_paths provides detailed propagation chains

Program Slicing:

Backward slicing: get_program_slice shows all code affecting a specific operation
Data dependencies: Variable assignments and data flow tracking
Control dependencies: Conditional statements affecting execution

Reachability Analysis:

Method connectivity: check_method_reachability verifies call graph connections
Impact analysis: Understand potential execution paths

Configuration

Key settings in config.yaml:

server:
  host: 0.0.0.0
  port: 4242
  log_level: INFO

redis:
  host: localhost
  port: 6379

sessions:
  ttl: 3600                # Session timeout (seconds)
  max_concurrent: 50       # Max concurrent sessions

cpg:
  generation_timeout: 600  # CPG generation timeout (seconds)
  supported_languages: [java, c, cpp, javascript, python, go, kotlin, csharp, ghidra, jimple, php, ruby, swift]

Environment variables override config file settings (e.g., MCP_HOST, REDIS_HOST, SESSION_TTL).

Example CPGQL Queries

Find all methods:

cpg.method.name.l

Find hardcoded secrets:

cpg.literal.code("(?i).*(password|secret|api_key).*").l

Find SQL injection risks:

cpg.call.name(".*execute.*").where(_.argument.isLiteral.code(".*SELECT.*")).l

Find complex methods:

cpg.method.filter(_.cyclomaticComplexity > 10).l

Architecture

FastMCP Server: Built on FastMCP 2.12.4 framework with Streamable HTTP transport
HTTP Transport: Network-accessible API supporting multiple concurrent clients
Docker Containers: One isolated Joern container per session
Redis: Session state and query result caching
Async Processing: Non-blocking CPG generation
CPG Caching: Reuse CPGs for identical source/language combinations

Development

Project Structure

joern-mcp/
├── src/
│   ├── services/       # Session, Docker, Git, CPG, Query services
│   ├── tools/          # MCP tool definitions
│   ├── utils/          # Redis, logging, validators
│   └── models.py       # Data models
├── playground/         # Test codebases and CPGs
├── main.py            # Server entry point
├── config.yaml        # Configuration
└── requirements.txt   # Dependencies

Running Tests

# Install dev dependencies
pip install -r requirements.txt

# Run tests
pytest

# Run with coverage
pytest --cov=src --cov-report=html

Code Quality

# Format
black src/ tests/
isort src/ tests/

# Lint
flake8 src/ tests/
mypy src/

Troubleshooting

Setup issues:

# Re-run setup to rebuild and restart services
./setup.sh

Docker issues:

# Verify Docker is running
docker ps

# Check Joern image
docker images | grep joern

# Check Redis container
docker ps | grep joern-redis

Redis connection issues:

# Test Redis connection
docker exec joern-redis redis-cli ping

# Check Redis logs
docker logs joern-redis

# Restart Redis
docker restart joern-redis

Server connectivity:

# Test server is running
curl http://localhost:4242/health

# Check server logs for errors
python main.py

Loading large projects:

joern:
  binary_path: ${JOERN_BINARY_PATH:joern}
  memory_limit: ${JOERN_MEMORY_LIMIT:16g}
  java_opts: ${JOERN_JAVA_OPTS:-Xmx16G -Xms8G -XX:+UseG1GC -Dfile.encoding=UTF-8}

Debug logging:

export MCP_LOG_LEVEL=DEBUG
python main.py

Contributing

We welcome contributions! Please see CONTRIBUTING.md for:

Getting started with development setup
Code style and quality guidelines
Testing requirements and best practices
Submitting changes through pull requests
Reporting issues and feature requests
Documentation standards

Quick start for contributors:

git clone https://github.com/YOUR_USERNAME/joern-mcp.git
cd joern-mcp
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
./setup.sh

# Create feature branch
git checkout -b feature/your-feature

# Make changes and run tests
pytest && black . && flake8

# Submit pull request

See CONTRIBUTING.md for detailed guidelines.

Acknowledgments

Joern - Static analysis platform
FastMCP - MCP framework
Model Context Protocol - MCP specification

Built with ❤️ in Doha 🇶🇦

Name		Name	Last commit message	Last commit date
Latest commit History 61 Commits
examples		examples
playground		playground
src		src
tests		tests
.env.example		.env.example
.flake8		.flake8
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
DOCS.md		DOCS.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
build.sh		build.sh
config.example.yaml		config.example.yaml
docker-compose.yml		docker-compose.yml
main.py		main.py
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
requirements.txt		requirements.txt
run_tests.py		run_tests.py
setup.sh		setup.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🕷️ joern-mcp

Features

Quick Start

Prerequisites

Installation

Integration with GitHub Copilot

Available Tools

Core Tools

Code Browsing Tools

Security Analysis Tools

Example Usage

Security Analysis Capabilities

Configuration

Example CPGQL Queries

Architecture

Development

Project Structure

Running Tests

Code Quality

Troubleshooting

Contributing

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🕷️ joern-mcp

Features

Quick Start

Prerequisites

Installation

Integration with GitHub Copilot

Available Tools

Core Tools

Code Browsing Tools

Security Analysis Tools

Example Usage

Security Analysis Capabilities

Configuration

Example CPGQL Queries

Architecture

Development

Project Structure

Running Tests

Code Quality

Troubleshooting

Contributing

Acknowledgments

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages