Warning
We're really proud of Thread and think it's pretty awesome, but we can't maintain it anymore.
We're focused on something else.
Thread is an incomplete realtime codebase intelligence platform. The core architecture is complete. It has good bones. Still licensed under AGPL 3.0, but if you have a good use for it, we're willing to reconsder. Just ask.
So please, fork it and build something great!
A safe, fast, flexible code analysis and parsing engine built in Rust. Production-ready service-library dual architecture with content-addressed caching and incremental intelligence.
Thread is a high-performance code analysis platform that operates as both a reusable library ecosystem and a persistent service. Built on tree-sitter parsers and enhanced with the ReCoco dataflow framework, Thread delivers 50x+ performance gains through content-addressed caching while supporting dual deployment: CLI with Rayon parallelism and Edge on Cloudflare Workers.
- ✅ Content-Addressed Caching: Blake3 fingerprinting enables 99.7% cost reduction and 346x faster analysis on repeated runs
- ✅ Incremental Updates: Only reanalyze changed files—unmodified code skips processing automatically
- ✅ Dual Deployment: Single codebase compiles to both CLI (Rayon + Postgres) and Edge (tokio + D1 on Cloudflare Workers)
- ✅ Multi-Language Support: 26 languages via tree-sitter (Rust, TypeScript, Python, Go, Java, C/C++, Solidity, HCL, Nix, and more)
- ✅ Pattern Matching: Powerful AST-based pattern matching with meta-variables for complex queries
- ✅ Production Performance: >1,000 files/sec throughput, >90% cache hit rate, <50ms p95 latency
# Clone the repository
git clone https://github.com/knitli/thread.git
cd thread
# Install development tools (optional, requires mise)
mise run install-tools
# Build Thread with all features
cargo build --workspace --all-features --release
# Verify installation
./target/release/thread --versionuse thread::language::{SupportLang, LanguageExt};
// Parse source code using language-specific AST
let ast = SupportLang::JavaScript.ast_grep("function hello() { return 42; }");
let root = ast.root();
// Find all function declarations
let functions = root.find_all("function $NAME($$$PARAMS) { $$$BODY }");
// Extract function names
for func in functions {
let name = func.get_env().get_match("NAME").unwrap().text().to_string();
println!("Found function: {name}");
}use thread_flow::ThreadFlowBuilder;
// Build a declarative analysis pipeline
let flow = ThreadFlowBuilder::new("analyze_rust")
.source_local("src/", &["**/*.rs"], &["target/**"])
.parse()
.extract_symbols()
.target_postgres("code_symbols", &["content_hash"])
.build()
.await?;
// Execute the flow
flow.execute().await?;Thread is a library-first platform with no standalone CLI binary. Integrate it in your own project:
[dependencies]
thread = "0.1"use thread::language::{SupportLang, LanguageExt};
let ast = SupportLang::Rust.ast_grep("fn main() { println!(\"hello\"); }");
let root = ast.root();
for m in root.find_all("println!($$$ARGS)") {
println!("Found println! call");
}For dataflow pipelines with persistent caching, see the
Dataflow Pipelines section and the
thread-flow README.
Thread follows a service-library dual architecture with six main crates plus service layer:
thread-ast-engine- Core AST parsing, pattern matching, and transformation enginethread-language- Language definitions and tree-sitter parser integrations (26 languages)thread-rule-engine- Rule-based scanning and transformation with YAML configurationthread-utilities- Shared utilities including SIMD optimizations and hash functionsthread-wasm- WebAssembly bindings for browser and edge deployment
thread-flow- High-level dataflow pipelines with ThreadFlowBuilder APIthread-services- Service interfaces, API abstractions, and ReCoco integration- Storage Backends:
- Postgres (CLI deployment) - Persistent caching with <10ms p95 latency
- D1 (Cloudflare Edge) - Distributed caching across CDN nodes with <50ms p95 latency
- Qdrant (optional) - Vector similarity search for semantic analysis
- Rayon (CLI) - CPU-bound parallelism for local multi-core utilization (2-8x speedup)
- tokio (Edge) - Async I/O for horizontal scaling and Cloudflare Workers
Best for: Development environments, CI/CD pipelines, large batch processing
# Build with CLI features (Postgres + Rayon parallelism)
cargo build --release -p thread --features "flow"
# Configure PostgreSQL backend
export DATABASE_URL=postgresql://user:pass@localhost/thread_cache
export RAYON_NUM_THREADS=8 # Use 8 cores
# Integrate in your project with Postgres + parallel features
# thread-flow = { version = "0.1", features = ["postgres-backend", "parallel"] }Features: Direct filesystem access, multi-core parallelism, persistent caching, unlimited CPU time
See CLI Deployment Guide for complete setup.
Best for: Global API services, low-latency analysis, serverless architecture
# Build WASM for edge
cargo run -p xtask build-wasm --release
# Deploy to Cloudflare Workers
wrangler deploy
# Access globally distributed API
curl https://thread-api.workers.dev/analyze \
-d '{"code":"fn main(){}","language":"rust"}'
# → Response time: <50ms worldwide (p95)Features: Global CDN distribution, auto-scaling, D1 distributed storage, no infrastructure management
See Edge Deployment Guide for complete setup.
Thread supports 26 programming languages via tree-sitter parsers:
- Rust, JavaScript/TypeScript/TSX, Python, Go, Java
- C/C++, C#, PHP, Ruby, Swift, Kotlin, Scala
- Bash, CSS, HTML (with embedded JS/CSS), JSON, YAML, Lua, Elixir, Haskell
- HCL/Terraform, Nix, Solidity
Each language provides full AST parsing, symbol extraction, and pattern matching capabilities.
Thread's core strength is AST-based pattern matching using meta-variables:
$VAR- Captures a single AST node$$$ITEMS- Captures multiple consecutive nodes (ellipsis)$_- Matches any node without capturing
// Find all variable declarations
root.find_all("let $VAR = $VALUE")
// Find if-else statements
root.find_all("if ($COND) { $$$THEN } else { $$$ELSE }")
// Find function calls with any arguments
root.find_all("$FUNC($$$ARGS)")
// Find class methods
root.find_all("class $CLASS { $$$METHODS }")id: no-var-declarations
message: "Use 'let' or 'const' instead of 'var'"
language: JavaScript
severity: warning
rule:
pattern: "var $NAME = $VALUE"
fix: "let $NAME = $VALUE"| Language | Files | Time | Throughput | Cache Hit | Incremental (1% update) |
|---|---|---|---|---|---|
| Rust | 10,100 | 7.4s | 1,365 files/s | 100% | 0.6s (100 files) |
| TypeScript | 10,100 | 10.7s | 944 files/s | 100% | ~1.0s (100 files) |
| Python | 10,100 | 8.5s | 1,188 files/s | 100% | 0.7s (100 files) |
| Go | 10,100 | 5.4s | 1,870 files/s | 100% | 0.4s (100 files) |
| Operation | Time | Speedup vs Parse | Notes |
|---|---|---|---|
| Blake3 fingerprint | 425ns | 346x faster | Single file |
| Batch fingerprint | 17.7µs | - | 100 files |
| AST parsing | 147µs | Baseline | Small file (<1KB) |
| Cache hit (in-memory) | <1µs | 147,000x faster | LRU cache lookup |
| Cache hit (repeated) | 0.9s | 35x faster | 10,000 file reanalysis |
| Incremental (1%) | 0.6s | 12x faster | 100 changed, 10K total |
| Backend | Target | Actual (Phase 5) | Deployment |
|---|---|---|---|
| InMemory | N/A | <1ms | Testing |
| Postgres | <10ms p95 | <1ms (local) | CLI |
| D1 | <50ms p95 | <1ms (local) | Edge |
- Rust: 1.89 or later (edition 2024)
- Tools: cargo-nextest (optional), mise (optional)
# Build everything (except WASM)
mise run build
# or: cargo build --workspace
# Build in release mode
mise run build-release
# Build WASM for edge deployment
mise run build-wasm-release# Run all tests
mise run test
# or: cargo nextest run --all-features --no-fail-fast -j 1
# Run tests for specific crate
cargo nextest run -p thread-ast-engine --all-features
# Run benchmarks
cargo bench -p thread-rule-engine# Full linting
mise run lint
# Auto-fix formatting and linting issues
mise run fix
# Run CI pipeline locally
mise run ci# Run specific test
cargo nextest run --manifest-path Cargo.toml test_name --all-features
# Run benchmarks
cargo bench -p thread-flow- CLI Deployment Guide - Local/server deployment with Postgres
- Edge Deployment Guide - Cloudflare Workers with D1
- Architecture Overview - System design and data flow
- Rustdoc: Run
cargo doc --open --no-deps --workspacefor full API documentation
- Phase 5 Completion Summary - Production validation results and benchmarks
- ReCoco Integration - Dataflow integration design and patterns
- Incremental Update System - Change detection and invalidation design
All development MUST adhere to the Thread Constitution v2.0.0 (.specify/memory/constitution.md)
-
Service-Library Architecture (Principle I)
- Features MUST consider both library API design AND service deployment
- Both aspects are first-class citizens
-
Test-First Development (Principle III - NON-NEGOTIABLE)
- TDD mandatory: Tests → Approve → Fail → Implement
- All tests execute via
cargo nextest - No exceptions, no justifications accepted
-
Service Architecture & Persistence (Principle VI)
- Content-addressed caching MUST achieve >90% hit rate
- Storage targets: Postgres <10ms, D1 <50ms, Qdrant <100ms p95 latency
- Incremental updates MUST trigger only affected component re-analysis
Before any PR merge, verify:
- ✅
mise run lintpasses (zero warnings) - ✅
cargo nextest run --all-featurespasses (100% success) - ✅
mise run cicompletes successfully - ✅ Public APIs have rustdoc documentation
- ✅ Performance-sensitive changes include benchmarks
- ✅ Service features meet storage/cache/incremental requirements
We welcome contributions of all kinds! By contributing to Thread, you agree to our Contributor License Agreement (CLA).
- Run
mise run install-toolsto set up development environment - Make changes following existing patterns
- Run
mise run fixto apply formatting and linting - Run
mise run testto verify functionality - Use
mise run cito run full CI pipeline locally - Submit pull request with clear description
Thread follows the REUSE Specification for license information. Every file should have license information at the top or in a .license file. See existing files for examples.
Thread is licensed under the GNU Affero General Public License v3.0 (AGPL-3.0-or-later). You can find the full license text in the LICENSE file.
Key Points:
- ✅ Free for personal and commercial use
- ✅ Modify the code as needed
⚠️ You must share your changes with the community under AGPL 3.0 or later⚠️ Include AGPL 3.0 and copyright notice with copies you share- ℹ️ If you don't modify Thread, you can use it without sharing your source code
Purchase a commercial license from Knitli to use Thread without sharing your source code. Contact us at licensing@knit.li
- Some components forked from ast-grep are licensed under AGPL 3.0 or later AND MIT. See VENDORED.md.
- Documentation and configuration files are licensed under MIT OR Apache-2.0 (your choice).
Thread has been validated for production use with comprehensive testing:
- 780 tests: 100% pass rate across all modules
- Real-world validation: Tested with 10,000+ files per language
- Performance targets: All metrics exceeded by 20-40%
- Edge cases: Comprehensive coverage including empty files, binary files, symlinks, Unicode, circular dependencies, deep nesting, large files
- Zero known issues: No crashes, memory leaks, or data corruption
See Phase 5 Completion Summary for full validation report.
- Documentation: https://thread.knitli.com
- Issues: GitHub Issues
- Email: support@knit.li
- Commercial Support: licensing@knit.li
Thread is built on the shoulders of giants:
- ast-grep: Core pattern matching engine (MIT license)
- tree-sitter: Universal parsing framework
- ReCoco: Dataflow orchestration framework
- BLAKE3: Fast cryptographic hashing
Special thanks to all contributors and the open source community.
Created by: Knitli Inc. Maintained by: Thread Team License: AGPL-3.0-or-later (with commercial license option) Version: 0.1.0