codebase-indexer is the context module of ZGSM (ZhuGe Smart Mind) AI Programming Assistant which running on backend. It provides powerful codebase indexing capabilities to support semantic search for RAG (Retrieval-Augmented Generation) systems.
- 🔍 Semantic code search with embeddings
 - 🌐 Multi-language support
 - 📊 Codebase statistics and information query API
 
- Go 1.24.3 or higher
 - Docker
 - PostgreSQL
 - Redis
 - Weavaite
 
# Clone the repository
git clone https://github.com/zgsm-ai/codebase-embedder.git
cd codebase-embedder
# Install dependencies
go mod tidy- Set up PostgreSQL 、 Redis、vector, etc.
 
vim etc/config.yaml- Update the configuration with your database and Redis credentials
 
# Build the project
make buildThe system consists of several key components:
- Parser: Code parsing and AST generation
 - Embedding: Code semantic vector generation
 - Store: Data storage and indexing
 - API: RESTful service interface
 
This project is licensed under the Apache 2.0 License.
This project builds upon the excellent work of:
- Tree-sitter - For providing robust parsing capabilities