FastAPI workbench for text embedding (Gemma-300m with Matryoshka) and summarization (Gemma/Gemini). Features hardware acceleration, caching, and secure endpoints for local LLM integration.
- 
            Updated
            
Nov 3, 2025  - Python
 
FastAPI workbench for text embedding (Gemma-300m with Matryoshka) and summarization (Gemma/Gemini). Features hardware acceleration, caching, and secure endpoints for local LLM integration.
Advanced Golang concepts 🔵
Add a description, image, and links to the rate-limiting-caching topic page so that developers can more easily learn about it.
To associate your repository with the rate-limiting-caching topic, visit your repo's landing page and select "manage topics."