Based on https://github.com/morioka/tiny-openai-whisper-api, using whisper-mlx instead of openai-whisper
OpenAI Whisper API-style local server, runnig on FastAPI. This is for companies behind proxies or security firewalls.
This API will be compatible with OpenAI Whisper (speech to text) API. See also Create transcription - API Reference - OpenAI API.
Some of code has been copied from whisper-ui
This was built & tested on Python 3.10.8, Ubutu20.04/WSL2 but should also work on Python 3.9+.
sudo apt install ffmpeg
pip install fastapi python-multipart pydantic uvicorn ffmpeg-python
# or pip install -r requirements.txtor
docker compose buildexport PYTHONPATH=.
uvicorn main:app --host 0.0.0.0or
docker compose upnote: Authorization header is ignored.
example 1: typical usecase, identical to OpenAI Whisper API example
curl http://127.0.0.1:8000/v1/audio/transcriptions \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H "Content-Type: multipart/form-data" \
-F model="whisper-1" \
-F file="@/path/to/file/openai.mp3"example 2: set the output format as text, described in quickstart.
curl http://127.0.0.1:8000/v1/audio/transcriptions \
-H "Content-Type: multipart/form-data" \
-F model="whisper-1" \
-F file="@/path/to/file/openai.mp3" \
-F response_format=textWhisper is licensed under MIT. Everything else by morioka is licensed under MIT.