
Project Overview
🧠 Medical Speech Transcription System (Bachelor thesis)
A privacy-first, offline-ready system for medical professionals that transcribes and summarizes patient consultations using local AI models. Designed to reduce documentation time and improve accuracy in medical records.
⚙️ System Overview
This system consists of a client-server architecture, where:
- Backend (Python + FastAPI) serves only as a bridge to the local database (PostgreSQL).
- Frontend (NuxtJS + TailwindCSS + Tauri) handles the voice capture, ASR transcription (Whisper-rs), and LLM summarization (Ollama-rs) entirely on the client side.
🧩 Key Features
- 🎤 Real-time audio transcription with Whisper-rs
- ✍️ LLM-based summarization of transcripts via Ollama-rs
- 🧩 Modular architecture using Service–Repository pattern
- 🗄️ Secure local database for appointment records
- 📦 Cross-platform GUI app with Tauri and NuxtJS
- 🧪 Integrated unit and integration tests with Pytest
- 🔁 CI/CD via GitHub Actions
📦 Technologies Used
Layer | Stack |
---|---|
Backend | Python, FastAPI, SQLAlchemy, PostgreSQL |
Frontend | NuxtJS, Nuxt UI, TailwindCSS, Tauri |
AI Models | Whisper-rs, Ollama-rs |
DevOps | GitHub Actions, PyInstaller, Tauri Builder |
📝 License
This project is licensed under the MIT License.
Gallery



