Mit Parakeet: 81 Minuten Video in nur 39 Sekunden transkribiert

Ich will nicht mehr ohne Macwhisper leben.

Mit Parakeet: 81 Minuten Video in nur 39 Sekunden transkribiert

Ich bin ein großer Fan von Sprachmemos an mich selbst. So entlade ich etwa meinen Bewusstseinsstrom nach einem Film direkt in mein Smartphone, um diese Gedanken später am Rechner zu ordnen und sie halbwegs lesbar hier zu veröffentlichen. Oder wenn ich mit Podcast auf den Ohren bin und ich zu dieser einen Stelle jetzt wirklich Meinung loswerden muss.

Damit ich nicht irgend eine Cloud mit meiner Stimme füttern muss, damit am Ende Elon einen Sexbot damit trainiert, setze ich auf lokale Transkription direkt auf dem Laptop. Das Tool meiner Wahl, dass ich allen Mac-Nutzer:innen wärmstens ans Herz legen möchte: Macwhisper. Damit kann man sich unterschiedliche Sprachmodelle auf den Rechner laden, um damit Podcasts und Videos zu transkribieren, dann daraus Untertitel zu erstellen, einen Text direkt zu diktieren oder eben mäandernde Sprachmemos möglichst unkompliziert zu verschriftlichen.

Das Modell meiner Wahl war bisher das Whisperkit Large v3 Turbo, weil es vergleichsweise absurd genau ist und eine Datei in einem akzeptablen Tempo transkribiert. Doch mit Nvidias neuem Sprachmodell, Parakeet v3, haben sich die Bedingungen für das, was als akzeptabel durchgeht, radikal geändert.

0:00
/0:12

Video via Goodsnooze, noch mit der Vorgängerversion Parakeet v2

So wie ich das verstanden habe, hat Nvidia diese Technik vorangetrieben, um Livevideo simultan untertiteln zu können. Ich habe es getestet und bin einigermaßen baff. Dafür habe ich das Tool mit einem Link zu einem 81-minütigen Youtube-Video gefüttert. Vom automatischen Runterladen der Tonspur bis zum fertigen Transkript inklusive Sprecher:innen-Erkennung und -Zuordnung sind gerade einmal 39 Sekunden(!) vergangen. (Einen Vorher-Nachher-Vergleich habe ich an dieser Stelle gar nicht. Eine etwa fünfminütige Sprachmemo hat mit dem alten Whisperkit jedoch länger gedauert, als die 81 Minuten Tonspur eines Videos. 🤯)

Macwhisper ist grundlegend erst mal kostenlos, aber nur mit ungenaueren Sprachmodellen. Auf Parakeet v3 lässt sich momentan nur mit Pro-Lizenz zugreifen, die aktuell mit 59 Euro zu Buche schlägt. Aus meiner Sicht lohnt sich diese Ausgabe absolut.

🎙️ MacWhisper
Quickly and easily transcribe audio files into text with OpenAI’s state-of-the-art transcription technology Whisper as well as Nvidia Parakeet. Whether you’re recording a meeting, lecture, or other important audio, MacWhisper quickly and accurately transcribes your audio files into text.📲 MacWhisper is now also available on iPhone and iPad, download it here.Full Feature List Easily record and transcribe audio files on your Mac System wide dictation with Whisper to replace Apple’s own dictation, even with the best Whisper models Just drag and drop audio files to get a high quality transcription Automatically record meetings in Zoom, Teams, Webex, Skype, Chime, Discord and more. Record directly from your microphone or any other input device on your Mac All transcription is done on your device, no data leaves your machine. This makes MacWhisper a great app for sensitive audio such as interviews. Save or export your transcripts as a .whisper file, which includes the original audio and all your transcription edits for easy sharing .srt & .vtt subtitles export as well as csv, dote, docx, pdf, markdown and html exports Metal and GPU support for extremely fast transcription Get accurate text transcriptions in seconds (up to ~30x realtime) Search the entire transcript and highlight words Audio playback synced to transcripts Supports 100 different languages Copy the entire transcript or individual sections Star/Favorite segments Compact mode (hide timestamps) Automatically remove ums, uhhs and other similar filler words Drag and drop directly from Voice Memos Edit and delete segments from the transcript Add up to two speakers manually Inline Video Player Video playback synced to subtitles View multiple language subtitles at once in the videoplayer Select transcription language (or use auto detect) Change playback speed from 0.5 to 3.0x (audio & video) Supported formats: mp3, wav, m4a, ogg, opus, mov and mp4 videos. Adjust whisper settings (beam search / greedy, beam size etc) Supports all Whisper models, some models are only fully available for Pro users MacWhisper Pro All above features Support for Parakeet v2 (for up to 300x realtime transcription at the highest accuracy) on m-series Macs Automatic Speaker Recognition with local models (M-series Macs only) and with ElevenLabs and Deepgram Automatic spelling, punctuation and grammar improvement in dictation mode (requires AI Service to be setup) Batch Transcribe as many files one after the other. Useful if you want to add subtitles to an entire season of a show, or if you have a lot of interviews to go through Support for WhisperKit and Distilled models Transcribe YouTube videos Watch Folder support to automatically transcribe files when they are added to a directory of your choice. The files can automatically be transcribed into a variety of formats. Support for OpenAI (ChatGPT), Anthropic (Claude), Groq, Ollama, XAi, Deepseek, Custom OpenAI API endpoints and Azure AI models for easy prompting Support Cloud Transcriptions through OpenAI, ElevenLabs, Deepgram, Groq and custom Whisper servers Manually add speakers to your transcript for a cleaner export Menubar app for accessing Whisper anywhere from your Mac Global, access MacWhisper from anywhere in a spotlight type view for instant transcription and easy pasting into other apps ChatGPT integration (with your own API key) Ignore segments such as [SILENCE] from appearing in your transcripts Supports GPT4, GPT4 Turbo, GPT4o and GPT4o-mini as well as older models Anthropic Claude Integration (with your own API key) Record and transcribe system audio (to record meetings for example) Supports Tiny (English Only), Tiny, Base, Small, Medium and Large (V2 and V3) models Add your own custom GGML models Change the starting timestamp for the transcript Translate audio file into another language through Whisper (use the Medium or Large models, the results will not be perfect and I’m working on more advanced ways to do this) Translate the full transcript by adding your own (free) DeepL API key. Translate subtitles into different languages Inline and separate video player with subtitle and multiple translated subtitles support Transcribe podcasts by combining single track audio for each host (beta) One time payment, no subscription. Pay once and use forever. Higher priority support. I’ll try to email you back as soon as possible if you run into anything. If you’re a journalist, student or non-profit, send me an email at support@macwhisper.com and tell me about your work to get 30% off 🙂 If you purchase MacWhisper Pro and are not happy with it, let me know within 7 days what could be improved and I’ll refund you. Support for OpenRouter Support for ElevenLabs Scribe and Deepgram Nova After downloading MacWhisper you will have to fill in your license key to unlock all Pro features.If you want to purchase more than 20 licenses, or if you’re looking for an MDM deployment or something custom, please send an email to support@macwhisper.com or check out the MDM Documentation.100+ Supported LanguagesMacWhisper can transcribe audio in the following languages:English, Chinese, German, Spanish, Russian, Korean, French, Japanese, Portuguese, Turkish, Polish, Catalan, Dutch, Arabic, Swedish, Italian, Indonesian, Hindi, Finnish, Vietnamese, Hebrew, Ukrainian, Greek, Malay, Czech, Romanian, Danish, Hungarian, Tamil, Norwegian, Thai, Urdu, Croatian, Bulgarian, Lithuanian, Latin, Maori, Malayalam, Welsh, Slovak, Telugu, Persian, Latvian, Bengali, Serbian, Azerbaijani, Slovenian, Kannada, Estonian, Macedonian, Breton, Basque, Icelandic, Armenian, Nepali, Mongolian, Bosnian, Kazakh, Albanian, Swahili, Galician, Marathi, Punjabi, Sinhala, Khmer, Shona, Yoruba, Somali, Afrikaans, Occitan, Georgian, Belarusian, Tajik, Sindhi, Gujarati, Amharic, Yiddish, Lao, Uzbek, Faroese, Haitian Creole, Pashto, Turkmen, Nynorsk, Maltese, Sanskrit, Luxembourgish, Myanmar, Tibetan, Tagalog, Malagasy, Assamese, Tatar, Hawaiian, Lingala, Hausa, Bashkir, Javanese, Sundanese.System RequirementsMacWhisper requires a lot of computer memory to work well. To use the Medium and Large models your Mac should have more than 8GB of RAM. Performance on older Intel based Macs can also be bad but I have not been able to test this properly.Privacy Policy and Terms of UseReviews👨‍💻 Check out my other macOS utilities:OpenAI Bundle - Get all my OpenAI apps at a discounted rateMacGPT - Use ChatGPT on your Mac and from your menubarDetective - GPT Vision for macOSVoices - High Quality Text to Speech with OpenAIText Assistant - Generate useful text and manage your prompts with GPT and your own OpenAPI keyVivid - Double the brightness of your MacBook Pro by always using HDR modeForehead - Hide the Notch and round your MacBook cornersCooldown - Quickly toggle Low Power Mode from your menubarSpeedy - Fast Speedtest in your menubarPippo - Improve the Picture-in-Picture video player with seek controlsWhisper was made by building on top of all the hard work from Georgi Gerganov, check out his Whisper implementation here: https://github.com/ggerganov/whisper.cpp

Eventuell gibt es auch Tools, die diese Nutzung nicht hinter eine Paywall packen, ist im Prinzip möglich, da Nvidia Parakeet kostenfrei zum Download anbietet. Aber ich habe nicht auf dem Schirm, ob das Macwhisper-Alternativen bereits nutzen.