Crazy speed improvement on speech to text. Switched speech to text transcription api to Groq Whisper-3-large. It falls back to OpenAI api (whisper 2) which should make everything a little more reliable with that redundancy. I tried whisper-3-large-turbo, and while faster, doesn't seem quite as accurate.
Also added voice activity detection. Basically it should prevent the random hallucination or jibberish if you submit empty audio. VAD is shockingly tricky to get right. Not absolutely sure this is right. But testing live!
Block Settings
Token Count 117/10000. AI functionality will not work for text longer than 10000 tokens.
Prompt Settings
To adjust behavior of changelog,
edit Prompt Settings