I haven't made a tutorial video on this becasue a lot of the infomation is covered in this video here from a different creator (NOT ME!):
I literally copied the instructions and code from that YouTube video and then modified it slighty to work with my JARVIS program. So I suggest everyone interested in applying VOSK to their program view and give that creator a like for their video.
A very brief recap of the video is:
Pip install vosk
This should be self explanitory if you've used python before.
Download STT model(s)
Go to https://alphacephei.com/vosk/models and find the Speech-To-Text (STT) file you want to use for your language and download it. Depending on your choice it might take a while. These can be large files. The files will be loaded to memory when being used. The larger files are more accurate but take up more RAM. If you are using something like a Raspberry Pi 4, use the smaller model designed for single board computers or Android.
I noticed that if you are using a higher quality mic or a headset mic, the small model works well. I have a cheap conference style mic on my desk specifically for JARVIS so I use the larger model. (But I'm odd, so there's that. ) If you don't have enough ram and accidently load a large file, your pc will freeze for several minutes and then the load will fail and dump the file from memory, unfreezing the computer. So you can't break anything but you may feel like you have. (Yes I tried it, becasue as mentioned before... I'm odd.)
Use the code from the video.
The presenter does a good job explaining the process and I don't know if I would do better. There is 1 part in the video that I found a fix for their error. He has an issue with the mic not being able to be split between other apps and crashing his assistant program. To fix that in his code change
data = stream.read(4096)
to
data = stream.read(4096, exception_on_overflow = False)
This will fix allow the mic to just pass the exception. If it's being used elsewhere it just won't pick up what you said and you have to repeat. Better than the whole thing crashing. This is what allows me to use JARVIS on the same PC I am recording videos on at the same time.
I hope this makes sense. If enough peolpe like this I'll probably make a full video out of it, but no need to wait for me to spend time editing a video and you can use it now for yourself.
where exactly do you put the code in jarvis