Private Voice Assistants on Mobile/Advice how to Privatize mainstream AI.

OhVenus_Baby@lemmy.ml · edit-2 2 months ago

Private Voice Assistants on Mobile/Advice how to Privatize mainstream AI.

endlessvoid@lemmy.today · edit-2 2 months ago

Home assistant has a built in voice assistant function that can be as simple or robust as you need it to be. The whole thing can be setup fully locally and mine runs easily on an old micro-pc I got for $100. I had it running on a Pi3b originally but the STT and TTS would take 10+ seconds to process, which was too long.

Out of the box it controls local devices, does to-do lists, controls media, sets timers. Setting reminders doesn’t work out of the box, but can be setup with some great community templates. Services that require web content like “tell me the news” or “what’s the weather in Seattle” need to be either setup with custom commands that have access to the info you want, or need to go through an LLM.

Luckily, the past few months have seen the open home foundation add integrations for LLM’s, both local and web-based (chatgpt, gemini, etc) are possible, so you can have it run queries through models run on a local GPU. Though this is currently fairly bleeding edge and I haven’t tried running a local LLM myself yet so I can’t speak to it’s complexity.

More on that here: https://www.home-assistant.io/blog/2024/06/07/ai-agents-for-the-smart-home/

OhVenus_Baby@lemmy.ml · 2 months ago

I really love the idea of setting this up. I keep up fairly with these types of projects. I just lack the time to implement this myself. Which is why I was mentioning a mainstream service and finding a way to privatize it. I plan to use the assistant for work. I work long hours. By the time I am done for the day I feel cooked. I need a relatively plug and play system. How could I help privatize say an echo dot? Or some other mainstream Voice Assistant.

endlessvoid@lemmy.today · 2 months ago

Here’s what I run, this is all 100% local. The most time I spent on this project was actually on getting the wakeword recognition (which is another fairly new function in HA) setup on these old teleconferencing devices: https://drive.google.com/file/d/1e2T1ibNw5GeIOUA1eqQbjwp1s2g5h5XN/view

ProletarianDictator [none/use name]@hexbear.net · edit-2 2 months ago

Checkout ~~OpenWhisper~~ Faster Whisper and Wyoming

edit: corrected name and added links

OhVenus_Baby@lemmy.ml · 2 months ago

Can you use say futo voice input with OpenAi’s API why does one need to use whisper. And Wyoming seems to be an integration with home assistant.

ProletarianDictator [none/use name]@hexbear.net · 2 months ago

You don’t need to use Whisper, I got some names mixed up. I was thinking of wyoming-faster-whisper which uses the FOSS speech to text system faster-whisper, but there are others that can be used.

Edited my original comment to fix that.

Wyoming is a protocol for voice assistants.

It ties together:

speech recognition services (faster-whisper, vosk, whisper.cpp, OpenAI’s Whisper API)
text to speech services (piper)
wake word detection services (openWakeWord, snowboy, porcupine1)
intent handling services
intent recognition services

Home Assistant can interact with that protocol. I think the addons run servers for various components used by the wyoming protocol server that the integration can use, but I run it separate from Home Assistant, so idk.

Not sure what futo is capable of, but you can use anything that can communicate with a wyoming server. I’m willing to wager you can, but idk.

OpenAI’s ChatGPT API and LLM models are orthogonal to this, but probably could be used as an intent or as the fallback when no other intent was recognized. So I’m pretty sure you could link up getting a response from OpenAI or any other LLM API, but I haven’t tried setting that up for myself yet. wyoming-handle-external lets you pipe the input text to the stdin of whatever program you give it and responds with the program’s stdout, so you could definitely use this to pass it to OpenAI or Ollama.

OhVenus_Baby@lemmy.ml · 2 months ago

I dig this. Especially the speech to text then run the command online. I’m afraid I don’t have the current time to implement this and need big brand assistant quickly, I need advice on how to protect my privacy while using alexa, or google assistant, etc… But damn this is nice. Almost exactly as one would want an AI setup really. Home assistant has came a long way and that is awesome. Those people do good work.