How can i get my Virtual Assistant to hear me? - python-3.x

I am trying to build a virtual assistant for exercise, when i attempt to get audio using the microphone live 'Robin' (the V.A) will stay running.
I updated speechrecognitioin, pyaudio, and also reinstalled elasticsearch via homebrew after having to install java 1.8. I also tried adjusting the exception_on_overflow error after shutdown and set it '=False' (at this point i am much beyond my level of knowledge). On top of this to ensure the translation was working correctly i ran the -m speech recognition in terminal (OS: Mac) and it pretty accurately translated the speech. Im stumped.
# take command from microphone
def takeCommand():
r = sr.Recognizer()
with sr.Microphone() as source:
print('Absorbing...')
audio = r.listen(source)
try:
print('Recognizing...')
query = r.recognize_google(audio, language='en-US')
print(f'user said:{query}\n')
except KeyboardInterrupt as e:
print('Im sorry, I didnt get that.')
#Begin tasking:
speak('Initializing, Robin...')
wishMe()
takeCommand()
I am hoping for the console to return what i said into text, the goal would then to turn the text into an executable command. Hence the 'takeCommand' function. Yet if Robin cannot detect a sound she will give the output 'Im sorry,'. If theres anything else i can provide let me know. I really appreciate the feedback. Also Im new to stackoverfow I apologize if i didn't format this correctly.

Related

QT paly audio by QAudioDevidce can't connect to PulseAudioService

Like title, i'm trying to use Qt and FFmpeg to play audio. My code like this:
QAudioOutput *audio_output;
QIODevice *stream_out;
QAudioFormat audio_fmt;
audio_fmt.setSampleRate(44100);
audio_fmt.setChannelCount(2);
audio_fmt.setSampleSize(16);
audio_fmt.setCodec("audio/pcm");
audio_fmt.setByteOrder(QAudioFormat::LittleEndian);
audio_fmt.setSampleType(QAudioFormat::SignedInt);
QAudioDeviceInfo info = QAudioDeviceInfo::defaultOutputDevice();
if(!info.isFormatSupported(audio_fmt))
{
audio_fmt = info.nearestFormat(audio_fmt);
}
audio_output = new QAudioOutput(audio_fmt);
When i use QAudioDeviceInfo info = QAudioDeviceInfo::defaultOutputDevice()
i get PulseAudioService: pa_context_connect() failed error.
So how can i fix it?
By the way, i'm using Ubuntu 16.04 and Qt 5.14.2, and i have add 'mutilmedia' to Qt pro file
I checked my Qt file ,and i have audio dir in plugins, it's not lib problem. Also, i read this post ,but i don't know how to fix it, anybody have idea? Thank you guys,and my English is bad, wish you can understand what do i say.

How do I transfer the output from pyttsx3 to a variable for DSP

I'm running a Raspberry Pi4 with Python 3.7 and pyttsx3.
I'm planning to use pyttsx3 to verbally respond to "commands" I issue. I also plan to visualise the output speech on a neopixel strip (think "Close Encounters" on a miniature scale.) The visualisation is not the problem though.
My problem is, how do I get the output from pyttsx3 into a variable so I can pass it to my DSP?
I know that I can pass the to a file:
import pyttsx3
engine = pyttsx3.init() # object creation
"""Saving Voice to a file"""
engine.save_to_file('Hello World', 'text.mp3')
engine.runAndWait()
& I know I can read the file but that creates a latency.
I want the speech and twinkly lights to coincide and I know I can play the wav file but I'd like something more "real time".
Does anyone have any suggestions please?

Tkinter and some kind of issue when trying to display images

First off, I'm very new to Python and coding in general. I'm using Python, Tkinter, and Idle version 3.7.3. I'm using an HP Chromebook, with Chrome OS Version 81.0.4044.141.
from tkinter import *
window = Tk()
window.title('Image Example')
img = PhotoImage(file = 'python.gif')
label = Label(window, image = img)
label.pack()
window.mainloop()
As you can see above, this is the small snippet of code that I'm having issues with. As far as I understand, everything is written correctly and the file "python.gif" is in the correct directory. For reference this is what the image should look like:
python.gif (normal)
But when I run the program, this is what I get:
python.gif (screenshot of running program)
That's the result 99% of the time, but I should mention that there have been a RARE number of occasions where the image displayed correctly upon program execution. However, I do not know how to replicate that. Also for more context, I've tried other images to see what happened. I found a free .pgm image to try as an example, and upon execution either I got the same result, or half of the image would appear correctly while the bottom half (also sometimes this would be reversed and the top half would be affected) would be "blacked out".
In conclusion, I wanted to ask if anybody has an idea of what's going on. I'm not sure if this is a hardware issue (because I can view all mentioned images in a normal image viewing app with no problems), or if this has something to do with Python/Tkinter.
Any assistance is very appreciated! Please and Thank You!

How do I detect certain things that I said in a speech recognizer script

I am trying to make a voice activated virtual assistant of sorts using python, but I am not sure how to detect and distinguish between the different voice commands. Currently it just repeats back to you, "You Said [whatever i said]" but i want it to respond differently to different things that I say. I am quite new to python and don't know what I should do. Does anyone know how I could do this?
You have to define what you want it to do. The last two lines of this tell the program to do something if the input is hello. So when you run it, you say "hello" and it will have a different response. If it does not detect that you said "hello" then it will not do anything. I might recommend finding a project on github where they have already done an assistant like this and start to try to understand what they did and edit to the specifications you want.
import speech_recognition as sr
sample_rate = 48000
chunk_size = 2048
r = sr.Recognizer()
device_id = 1
with sr.Microphone(device_index=device_id, sample_rate=sample_rate, chunk_size=chunk_size) as source:
print("Say something...")
r.adjust_for_ambient_noise(source)
audio = r.listen(source)
text = r.recognize_google(audio)
if text.lower() == "hello":
print("Hi, how are you?")

About Tkinter python 2.76 on Linux Mint 17.2

I have 2 functions as below:
def select_audio():
os.chdir("/home/norman/songbook")
top1.lower(root)
name=tkFileDialog.askopenfilename()
doit="play " + name
top1.lift(root)
os.system(doit)
def select_video():
os.chdir("/home/norman/Videos")
top2.lower(root)
name=tkFileDialog.askopenfilename()
doit="mpv --fs " + name
top2.lift(root)
os.system(doit)
They are selected from buttons to allow choosing and playing audio files or video files.
They work to some extent.
Videos are in a different directory and at the same level as the audio files.
It doesn't matter which I choose first I see the correct directory so I can play say a video, if after it's finished I choose audio it still shows the video directory.
Similarly if I first choose audio it still shows the audio directory if I select videos.
I have no idea why it does this. I am not an experienced programmer as you can probably tell from the code.
Some suggestions:
Use a raw string to make sure that Python doesn't try to interpret anything following a \ as an escape sequence:
Change os.chdir("/home/norman/whatever") to os.chdir(r"/home/norman/whatever")
It won't solve this problem, but it will avoid you future problems.
For tkFileDialog use the initialdir option:
Change name=tkFileDialog.askopenfilename() to
name=tkFileDialog.askopenfilename(initialdir=r"home/norman/whatever", parent=root)

Resources