I am wondering how to use the IXAudio2Voice::SetChannelVolume function. The documentation tells me to pass the number of channels for the first parameter. But which "number of channels" do I need and how can I get that value? Do I have to use the number of input channels which can be retrieved from the GetVoiceDetails-function (see here)?
It is expecting the number of channels in the voice.
This should be the same as the channel count when the voice was created, the value of nChannels in the WAVEFORMATEX struct.
http://msdn.microsoft.com/en-us/library/windows/desktop/dd390970(v=vs.85).aspx
IXAudio2SourceVoice* pSourceVoice;
if( FAILED(hr = pXAudio2->CreateSourceVoice( &pSourceVoice, (WAVEFORMATEX*)&wfx ) ) ) return hr;
wfx.nChannels is the channel count for the voice
http://msdn.microsoft.com/en-us/library/windows/desktop/ee415828(v=vs.85).aspx
Related
This is my code
import pyttsx3
# This function is used to process text and speak it out
def speak (lis):
voiceEngine = pyttsx3.init()
voiceEngine.getProperty("rate", 100)
#voice_id = "HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Speech\Voices\Tokens\IVONA 2 Voice Brian22"
#voiceEngine.setProperty('IVONA 2 Brian - British English male voice [22kHz]', voice_id )
voiceEngine.say (lis)
voiceEngine.runAndWait()
voiceEngine.stop
speak(lis = "Hello i am alpha, your personal digital assistant. how may i be of your assistance")
Looking at the documentation for pyttsx3.ening.Engine.getProperty it looks like you're giving it too many arguments:
getProperty(name : string) → object
Gets the current value of an engine property.
Parameters: name – Name of the property to query.
Returns: Value of the property at the time of this invocation.
You are giving it an extra parameter: 100. You should only be providing the name of the property you are looking to access: voiceEngine.getProperty("rate").
Unless you were looking to use pyttsx3.engine.Engine.setProperty which does take a second value for the value to set the property to.
I try to solve some challange on hackerrank. And description is like follow: https://www.hackerrank.com/challenges/maximum-element
So I try this code
query,number=map(int,input().split())
This code work well when I have exactly two variable and fails when I have one variable.
You need to do some input validation before you start processing it if you expect the users to sometimes not enter the expected values, for example:
user_input = input().split()
if len(user_input) < 2:
print("At least two parameters are required!")
else:
try:
query = int(user_input[0])
number = int(user_input[1])
except ValueError:
print("At least two integer parameters are required!")
You can do even more post-input validation to match your required parameters, or you can combine some aspects of the validation to pick one argument when only one is passed while setting the second argument to the default, etc. It all depends on your desired business logic.
I am playing around with some basics of the Audiostream package for Kivy.
I would like to make a simple online input-filter-output system, for example, take in microphone data, impose a band-pass filter, send to speakers.
However, I can't seem to figure out what data format the microphone input is in or how to manipulate it. In code below, buf is type string, but how can I get the data out of it to manipulate it in such a way [i.e. function(buf)] to do something like a band-pass filter?
The code currently functions to just send the microphone input directly to the speakers.
Thanks.
from time import sleep
from audiostream import get_input
from audiostream import get_output, AudioSample
#get speakers, create sample and bind to speakers
stream = get_output(channels=2, rate=22050, buffersize=1024)
sample = AudioSample()
stream.add_sample(sample)
#define what happens on mic input with arg as buffer
def mic_callback(buf):
print 'got', len(buf)
#HERE: How do I manipulate buf?
#modified_buf = function(buf)
#sample.write(modified_buf)
sample.write(buf)
# get the default audio input (mic on most cases)
mic = get_input(callback=mic_callback)
mic.start()
sample.play()
sleep(3) #record for 3 seconds
mic.stop()
sample.stop()
The buffer is composed of bytes that need to be interpreted as signed short. You can use struct or array module to get value. In your example, you have 2 channels (L/R). Let's say you wanna to have the right channel volume down by 20% (aka 80% of the original sound only for right channel)
from array import array
def mic_callback(buf):
# convert our byte buffer into signed short array
values = array("h", buf)
# get right values only
r_values = values[1::2]
# reduce by 20%
r_values = map(lambda x: x * 0.8, r_values)
# you can assign only array for slice, not list
# so we need to convert back list to array
values[1::2] = array("h", r_values)
# convert back the array to a byte buffer for speaker
sample.write(values.tostring())
How do I get time index (or frame number) in Sphinx 4 when I set it to transcribe an audio file?
The code I'm using looks like this:
audioURL = ...
AudioFileDataSource dataSource = (AudioFileDataSource) cm.lookup("audioFileDataSource");
dataSource.setAudioFile(audioURL, null);
Result result;
while ((result = Recognizer.recognize()) != null) {
Token token = result.getBestToken();
//DoubleData data = (DoubleData) token.getData();
//long frameNum = data.getFirstSampleNumber(); // data seem always null
String resultText = token.getWordPath(false, false);
...
}
I tried to get time of transcription from result/token objects, e.g. similar to what a subtitler do. I've found Result.getFrameNumber() and Token.getFrameNumber() but they appear to return the number of frames decoded and not the time (or frame) where the result was found in the context of entire audio file.
I looked at AudioFileDataSource.getDuration()[=private] and the Recognizer classes but haven't figure out how to get the needed transcribed time-index..
Ideas? :)
Frame number is the time multiplied by frame rate which is 100 frames/second.
Anyway, please find the patch for subtitles demo which returns timings here:
http://sourceforge.net/mailarchive/forum.php?thread_name=1380033926.26218.12.camel%40localhost.localdomain&forum_name=cmusphinx-devel
The patch applies to subversion trunk, not to the 1.0-beta version.
Please note that this part is under major refactoring, so the API will be obsolete soon. However, I hope you will be able to create subtitles with just few calls without all current complexity.
I am trying to send a message to the broker over a websocket. The message contains numbers that represent sensor data, so the message can be a mix of integers and floats. When I run the code I get TypeError: payload must be a string, bytearray, int, float or None. How can the code be changed to send a message containing integers and floats? I am using CloudMQTT as a broker.
Full code:
import paho.mqtt.client as mqtt
import time
client = mqtt.Client()
client.username_pw_set("User", "Password")
client.connect("Server", "Port")
num_one = 5.83
num_two = -12.46
num_three = 2
message = (num_one, num_two, num_three)
while True:
client.publish("topic", message)
time.sleep(1)
It looks like your problem is that the message you're sending is a tuple. You probably want
message = (num_one, num_two, num_three)
message = ''.join([str(x) for x in message])
This will convert each number to a string, then join them to a single string.
Choose an appropriate binary or text based format for your message, and encode your structure in that format. It will then either be a bytearray or string.
Unless there's a good reason to roll your own format, I'd suggest SenML as it is barely more complex than most non-standard JSON formats, but is sufficiently standardised you can at least say you're trying to be compatible with other applications.