ALSA async callbacks? - linux

The ALSA documentation seems to be very lacking... Basically, I need to play sounds asynchronously, be able to stop (all) sounds, and get a callback when one has finished playing successfully.
I can mostly do the first 2, its just the latter I'm having trouble with.
Does anyone know any snippets that may enlighten me?
Further Details:
Basically the user will be browsing a collection of sounds, when they hover over one, it should play it, and when they go onto the next one, that one should stop very quickly, and the next should play etc.... This will happen fast. The last one they heard in its entirety should be selected (hence why I need the callback if a whole sound plays successfully, as atm the one selected isn't necessarily the one they least heard, due to threading)
I don't really want to use any libraries other than libasound.

Related

Capture voice from a stream and translate it

I hoped that potentially there is something that will help me do it with just one step, however there might be other steps for it.
The problem is, that there is a game I follow, however all the major informations (devblogs and streams) is passed over mostly in french. Today there is one stream on Twitch that I would love to understand, however French has never crossed my path outside of the game. I was hoping there would be a way for me to launch the stream, capture the text spoken during it and translate it to English.
So far, the best idea that came to my mind would be to open up google translate, turn the volume up and let it speak to itself, however I hoped there would be something that listens to an application/window inside the system without the use of the actual microphone and speaker.
Enjoy your day, all!

Where can I find documentation for the preview() and show() function for VideoClips?

I just want to see how exactly they work, and I can't seem to find them in either moviepy or pygame's websites. Basically I just want to see at what time a user presses a specific key during a clip, and record that time/possibly insert an image at that time while the movie is playing. I know moviepy does that already to some extent, but it's only for mouse clicks.
Thank you for your time.
I found the source code but no answer. I ended up editing the source code, and while that works, I would much rather do something else than that if possible.
To have a more elaborate answer to the rest of my question, basically it's not something I think is feasible to directly edit the video file WHILE it's playing. I also don't know if it would be a good idea to save every single and just combine them. I was able to find an extremely efficient, but niche solution by modifying the preview frame while it plays, and having that persist across every new frame. Then I saved JUST the overlay to a file, and can use that however else I feel.
I have seen no other threads/users actually deal with moviepy in this way, so feel free to PM me or ask on the thread if you want more info.
Source code here

Actions on Google - Close mic without closing the app, or workaround suggestions

Is it possible to close the mic without closing the app?
Or any suggestions to the below explained situation are very welcome:
I've found some posts already asking for this, but they have about a year old so I wonder whether there's something new.
I'm using conv.close('some message not prompting');. That closes the mic, but also closes the app, which is not what I need.
The functionality I need is the same that AOG takes by default when showing a browsing carousel: it automatically closes the mic (but not the app), and the user can re-open the mic or tap on a suggestion chip to interact with your app directly, without the need to invoke it again.
I was suggested to add a tail saying What else can I do for you? after each reply that doesn't prompt the user for new information, so I can keep the mic open, but that sounds so unnatural that I really think it kills the purpose of trying to sound natural with a bot.
There are many situations where you can expect the user to say something, even though you're not asking for anything. A simple example is when telling a joke: you can expect the user to laugh, criticize, ask for another joke or make whatever comment. In this case, closing the mic (and the app) is nonsense, and adding a Do you want to hear another joke? tail doesn't sound good after the joke, specially if you're telling one after another.
The purpose here is not to be rejected by the AOG review team because I'm leaving the mic open.
Any ideas are welcome. Thanks in advance.
You don't need to explicitly prompt for "what next", but you do need to make clear if you're expecting something further from the user. The easiest way to do that is to rotate through some prompts. (Libraries like multivocal make this easier.)
The notion of "closing the microphone without closing the conversation" leads to the question of "ok, how do they close the conversation?" And for simple one-off scenario that you've described, that isn't always obvious.
That said, there are a few thoughts about how you can approach it depending on your needs.
If there is a reason you need to close the microphone, yet still allow the user to issue a command while still in the Action, you can consider sending a Media object as part of your response. When the playback finishes, your Action will be triggered to let you know, and you can either prompt the user again (and play more audio) or eventually agree to close the conversation. Users would interrupt the audio with "Hey Google" prompted by a command in your Action.
Another approach for things that are truly "one off", but where they may want to followup in rarer cases, is to keep track of user state (if you need to reference it in a followup) and close the conversation. The user would be able to "re-start" the conversation if they need to, either through normal invocation or through a deep-linking invocation. This does close the conversation - but makes it easy to restart.
I have written 2 google home apps now and had them initially rejected if any of my intents did not ask a follow-up question. I agree that it can sound a little unnatural always asking a follow-up question. I get my answer logic (on my endpoint) to append a random follow-up question from a pre-defined list to try and vary things up a bit.

Computer keyboard into piano keyboard with AutoHotkey

I want to be able to use my computer keyboard as a piano keyboard, however the default version of AutoHotkey only supports one "voice" at a time. I tried running an instance for each note, but that doesn't fix it if I press the same note repeatedly.
I found this thread on how this might be solved with the BASS library, but I'm pretty green when it comes to coding and so I'm not certain how to incorporate the library into my simple code.
Here's another similar forum that might solve things, but it has a delay and the overlapping solution doesn't really solve my issue.
This is such a simple idea (play sound when a button is pressed), but somehow it's way out of my depth. Currently my code looks like this:
~1::
SoundPlay, C:\Users\Fires\Downloads\2489__jobro__piano-ff\39187__jobro__piano-ff-040.wav
for each note
Edit:
~a::
FileDelete, %A_ScriptDIR%\Sound1.AHK
FileAppend,
(
SoundPlay, C:\Users\Fires\Desktop\New folder (4)\043.wav, Wait
), %A_ScriptDir%\Sound1.AHK
Run, %A_ScriptDIR%\Sound1.AHK
Return
is what I am using now, but it's still iffy when two are pressed at the same time.
Its likely due to the "Wait". according to the documentation:
https://autohotkey.com/docs/commands/SoundPlay.htm
"If a file is playing and the current script plays a second file, the first file will be stopped so that the second one can play. On some systems, certain file types might stop playing even when an entirely separate script plays a new file."
It looks like this is an "issue" with AutoHotKey. And its not possible to open multiple "voices" or simultaneous sounds. it is completely possible to make a C# program that does the same thing, I've done it before. and i have bits of the code now still (it plays midi sounds, instead of wavs, but same concept play notes asynchronously).

online trading bot

I want to code a trading bot for Magic: The Gathering Online. This bot should wait until someone offers to trade, accept, look through the cards available from the other trader (the information is shown on screen), and perform other similar functions. I have several questions:
How can it know that someone is offering a trade?
How can it know that the other trader has some card (the informaion is stored in pictures)?
I just cannot imagine right now how to do it, I have no experience with it, until now I've been coding only console programs for my physics neсessities.
First, you should note that some online games forbid bots, as they can give certain players unfair advantages. The MTGO Terms of Service do not seem to say anything about this, though they do put restrictions on anything that might negatively impact the service. They have also said that there is a possibility they will add an API in the future, so they don't seem to be against the idea of automation, but are not supporting it at the moment. Tread carefully here, but it looks like it should be OK to write a bot as long as it is not harmful or abusive. This is not legal advice, and it would be a good idea to ask the folks who run MTGO for permission. edit since I wrote this, it has been pointed out that there are lots of bots already, so there should be no problems writing bots.
Assuming that it is not forbidden by the terms of service, but they do not have an API, you will have to find a way to detect what's going on, and control the game automatically. There's a pretty good series of articles on writing poker bots (archived copy), which has some good information on how to inject a DLL into an application, scrape the screen, and control the application. That might provide you with a starting point for doing this sort of thing.
You might also want to look for tools that other people have already written for doing this. It looks like there are several existing MTGO bots, but they all seem a bit sketchy (there have been some reports of them stealing passwords), so be careful there.
Edit
Since this answer still seems to be getting upvotes, I should probably update it with some more useful information. Since writing this, I have found a great UI automation system called Sikuli. It allows you to write programs in Python that automate a GUI. It includes image recognition features which make it very easy to recognize buttons, cards, and other UI elements; you just take a screenshot, crop it down to include just the thing you're interested in, and do fuzzy image matching (so that changing backgrounds and the like doesn't cause the match to fail). It even includes a custom IDE that allows you to embed those screenshots directly in your source code, so you can see exactly what the code is looking for. Here's an example from the documentation (apologies for the code formatting, doing images inline in code is not easy given StackOverflow's restricted subset of HTML):
def resizeApp(app, dx, dy):
switchApp(app)
corner = find(Pattern().targetOffset(3,14))
drop_point = corner.getTarget().offset(dx, dy)
dragDrop(corner, drop_point)
resizeApp("Safari", 50, 50)
This is much easier to get started with than the techniques mentioned in the article linked above, of injecting a DLL into the process you are debugging. Sikuli runs entirely at the UI level, so you never have to modify the program you are automating or worry about changes to the internals breaking your script.
One thing it is a bit poor at is handling text; it has OCR features, but they aren't all that good. If the text is selectable, however, you can select the text, copy it, and then look directly at the clipboard.
If I were to write a bot to automate something without a good API or text-based interface, Sikuli is probably the first tool I would reach for.
This answer is constructed from my comments.
What you are trying to do is hard, any way you try and do it.
Arguably the easiest way to do it is to totally mimic the user. So the application presses buttons, moves the mouse etc. The downside with this is that it is dependant on being able to recognise the screen.
This is easier if you can alter the games files as you can then just skin ( changing the image (texture)) the required cards to a single unique colour.
The major down side is you have to have the game as the top level window or have the game running in a virtual machine. Neither of which is ideal.
Another method is to read the processes memory. You may be able to find a list of memory locations, which would make things simpler, otherwise it involves a lot of hardwork, a debugger to deduce the memory addresses. It also helps (a lot) to be able to understand assembly.
The third method is to intercept the packets, and alter them. This is easier that the method above as it (at least for me) is easier to reverse engine the protocol as you have less information to deal with. It is just a matter of setting up a packet sniffer and preforming a action with one variable different (for example, the card) and comparing the differences.
The thing you need to check are that you are not breaking the EULA. I don't know how the game works, but most of the games I have come across have a EULA that prohibits (i.e. You get banned) doing any of the things I have mentioned.

Resources