Actions on Google - Close mic without closing the app, or workaround suggestions - dialogflow-es

Is it possible to close the mic without closing the app?
Or any suggestions to the below explained situation are very welcome:
I've found some posts already asking for this, but they have about a year old so I wonder whether there's something new.
I'm using conv.close('some message not prompting');. That closes the mic, but also closes the app, which is not what I need.
The functionality I need is the same that AOG takes by default when showing a browsing carousel: it automatically closes the mic (but not the app), and the user can re-open the mic or tap on a suggestion chip to interact with your app directly, without the need to invoke it again.
I was suggested to add a tail saying What else can I do for you? after each reply that doesn't prompt the user for new information, so I can keep the mic open, but that sounds so unnatural that I really think it kills the purpose of trying to sound natural with a bot.
There are many situations where you can expect the user to say something, even though you're not asking for anything. A simple example is when telling a joke: you can expect the user to laugh, criticize, ask for another joke or make whatever comment. In this case, closing the mic (and the app) is nonsense, and adding a Do you want to hear another joke? tail doesn't sound good after the joke, specially if you're telling one after another.
The purpose here is not to be rejected by the AOG review team because I'm leaving the mic open.
Any ideas are welcome. Thanks in advance.

You don't need to explicitly prompt for "what next", but you do need to make clear if you're expecting something further from the user. The easiest way to do that is to rotate through some prompts. (Libraries like multivocal make this easier.)
The notion of "closing the microphone without closing the conversation" leads to the question of "ok, how do they close the conversation?" And for simple one-off scenario that you've described, that isn't always obvious.
That said, there are a few thoughts about how you can approach it depending on your needs.
If there is a reason you need to close the microphone, yet still allow the user to issue a command while still in the Action, you can consider sending a Media object as part of your response. When the playback finishes, your Action will be triggered to let you know, and you can either prompt the user again (and play more audio) or eventually agree to close the conversation. Users would interrupt the audio with "Hey Google" prompted by a command in your Action.
Another approach for things that are truly "one off", but where they may want to followup in rarer cases, is to keep track of user state (if you need to reference it in a followup) and close the conversation. The user would be able to "re-start" the conversation if they need to, either through normal invocation or through a deep-linking invocation. This does close the conversation - but makes it easy to restart.

I have written 2 google home apps now and had them initially rejected if any of my intents did not ask a follow-up question. I agree that it can sound a little unnatural always asking a follow-up question. I get my answer logic (on my endpoint) to append a random follow-up question from a pre-defined list to try and vary things up a bit.

Related

simultaneous text editing

I'm trying to program a Webpage, which allows to edit a text document simultaneously.
To program something like a Chat in Node.js is not very difficult, but working on the same text makes it kinda tricky.
I thought about sending the char position and the changes characters, but if someone types something previous to the change, the change would be placed on the wrong position.
What's the best way to exchange Modifications between my clients?
You should use Socket.io to have make your Real-Time application.
I just founded a nice blog article which speaks about real time edition, see here.
It's also providing a link to the github project and to an open source online editor project.
Take a look and try to understand how they do stuff like this, good luck !
Two people cannot be manipulating the same object at the same time from a different place. You basically have two choices.
1. Let them take turns with the object
2. duplicate it if they both want it, but that doesnt sound like it would end well

ALSA async callbacks?

The ALSA documentation seems to be very lacking... Basically, I need to play sounds asynchronously, be able to stop (all) sounds, and get a callback when one has finished playing successfully.
I can mostly do the first 2, its just the latter I'm having trouble with.
Does anyone know any snippets that may enlighten me?
Further Details:
Basically the user will be browsing a collection of sounds, when they hover over one, it should play it, and when they go onto the next one, that one should stop very quickly, and the next should play etc.... This will happen fast. The last one they heard in its entirety should be selected (hence why I need the callback if a whole sound plays successfully, as atm the one selected isn't necessarily the one they least heard, due to threading)
I don't really want to use any libraries other than libasound.

Sanitizing Input from irc

So I was thinking of writing a irc bot/bot extension that lets users play certain text based games by starting the game,
sending parts of certain lines they enter(regexp match for game signal if not in bots channel ex. rbot gamename enter the forest . sends "enter the forest) to std in of game,
while standard out of game is cached by bot and the piped to the channel (ex.
"let us rejoice for
the duck has been defeated"
gets read into a line cache inside the bot and then
the bot sends it to the appropriate channel as
gamename: let us rejoice for
gamename: the duck has been defeated"
)
But I'm sort of worried about the tricky things people on irc might do, would stripping all non printable characters be enough safety? If a program quits (say they enter the quit command for the game) what happens when you try writing to the file descriptor for that programs std in(error)? Any other potential problems?
Note I'm going to run this on linux or *bsd so I don't need to worry about windows specific things.
Some basics you might want to consider:
It's much safer to allow text through that you know is safe, than to try and filter out text that you think might not be safe. The games probably accept only alpha-numeric characters, so check to see if the input contains only those values, and deny anything else.
Run the bot under an account that has the lowest permissions possible, and as limited access to the rest of the machine as possible. If you can sandbox or virtualize it completely, even better.
You should be watching the PID of the child process for termination, and decide what to do if it exits, restart it or fail further commands, exit the bot, etc.
There are any number of possible security issues whenever exposing services to the network, you would do well to read about general secure programming topics, a quick google search turns up this how-to for example.
It pays to be paranoid. Without following proper secure programming practices, the most you can hope for is that nobody gives an honest try at breaking it.
escaping quotes and pipe will keep you safe from most stuff
" ' |
It doesn't matter where the user input is coming from, it matters how its used.
The one attack that affects IRC is CRLF injection. This will come up for you if you echo back user input over IRC. An attacker could try and inject a carrage return (\r) line feed (\n). This type of injection affects many protocols including HTTP and SMTP. In the case of IRC the attacker would be able to force your bot to send a command to the IRCD (like /join or /kick or /ban :). Make sure to look at an ASCII table and filter out all 0x0A (\n) and 0x0D (\r). In most cases the new line is enough, so make sure you filter for both.
Make sure you read over OWASP A1: Injection. Especially if you are using user input in a sql query or invoking a process on the commandline.

online trading bot

I want to code a trading bot for Magic: The Gathering Online. This bot should wait until someone offers to trade, accept, look through the cards available from the other trader (the information is shown on screen), and perform other similar functions. I have several questions:
How can it know that someone is offering a trade?
How can it know that the other trader has some card (the informaion is stored in pictures)?
I just cannot imagine right now how to do it, I have no experience with it, until now I've been coding only console programs for my physics neсessities.
First, you should note that some online games forbid bots, as they can give certain players unfair advantages. The MTGO Terms of Service do not seem to say anything about this, though they do put restrictions on anything that might negatively impact the service. They have also said that there is a possibility they will add an API in the future, so they don't seem to be against the idea of automation, but are not supporting it at the moment. Tread carefully here, but it looks like it should be OK to write a bot as long as it is not harmful or abusive. This is not legal advice, and it would be a good idea to ask the folks who run MTGO for permission. edit since I wrote this, it has been pointed out that there are lots of bots already, so there should be no problems writing bots.
Assuming that it is not forbidden by the terms of service, but they do not have an API, you will have to find a way to detect what's going on, and control the game automatically. There's a pretty good series of articles on writing poker bots (archived copy), which has some good information on how to inject a DLL into an application, scrape the screen, and control the application. That might provide you with a starting point for doing this sort of thing.
You might also want to look for tools that other people have already written for doing this. It looks like there are several existing MTGO bots, but they all seem a bit sketchy (there have been some reports of them stealing passwords), so be careful there.
Edit
Since this answer still seems to be getting upvotes, I should probably update it with some more useful information. Since writing this, I have found a great UI automation system called Sikuli. It allows you to write programs in Python that automate a GUI. It includes image recognition features which make it very easy to recognize buttons, cards, and other UI elements; you just take a screenshot, crop it down to include just the thing you're interested in, and do fuzzy image matching (so that changing backgrounds and the like doesn't cause the match to fail). It even includes a custom IDE that allows you to embed those screenshots directly in your source code, so you can see exactly what the code is looking for. Here's an example from the documentation (apologies for the code formatting, doing images inline in code is not easy given StackOverflow's restricted subset of HTML):
def resizeApp(app, dx, dy):
switchApp(app)
corner = find(Pattern().targetOffset(3,14))
drop_point = corner.getTarget().offset(dx, dy)
dragDrop(corner, drop_point)
resizeApp("Safari", 50, 50)
This is much easier to get started with than the techniques mentioned in the article linked above, of injecting a DLL into the process you are debugging. Sikuli runs entirely at the UI level, so you never have to modify the program you are automating or worry about changes to the internals breaking your script.
One thing it is a bit poor at is handling text; it has OCR features, but they aren't all that good. If the text is selectable, however, you can select the text, copy it, and then look directly at the clipboard.
If I were to write a bot to automate something without a good API or text-based interface, Sikuli is probably the first tool I would reach for.
This answer is constructed from my comments.
What you are trying to do is hard, any way you try and do it.
Arguably the easiest way to do it is to totally mimic the user. So the application presses buttons, moves the mouse etc. The downside with this is that it is dependant on being able to recognise the screen.
This is easier if you can alter the games files as you can then just skin ( changing the image (texture)) the required cards to a single unique colour.
The major down side is you have to have the game as the top level window or have the game running in a virtual machine. Neither of which is ideal.
Another method is to read the processes memory. You may be able to find a list of memory locations, which would make things simpler, otherwise it involves a lot of hardwork, a debugger to deduce the memory addresses. It also helps (a lot) to be able to understand assembly.
The third method is to intercept the packets, and alter them. This is easier that the method above as it (at least for me) is easier to reverse engine the protocol as you have less information to deal with. It is just a matter of setting up a packet sniffer and preforming a action with one variable different (for example, the card) and comparing the differences.
The thing you need to check are that you are not breaking the EULA. I don't know how the game works, but most of the games I have come across have a EULA that prohibits (i.e. You get banned) doing any of the things I have mentioned.

Should loading/startup dialogs be locked on top?

Introduction
I have been so annoyed by applications that have a startup dialog which is Always on Top configured.
By start dialog I mean the annoying box that tells you what program you just opened (and probably opened on purpose so useless information), who the program is registered to (most likely you, more uselessness), and some other random application specific information. Some have loading bars that indicate startup progress, but otherwise they seem basically useless except to show that your program is actually starting (to prevent the user from opening 5 instances during the loading process because they think it's not open yet).
The worst though is when this useless information is displayed over all the useful browsers and documents that I may be working on at the time, making me wait until the application is loaded before I can effectively work on something else again.
Most apps have the sense not to do this, but some still continue the practice.
Now that I'm done ranting...
My Question(s)
My question is..Why?
What is the point of all this?
Why does/did anyone ever do this?
What was the reasoning behind it?
Is anyone else annoyed by this?
Is there any benefit to the end user or developer to use this technique?
Should I ever use a startup dialog like this and when?
Anyone else have other comments/rants/suggestions to share with the community?
I believe you are talking about the 'splash screen.'
Some reasons for it:
It is often thought of as 'branding' in that it reinforces the company's logo.
It can contain some useful info such as version number.
Most importantly, it gives you the impression that the slow starting app is being just a bit more than unresponsive.
I share in your annoyance of Always-on-Top dialogs.
The use of the dialog is to let the user know that the application is actually doing something and hasn't hung. Back in the day, when IO wasn't as fast, it was reasonable to assume that the system is not very responsive while the application is doing its IO, so it was reasonable to bar the user from doing anything else while the application was loading.
Now that IO is faster and more concurrent, there isn't a need for hogging the user's focus and the start-up dialogs should simply be regular dialogs.
Since the main use of the dialog is to indicate application process, I like a visual indication, such as a progress bar. Eclipse does this well IMO.
It's good to have a startup screen so the user gets some feedback that they actually launched the app.
But putting that screen always on top of existing windows, so the user is sure to see it, is a definite no - you should not presume that your application is more important than the user's web browser, email, etc. Unfortunately many developers have a very self-centric view of the world, and think that their application is the most important thing the user is running.
Just gives visual indication that the app is initializing.
Startup dialogs or Splash screens are completely useless for the most part. The only time I think they are any use is if as you suggested a particular program takes a bit of time to load. Some kind of progress indication would be nice.
The only example I can think of is Photoshop. Not strictly necessary but it does take a moment to load.

Resources