Why can't EventData.GetBytes() be called before sending? - azure

I'm working with Azure Event Hubs and initially when sending data to try and calculate batch size I had code similar to the below that would call EventData.GetBytes
EventHubClient client;//initialized before the relevant code
EventData curr = new EventData(data);
//Setting a partition key, and other operations.
long itemLength = curr.GetBytes().LongLength;
client.SendAsync(curr);
Unfortunately I would receive an exception in the SDK code.
The message body cannot be read multiple times. To reuse it store the value after reading.
While removing the ultimately unnecessary call to GetBytes meant that I could send messages, the rationale for this exception to occur is rather puzzling. Calling GetBytes() twice in a row is an easy way to reproduce the same exception, but a single call will mean that the EventData cannot be sent successfully.
It seems likely that underneath a Message is used and this is set to throw an exception if called more than once as Message.GetBody documents; however, there is no documentation to this effect in EventData's methods GetBodyStream, GetBody w/serializer, GetBody, or GetBytes.
I imagine this should either be documented, or corrected since currently it is an unpleasant surprise in a separate thread.

Have you tried using EventData.SerializedSizeInBytes to get the size? that is a much more accurate way to get the size for batching calculation.

Related

Do AsyncIO stream writers/readers require manually ensuring that all data is sent/received?

When dealing with sockets, you need to make sure that all data is sent/received, since you may receive incomplete chunks of data when reading. From the docs:
In general, they return when the associated network buffers have been filled (send) or emptied (recv). They then tell you how many bytes they handled. It is your responsibility to call them again until your message has been completely dealt with.
Emphasis mine. It then shows sample implementations that ensure all data has been handled in each direction.
Is the same true though when dealing with AsyncIO wrappers over sockets?
For read, it seems to be required as the docs mention that it "[reads] up to n bytes.".
For write though, it seems like as long as you call drain afterwards, you know that it's all sent. The docs don't explicitly say that it must be called repeatedly, and write doesn't return anything.
Is this correct? Do I need to check how much was read using read, but can just drain the StreamWriter and know that everything was sent?
I thought that my above assumptions were correct, then I had a look at the example TCP Client immediately below the method docs:
import asyncio
async def tcp_echo_client(message):
reader, writer = await asyncio.open_connection(
'127.0.0.1', 8888)
print(f'Send: {message!r}')
writer.write(message.encode())
data = await reader.read(100)
print(f'Received: {data.decode()!r}')
print('Close the connection')
writer.close()
asyncio.run(tcp_echo_client('Hello World!'))
And it doesn't do any kind of checking. It assumes everything is both read and written the first time.
For read, [checking for incomplete read] seems to be required as the docs mention that it "[reads] up to n bytes.".
Correct, and this is a useful feature for many kinds of processing, as it allows you to read new data as it arrives from the peer and process it incrementally, without having to know how much to expect at any point. If you do know exactly how much you expect and need to read that amount of bytes, you can use readexactly.
For write though, it seems like as long as you call drain afterwards, you know that it's all sent. The docs don't explicitly say that it must be called repeatedly, and write doesn't return anything.
This is partially correct. Yes, asyncio will automatically keep writing the data you give it in the background until all is written, so you don't need to (nor can you) ensure it by checking the return value of write.
However, a sequence of stream.write(data); await stream.drain() will not pause the coroutine until all data has been transmitted to the OS. This is because drain doesn't wait for all data to be written, it only waits until it hits a "low watermark", trying to ensure (misguidedly according to some) that the buffer never becomes empty as long as there are new writes. As far as I know, in current asyncio there is no way to wait until all data has been sent - except for manually tweaking the watermarks, which is inconvenient and which the documentation warns against. The same applies to awaiting the return value of write() introduced in Python 3.8.
This is not as bad as it sounds simply because a successful write itself doesn't guarantee that the data was actually transmitted to, let alone received by the peer - it could be languishing in the socket buffer, or in network equipment along the way. But as long as you can rely on the system to send out the data you gave it as fast as possible, you don't really care whether some of it is in an asyncio buffer or in a kernel buffer. (But you still need to await drain() to ensure backpressure.)
The one time you do care is when you are about to exit the program or the event loop; in that case, a portion of the data being stuck in an asyncio buffer means that the peer will never see it. This is why, starting with 3.7, asyncio provides a wait_closed() method which you can await after calling close() to ensure that all the data has been sent. One could imagine a flush() method that does the same, but without having to actually close the socket (analogous to the method of the same name on file objects, and with equivalent semantics), but currently there are no plans to add it.

Concurrency between Meteor.setTimeout and Meteor.methods

In my Meteor application to implement a turnbased multiplayer game server, the clients receive the game state via publish/subscribe, and can call a Meteor method sendTurn to send turn data to the server (they cannot update the game state collection directly).
var endRound = function(gameRound) {
// check if gameRound has already ended /
// if round results have already been determined
// --> yes:
do nothing
// --> no:
// determine round results
// update collection
// create next gameRound
};
Meteor.methods({
sendTurn: function(turnParams) {
// find gameRound data
// validate turnParams against gameRound
// store turn (update "gameRound" collection object)
// have all clients sent in turns for this round?
// yes --> call "endRound"
// no --> wait for other clients to send turns
}
});
To implement a time limit, I want to wait for a certain time period (to give clients time to call sendTurn), and then determine the round result - but only if the round result has not already been determined in sendTurn.
How should I implement this time limit on the server?
My naive approach to implement this would be to call Meteor.setTimeout(endRound, <roundTimeLimit>).
Questions:
What about concurrency? I assume I should update collections synchronously (without callbacks) in sendTurn and endRound (?), but would this be enough to eliminate race conditions? (Reading the 4th comment on the accepted answer to this SO question about synchronous database operations also yielding, I doubt that)
In that regard, what does "per request" mean in the Meteor docs in my context (the function endRound called by a client method call and/or in server setTimeout)?
In Meteor, your server code runs in a single thread per request, not in the asynchronous callback style typical of Node.
In a multi-server / clustered environment, (how) would this work?
Great question, and it's trickier than it looks. First off I'd like to point out that I've implemented a solution to this exact problem in the following repos:
https://github.com/ldworkin/meteor-prisoners-dilemma
https://github.com/HarvardEconCS/turkserver-meteor
To summarize, the problem basically has the following properties:
Each client sends in some action on each round (you call this sendTurn)
When all clients have sent in their actions, run endRound
Each round has a timer that, if it expires, automatically runs endRound anyway
endRound must execute exactly once per round regardless of what clients do
Now, consider the properties of Meteor that we have to deal with:
Each client can have exactly one outstanding method to the server at a time (unless this.unblock() is called inside a method). Following methods wait for the first.
All timeout and database operations on the server can yield to other fibers
This means that whenever a method call goes through a yielding operation, values in Node or the database can change. This can lead to the following potential race conditions (these are just the ones I've fixed, but there may be others):
In a 2-player game, for example, two clients call sendTurn at exactly same time. Both call a yielding operation to store the turn data. Both methods then check whether 2 players have sent in their turns, finding the affirmative, and then endRound gets run twice.
A player calls sendTurn right as the round times out. In that case, endRound is called by both the timeout and the player's method, resulting running twice again.
Incorrect fixes to the above problems can result in starvation where endRound never gets called.
You can approach this problem in several ways, either synchronizing in Node or in the database.
Since only one Fiber can actually change values in Node at a time, if you don't call a yielding operation you are guaranteed to avoid possible race conditions. So you can cache things like the turn states in memory instead of in the database. However, this requires that the caching is done correctly and doesn't carry over to clustered environments.
Move the endRound code outside of the method call itself, using something else to trigger it. This is the approach I've taken which ensures that only the timer or the final player triggers the end of the round, not both (see here for an implementation using observeChanges).
In a clustered environment you will have to synchronize using only the database, probably with conditional update operations and atomic operators. Something like the following:
var currentVal;
while(true) {
currentVal = Foo.findOne(id).val; // yields
if( Foo.update({_id: id, val: currentVal}, {$inc: {val: 1}}) > 0 ) {
// Operation went as expected
// (your code here, e.g. endRound)
break;
}
else {
// Race condition detected, try again
}
}
The above approach is primitive and probably results in bad database performance under high loads; it also doesn't handle timers, but I'm sure with some thinking you can figure out how to extend it to work better.
You may also want to see this timers code for some other ideas. I'm going to extend it to the full setting that you described once I have some time.

Swift - Load information from Core Data faster

Hey how to get big amount of information like 1000 rows without stuck?
I try with this:
dispatch_async(dispatch_get_main_queue(), {
//here code
})
but when I executed the request self.context.executeFetchRequest it returns me fatal error: unexpectedly found nil while unwrapping an Optional value. I have an error and I have to add self. in front of the function.
let queue:dispatch_queue_t = dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0)
dispatch_async(queue, { () -> Void in
//code
})
but also I get the same error...
I use NSFetchRequest and I add the results in NSArray and I loop the results in for loop and in the loop I sort results in a dictionaries.
1000 records is not very much for Core Data. Just fetch them on the thread.
I would not advise to "sort results in a dictionaries". You should think how your app logic interacts with the data and simply fetch the objects you need from the Core Data persistent store.
For example, if you want to display 1000 lines in a table view, use `NSFetchedResultsController´ which is optimized for this sort of situation - so you will avoid memory and performance issues without any work.
If you really need threading with Core Data (which I doubt) I would advise not to start with GCD but to use Core Data's own concurrency APIs, such as performBlock and global queue child contexts. But most likely you won't have to worry about those.
Finally, your error is really referring to some code that you have not posted. It has to do with Swift's optionals. For example, if you declare a variable as type variable : String? (or you use an API that returns such a type), you can unwrap it with variable! if you are sure it is not nil . If it is nil you will get the above crash.

Most efficiet way to determine if there are messages in Azure Storage Queue

I'm beginning a project which will involve Azure Queue (not Service Bus).
I'm trying to figure out what is the best practice to find out whether there are messages waiting in the Queue.
AFAIK, there are two methods for that:
Using the ApproximateMessageCount property of the Queue object
Calling GetMessage, and if the returned value is null - there are no messages.
Which one is better performance-wise? Is there any difference?
From a billing POV, I understand there is a transaction cost for both of them, is that correct?
Thanks!
GetMessage is both faster and cheaper. GetMessage is also more correct from a logic perspective since the message count will return both messages that have already been retrieved by another reader as well as messages that have expired without being deleted.
I have also used this code in the past:
var cnnString = "the connection string";
var queueName = "the queue name";
var nsManager = NamespaceManager.CreateFromConnectionString(cnnString);
return nsManager.GetQueue(queueName).MessageCount;
That said - this was from about 4 months ago.
Any reason you need to do this (i.e. are you not just consuming messages off the queue?)

The connection of bluetooth with multi devices using SPP

I could connect to two devices from Android based cell phone simultaneously using SPP, but once I turn on the inputstream (like socket.getInputStream()), one of them will return 0 in the stream, that is, no data available on the stream.
For example, thread A(thA) and thread B(thB) connected to device A(devA) and device B(devB) respectively. So, thA uses inputstream A(inA) to receive data from devA, thB uses inputstream B(inB) to receive data from devB. As follow:
devA --->inA --->thA
devB --->inB --->thB
It works fine if I connect to each device separately. However, in the case of connecting two devices at the same time, then only inA or inB has data on it.
If it happens to you, please share your experence with me, I would be very appreciated!!
Thank you in advance.
YT
Why are you using reflection for the createRFCommSocket? device.getClass().getMethod("createRfcommSocket", new Class[] {int.class});
as opposed to
try {mBTSocket = mBTDevice.createRfcommSocketToServiceRecord(UUID_RFCOMM_GENERIC);
} catch (Exception e1) {
msg ("connect(): Failed to bind to RFCOMM by UUID. msg=" + e1.getMessage());
return false;
}
The reflection can easily be the source of problems. If there is no reason to use it then avoid it at all costs.
Furthermore, if the getClass call fails, then your "m" variable will be null, and you're not trapping for that situation. You should generalize your exception more too, instead of using specific exceptions, just use "Exception" Like in my code snippet above. It's much easier than adding a catch for every possible type of exception that might get thrown.
I'm confused about what you're doing with the handlers, it doesn't make sense to me. Can you remove the handler code to simplify things?
There's just too much complication. Remove all the reflection, extra catch's.
It's good coding practice to make your methods one page or less. When a method is more than a page it is too complicated and it makes reading it AND debugging it very difficult. Reduce the size of your methods by creating other methods to perform common tasks.
Separate your connect() logic, from your I/O logic. You should have a method for sending data, and a method for receiving data, a method for connect(). Then once you get those working, chunk up and create methods for higher level I/O for sending and receiving whole blocks of data. then perfect those methods and keep growing up and up.
in my code the read, write, connect, and ALL I/O methods are only 1-20 lines each. Keep them very simple because your I/O logic is at the core of your app and it needs to be clean clean clean.

Resources