Runtime errors for Azure stream Analytics Job - azure

I tried to modify an existing azure stream analytics job by adding one more temporary result set.
But when I run the SA job, it is throwing runtime error and watermark delay is getting increased.
Below is the existing SAQL in the Stream Analytics job:
-- Reading from Event Hub
WITH INPUTDATASET AS (
SELECT
udf.udf01([signals2]) AS flat
FROM [signals2]
PARTITION BY PartitionId
WHERE [signals2].ABC IS NOT NULL
),
INPUT1 AS (
SELECT
ID,
SIG1,
SIG2
FROM [signals2] as input
WHERE GetArrayLength(input.XYZ) >=1
)
--Dump the data from above result set into cosmosDB
I tried to add the below temporary result set to the SAQL:
INPUT2 AS (
SELECT
ID,
SIG3,
SIG4
FROM [signals2] as input
WHERE GetArrayLength(input.XYZ) =0
)
Now when I start the SA job, it throws runtime error.
When I fetch the logs below is the error logs i received.
TimeGenerated,Resource,"Region_s",OperationName,"properties_s",Level
"2020-01-01T01:10:10.085Z",SAJOB01,"Japan West","Diagnostic: Diagnostic Error","{""Error"":null,""Message"":""First Occurred: 01\/01\/2020 01:10:10 | Resource Name: signals2 | Message: Maximum Event Hub receivers exceeded. Only 5 receivers per partition are allowed.\r\nPlease use dedicated consumer group(s) for this input. If there are multiple queries using same input, share your input using WITH clause. \r\n "",""Type"":""DiagnosticMessage"",""Correlation ID"":""xxxx""}",Error
"2020-01-01T01:10:10.754Z",SAJOB01,"Japan West","Receive Events: ","{""Error"":null,""Message"":""We cannot connect to Event Hub partition [25] because the maximum number of allowed receivers per partition in a consumer group has been reached. Ensure that other Stream Analytics jobs or Service Bus Explorer are not using the same consumer group. The following information may be helpful in identifying the connected receivers: Exceeded the maximum number of allowed receivers per partition in a consumer group which is 5. List of connected receivers - AzureStreamAnalytics_xxxx_25, AzureStreamAnalytics_xxxx_25, AzureStreamAnalytics_zzz_25, AzureStreamAnalytics_xxx_25, AzureStreamAnalytics_xxx_25. TrackingId:xxx_B7S2, SystemTracker:eventhub001-ns:eventhub:ehub01~26|consumergrp01, Timestamp:2020-01-01T01:10:10 Reference:xxx, TrackingId:xxx_B7S2, SystemTracker:eventhub001-ns:eventhub:ehub01~26|consumergrp01, Timestamp:2020-01-01T01:10:10, referenceId: xxx_B7S2"",""Type"":""EventHubBasedInputQuotaExceededError"",""Correlation ID"":""xxxx""}",Error...
For the SA job, the input signals2 is having a dedicated consumer group (consumergrp01)
For this Stream Analytics job, dedicated consumer group is available.There are 3 readers on a partition for this consumer group, but still it is throwing the error as Maximum Event Hub receivers exceeds. Why is it so?

Message: Maximum Event Hub receivers exceeded. Only 5 receivers per
partition are allowed.
I think the error message is clear on the root cause of the runtime error.Please refer to the statement in this doc:
There can be at most 5 concurrent readers on a partition per consumer
group; however it is recommended that there is only one active
receiver on a partition per consumer group. Within a single partition,
each reader receives all of the messages. If you have multiple readers
on the same partition, then you process duplicate messages. You need
to handle this in your code, which may not be trivial. However, it's a
valid approach in some scenarios
You could follow the suggestion in the error message:If there are multiple queries using same input, share your input using WITH clause. Try to isolate input2 as a temporary result set then use it in other temp results.

Related

Azure Function with Event Hub trigger receives weird amount of events

I have an Event Hub and Azure Function connected to it. With small amounts of data all works well, but when I tested it with 10 000 events, I got very peculiar results.
For test purposes I send into Event hub numbers from 0 to 9999 and log data in application insights and in service bus. For the first test I see in Azure that hub got exactly 10 000 events, but service bus and AI got all messages between 0 and 4500, and every second message after 4500 (so it lost about 30%). In second test, I got all messages from 0 to 9999, but every second message between 3500 and 3200 was duplicated. I would like to get all messages once, what did I do wrong?
public async Task Run([EventHubTrigger("%EventHubName%", Connection = "AzureEventHubConnectionString")] EventData[] events, ILogger log)
{
int id = _random.Next(1, 100000);
_context.Log.TraceInfo("Started. Count: " + events.Length + ". " + id); //AI log
foreach (var message in events)
{
//log with ASB
var mess = new Message();
mess.Body = message.EventBody.ToArray();
await queueClient.SendAsync(mess);
}
_context.Log.TraceInfo("Completed. " + id); //AI log
}
By using EventData[] events, you are reading events from hub in batch mode, thats why you see X events processing at a time then next seconds you process next batch.
Instead of EventData[] use simply EventData.
When you send events to hub check that all events are sent with the same partition key if you want try batch processing otherwise they can be splitted in several partitions depending on TU (throughput units), PU (Processing Units) and CU (Capacity Units).
Egress: Up to 2 MB per second or 4096 events per second.
Refer to this document.
Throughput limits for Basic, Standard, Premium..:
There are a couple of things likely happening, though I can only speculate with the limited context that we have. Knowing more about the testing methodology, tier of your Event Hubs namespace, and the number of partitions in your Event Hub would help.
The first thing to be aware of is that the timing between when an event is published and when it is available in a partition to be read is non-deterministic. When a publish operation completes, the Event Hubs broker has acknowledged receipt of the events and taken responsibility for ensuring they are persisted to multiple replicas and made available in a specific partition. However, it is not a guarantee that the event can immediately be read.
Depending on how you sent the events, the broker may also need to route events from a gateway by performing a round-robin or applying a hash algorithm. If you're looking to optimize the time from publish to availability, taking ownership of partition distribution and publishing directly to a partition can help, as can ensuring that you're publishing with the right degree of concurrency for your host environment and scenario.
With respect to duplication, it's important to be aware that Event Hubs offers an "at least once" guarantee; your consuming application should expect some duplicates and needs to be able to handle them in the way that is appropriate for your application scenario.
Azure Functions uses a set of event processors in its infrastructure to read events. The processors collaborate with one another to share work and distribute the responsibility for partitions between them. Because collaboration takes place using storage as an intermediary to synchronize, there is an overlap of partition ownership when instances are scaled up or scaled down, during which time the potential for duplication is increased.
Functions makes the decision to scale based on the number of events that it sees waiting in partitions to be read. In the case of your test, if your publication pattern increases rapidly and Functions sees "the event backlog" grow to the point that it feels the need to scale by multiple instances, you'll see more duplication than you otherwise would for a period of 10-30 seconds until partition ownership normalizes. To mitigate this, using an approach of gradually increasing speed of publishing over a 1-2 minute period can help to smooth out the scaling and reduce (but not eliminate) duplication.

What happens to events in event hub after stream analytics does it works and routes them to service bus?

I have following scenario:
The event hub (EH1) is configured with a retention policy of 7 days.
Producers publish events to EH1.
The events from EH1 are routed from stream analytics (SA) (after performing certain calculations over 1 hour time windows) to service bus, which gets both raw events (as messages) as well as summarized calculations.
Lets say over 24 hour period of day 1, producers publish 1 million events to EH1.
SA kicks in and routes the raw events as well as summarized calculations (over 1 hour periods) to service bus.
Assume that after day 1, there are no events pushed to EH1 for next 15 days.
Questions:
How long will the 1 million raw events (from day 1) stay in EH1?
Will those 1 million raw events (from day 1) be still there on day 2 (after 1st hour) through day 7 (because the retention policy is 7)? Or will they be gone after day 1 when SA is done processing all those events? If neither, what else happens?
What metrics should I look at in EH1 to prove what ever the answer is to both (1) and (2)?
First of all, you should take a look at the consumer group first.
In short, when consumers(like any app or code which are used to receive events from eventhub) read events, it must read the events via a consumer group(we named it cg_1 here) -> then for the next time, you read events from cg_1 again, the events(which you have already read) will not be read again.
But if you switch to another consumer group(like you newly create a consumer group named cg_2), you can read all the data(even though the data has been read from cg_1) again.
So for your questions:
#1:
Since you have configured the retention policy of 7 days, the events(raw data) will be kept in eventhub for 7 days. If the events have been received via a consumer group, you cannot receive it again via this consumer group. But you can use another consumer group to receive the data again.
#2:
Similar to question 1, the raw events will be stored in eventhub according to the retention days you have configured.
#There is no such metrics, but you can easily write client codes, and create a new consumer group, then read the data to check if it's there.

Stream Analytics query hits size limit

I'm new to Azure Stream Analytics. I have an Event hub as input source and now I'm trying to execute a simple query on this stream. An example query is like this:
SELECT
count(*)
INTO [output1]
FROM
[input1] TIMESTAMP BY Time
GROUP BY TumblingWindow(second, 10)
So I want to count the events which arrived within a certain time frame.
When executing this query, I always get the following error:
Request exceeded maximum allowed size limit
As I already narrowed down the checked time window and I'm certain that the amount of events within this time frame is not very big (at most several 100)
I'm not sure how to avoid this error.
Do you have a hint?
Thanks!
Request exceeded maximum allowed size limit
This error(i believe it should be more explicit) indicates that you violated the azure stream analytic resource and object limits.
It's not just about quantity, it's also about size.Please check your source inputs' size or try to reduce the windowsize and test again.
1.Does the record size of the source query mean that one event can only have 64 KB or does this parameter mean 64 K events?
It means the size of one event should below 64KB.
Is there a possibility to use Stream Analytics to select only certain
subfields of the event or is the only way to reduce the event size
before it is sent to the event hub?
As i know,ASA only collects data for processing it,so the size is all depends on the source side and your query sql. Since you need to use COUNT, i'm afraid that you have to do something on the eventhub side.Please refer to my thoughts:
Use Event Hub Azure Function Trigger, when an event streams into event hub,trigger the function and pick only partial key-values and save it into another event hub namespace.(Just in order to reduce the size of source event) Anyway you only need to COUNT records, i think it works for you.

What is the function of a partition in the Microsof Azure Iot Hub?

When I'm going to create an Iot Hub, the Azure platform requests for the number of partitions of the IoT Hub. I have read about the partitions on this topic purpose of Azure iot hub device-to-cloud partitions but I don't understand what is the relation between consumer groups and partitions, and which is the relation with the reading of the data.
Partitions are primary there for support of scaling. The default behavior is that messages that are send to the hub are divided over those partitions.
So lets say we have 4 partitions (1-4) containing some messages (A-L):
Partition 1: A, E, I
Partition 2: B, F, J
Partition 3: C, G, K
Partition 4: D, H, L
Lets also say that we have defined 2 consumer groups, C1 and C2. If you start a process to read the messages from the hub you define a consumer group (if not, the default consumer group is used).
So let us have 2 readers, one (R1) configured to read using C1 and the other (R2)to read using C2.
Both readers have access to the same partitions and messages. But both have their own progress tracker. This is the important part!
In a real word scenario you might have a stream of data, lets assume log messages. The requirements are that all log messages have to be written to a database and that some messages having a higher loglevel need to be send as a high priority alert using sms. If you would have just one consumer group (C1, read by R1) all messages will eventually be processed. But if the database writes are slow it could very well be that it takes some time between a message being delivered and a message being processed.
Now, if we would have 2 consumer groups, the reader (R2) for that consumer group (C2) could skip all low loglevel messages and only process the critical messages that are to be send using sms. This reader will go through all the messages a lot faster than the one that needs to write all the messages to a database.
TL;DR: multiple consumer groups can be used to separate slow stream processors for fasters stream processors. Each consumer group tracks its own progress in the stream.
So in the end progress may look like this:
Consumer group 1 (doing some time consuming processing)
Partition 1: A, E, I
Partition 2: B, F, J
Partition 3: C, G, K
Partition 4: D, H, L
Consumer group 2 (doing some fast message processing)
Partition 1: A, E, I
Partition 2: B, F, J
Partition 3: C, G, K
Partition 4: D, H, L
where the bold characters represent a processed message.
Edit
If I have two readers in the same consumer group, Does each reader have their own progress or the progress is per consumer group?
Each reader is connected to an event hub partition through a consumer group, the progress is stored per partition per consumer group. So, in a sense a reader has its own progress but the reader is short-lived, a new instance of a reader connecting to the same partition will continue where a previous reader left.
#Peter answer is absolutely correct, Let me add some more words.
Partition in Event hub like a lane in highway, Instead of creating big wide road we use lane system to pass traffic easily and when we block one lane it doesn't mean the whole highway blocked.
using this technique EventHub/IoTHub allow us to ingest millions on record per second

Release a group when the number of messages in the group gets to a number defined in an another line message

I have a batch process , and we receive an START message in a queue, and an END message in the same queue. After the Start message, we receive thousands of messages in other 3 queues, that we filter, enrich, aggregate and finally transform to JSON. We can call this pipeline as MAIN_PIPE )
After that Start message we have an adapter that reads from database the total number of elements in only one message that we will receive ( we can call this pipeline as COUNTER_PIPE )
And after the End message, whenever we have treated ALL the messages we have to send a request to an external service.
So, we need to count all treated messages ( JSON converted ) in MAIN_PIPE and compare to that number in COUNTER_PIPE.
How can I compare that ?
Would you mind to describe also how do you read from those 3 queues? It isn't clear to me where is a correlation between START and all those messages to the batch. If that is regular message-driven channel adapter, there is a case when we may start receiving those message but there is still no START or no info about count in the DB.
Anyway I'd make it like:
The START and END messages, as well, as all messages in that batch must have the same correlataionKey to let an Aggregator to form a batch in the end.
Since the group in case is based on the count anyway, you don't have choice unless send to the aggregator even discarded messages from the filter. That might be just simple error stub to be able to distinguish them properly in the aggregator's release function.
The releaseStrategy of the aggregator must iterate over the group to find a message with the count and compare it with the group size + 2 (START & END messages).
Does it make sense to you?

Resources