purpose of Azure iot hub device-to-cloud partitions - azure

When creating a new Azure IOT Hub you are asked how many device-to-cloud partitions you need. You can select between 2-32 partitions for standard tiers.
I understand that the SKU and number of units determine the maximum daily quota of messages that you can send to IOT Hub. And that it is recommended to shard your devices into multiple IOT hubs to smooth traffic bursts. However, device-to-cloud partitions need clarification.
1>> What is the purpose of those device-to-cloud partitions under a single IOT hub?
2>> How are we supposed to take advantage of those IOT Hub device-to-cloud partitions? 
Thanks.

1>> What is the purpose of those device-to-cloud partitions under a
single IOT hub?
Partition property is setting for Event Hub-compatible messaging endpoint(messages/events) built in Azure IoT Hub. From here we can see "partitions" is a concept belongs to Event Hub.
Event Hubs is designed to allow a single partition reader per consumer group. A single partition within a consumer group cannot have more than 5 concurrent readers connected at any time. More partitions enables you to have more concurrent readers processing your data, improving your aggregate throughput.
Ref: Built-in endpoint: messages/events and How many partitions do I need?
2>> How are we supposed to take advantage of those IOT Hub
device-to-cloud partitions?
Event Hubs has two primary models for event consumption: direct receivers and higher-level abstractions, such as EventProcessorHost. Direct receivers are responsible for their own coordination of access to partitions within a consumer group.
Ref:Event consumers.
More information about the partitioning model of Azure Event Hubs are here.

Related

Azure IoT Hub routing to Event Hub

I created a Event Hub with 32 partitions and doing the message injection over the IoT Hub message routing.
If i connect with Stream Analytics to my Event Hub and look at the input i get this message: "While sampling data, no data was received from '31' partitions"
I thought that if i have multiple partitions all the messages will be distributed over all the partitions and not land in one.
If this is intended, then whats the use to have multiple partitions when using Stream Analytics with Event Hub as input?
IoT Hub will use the device ID as the partition key I think.
So if you receive data only from one device, they will all go to one partition of the Event Hub.
You'll get more usage of the partitions by having more devices sending data to IoT Hub.

Azure Event Hub - Partitions usecase question

I'm new to Azure Event Hubs and I'm having a hard time understanding the Partitions.
I have the following scenario:
1 Event Hub Namespace
1 actual Event Hub
2 Partitions in the Event Hub
2 Consumer groups
1 Event Producer
2 Event Consumers, one per Consumer group
The Event Producer sends out 10 events to the Event hub. The events gets distributed to the partitions with a round-robin mechanism. So the Event hub looks like this:
Partition 1: [0] [2] [5] [6] [8]
Partition 2: [1] [3] [4] [7] [9]
When the Event Consumers start reading, each consumer would end up with only a part of the events, like so:
Consumer 1: Gets events 0,2,5,6,8
Consumer 2: Gets events 1,3,4,7,9
Is it true that a Consumer group can only access a subset of the Partitions?
My assumption is that the Event Hub architecture supports broadcasting of events to multiple consumers. And that every consumer wants all the events.
But it seems to me that Event Hub isn't designed to have all consumers get all the events, but I don't understand why that would be useful.
Can anyone help me understand Partitions?
Each Event Hubs partition is a persistent stream of events that is available to all consumers, regardless of which consumer group they are associated with. Any consumer can read from any partition at any point in the event stream.
Partitions are used to help scale resources to support a greater degree of concurrency and increase throughput for the Event Hub. Generally speaking, the more partitions that are in use, the more concurrent operations the Event Hub can handle. More information can be found in the Event Hubs overview.
My assumption is that the Event Hub architecture supports broadcasting of events to multiple consumers.
Not quite; consumers are responsible for pulling events from the partitions of an Event Hub, they are not pushed to consumers. Any consumer with permissions can connect to a partition and read independently. Events are not removed once read, they exist in the partition until their age exceeds the retention period.
But it seems to me that Event Hub isn't designed to have all consumers get all the events
That is not correct. Event Hubs exposes the events for any consumer wishing to read them. Using a client like the EventProcessorClient from the Event Hubs SDK allows an application to consume from all partitions without having to manage each partition consumer individually.

IoT Hub Routing Messages to Only One Partition of Event Hub

I have a data pipeline set up in Azure where I send messages to an IoTHub which then routes those messages to an EventHub. When I read from the EventHub using the standard EventProcessorHost method, I find that only one of the partitions is being read from. I assume that only one partition is actually having messages routed to it. I have not specified a partition key anywhere and expect that the messages would be routed to all of the partitions of the event hub using round robin (as per the documentation at https://learn.microsoft.com/en-us/azure/event-hubs/event-hubs-programming-guide).
How can I configure my setup to route messages to all partitions of the event hub?
Like I said in the comment:
Is it possible you are only receiving data from one device? IoT Hub does automatic partitioning based on the deviceId, so the partition affinity might be the cause.

Does multiple units means multiple connection strings in Azure IoT hub and Event hub?

Are connection strings independent of the number of units we create in IoT hub and throughput units in Event hub?
Will there be only one connection string for single IoT hub and multiple units are create only when scaling is necessary?
Units determine how many messages the IoT Hub/Event Hub can handle in a time period.
So it decides the scale.
More units mean it can handle more data, but it also costs more.
Quote from Event Hubs docs:
Throughput units: Pre-purchased units of capacity that control the throughput capacity of Event Hubs.
Quote from Iot Hub pricing page:
Number of units depends on number of messages required for your IoT solution. For example, each unit of S1 or B1 IoT Hub can handle 400,000 messages a day.

Can I create thousands of event hubs in one Azure Event Hubs namespace

I need to send messages from few thousands of devices to central hub and be able to get live stream of messages for specific device from that hub. So far, Azure Event Hubs seems to the cheapest option in terms of messages count. Event Hub namespace allows to create distinct event hubs in it.
Can I create few thousands of such hubs, one per device?
Is it a good idea? What could be potential drawbacks?
How price is calculated - per namespace or per event hub? (I think per namespace, but I cannot find this info)
If per namespace, does it mean that purchased throughput units are shared among all event hubs? If yes, will single event hub namespace with 1000 event hubs will consume same amount of resources as single event hub namespace with single event hub but which receives messages from 1000 devices?
No, you are limited to 10 Event Hubs per namespace.
Event Hub per device is not the recommended usage. Usual scenario is to put all messages from all devices to the same Event Hub, and then you can separate them again in the processing side. This will scale much better.
Event Hubs quotas
Azure Event hub is an event ingestion service to which you can send events from the event publishers.The events will be available in the event hub partitions to which different consumer groups subscribe.The partitions can be designed to accept only specific kind of events.
You can also create multiple event hubs within an event hub namespace. You can create a maximum of 10 event hubs per Event hub namespace, 32 event hub partitions within an event hub and 20 consumer groups per event hub. So, You can use event hub partitions to separate the events from the event publishers and consume the events in the processing side very easily.
The pricing is at event hub level and not at namespace level. Based on the tier you choose you will be provided with variable features like:
Basic tier:
You can have only 1 consumer group
Standard and Dedicated tier:
You can create up to 20 consumer groups.
For example,
If you choose Basic or Standard tier and region as East US, You will be charged $0.028 per million events for ingress and $0.015 per unit/hour for throughput.
If you choose Dedicated tier, you will be charged 6.849$ per hour which includes the unlimited ingress and throughput charges, but the minimum hours charged are 4hrs.
The main advantage of using dedicated tier is the message retention period is 7 days whereas in basic and standard tier it is just 1 day, and the message size is up to 1 MB whereas in basic and standard tier it is just 256 KB.
Refer https://azure.microsoft.com/en-in/pricing/details/event-hubs/.

Resources