'Stream Analytics to Azure SQL (duplicate data)
My use case:
IoT Hub -> Event Hub -> ASA -> Azure SQL (staging table)
The problem is that i get duplicate data in my staging table.
For testing purpose i sent exactly 10k json messages to IoT-Hub but my staging table then contains much more data like over 40k.
Is there something i have to adjust in event hub or in ASA? Is this normal for ASA to process duplicate messages?
Solution 1:[1]
Is there something i have to adjust in event hub or in ASA? Is this normal for ASA to process duplicate messages?
You can use COUNT(DISTINCT) which specifies that COUNT returns the number of unique non-null values.
For example:
With Temp AS (
SELECT
COUNT(DISTINCT Time) AS CountTime,
Value,
DeviceId
FROM
Input TIMESTAMP BY Time
GROUP BY
Value,
DeviceId,
SYSTEM.TIMESTAMP()
)
SELECT
AVG(Value) AS AverageValue, DeviceId
INTO Output
FROM Temp
GROUP BY DeviceId,TumblingWindow(minute, 10)
You can refer to Remove Duplicates from EventHub in Azure Stream Analytics, Remove duplicate events in a window and Azure Stream Analytics : remove duplicates while aggregating
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
