'How to sort data based on one of the timestamp field in Druid Scan Query
I'm using Druid scan query with ordering param "ascending". It is returning data based on configured timestamp field called serverReceiveTime. I wanted to sort my data on one of the other timestamp field(streamingSegmentStartTime). As per Scan query documentation, there is no such sort argument we can pass.
ScanDruidQuery.builder()
.dataSource(route.getDataSource())
.intervals(IntervalParser.getIntervals(getSessionsQuery.getStartTime(), getSessionsQuery.getEndTime()))
.filter(filterTranslator.translate(getSessionsQuery.getFilter()))
.order(DRUID_DATA_SORT_ORDER)
.columns(columnList)
.context(new DruidQueryContext(genericQuery.getRequestId()))
.limit(getSessionsQuery.getResultSize())
.offset(NumberUtils.toInt(getSessionsQuery.getNextToken(), 0))
.build();
Please let me know if there is any way to sort this data based on streamingSegmentStartTime at Druid end
Solution 1:[1]
Not sure what your query is doing, so this might not help, but you can sort by other columns if you use a group by query.
Take a look at the sortByDimsFirst query context property of the Group By query here: https://druid.apache.org/docs/latest/querying/groupbyquery.html#groupby-v2-configurations
If you set the first dimension of the DimensionSpec to the streamingSegmentStartTime and use sortByDimsFirst set to True, I think you can achieve what you want.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Sergio Ferragut |
