'ClickHouse TTL settings with WHERE condition
I have a table with the following DDL:
CREATE TABLE default.pricing_rate_latest_state_union_test
(
`organization_id` Int32,
`organization_name` String,
`scraping_time` AggregateFunction(max, DateTime),
`amount_from_raw` Float64,
`amount_to_raw` Float64,
`key_param_id` LowCardinality(String),
`amount_from` AggregateFunction(argMax, Float64, DateTime),
`amount_to` AggregateFunction(argMax, Float64, DateTime),
`source` LowCardinality(String)
)
ENGINE = ReplicatedAggregatingMergeTree('/clickhouse/tables/{shard}/*****/pricing_rate_latest_state_union_test',
'{replica}')
ORDER BY (key_param_id,
amount_from_raw,
amount_to_raw)
SETTINGS index_granularity = 8192;
For which I would like to add a TTL expression on scraping_time column, so that after certain number of days the corresponding records are dropped. This I can do easily with the following:
ALTER TABLE default.pricing_rate_latest_state_union_test
MODIFY TTL finalizeAggregation(scraping_time) + INTERVAL 30 DAY;
This is works fine, but my end goal is to add condition describing which rows need to be deleted after expiration. What I'm trying is this:
ALTER TABLE default.pricing_rate_latest_state_union_test
MODIFY TTL finalizeAggregation(scraping_time) + INTERVAL 30 DAY
DELETE WHERE source = 'some_value';
Unfortunately this lead to the following error:
DB::Exception: Exception happened during execution of mutation '0000000000' with part 'all_0_0_0' reason: 'Code: 10, e.displayText() = DB::Exception: Not found column scraping_time in block. There are only columns: organization_id (version 21.8.13.6 (official build))'. This error maybe retryable or not. In case of unretryable error, mutation can be killed with KILL MUTATION query
According to this documentation it should be possible as there is the following example:
CREATE TABLE table_with_where
(
d DateTime,
a Int
)
ENGINE = MergeTree
PARTITION BY toYYYYMM(d)
ORDER BY d
TTL d + INTERVAL 1 MONTH DELETE WHERE toDayOfWeek(d) = 1;
I believe my case doesn't work that smoothly due to the ReplicatedAggregatingMergeTree, so any help would be greatly appreciated here.
I'm also interested in the meaning of Not found column scraping_time in block. I know what part is in ClickHouse, but what is the block and why it doesn't see the scraping_time, but see the organization_id?
Solution 1:[1]
It looks like a bug or unexpected behavior because TTL is not designed to work with AggregateFunction.
You can construct TTL without WHERE
ALTER TABLE default.pricing_rate_latest_state_union_test
MODIFY TTL if (source = 'some_value', finalizeAggregation(scraping_time) + INTERVAL 30 DAY, toDateTime(0));
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Denny Crane |