'Clickhouse: how to use `Data Skipping Indexes` and `Manipulations With Data Skipping Indices` features in clickhouse?
I'm using the Data Skipping Indexes feature in clickhouse and i got confused about its usage. If i add a data skip index when i create the table like this:
CREATE TABLE MyTable
(
...
INDEX index_time TimeStamp TYPE minmax GRANULARITY 1
)
ENGINE =MergeTree()
...
When i query with TimeStamp filter condition the 'index_time' works. But if i didn't add index when creating table, alternatively, i added the index with Manipulations With Data Skipping Indicesfeature like this:
ALTER TABLEE MyTable ADD INDEX index_time TimeStamp TYPE minmax GRANULARITY 1
Then the index 'index_time' doesn't work.
My database is running on production so i can't recreate the table. I have to use the second way. Can anyone explain why it does not work or i used the feature in a wrong way?
Solution 1:[1]
It's quite right that
OPTIMIZE TABLE my_table_name FINAL;
does recreate the indexes set to the table. But there's some scenarios in a columnar DB where you want to avoid rewriting EVERYTHING. If you just add a single index to an already existing table with lots of data when you just rebuilt the new index which includes two steps:
Step 1 - Define the index
Creating the INDEX itself just defines what the index should do, which reflects in Clickhouse as metadata that's added to the table. Thus there is no index build up really, thus nothing will be faster. It's also a lightweight operation as it won't change data or build up any structures beside the table metadata.
It's important to understand any new incoming data will be indexed on insert, but any pre existing data is not included!
ALTER TABLE my_table_name ADD INDEX my_index(my_expression) TYPE minmax GRANULARITY 1
Note Clickhouse can index expressions, so it could simply be the column name as in the question or a more complex expression (e.g. my_index(price * sold_items * revshare)). The index will work on that expression only of course.
Step 2 - Build up (materialize) the index
After creation of the metadata the index for existing data need to be build up. This action is called materialize and needs to be explicitly triggered. Good thing is you can do this individually for any index that was added or changed. This is a heavy operation as it'll trigger work on the database.
ALTER TABLE my_table_name MATERIALIZE INDEX my_index;
Also have a look at the Clickhouse docs for Manipulating Data Skipping Indices
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | marc_s |
