'How to write a shell script that will alter tables drop partitions in hive for more than 10 days for multiple tables(by importing a hql file)

I wanna write a shell script that will drop partitions for multiple tables in hive that have partitions greater than 10 days, can someone please help me out

I am able to hardcode it for only one table, but I do not want to hardcode anything as I am need to deploy it with oozie, So I want to make everything dynamic



Solution 1:[1]

As a starting point, you can compute for the value of 10 days ago as reference point and pass it to your HQL script.

# shell script

dt_delete=$(date --date="10 days ago" +'%F') # 2022-03-23
hive -hivevar dt_delete=$dt_delete -f "daily_deletion.hql"

In your Hive script, you might need to find a way how to determine whether a partition is 10 days old. You can try to look at your columns and select the partitions to be deleted.

If your Hive table is partitioned by date, then it's going to more straightforward.

// hive script

alter table table_a drop partition(date_partition < "${dt_delete}");
alter table table_b drop partition(date_partition < "${dt_delete}");
alter table table_c drop partition(date_partition < "${dt_delete}");
...

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Clover