'SQL Server 2019 database archiving
SQL Server database 2019 - I need to perform an archiving task.
I have a database of about a terabyte. It has many tables (approximately 70%) that users do not need to work with often, only to be able to view occasionally.
There are two suggestions for this:
Option 1: offered by my colleague (Devops engineer)
Use partitioning by the "modification date" column. The problem is that not a single query query uses this column, but a colleague vehemently assures that such partitioning will speed up the execution of query queries due to indexes on the remaining columns that participate in the queries.
I think that such partitioning should speed up requests for inserting new data, due to the fact that the request will understand in which section to insert data - with the latest date. And with the samples, I strongly doubt it.
In this regard, the first question.
Let's say we have a table with columns A B C
- The table is partitioned by column A.
- There is an index on column B.
And request is something like this:
SELECT *
FROM table
WHERE B = "some value"
Will the query understand which partition to enter using the index?
Option #2
I'm thinking about allocating "archival data" into a separate database with which a separate replica of the same application will work.
There are such advantages as an easy update of the archive base - the same change sets will be applied as to the active one. You don't need to change anything in the app.
The question is whether it is possible to use partitioning and sharding for this?
The idea is this:
- With the help of a rather complex script (it's a matter of understanding which records should be archived) and an additional technical column, we mark archived records.
- Using the partitioning function, we divide by this column into two partitions.
- And here is the problem with the third point. I found how to create a table from a partition file. That is, I can create a table in a new, archived database, but I need to turn it around once every half a year. And it turns out I need to add data from a partition or a partition file to an existing table in another database. Tell me if there is such a possibility? Am I thinking in the right direction or is there some other generally accepted way?
There are still attempts to make this separation using SQL queries, including inter-database ones. Here I'm concerned about the performance of inter-database queries. By the way, only the search for records that need to be archived worked for three days.
There is also an idea somehow explicitly and preferably in one place to indicate to the application which partition to work with. Then I will be sure that the division into two partitions according to the technical field will speed up all requests.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
