'Extract subset of data from database based on FKs, and import to another database

Consider a database with 50-100 tables that all can trace all their FK dependencies directly and indirectly (up to 4-5 relationships deep) back to "Table1".

In principle, if one record of Table1 were deleted, and cascading deletes were enabled, then a subset of data would be deleted from all tables.

My goal is not to delete that data; instead I'd like to be able to script out (copy, not delete) that subset of data to a file. Then I'd like to be able to import that data into another non-empty database with the same schema. Of course I need to preserve all those relationships and not clash with any existing keys in the second database.

I realize this is a non-trivial task. The closest similar topic I found was here. (Edit: This requirement is new to an existing and successful SQL database. I know it is ugly and possibly not practical.)

My question is: What tools and techniques can I use to simplify this task?

We're using SQL Server and currently use .NET, EF6, C#. The data subset would be about 1-3 MB in size.

Thanks!

Edit: Remove some speculative statements, and added example tables below where I would like to pull out all the data with relationships to a single record in Table1:

Table1
    Table1_Id (PK)
    Data1

Foo
    Foo_Id (PK)
    Table1_Id (FK)
    FooData

Bar
    Bar_Id (PK)
    Table1_Id (FK)
    BarData

Ding
    Ding_Id (PK)
    Foo_Id (FK)
    Bar_Id (FK)
    DingData

Solution 1:^[1]

If you want this to happen OnDelete, my first thought is that you need to use triggers, and you'd either need to put a trigger on every table that is Parent in the relationship chain, or the one trigger on "Table1" would need to call a procedure that goes all the way down the chain.

Since you first want to write the deleted data to a file, I'm guessing that the second database you want to populate isn't on the same network as the first.

So what you can do, using the single trigger method, is have the trigger populate a table that simply has the PK of the row being deleted from "Table1". Then start a job that calls an SSIS package. The package reads the table to find out what row was deleted, and then begins a series of dataflows to create your file(s).

Finally other dataflows would upload the file(s) to the second database. Those dataflows could be in the same package, a different package, same agent job, different agent job, depending on what works best in the details.

Alternately, if you put a trigger on every parent table, you can have that trigger perform a bulk export of the related rows in the child table. Then a scheduled job can run every now and then and import any files it finds into the second database.

Solution 2:^[2]

You can do this with the SQL Data Compare tool, as long as each of the tables have a primary key. The tool can be found here: https://docs.microsoft.com/en-us/sql/ssdt/download-sql-server-data-tools-ssdt?view=sql-server-ver15

Solution 3:^[3]

Consider using open-source (MIT license) SqlSizer project. I think you should be able to achieve what you want with that tool.

It allows to easily find the subset of that data, extract it and copy to empty database.

The project website: https://github.com/sqlsizer/mssql-sqlsizer

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source
Solution 1	Tab Alleman
Solution 2	Bill Blair
Solution 3	marcin.golebiowski

'Extract subset of data from database based on FKs, and import to another database

Solution 1:[1]

Solution 2:[2]

Solution 3:[3]

Sources

Related Questions

Solution 1:^[1]

Solution 2:^[2]

Solution 3:^[3]