'Randomly selecting rows and updating another table per row in Postgres

I'm using postgres 10 and I'm looking to randomise some data.

I start by creating a temporary table and fill it with 1,000 rows of random data.

I then want to merge that into another table that may have less or more rows than the random data.

For each row in my dimension table I want to select a random row from the random data in the temporary table, setting the values in the dimension table to the randomly selected rows values in the temporary table.

eg.

I have a table called reference.tv_shows with the fields Name and Category.

I have a temporary table called random_tv_shows with the fields Name and Category. This data is completely random and consists of 1,000 rows.

I want to go through EACH row in the reference.tv_shows and pick a random row in the random_tv_shows table and set the reference.tv_shows Name and Category to be that of the selected row in random_tv_shows.

I tried running a fairly simple select but it looks as though it evaluates itself once then updates (Or maybe RANDOM() is only random once per TX?).

UPDATE reference.tv_shows SET "Name" = (SELECT "Name" FROM random_tv_shows ORDER BY RANDOM() LIMIT 1)

Is there a way to do this in postgres?



Solution 1:[1]

When I have a test table, with the field a which is an integer,

If I do this:

update test set a=random()*1000;

If wil get random values for every record in my table.

But when I do this:

update test set a=(select random()*1000);

All values for a will be the same.

This is shown in this DBFIDDLE

Because, when updating the table reference.tv_shows, you only want 1 tv_show to be updated, you need to have a unique identifier for every tv_show. currently that info is not available in the question.

EDIT: I tried to reproduce your data (less records, and lack of imagination on categories, but... ?).

When you have a unique id in your tables you can do:

UPDATE tv_shows 
SET Name = rts.Name,
    Category =  rts.Category
FROM tv_shows ts
INNER JOIN (SELECT ROW_NUMBER() OVER () R, Name, Category 
            FROM random_tv_shows
            ORDER BY RANDOM()) rts on rts.R = ts.id
WHERE tv_shows.id = ts.id

see DBFIDDLE

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1