'Iterate on page of returning execute_values

http://initd.org/psycopg/docs/extras.html

psycopg2.extras.execute_values has a parameters page_size. I'm doing an INSERT INTO... ON CONFLICT... with RETURNING ID.

The problem is that the cursor.fetchall() give me back only the last "page", that is, 100 ids (default of page_size).

Without modifying page_size parameters, is it possible to iterate over the results, to get the total number of rows updated ?



Solution 1:[1]

I ran into the same issue. I work around this issue by batching my calls to execute_values(). I'll set my_page_size=1000, then iterate over my values, filling argslist until i have my_page_size items. Then I'll call execute_values(cur, sql, argslist, page_size=my_page_size). And iterate over cur to get those ids.

Solution 2:[2]

Without modifying page_size parameters, is it possible to iterate over the results, to get the total number of rows updated ?

Yes.

try:
    conn = psycopg2.connect(...)
    cur = conn.cursor()
    query = """
        WITH
            items (eggs) AS (VALUES %s), 
            inserted AS (
                INSERT INTO spam (eggs)
                SELECT eggs FROM items
                ON CONFLICT (eggs) DO NOTHING
                RETURNING id
            )
        SELECT id FROM spam
        WHERE eggs IN (SELECT eggs FROM items)
        UNION
        SELECT id FROM inserted
    """
    eggs = (('egg_{}'.format(i % 666),) for i in range(10_000))
    ids = psycopg2.extras.execute_values(cur, query, argslist=eggs, fetch=True)
    # Do whatever with `ids`. `len(ids)` I suppose?
finally:
    if connection:
        cur.close()
        conn.close()

I overkilled query on purpose to address some gotchas:

  • WITH items (eggs) AS (VALUES %s) is done to be able to use argslist in two places at once;
  • RETURNING with ON CONFLICT will return only ids which were actually inserted, conflicting ones are omitted from INSERT' direct results. To solve that all this SELECT ... WHERE ... UNION SELECT mumbo jumbo is done;
  • to get all values which you asked for: ids = psycopg2.extras.execute_values(..., fetch=True).

A horrible interface oddity considering that all other cases are done like

cur.execute(...)  # or other kind of `execute`
rows = cur.fetchall()  # or other kind of `fetch`

So if you want only the number of inserted rows then do

try:
    conn = psycopg2.connect(...)
    cur = conn.cursor()
    query = """
        INSERT INTO spam (eggs)
        VALUES %s
        ON CONFLICT (eggs) DO NOTHING
        RETURNING id
    """
    eggs = (('egg_{}'.format(i % 666),) for i in range(10_000))
    ids = psycopg2.extras.execute_values(cur, query, argslist=eggs, fetch=True)
    print(len(ids)
finally:
    if connection:
        cur.close()
        conn.close()

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Brian DeRocher
Solution 2