'how to exclude null values in array_agg like in string_agg using postgres?

If I use array_agg to collect names, I get my names separated by commas, but in case there is a null value, that null is also taken as a name in the aggregate. For example :

SELECT g.id,
       array_agg(CASE WHEN g.canonical = 'Y' THEN g.users ELSE NULL END) canonical_users,
       array_agg(CASE WHEN g.canonical = 'N' THEN g.users ELSE NULL END) non_canonical_users
FROM groups g
GROUP BY g.id;

it returns ,Larry,Phil instead of just Larry,Phil (in my 9.1.2, it shows NULL,Larry,Phil).

Instead, if I use string_agg(), it shows me only the names (without empty commas or nulls).

The problem is that I have Postgres 8.4 installed on the server, and string_agg() doesn't work there. Is there any way to make array_agg work similar to string_agg() ?



Solution 1:[1]

With postgresql-9.3 one can do this;

SELECT g.id,
   array_remove(array_agg(CASE WHEN g.canonical = 'Y' THEN g.users ELSE NULL END), NULL) canonical_users,
   array_remove(array_agg(CASE WHEN g.canonical = 'N' THEN g.users ELSE NULL END), NULL) non_canonical_users
FROM groups g 
GROUP BY g.id;

Update: with postgresql-9.4;

SELECT g.id,
   array_agg(g.users) FILTER (WHERE g.canonical = 'Y') canonical_users,
   array_agg(g.users) FILTER (WHERE g.canonical = 'N') non_canonical_users
FROM groups g 
GROUP BY g.id;

Update (2022-02-19): also with postgresql-9.4;

This results in an empty array when all values in an array are null instead of returning null;

SELECT g.id,
  coalesce( array_agg(g.users) FILTER (WHERE g.canonical = 'Y'), '{}' ) canonical_users,
  coalesce( array_agg(g.users) FILTER (WHERE g.canonical = 'N'), '{}' ) non_canonical_users
FROM groups g 
GROUP BY g.id;

Solution 2:[2]

If you are looking for a modern answer to the general question of how to remove a NULL from an array, it is:

array_remove(your_array, NULL)

I was specifically curious about performance and wanted to compare this to the best possible alternative:

CREATE OR REPLACE FUNCTION strip_nulls(
    IN array_in ANYARRAY
)
RETURNS anyarray AS
'
SELECT
    array_agg(a)
FROM unnest(array_in) a
WHERE
    a IS NOT NULL
;
'
LANGUAGE sql
;

Doing a pgbench test proved (with high confidence) that array_remove() is a little more than twice as fast. I did my test on double precision numbers with a variety of array sizes (10, 100 and 1000 elements) and random NULLs in between.


It's also worth noting that this can be used to remove blanks ('' != NULL). But the second parameter accepts anyelement, and since it is most likely they you'd be indicating a blank with a string literal, make sure to cast it to the form you want, usually a non-array.

For example:

select array_remove(array['abc', ''], ''::text);

If you try:

select array_remove(array['abc', ''], '');

it will assume that the '' is TEXT[] (array) and will throw this error:

ERROR: malformed array literal: ""

Solution 3:[3]

In solving the general question of removing nulls from array aggregates there are two main ways of attacking the problem: either doing array_agg(unnest(array_agg(x)) or creating a custom aggregate.

The first is of the form shown above:

SELECT 
    array_agg(u) 
FROM (
    SELECT 
        unnest(
            array_agg(v)
        ) as u 
    FROM 
        x
    ) un
WHERE 
    u IS NOT NULL;

The second:

/*
With reference to
http://ejrh.wordpress.com/2011/09/27/denormalisation-aggregate-function-for-postgresql/
*/
CREATE OR REPLACE FUNCTION fn_array_agg_notnull (
    a anyarray
    , b anyelement
) RETURNS ANYARRAY
AS $$
BEGIN

    IF b IS NOT NULL THEN
        a := array_append(a, b);
    END IF;

    RETURN a;

END;
$$ IMMUTABLE LANGUAGE 'plpgsql';

CREATE AGGREGATE array_agg_notnull(ANYELEMENT) (
    SFUNC = fn_array_agg_notnull,
    STYPE = ANYARRAY,
    INITCOND = '{}'
);

Calling the second is (naturally) a little nicer looking than the first:

select array_agg_notnull(v) from x;

Solution 4:[4]

I am adding this even though this this thread is quite old, but I ran into this neat trick that works quite well on small arrays. It runs on Postgres 8.4+ without additional libraries or functions.

string_to_array(array_to_string(array_agg(my_column)))::int[]

The array_to_string() method actually gets rid of the nulls.

Solution 5:[5]

You should wrap your array_agg with array_remove.

SELECT g.id,
       array_remove(array_agg(CASE WHEN g.canonical = 'Y' THEN g.users ELSE NULL END), NULL) canonical_users,
       array_remove(array_agg(CASE WHEN g.canonical = 'N' THEN g.users ELSE NULL END), NULL) non_canonical_users
FROM groups g
GROUP BY g.id;

Solution 6:[6]

As has been suggested in the comments you can write a function to replace nulls in an array, however as also pointed out in the thread linked to in the comments, this kind of defeats the efficiency of the aggregate function if you have to create an aggregate, split it then aggregate it again.

I think keeping nulls in the array is just a (perhaps unwanted) feature of Array_Agg. You could use subqueries to avoid this:

SELECT  COALESCE(y.ID, n.ID) ID,
        y.Users,
        n.Users
FROM    (   SELECT  g.ID, ARRAY_AGG(g.Users) AS Users
            FROM    Groups g
            WHERE   g.Canonical = 'Y'
            GROUP BY g.ID
        ) y
        FULL JOIN 
        (   SELECT  g.ID, ARRAY_AGG(g.Users) AS Users
            FROM    Groups g
            WHERE   g.Canonical = 'N'
            GROUP BY g.ID
        ) n
            ON n.ID = y.ID

SQL FIDDLE

Solution 7:[7]

It is very simple, just first of all create a new - (minus) operator for text[]:

CREATE OR REPLACE FUNCTION diff_elements_text
    (
        text[], text[] 
    )
RETURNS text[] as 
$$
    SELECT array_agg(DISTINCT new_arr.elem)
    FROM
        unnest($1) as new_arr(elem)
        LEFT OUTER JOIN
        unnest($2) as old_arr(elem)
        ON new_arr.elem = old_arr.elem
    WHERE old_arr.elem IS NULL
$$ LANGUAGE SQL IMMUTABLE;

CREATE OPERATOR - (
    PROCEDURE = diff_elements_text,
    leftarg = text[],
    rightarg = text[]
);

And simply subtract the array[null]:

select 
    array_agg(x)-array['']
from
    (   select 'Y' x union all
        select null union all
        select 'N' union all
        select '' 
    ) x;

That's all:

{Y, N}

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2
Solution 3 Community
Solution 4 ced-b
Solution 5 Lulu
Solution 6 GarethD
Solution 7