'MariaDB CTE syntax for parameter list

I'm trying to replace an expensive where in ... query with a CTE. I've done this previously in other sql dialects, but can't get the syntax right for MariaDB/MySQL.

QUERY TO REPLACE

select o from orders as o where o.id in :orderIds

The list of id's is dynamic. It is a Spring Boot application using JPA. Now I know I have to create a native query with the optional parameter like this to get the CTE to work:

@Query("select o from orders as o from where o.id in :orderIds", nativeQuery = true)
List<OrderEntity> findOrdersByIds(@Param("orderIds") List<String> orderIds);

Above you can see the dynamic parameter list of ids being passed in (and used).

My question is regarding how to insert the dynamic list of orderIds into the CTE. Something like:

WITH orderIds (id) as (
SELECT * FROM :orderIds // What to type here?
) SELECT o.* FROM orderIds
  INNER JOIN orders o ON o.id = orderIds.id

As you can see line #2 above is what I need help with. Let me know if you know how.



Solution 1:[1]

So, I managed to get the query right, and did comparisons. TL;DR; the where...in query is faster than the CTE, at least in my measurements. This is what I did, feedback welcome.

I created a new test table with two columns:

CREATE TABLE `perf_table` (
  `system_id` varchar(255) NOT NULL,
  `textcol` varchar(255) DEFAULT NULL,
  PRIMARY KEY (`system_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

I filled it with random data (uuid() and random strings in textcol). I added 10 000 rows, using this call in a for loop:

INSERT INTO perf_table VALUES (UUID(), SUBSTR(MD5(RAND()), 1, 10));

I Created a helper method in code to generate the actual query string, and execute the native query. Here's the CTE version:

WITH cteIds AS ((SELECT '<random-id-1>' AS id) UNION (SELECT '<random-id-2>' AS id))
select p.* from cteIds inner join perf_table p on p.system_id = cteIds.id

Using just 2 ids here is just an example, I generated this string dynamically in code. It's a first naive implementation, but it works.

  1. Try it out with both where...in and CTE, and compare. I picked 30 random values from the perf_table and executed the query on both the first col (primary key) and second col (random content). No indexes were created.

  2. Since the CTE was so much slower I also tried creating a separate table completely, id_table, inserted the 30 ids and did a simple select in the CTE, since I suspected the UNION isn't the optimal way (?).

The query then becomes:

WITH cteIds AS (SELECT * from id_table) select p.* from cteIds inner join perf_table p on p.textcol = cteIds.id

Again, this was just a test, but it still didn't make a difference. All in all, doing a where...in on 30 random values in a 10 000 row table takes around 2 ms. Doing the same with CTE takes 24 ms.

Did I miss anything obvious? I'm asking since the difference is so large.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Henrik Oscarsson