'Get values of a field on previous dates

I have a table with the following data.

fecha SearchPhrase url rank
2022-03-01 keyword1 url1 1
2022-03-01 keyword2 url2 1
2022-03-01 keyword1 url1 1
2022-03-01 keyword3 url3 3
2022-03-01 keyword4 url4 7
2022-02-28 keyword1 url1 3
2022-02-28 keyword1 url1 4
2022-02-28 keyword2 url2 3
2022-02-27 keyword1 url1 3
2022-02-27 keyword2 url2 5
2022-02-27 keyword3 url3 10

Each word can have two values on the same day (I only want the lower one) and a word may not exist on some date.

I want to get, for each word that exists on the last date, the current rank and the previous date's minimum value for each word.

Result desired

fecha SearchPhrase currentRank previusRank
2022-03-01 keyword1 1 3
2022-03-01 keyword2 1 3
2022-03-01 keyword3 3 10
2022-03-01 keyword4 7

I'm using this query

WITH (SELECT max(fecha)
        FROM tableName)AS InitDate
SELECT
    t1.sDate,
    t1.SearchPhrase,
    argMin(ranking, t1.sDate) AS rankDate1,
    any(t2.ranking) AS rankDate2,
    max(t2.sDate) AS Date2
FROM
(
    SELECT
        fecha AS sDate,
       kw as  SearchPhrase,
        any(rank) AS ranking
    FROM tableName
    WHERE fecha = InitDate
    GROUP BY
        fecha,
        SearchPhrase
    ORDER BY
        fecha DESC,
        ranking ASC
) AS t1
any LEFT JOIN
(
    SELECT
        fecha,
      kw as  SearchPhrase,
        any(rank) AS ranking,
        any(fecha) AS sDate
    FROM tableName
    WHERE fecha < InitDate
    GROUP BY
        fecha,
        SearchPhrase
    ORDER BY
        fecha DESC,
        ranking ASC
) AS t2 USING (SearchPhrase)
GROUP BY
    sDate,
    SearchPhrase
LIMIT 10  

But I'm sure it can be done better



Solution 1:[1]

The basic building block, that I've used to address this problem, extracts the minimum value for each "SearchPhrase":

(
SELECT 
    fecha,
    SearchPhrase,
    MIN(`rank`)     AS currentRank
FROM 
    search_infos
GROUP BY
    SearchPhrase,
    fecha
) min_ranks

Given the information contained in this table, I've considered two ways of solving this problem:

  • Using a LAG/LEAD window function, which allows you to retrieve the previous/next value in the time-series (either date or rank value) - as already suggested in the comments. As you can see from the snippet below, the LEAD value allows us to gather the leading rank value, while the ROW_NUMBER is useful to impose a ranking on dates, where the rank=1 will represent the minimum value of date for each "SearchPhrase".

    SELECT
        fecha,
        SearchPhrase,
        currentRank,
        previusRank
    FROM (
        SELECT 
            fecha,
            SearchPhrase,
            currentRank,
            LEAD(currentRank) OVER(
                PARTITION BY SearchPhrase 
                ORDER BY     fecha DESC
            )                   AS previusRank,
            ROW_NUMBER() OVER(
                PARTITION BY SearchPhrase 
                ORDER BY     fecha DESC
            )                   AS fecha_order
        FROM min_ranks 
    ) curr_and_prec_ranks
    WHERE
        fecha_order = 1
    

    Fiddle Link: https://onecompiler.com/mysql/3xy65u37m.

  • Using a JOIN to match current date and previous date for each SearchPhrase. In this case the ROW_NUMBER is always used to keep track of the ranking on the dates and what we do is joining the same table with itself where the values of "SearchPhrase" correspond and at the same time their date rank difference is 1. The final filtering on curr.fecha_order = 1 is useful to select the last date.

    WITH min_ranks_with_fecha_order AS (
        SELECT 
            *,
            ROW_NUMBER() OVER(
                PARTITION BY SearchPhrase 
                ORDER BY     fecha DESC
            )                   AS fecha_order
        FROM min_ranks
    )
    SELECT 
        curr.fecha,
        curr.SearchPhrase,
        curr.currentRank        AS current_rank,
        prec.currentRank        AS previus_rank
    FROM 
        min_ranks_with_fecha_order curr
    LEFT JOIN 
        min_ranks_with_fecha_order prec
    ON 
        curr.SearchPhrase = prec.SearchPhrase
    AND
        curr.fecha_order = prec.fecha_order-1
    WHERE 
        curr.fecha_order = 1
    

    Fiddle Link: https://onecompiler.com/mysql/3xy5rcsw3.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1