'Get values of a field on previous dates
I have a table with the following data.
| fecha | SearchPhrase | url | rank |
|---|---|---|---|
| 2022-03-01 | keyword1 | url1 | 1 |
| 2022-03-01 | keyword2 | url2 | 1 |
| 2022-03-01 | keyword1 | url1 | 1 |
| 2022-03-01 | keyword3 | url3 | 3 |
| 2022-03-01 | keyword4 | url4 | 7 |
| 2022-02-28 | keyword1 | url1 | 3 |
| 2022-02-28 | keyword1 | url1 | 4 |
| 2022-02-28 | keyword2 | url2 | 3 |
| 2022-02-27 | keyword1 | url1 | 3 |
| 2022-02-27 | keyword2 | url2 | 5 |
| 2022-02-27 | keyword3 | url3 | 10 |
Each word can have two values on the same day (I only want the lower one) and a word may not exist on some date.
I want to get, for each word that exists on the last date, the current rank and the previous date's minimum value for each word.
Result desired
| fecha | SearchPhrase | currentRank | previusRank |
|---|---|---|---|
| 2022-03-01 | keyword1 | 1 | 3 |
| 2022-03-01 | keyword2 | 1 | 3 |
| 2022-03-01 | keyword3 | 3 | 10 |
| 2022-03-01 | keyword4 | 7 |
I'm using this query
WITH (SELECT max(fecha)
FROM tableName)AS InitDate
SELECT
t1.sDate,
t1.SearchPhrase,
argMin(ranking, t1.sDate) AS rankDate1,
any(t2.ranking) AS rankDate2,
max(t2.sDate) AS Date2
FROM
(
SELECT
fecha AS sDate,
kw as SearchPhrase,
any(rank) AS ranking
FROM tableName
WHERE fecha = InitDate
GROUP BY
fecha,
SearchPhrase
ORDER BY
fecha DESC,
ranking ASC
) AS t1
any LEFT JOIN
(
SELECT
fecha,
kw as SearchPhrase,
any(rank) AS ranking,
any(fecha) AS sDate
FROM tableName
WHERE fecha < InitDate
GROUP BY
fecha,
SearchPhrase
ORDER BY
fecha DESC,
ranking ASC
) AS t2 USING (SearchPhrase)
GROUP BY
sDate,
SearchPhrase
LIMIT 10
But I'm sure it can be done better
Solution 1:[1]
The basic building block, that I've used to address this problem, extracts the minimum value for each "SearchPhrase":
(
SELECT
fecha,
SearchPhrase,
MIN(`rank`) AS currentRank
FROM
search_infos
GROUP BY
SearchPhrase,
fecha
) min_ranks
Given the information contained in this table, I've considered two ways of solving this problem:
Using a
LAG/LEADwindow function, which allows you to retrieve the previous/next value in the time-series (either date or rank value) - as already suggested in the comments. As you can see from the snippet below, the LEAD value allows us to gather the leading rank value, while theROW_NUMBERis useful to impose a ranking on dates, where the rank=1 will represent the minimum value of date for each "SearchPhrase".SELECT fecha, SearchPhrase, currentRank, previusRank FROM ( SELECT fecha, SearchPhrase, currentRank, LEAD(currentRank) OVER( PARTITION BY SearchPhrase ORDER BY fecha DESC ) AS previusRank, ROW_NUMBER() OVER( PARTITION BY SearchPhrase ORDER BY fecha DESC ) AS fecha_order FROM min_ranks ) curr_and_prec_ranks WHERE fecha_order = 1Fiddle Link: https://onecompiler.com/mysql/3xy65u37m.
Using a
JOINto match current date and previous date for each SearchPhrase. In this case theROW_NUMBERis always used to keep track of the ranking on the dates and what we do is joining the same table with itself where the values of "SearchPhrase" correspond and at the same time their date rank difference is 1. The final filtering oncurr.fecha_order = 1is useful to select the last date.WITH min_ranks_with_fecha_order AS ( SELECT *, ROW_NUMBER() OVER( PARTITION BY SearchPhrase ORDER BY fecha DESC ) AS fecha_order FROM min_ranks ) SELECT curr.fecha, curr.SearchPhrase, curr.currentRank AS current_rank, prec.currentRank AS previus_rank FROM min_ranks_with_fecha_order curr LEFT JOIN min_ranks_with_fecha_order prec ON curr.SearchPhrase = prec.SearchPhrase AND curr.fecha_order = prec.fecha_order-1 WHERE curr.fecha_order = 1Fiddle Link: https://onecompiler.com/mysql/3xy5rcsw3.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
