'How to strip ending date from string using Regex? - SQL

I want to format the strings in a table column, in a specific format.

Input table:

file_paths
my-file-path/wefw/wefw/2022-03-20
my-file-path/wefw/2022-01-02
my-file-path/wef/wfe/wefw/wef/2021-02-03
my-file-path/wef/wfe/wef/

I want to remove everything after the last / sign, if the only thing after it resembles a date (i.e. YYYY-MM-dd or ####-##-##).

Output:

file_paths
my-file-path/wefw/wefw/
my-file-path/wefw/
my-file-path/wef/wfe/wefw/wef/
my-file-path/wef/wfe/wef/

I'm thinking of doing something like:

SELECT regexp_replace(file_paths, 'regex_here', '', 1, 'i')
FROM my_table

I'm unsure of how to write the RegEx for this though. I'm also open to easier methods of string manipulation, if there are any. Thanks in advance!



Solution 1:[1]

You can use

REGEXP_REPLACE ( file_paths, '/[0-9]{4}-[0-9]{1,2}-[0-9]{1,2}$', '/' )

See the regex demo.

The /[0-9]{4}-[0-9]{1,2}-[0-9]{1,2}$ is a POSIX ERE compliant pattern matching

  • / - a slash
  • [0-9]{4} - four digits
  • - - a hyphen
  • [0-9]{1,2} - one or two digits
  • -[0-9]{1,2} - a hyphen and one or two digits
  • $ - end of string.

If your values can contain trailing whitespace, insert [[:space:]]* before $: /[0-9]{4}-[0-9]{1,2}-[0-9]{1,2}[[:space:]]*$.

Solution 2:[2]

You may use this regex:

^(.*?\/)\d{4}-\d{2}-\d{2}$

You may try this query:

select regexp_replace(file_paths, '^(.*?\/)\\d{4}-\\d{2}-\\d{2}$','$1')

Demo

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Wiktor Stribiżew
Solution 2