'Python remove digits in the middle of the string

I am trying to iterate through the items in python and remove the timestamp but keep the extension

for item in items:
    print(item.split('_')[0])

Although this works but it deletes the extension as well. This how the string looks like dataset_2020-01-05.txt and this how i need it to be dataset.txt or dataset_2020-01-05.zip -> dataset.zip

I also tried this way

for item in items:
        print(item.split('_')[0] + item.split('.')[-1])

but there are some files that doesn't have timestamp and it appends .txt to those files as well, so i ended up having something like dataset.txt.txt



Solution 1:[1]

for item in items:
        front, ext = item.split('.')
        print(front.split('_')[0] + '.' + ext)

or

for item in items:
        ext = item.split('.')[-1]
        front = item.split('.')[0].split('_')[0]
        print(front + '.' + ext)

Solution 2:[2]

I would say if you have a date range, then maybe check if the date is present, and if it is present then apply the logic.

for example: if all your files contain '2020' check
if '2020' in items

Solution 3:[3]

You can utilise the RE module to help with this. For example:

import re

print(re.sub('[0-9_-]', '', 'dataset_2020-01-05.txt'))

Output:

dataset.txt

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Julius
Solution 2 ahthserhsluk
Solution 3 Albert Winestein