'validation-remove currency symbol from price

I have this one string, which is actually price, this price value comes with any currency symbol (currency_list), I am trying to remove these currency symbols from price and return only price.\

Till now I am able to do it for prefix and suffix currency symbol using below code , everything works till here.

I just want to add one validation where if the symbol is not prefix or suffix like "200$434" in btw, then it should return not valid format. which I am not able to understand how should be implemented.

currency_list = ['USD', 'UNITED STATES DOLLAR', '$', 'EUR', 'EURO', '€', 'GBP','BRITISH POUND', '£']

Normally input string can be

"$1212212"
"1212212EURO"
"1212212"
"1212212 BRITISH POUND"

need help to validate values like "1212$343" or "1212212EURO323.23"

Code:

for symb in currency_list:
    if symb in amount:
        data = amount.replace(symb, '')


Solution 1:[1]

You can use regex to achieve your purpose.

import re

currency_list = ['USD', 'UNITED STATES DOLLAR', '$', 'EUR', 'EURO', '€', 'GBP', 'BRITISH POUND', '£']

p = re.compile(r'([\D]*)([\d]+\.?[\d]+)(.*)')

def verify_or_get_amount(amount):
    first, mid, last = [i.strip() for i in p.search(amount).groups()]

    if (first and first not in currency_list) or (last and last not in currency_list):
        print('invalid:', amount)
    else:
        amount = mid
        print('amount:', amount)
    return mid


for i in ['EURO123', 'EURO 123', 'EURO 123.', 'EURO .12', 'EURO 12.12', '$1212212', '1212212EURO', '1212212', '1212212 BRITISH POUND', '1212$343']:
    verify_or_get_amount(i)

Solution 2:[2]

After going through multiple blog post, I found this answer which gets the job done.

def validateCurrency(amount):
new_amount=None
for cur in currency_list:
    if amount.startswith(cur) or amount.endswith(cur):
        new_amount = amount.replace(cur, "", 1)

if new_amount == None:
    return "Currency is not valid a string."
return f"Price after removeing symbol is {new_amount}"

// print(validateCurrency('$1212212'))

Solution 3:[3]

using regex:

import re

currency_list = ['USD', 'UNITED STATES DOLLAR', '\$', 'EUR', 'EURO', '€', 'GBP', 'BRITISH POUND', '£']
currencies = '|'.join(currency_list)


c = re.compile(rf'^({currencies})? *(\d+(\.\d+)?) *({currencies})?$')

for i in ['$1212212', '1212212EURO', '1212212', '1212212 BRITISH POUND', '1212$343']:
    match_obj = c.match(i)
    if match_obj:
        print(match_obj.group(2))
    else:
        print('not found')

output :

1212212
1212212
1212212
1212212
not found

Explanation :

to see actual pattern : print(c.pattern) which gives :

^(USD|UNITED STATES DOLLAR|\$|EUR|EURO|€|GBP|BRITISH POUND|£)?(\d+(\.\d+)?) *(USD|UNITED STATES DOLLAR|\$|EUR|EURO|€|GBP|BRITISH POUND|£)?$

I've escaped $ in the currency_list.

currencies = '|'.join(currency_list) for building possible prefixes or suffixes.

(\d+(\.\d+)?) is for matching price which accept float as well. (you can omit the (\.\d+) part)

the * that you see in regex, is for for example BRITISH POUND which have a space after the number.

Solution 4:[4]

I am assuming you want a currency validation function

def validateCurrency(input):
    input_length = len(input)
    if input.isdigit():return False
    split = [re.findall(r'(\D+?)(\d+)|(\d+?)(\D+)', input)[0] ]

    total_length = 0
    for i in split[0]:
        if i in currency_list:
            total_length+=len(i)
        if str(i).isdigit():
            total_length+=len(i)
    if total_length == input_length:
       return True
    else:
       return False

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 fitz
Solution 2
Solution 3
Solution 4 Prathamesh Patkar