'How to extract a certain substring from string using regex in python - dictionary value? [closed]

I have a str 'data' below contains some url content. using regex I want to only extract "url":"https://www.google.com". Here's what I tried so far:

impor re
url = re.findall('"url":.*', data)

Any help would be appreciated.

{"__ref":"QXNzZXRDb250cmFjdFR5cGU6NDU0OTE3"},"decimals":null,"relayId":"QXNzZXRUeXBlOjE1MDQ5OTUxMQ==","favoritesCount":10,"isDelisted":false,"isFavorite":false,"isFrozen":false,"hasUnlockableContent":false,"tokenId":"3379","collection":{"__ref":"Q29sbGVjdGlvblR5cGU6MzY3NDA5MQ=="},"orderData":{"__ref":"client:QXNzZXRUeXBlOjE1MDQ5OTUxMQ==:orderData"},"name":"Flame River Chumbi","authenticityMetadata":null,"imageUrl":"https:\u002F\u002Flh3.googleusercontent.com\u002FtM1wwmVb1dA5DadG-QXFBd1nWnqi6DRh7fL0nHgjr2eZkSMKIhcPcwSuBLEvRIgOQRuFDKiovy1kpHSGz4NciKw6oSON-_gdz8aXKw","creator":null,"assetOwners(first:1)":{"__ref":"client:QXNzZXRUeXBlOjE1MDQ5OTUxMQ==:assetOwners(first:1)"},"animationUrl":null,"backgroundColor":null,"description":"Chumbi are adorable bipedal creatures that inhabit a lush and mysterious forest valley. They have a spiritual connection to the forest and use magical spells to maintain and protect it.","isListable":true,"isReportedSuspicious":false,"traits(first:100)":{"__ref":"client:QXNzZXRUeXBlOjE1MDQ5OTUxMQ==:traits(first:100)"},"displayImageUrl":"https:\u002F\u002Flh3.googleusercontent.com\u002FtM1wwmVb1dA5DadG-QXFBd1nWnqi6DRh7fL0nHgjr2eZkSMKIhcPcwSuBLEvRIgOQRuFDKiovy1kpHSGz4NciKw6oSON-_gdz8aXKw","externalLink":null,"isEditableByOwner":{"__ref":"client:QXNzZXRUeXBlOjE1MDQ5OTUxMQ==:isEditableByOwner"}, "url":"https:://google.com" ,"__ref":"client:QXNzZXRUeXBlOjE1MDQ5OTUxMQ==:traits(first:100)","displayImageUrl":"https:\u002F\u002Flh3.googleusercontent.com\u002FtM1wwmVb1dA5DadG-QXFBd1nWnqi6DRh7fL0nHgjr2eZkSMKIhcPcwSuBLEvRIgOQRuFDKiovy1kpHSGz4NciKw6oSON-_gdz8aXKw","externalLink":null,"isEditableByOwner":{"__ref":"client:QXNzZXRUeXBlOjE1MDQ5OTUxMQ==:isEditableByOwner"}



Solution 1:[1]

Using regex for this kid of task is really not appropriate ... The data seems to be a JSON "encoded" string. In python you can convert it to a dict and extract its value:

import json

encodedData = '{"ref":{"__ref":"QXNzZXRDb250cmFjdFR5cGU6NDU0OTE3"},"decimals":null,"relayId":"QXNzZXRUeXBlOjE1MDQ5OTUxMQ==","favoritesCount":10,"isDelisted":false,"isFavorite":false,"isFrozen":false,"hasUnlockableContent":false,"tokenId":"3379","collection":{"__ref":"Q29sbGVjdGlvblR5cGU6MzY3NDA5MQ=="},"orderData":{"__ref":"client:QXNzZXRUeXBlOjE1MDQ5OTUxMQ==:orderData"},"name":"Flame River Chumbi","authenticityMetadata":null,"imageUrl":"https:\u002F\u002Flh3.googleusercontent.com\u002FtM1wwmVb1dA5DadG-QXFBd1nWnqi6DRh7fL0nHgjr2eZkSMKIhcPcwSuBLEvRIgOQRuFDKiovy1kpHSGz4NciKw6oSON-_gdz8aXKw","creator":null,"assetOwners(first:1)":{"__ref":"client:QXNzZXRUeXBlOjE1MDQ5OTUxMQ==:assetOwners(first:1)"},"animationUrl":null,"backgroundColor":null,"description":"Chumbi are adorable bipedal creatures that inhabit a lush and mysterious forest valley. They have a spiritual connection to the forest and use magical spells to maintain and protect it.","isListable":true,"isReportedSuspicious":false,"traits(first:100)":{"__ref":"client:QXNzZXRUeXBlOjE1MDQ5OTUxMQ==:traits(first:100)"},"displayImageUrl":"https:\u002F\u002Flh3.googleusercontent.com\u002FtM1wwmVb1dA5DadG-QXFBd1nWnqi6DRh7fL0nHgjr2eZkSMKIhcPcwSuBLEvRIgOQRuFDKiovy1kpHSGz4NciKw6oSON-_gdz8aXKw","externalLink":null,"isEditableByOwner":{"__ref":"client:QXNzZXRUeXBlOjE1MDQ5OTUxMQ==:isEditableByOwner"},"url":"https:://google.com","__ref":"client:QXNzZXRUeXBlOjE1MDQ5OTUxMQ==:traits(first:100)","displayImageUrl":"https:\u002F\u002Flh3.googleusercontent.com\u002FtM1wwmVb1dA5DadG-QXFBd1nWnqi6DRh7fL0nHgjr2eZkSMKIhcPcwSuBLEvRIgOQRuFDKiovy1kpHSGz4NciKw6oSON-_gdz8aXKw","externalLink":null,"isEditableByOwner":{"__ref":"client:QXNzZXRUeXBlOjE1MDQ5OTUxMQ==:isEditableByOwner"}}'

data = json.loads(encodedData)
url = data['url']

print(url) # https:://google.com

Note: your pasted data seems to be missing some part of the json. I added {"ref": at the beginning and } at the end.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 homer