'Get PDB ID + Chain ID from Uniprot ID?
I have a list of Uniprot IDs and need to know the PDB IDs plus the Chain IDs. With the code given on the Uniprot website I can get the PDB IDs but not the Chain Information.
import urllib.parse
import urllib.request
url = 'https://www.uniprot.org/uploadlists/'
params = {
'from': 'ACC+ID',
'to': 'PDB_ID',
'format': 'tab',
'query': UniProtIDs
}
data = urllib.parse.urlencode(params)
data = data.encode('utf-8')
req = urllib.request.Request(url, data)
with open('UniProt_PDB_IDs.txt', 'a') as f:
with urllib.request.urlopen(req) as q:
response = q.read()
f.write(response.decode('utf-8'))
so this code gets me this:
From To
A0A075B6N1 5HHM
A0A075B6N1 5HHO
A0A075B6N1 5NQK
A0A075B6T6 1AO7
A0A075B6T6 4ZDH
for the Protein A0A075B6N1 with PDB ID 5HHM the Chains are E and J so i need a way to also retrieve the chains to get something like that:
A0A075B6N1 5HHM_E
A0A075B6N1 5HHM_J
A0A075B6N1 5HHo_E
A0A075B6N1 5NQK_B
It doesen't has to be in this format, later I convert it into a dictionary with the UniProt IDs as keys and the PDB IDs as values.
Thank you for your help in advance!
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
