'UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte in python. While the file is encoded in utf-8
I am getting a decoding error in the below code while the file is already is in utf-8. Please explain how I can solve this issue. I am getting errors on the loop for row in reader:
def loadPhishTank(self):
db = RedirectDB(self.runConfig)
phishFile = self.runConfig.phishTankLocation+'phishTank-'+self.runConfig.day+'.csv'
if (not os.path.exists(self.runConfig.phishTankLocation)):
os.makedirs(self.runConfig.phishTankLocation)
self.downloadPhishTank(phishFile)
with open(phishFile, encoding='utf-8') as fin:
reader = csv.DictReader(fin)
urls = {}
for row in reader:
url = row['url']
meta = {'phish_detail_url':row['phish_detail_url'],
'submission_time':row['submission_time'],
'src':'phishTank'}
urls[url] = meta
samples = random.sample(list(urls.keys()),self.runConfig.phishTankSampleSize)
for sample in samples:
db.addUrlsFromList(sample,urls[sample],'phishTank',self.runConfig)
db.close()
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
