'UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte in python. While the file is encoded in utf-8

I am getting a decoding error in the below code while the file is already is in utf-8. Please explain how I can solve this issue. I am getting errors on the loop for row in reader:

def loadPhishTank(self):

    db = RedirectDB(self.runConfig)
    phishFile = self.runConfig.phishTankLocation+'phishTank-'+self.runConfig.day+'.csv'
    if (not os.path.exists(self.runConfig.phishTankLocation)):
      os.makedirs(self.runConfig.phishTankLocation)
    self.downloadPhishTank(phishFile)
    with open(phishFile, encoding='utf-8') as fin:
      reader = csv.DictReader(fin)
      urls = {}
      for row in reader:
        url = row['url']
        meta = {'phish_detail_url':row['phish_detail_url'],
                'submission_time':row['submission_time'],
                'src':'phishTank'}
        urls[url] = meta
      samples = random.sample(list(urls.keys()),self.runConfig.phishTankSampleSize)
      for sample in samples:
        db.addUrlsFromList(sample,urls[sample],'phishTank',self.runConfig)
    db.close()

python utf-8

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source

'UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte in python. While the file is encoded in utf-8

Sources

Related Questions