'How to decode scrapy responses? Scrapy response differs from Requests module response
I've been using scrapy for about 1 year and for the first time I'm facing an encoding problem. I don't use any arguments in the definitions for headers or cookies as I use an external proxy rotation service. And I've already verified that the request is done correctly (utf-8).
What happens is that when I make the request with the "requests" module I can get the source code correctly encoded/decoded and using a scrapy.Selector it is possible to get what I need. When I use scrapy I'm having problems with the decode, several special characters appear and I need to encode in latin-1 and again decode in utf-8:
cache_response = response
try:
response = response.text.encode("latin-1").decode("utf-8")
response = Selector(text=response)
except Exception as e:
LOGGER.error("Exception Encoding")
response = cache_response
Scrapy vs requests (Same Headers):
Requests
<li>Certificação energética:</li>
Scrapy
<li>Certificação energética:
Scrapy response.css('span[id="foo"]::text').get() -----> 107 m² construÃ\xaddos, 64 m² úteis
How can i make the response object come formatted correctly?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
