'Requests: Proxy not working in people_also_ask module

I am scraping search results from google using people_also_ask module. The module itself dont have method to use proxies but I manually added proxies in the module. When I got blocked from google I printed the status and it was printing my ip address was banned from sending requests. The code I added in people_also_ask module to use proxies is

            proxies = {
                    'http' : "http://username:passward@ip:port"
                        }
            response = SESSION.get(URL, params=params, headers=HEADERS, proxies=proxies)

.I know it is an illegal activity but I want to know why it happens for education purpose mainly. I think the code to extract the data is irrelevant so I am adding simple code to send request using people_also_ask module

import people_also_ask as paa
queries = ["how to boil eggs","how to make cake","price of poco f1","price of wooden table","best soap in us","how much tesla worth"]
for query in queries:
    questions = paa.get_related_questions(query ,40)

Note: The changes are made in first function named search() of google.py of people_also_people module

Note: I am doing searchs from browser without any problem. why is google allowing me to use google but blocked from using the script



Solution 1:[1]

The answer is quite simple. Although it is a proxy service, it doesn't guarantee 100% anonymity. When you send the HTTP GET request via the proxy server, the request sent by your program to the proxy server is:

GET http://www.whatsmybrowser.org/ HTTP/1.1
Host: www.whatsmybrowser.org
Connection: keep-alive
Accept-Encoding: gzip, deflate
Accept: */*
User-Agent: python-requests/2.10.0

Now, when the proxy server sends this request to the actual destination, it sends:

GET http://www.whatsmybrowser.org/ HTTP/1.1
Host: www.whatsmybrowser.org
Accept-Encoding: gzip, deflate
Accept: */*
User-Agent: python-requests/2.10.0
Via: 1.1 naxserver (squid/3.1.8)
X-Forwarded-For: 122.126.64.43
Cache-Control: max-age=18000
Connection: keep-alive

As you can see, it throws your IP (in my case, 122.126.64.43) in the HTTP header: X-Forwarded-For and hence the website knows that the request was sent on behalf of 122.126.64.43

Read more about this header at: https://www.rfc-editor.org/rfc/rfc7239

If you want to host your own squid proxy server and want to disable setting X-Forwarded-For header, read: http://www.squid-cache.org/Doc/config/forwarded_for/

I dont get any credit for the answer I copied this answer from the following post I found Python Requests module - proxy not working

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 farhan jatt