'Scraping live stock price using google finance

I have recently started learning python and one of my first project is to get live stock prices from Google finance using beautifulsoup. Basically I am looking up for a stock and setting a price alert.

here is what my code looks like.

import requests
import time
import tkinter
from bs4 import BeautifulSoup

def st_Price(symbol):
    baseurl = 'http://google.com/finance/quote/'
    URL = baseurl + symbol + ":NSE?hl=en&gl=in"
    
    page = requests.get(URL)
    
    soup = BeautifulSoup(page.content, 'html.parser')
    
    results = soup.find(class_="YMlKec fxKbKc")

    result = results.__str__()
    #print(result)

    res = result.split("₹")[1].split("<")[0]

    res_flt = float(res.replace(",",""))
      
    return res_flt
        
def main():
    
    sym = input("Enter Stock Symbol : ")
    price = input("Enter desired price : ")
    
    
    x = st_Price(sym)
    
    while x < float(price):
        print(x)
        t1 = time.perf_counter()
        x = st_Price(sym)
        t2 = time.perf_counter()
        print("Internal refresh time is {}".format(t2-t1))
    else:
        print("The Stock {} achieved price greater than {}".format(sym,x))
        root = tkinter.Tk()
        root.geometry("150x150")
        tkinter.messagebox.showinfo(title="Price Alert",message="Stock Price {} greater Than {}".format(x,price))
        root.destroy()
    

if __name__ == "__main__":
    main()

I am looking up following class in the Page HTML:

HTML element for the Stock

The code works perfectly fine but it takes too much time to fetch the information:

Enter Stock Symbol : INFY

Enter desired price : 1578
1574.0
Internal refresh time is 9.915285099999892
1574.0
Internal refresh time is 7.2284357999997155

I am not too much familiar with HTML. By referring online documentation I was able to figure out how to scrape necessary part.

Is there any way to reduce the time to fetch the data ?



Solution 1:[1]

Have a look at the SelectorGadget Chrome extension to grab CSS selectors by clicking on the desired element in your browser.

Also, when using the requests library, the default requests user-agent is python-requests so websites understand that it's a bot or a script that sends a request, not a real user. Check what's your user-agent and pass it request headers.

To get just the current price you would need to use such CSS selector AHmHk .fxKbKc via the select_one() bs4 method, which could also change in the future.

from bs4 import BeautifulSoup
import requests, lxml

headers = {
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.60 Safari/537.36",
        }

html = requests.get(f"https://www.google.com/finance/quote/INFY:NSE", headers=headers, timeout=30)
soup = BeautifulSoup(html.text, "lxml")

current_price = soup.select_one(".zzDege").text
print(current_price)

# ?1,860.50

Code and full example in the online IDE to scrape current price and right panel data:

from bs4 import BeautifulSoup
import requests, lxml, json
from itertools import zip_longest


def scrape_google_finance(ticker: str):
    # https://docs.python-requests.org/en/master/user/quickstart/#custom-headers
    # https://www.whatismybrowser.com/detect/what-is-my-user-agent
    headers = {
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.60 Safari/537.36",
        }

    html = requests.get(f"https://www.google.com/finance/quote/{ticker}", headers=headers, timeout=30)
    soup = BeautifulSoup(html.text, "lxml")
    
    ticker_data = {"right_panel_data": {},
                    "ticker_info": {}}
    
    ticker_data["ticker_info"]["title"] = soup.select_one(".zzDege").text
    ticker_data["ticker_info"]["current_price"] = soup.select_one(".AHmHk .fxKbKc").text
    
    right_panel_keys = soup.select(".gyFHrc .mfs7Fc")
    right_panel_values = soup.select(".gyFHrc .P6K39c")
    
    for key, value in zip_longest(right_panel_keys, right_panel_values):
        key_value = key.text.lower().replace(" ", "_")

        ticker_data["right_panel_data"][key_value] = value.text
    
    return ticker_data
    

data = scrape_google_finance(ticker="INFY:NSE")

# ensure_ascii=False to display Indian Rupee ? symbol
print(json.dumps(data, indent=2, ensure_ascii=False))
print(data["right_panel_data"].get("ceo"))

Outputs:

{
  "right_panel_data": {
    "previous_close": "?1,882.95",
    "day_range": "?1,857.15 - ?1,889.60",
    "year_range": "?1,311.30 - ?1,953.90",
    "market_cap": "7.89T INR",
    "p/e_ratio": "36.60",
    "dividend_yield": "1.61%",
    "primary_exchange": "NSE",
    "ceo": "Salil Parekh",
    "founded": "Jul 2, 1981",
    "headquarters": "Bengaluru, KarnatakaIndia",
    "website": "infosys.com",
    "employees": "292,067"
  },
  "ticker_info": {
    "title": "Infosys Ltd",
    "current_price": "?1,860.50"
  }
}
Salil Parekh

If you want to scrape more data with a line-by-line explanation, there's a Scrape Google Finance Ticker Quote Data in Python blog post of mine.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1