'Percentage of google N-grams containing a particular word

I'm trying to use the google n-grams API to understand what percentage of say 2,3, or 4-grams contain a particular word, say 'happy'.

Just using the query for 'happy' - which give the percentage of 1-grams that the word 'happy' accounts for - should be a very reasonable estimate for this, but I want to be more precise.

For a particular year:

https://books.google.com/ngrams/graph?content=happy&year_start=2018&year_end=2019&corpus=26&smoothing=3&case_insensitive=true

I see that you can download the raw frequency scores for all of the 1-5 grams, so if all else fails, I guess I can get the answer this way, but I thought this was a relatively natural question for the standard API.

I thought it might be something like 'happy *', but this returns the top 10 2-grams starting with happy.



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source