'Gmail API handle large amounts of requests

I am trying to take some benchmarks for how long it takes to some operations on Gmail messages with a mailbox containing a large amount of emails. I am using OAuth and a Gmail account with over 200GB of messages and the way I am currently going about it is fairly naive. I have a loop iterating infinitely until there is no longer a nextPageToken in the response to list messages. When the response comes in for the list of messages on the page, I iterate through the messages and use their message ID to get the information for the entire email (in RAW format). I have the Page Size set at 50.

The issue that I keep running into is an "Insufficient Memory" exception for my application. I am not entirely sure why this is as I completely process each page before moving onto the next request.

The process flow goes something like this:

request list of messages -> process page of 50 messages (each individual message must make a request to get full raw information) -> print message data on screen -> request new page with the previous responses' NextPageToken -> continue until no token is left.

// Here is the core logic in the main function that handles getting the list of messages and calling the helper function.

 using (StreamWriter oFile = new StreamWriter(Path.Combine(Directory.GetCurrentDirectory(), backupFile)))
            {

                ListMessagesResponse response = null;

                while (true)
                {
                    if (response != null && string.IsNullOrWhiteSpace(response.NextPageToken))
                        break;

                    Google.Apis.Gmail.v1.UsersResource.MessagesResource.ListRequest req = service.Users.Messages.List("me");
                    req.IncludeSpamTrash = false;
                    req.MaxResults = 50;

                    if (response != null)
                    {
                        req.PageToken = response.NextPageToken;
                    }

                    try
                    {
                        response = req.Execute();
                    }
                    catch (Exception except)
                    {
                        oFile.Write(except + "\n" + "\n");
                    }


                    foreach (Message message in response.Messages)
                    {
                        if (DisplayMessage(service, oFile, message) == false)
                        {
                            break;
                        }
                    }
                }
            }


// Here is the helper function to display the messages
public static bool DisplayMessage(GmailService _service, StreamWriter _fileWriter, Message _message)
        {
            var emailInfoRequest = _service.Users.Messages.Get("me", _message.Id);
            emailInfoRequest.Format = Google.Apis.Gmail.v1.UsersResource.MessagesResource.GetRequest.FormatEnum.Raw;

            Message emailInfoResponse = null;
            try
            {
                emailInfoResponse = emailInfoRequest.Execute();
            }
            catch (Exception except)
            {
                _fileWriter.Write(except + "\n" + "\n");
            }

            if (emailInfoResponse != null)
            {
                try
                {
                    string email = Newtonsoft.Json.JsonConvert.SerializeObject(emailInfoResponse, Formatting.None);
                    email += "\n";

                    Console.WriteLine(email);
                }
                catch (Exception except)
                {
                    _fileWriter.Write(except + "\n" + "\n");
                }
            }

            return true;
        }

What am I missing here when it comes to memory buildup? After processing 100+ pages of messages I start to see "Insufficient memory" and then a few requests later my application crashes. From what I understand, after the DisplayMessage operation is performed for a message shouldn't it go out of scope? Is garbage collection just having a hard time keeping up because of the rate of requests?

Also I would appreciate any help in how I can optimize processing this amount of requests. I understand that google recommends batching the requests and using gzip compression. https://developers.google.com/gmail/api/guides/performance . But from what I understand, these are primarily network speed improvements. At the moment, I am more focused on having my application complete without crashing locally. So these enhancements I can focus on afterwords.

Thanks!



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source