'How to extract PDF text with Google Vision API in C#

I want to use Google Vision in order to extract PDF into text/table. My PDF includes a table which I want to extract (BlockType = table).

I am not sure how to do that in C# though.

I installed Google.Cloud.Vision.API NuGet and tried to use the DetectTextDocument method but it seems that it receives only image.

var client = new ImageAnnotatorClientBuilder
{
    CredentialsPath = @"myjsonfile.json"
}.Build();

Image image = Image.FromUri("https://storage.cloud.google.com/pathtomyfile.pdf");

TextAnnotation response = client.DetectDocumentText(image); // Getting error for a bad image.

Then I tried to find any files methods and found the BatchAnnotateFilesAsync but I am not sure how to build the BatchAnnotateFilesRequest object that it requires and can't find any examples in C#.

Can anyone help me to figure out how to extract PDF document into text of a table block types?

Thanks in advance.



Solution 1:[1]

private string ScanPDFWithGoogle(string path)
      {

          string ret = string.Empty;
          try
          {
             
              var image = Google.Cloud.Vision.V1.Image.FromFile(@"C:\Users\ADMIN\Downloads\parts.png");
              Log.Write("In  photoread try catch block : " + image.ToString());
              var credentialPath = ConfigurationManager.AppSettings["GOOGLE_APPLICATION_CREDENTIALS"];
              Log.Write("In  photoread try catch block after credential : " + credentialPath);
              GoogleCredential credential = GoogleCredential.FromFile(credentialPath);
              var channel = new Grpc.Core.Channel(
              ImageAnnotatorClient.DefaultEndpoint.ToString(),
              credential.ToChannelCredentials());
              ImageAnnotatorClient client = ImageAnnotatorClient.Create(channel);
              Log.Write("Channel" + client.ToString());
              var response = client.DetectText(image);
              ret = response.ToString();
              return ret;
          }
          catch (Exception ex)
          {
              Log.Write("Error at photoread api" + ex.Message);
              Log.Write(ex.StackTrace);
              throw ex;
          }


      }

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Jeremy Caney