'How to retrieve AWS Textract response JSON using the Java SDK
I am using AWS Textract to OCR images and create a searchable PDF as outlined in this AWS blog post. The basic request code looks like this:
AmazonTextractClientBuilder builder = AmazonTextractClientBuilder.standard();
DetectDocumentTextRequest request = new DetectDocumentTextRequest()
.withDocument(new Document()
.withBytes(imageBytes));
DetectDocumentTextResult result = client.detectDocumentText(request);
List<Block> blocks = result.getBlocks()
This works out great however I would also like to write out and keep the original response JSON that contains all the information on what was detected where etc.
Is there a way to get to the response JSON using the JAVA SDK?
Solution 1:[1]
AWS doesn't return the response JSON to you in raw form. The assumption may have been that it wouldn't be required once it has been converted to a DetectDocumentTextResult object.
You are able to convert the DetectDocumentTextResult object to JSON (example) which should provide identical values. Note that the variable names will not be identical (e.g.: DocumentMetadata vs documentMetadata) but the values of those variables will be the same.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Cosmittus |
