'Compare two string and produce the result with the score of how much they match
I want to be able to validate that two objects match with a certain degree, let's say from 0 to 1.
So assume there is a Person object:
Person
{
Guid Id {get;set;}
string FirstName {get;set;}
string LastName {get;set;}
int Age {get ;set;}
}
Now, I have a database with verified Person object details. For every transaction, a new person data is aquired through OCR processing (like from credit card or other document), so it's likely it will be a little different than the full data stored in a database - there might be missing characters, additional namespaces, etc.
What I want is to be able to compare the data aquired from OCR and match it to a specific person from a database (I already know which person object it is; no need to validate against all the records) with a certain score. I could then test the data and set some threshold that would be fine for me in terms of the person validation.
Initially, I thought I could use ElasticSearch to do this (never used it before, did some reading/research today about this), like I would index verified person objects and then I could just search for a given customer with rules like: it must have matching id and should match with FirstName and should match with LastName, etc. (and add some weights/boosting for a particular fields). But the more I think about it, the more it looks like overengineering to me. Thus, I have a few questions:
- Is ElasticSearch a correct tool to approach the problem, since I already know which verified person I want to fetch from the source to compare agains the OCR result, all I need is an algorithm to evaluate the relevance of that match?
- If above statement is true, what ES API should I use? I mean, I've been searching for full-text matching, multi-field matching with query boosting and many others, but I'm not sure which (if any) would be good to address my problem.
- If ES is not a good solution for this, is there any other open source tool/library that would be better to solve this kind of a problem? I'm working with .NET environment.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
