'How to obtain full GPE for named entity recognition using NLTK ? Misses full name or full city

How do you fix duplication of names, obtaining full names and fix location errors during NER modeling using NLTK ?

import nltk
from nltk import ne_chunk, pos_tag, word_tokenize

sentence = 'Mark, Anitha and Ann Hathway are working at Crazybook. Mark Anthony arrived from Ghana and the second person moved from India to Crazy Bel Technologies in San diego before arriving here in Mountain View'

for sent in nltk.sent_tokenize(sentence):
   for chunk in nltk.ne_chunk(nltk.pos_tag(nltk.word_tokenize(sent))):
      if hasattr(chunk, 'label'):
         print(chunk.label(), ' '.join(c[0] for c in chunk))

PERSON Mark
PERSON Anitha
PERSON Ann Hathway
ORGANIZATION Crazybook
PERSON Mark
PERSON Anthony
GPE Ghana
GPE India
PERSON Crazy Bel
GPE San
GPE Mountain

Issue #1 as seen in the output is that person Mark #1 and Mark #2, Anthony are all the same in the context and how do you detect this ?

Issue #2 is about misssing Crazy Bel Technologies as an ORGANIZATION

Issue #3 is about missing San Diego as the GPE and only detecting San and similarly only Mountain instead of Mountain View in the last case



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source