Category "nltk"

Remove the initial text when using the nltk.book module in python

I'm learning about NLP and messing around with nltk but just by importing the module on my program, whenever I run the script I get the following text: *** Intr

Processing British National Corpus (BNC) with NLTk: How to keep spoken texts only?

Following this old(er) post HERE, I wonder if one can keep (extract) spoken samples only? There is a special XML tag (element): is used to represent a spoken t

NLTK find german nouns

I want to extract all german nouns from a german text in lemmatized form with NLTK. I also checked spacy but NLTK is much more preferred because in english it a

Is there any way to solve re.sub issue?

sub() missing 1 required positional argument: 'string' def preprocess_text(sentence): #Remove punctuations and numbers sentence = re.sub('[^a-zA-Z]', '

Running nltk.download in Azure Synapse notebook ValueError: I/O operation on closed file

I'm experimenting with NLTK in an Azure Synapse notebook. When I try and run nltk.download('stopwords') I get the following error: ValueError: I/O operation on

How to solve missing words in nltk.corpus.words.words()?

I have tried to remove non-English words from a text. Problem many other words are absent from the NLTK words corpus. My code: import pandas as pd lst = ['

TypeError: "hypothesis" expects pre-tokenized hypothesis (Iterable[str]):

I am trying to calculate the Meteor score for the following: print (nltk.translate.meteor_score.meteor_score( ["this is an apple", "that is an apple"], "an

Extracting names from a text file using Spacy

I have a text file which contains lines as shown below: Electronically signed : Wes Scott, M.D.; Jun 26 2010 11:10AM CST The patient was referred by Dr. J

Multilingual NLTK for POS Tagging and Lemmatizer

Recently I approached to the NLP and I tried to use NLTK and TextBlob for analyzing texts. I would like to develop an app that analyzes reviews made by traveler

NLTK agreement with distance metric

I have a task to calculate inter-annotator agreement in multi-label classification, where for each example more than one label can be assigned. I found that NLT

A good dictionary/corpus to crosscheck plural nouns

I am using "nltk" to identify nouns and then "inflect" to find the plural form of the noun. I have added a contingency where the plural form is crosschecked wit

Parsing HTML into sentences - how to handle tables/lists/headings/etc?

How do you go about parsing an HTML page with free text, lists, tables, headings, etc., into sentences? Take this wikipedia page for example. There is/are: fr

Using NLTK corpora with AWS Lambda functions in Python

I'm encountering a difficulty when using NLTK corpora (in particular stop words) in AWS Lambda. I'm aware that the corpora need to be downloaded and have done s

how to get parse tree using python nltk?

Given the following sentence: The old oak tree from India fell down. How can I get the following parse tree representation of the sentence using python NLTK?

ntlk: how to get inflections of words

I have a list of words, nearly 5000 English words, and for each word I need these inflectional forms: noun: singular and plural verb: infinitive, present simp

pip issue installing almost any library

I have a difficult time using pip to install almost anything. I'm new to coding, so I thought maybe this is something I've been doing wrong and have opted out t

pip issue installing almost any library

I have a difficult time using pip to install almost anything. I'm new to coding, so I thought maybe this is something I've been doing wrong and have opted out t

pip issue installing almost any library

I have a difficult time using pip to install almost anything. I'm new to coding, so I thought maybe this is something I've been doing wrong and have opted out t