Category "nlp"

How to generate a sentence around words in Keras?

I know that how to generate next word in keras with lstm but how to predict previous word for example If you have two words like "car" and "running" then It sho

I created a TF-IDF code to analyze an annual report, I want to know the importance of specific keywords

import pandas as pd from sklearn.feature_extraction.text import TfidfTransformer from sklearn.feature_extraction.text import TfidfVectorizer import path import

Will NER improve Text Categorization?

I was wondering - if I'm doing text categorization (with SpaCy, using their textcat-multi component for example), will those results improve if an NER component

Text Classification on a custom dataset with spacy v3

I am really struggling to make things work with the new spacy v3 version. The documentation is full. However, I am trying to run a training loop in a script. (I

Add Noise to Background for Voice Separation

I want to implement a voice separation project. Now, I got a voice dataset with no background noise and a dataset about noise, such as engine sound , horn sound

How to get TF-IDF value of a word from all set of documents?

I need a TF-IDF value for a word that is found in number of documents and not only a single document or a specific document. For example, Consider this corpus c

Removing Non-English Words from CSV - NLTK

I am relatively new to Python and NLTK and have a hold of Flickr data stored in CSV and want to remove non-english words from the tags column. I keep getting er

kwic() function returns less rows than it should

I'm currently trying to perform a sentiment analysis on a kwic object, but I'm afraid that the kwic() function does not return all rows it should return. I'm no

I want to ask you about the structure of "query, key, value" of "transformer"

I'm a beginner at NLP. So I'm trying to reproduce the most basic transformer all you need code. But I got a question while doing it. In the MultiHeadAttention l

Tell `kwic()` to ignore stopwords when situating keywords in context?

I once again have a question about the kwic() function from the quanteda package. I want to extract the five words around a specific keyword (in the example bel

Using a target size (torch.Size([2])) that is different to the input size (torch.Size([2, 5])) is deprecated. Please ensure they have the same size

When I am using criterion = nn.BCELoss() for my binary classification task it creates problem and print "Using a target size (torch.Size([2])) that is different

Error while creating a model for binary classification for text classification

code: model = create_model() model.compile(optimize=tf.keras.optimizers.Adam(learning_rate=2e-5), loss=tf.keras.losses.BinaryCrossentropy(),

Continous Bag of Words

I have a question related to the continous Bag of Words model. If I have a vocabulary size of 1000, a window size of 2, and the number of nodes in the hidden la

I want to add numeric columns to my tfidf sparse matrix

[here] I tried to do it with sp.hstack() and with

Looping through each row in array to calculate cosine similarity

I have a subset of a dataframe that looks like: <OUT> PageNumber english_only_tags 175 flower architecture people 162 hair red bobbles

It looks like the config file at 'bert-base-uncased' is not a valid JSON file?

Working fine for months, then I interrupted a "bert-large-cased" download and the following code returns the error in the title: from transformers import BertMo

Value error trying to fit a logistic regression with SentenceTransformer output (embeddig)

My code: model = SentenceTransformer('hiiamsid/sentence_similarity_spanish_es') I apply the model to the text column of the data frame prueba['encoder'] = prueb

Is there any way to put timer/end the serving of infographics automatically in dispacy?

While running the code with displacy, I see the images being created perfectly as expected. They are also projected to a server, the address of which is mention

Extract multiple start date and end date from a string in python?

I am making a resume parser but I want to know the years of experience of the person from the experience section and want results like if there are 3 years of e

How to fix Spacy Transformers for Spacy version 3.1

I'm having the following problem. I've been trying to replicate example code from this source: Github I'm using Jupyter Lab environment on Linux and Spacy 3.1 #