Category "spacy"

How to mock spacy models / Doc objects for unit tests?

Loading spacy models slows down running my unit tests. Is there a way to mock spacy models or Doc objects to speed up unit tests? Example of a current slow tes

How to get up and running with spaCy for Vietnamese?

I success with English python -m spacy download en_core_web_lg python -m spacy download en_core_web_sm python -m spacy download en I read https://spacy.io/mod

Extracting names from a text file using Spacy

I have a text file which contains lines as shown below: Electronically signed : Wes Scott, M.D.; Jun 26 2010 11:10AM CST The patient was referred by Dr. J

Spacy train ner using multiprocessing

I am trying to train a custom ner model using spacy. Currently, I have more than 2k records for training and each text consists of more than 100 words, at least

Tokenizing an HTML document

I have an HTML document and I'd like to tokenize it using spaCy while keeping HTML tags as a single token. Here's my code: import spacy from spacy.symbols impo

The best and simple way to convert labeled text classification data to spaCy v3 format

Let's suppose we have labeled data for text classification in a nice CSV file. We have 2 columns - "text" and "label". I am kind of struggling to understand spa

How to extract relation between entities for stock prediction

I am trying to extract relation between two entities (entity1- relation- entity2) from news articles for stock prediction. I have used NER for entity extraction

spaCy library to extract noun phrase - ValueError: [E866] Expected a string or 'Doc' as input, but got: <class 'float'>

currently I'm trying to extract noun phrase from sentences. The sentences were stored in a column in excel file. Here the code using python: import pandas as pd

Error during training entity linker model with cutom spacy ner model

I have already trained an Entity Linker (EL) model with spacy's en_core_web_sm model without any problems. But when I train a EL model with a custom NER Model,

Cannot install spacy in pycharm

I've tried installing spaCy numerous times. At first I was getting an error noting that C++ needed to be upgraded to version 14. Now I'm getting a number of err

TypeError: add() takes exactly 2 positional arguments (3 given)

Why I am getting this error Can anyone tell please or explain me how to use it using simple example ------------------------------------------------------------

spacy with joblib library generates _pickle.PicklingError: Could not pickle the task to send it to the workers

I have a large list of sentences (~7 millions), and I want to extract the nouns from them. I used joblib library to parallelize the extracting process, like in