'How to map input sentence to output sentence in NLP
How can I do sentence to sentence mapping?
example: If you have an input text "The price of orange has increased" and output text "Increase the production of orange"
So should I convert into vector then use any algorithm or cosine similarity
Solution 1:[1]
What you are looking at seems to be a Seq2Seq mapping problem.
You probably want to use a Denoising AutoEncoder for this task, preferably BART.
You can learn more about autoencoders here.
Here is a quick implementation to get you started:
import logging
import pandas as pd
from simpletransformers.seq2seq import Seq2SeqModel, Seq2SeqArgs
logging.basicConfig(level=logging.INFO)
transformers_logger = logging.getLogger("transformers")
transformers_logger.setLevel(logging.WARNING)
#load dataset
df = pd.read_excel('sample_input.xlsx')
#rename input and output columns to 'input_text' and 'target_text'
df.rename(columns={'input text':'input_text', 'output text':'target_text'}, inplace=True)
#split into training and evaluation sets
train_size = int(len(df)*0.8)
train_df, eval_df = df[:train_size], df[train_size:]
train_df, eval_df = train_df[['input_text', 'target_text']], eval_df[['input_text', 'target_text']]
# Configure the model
model_args = Seq2SeqArgs()
model_args.num_train_epochs = 10
model_args.train_batch_size = 16
model_args.eval_batch_size = 8
model_args.evaluate_generated_text = True
model_args.evaluate_during_training = True
model_args.evaluate_during_training_verbose = True
model_args.overwrite_output_dir = True
model = Seq2SeqModel(
encoder_decoder_type="mbart",
encoder_decoder_name="facebook/mbart-large-cc25",
use_cuda=True,
args=model_args,
)
# Train the model
model.train_model(train_df, eval_data=eval_df)
# Evaluate the model
result = model.eval_model(eval_df)
# Use the model for prediction
print(model.predict(['The price of orange has increased']))
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Aamir Syed |
