'How to map input sentence to output sentence in NLP

How can I do sentence to sentence mapping?

example: If you have an input text "The price of orange has increased" and output text "Increase the production of orange"

So should I convert into vector then use any algorithm or cosine similarity



Solution 1:[1]

What you are looking at seems to be a Seq2Seq mapping problem.

You probably want to use a Denoising AutoEncoder for this task, preferably BART.

You can learn more about autoencoders here.

Here is a quick implementation to get you started:

import logging
import pandas as pd
from simpletransformers.seq2seq import Seq2SeqModel, Seq2SeqArgs

logging.basicConfig(level=logging.INFO)
transformers_logger = logging.getLogger("transformers")
transformers_logger.setLevel(logging.WARNING)

#load dataset
df = pd.read_excel('sample_input.xlsx')

#rename input and output columns to 'input_text' and 'target_text'
df.rename(columns={'input text':'input_text', 'output text':'target_text'}, inplace=True)

#split into training and evaluation sets
train_size = int(len(df)*0.8)

train_df, eval_df = df[:train_size], df[train_size:]
train_df, eval_df = train_df[['input_text', 'target_text']], eval_df[['input_text', 'target_text']]

# Configure the model
model_args = Seq2SeqArgs()
model_args.num_train_epochs = 10
model_args.train_batch_size = 16
model_args.eval_batch_size = 8
model_args.evaluate_generated_text = True
model_args.evaluate_during_training = True
model_args.evaluate_during_training_verbose = True
model_args.overwrite_output_dir = True

model = Seq2SeqModel(
    encoder_decoder_type="mbart",
    encoder_decoder_name="facebook/mbart-large-cc25",
    use_cuda=True,
    args=model_args,
)

# Train the model
model.train_model(train_df, eval_data=eval_df)

# Evaluate the model
result = model.eval_model(eval_df)

# Use the model for prediction
print(model.predict(['The price of orange has increased']))

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Aamir Syed