'What algorithm should I use for the following sequential dataset training?

I have a dataset that contains opcode sequences of malware files. I read a paper where the author tried to implement a RNN algorithm like LSTM, but he specified a preprocessing step where he creates a word-bag and uses Word2Vec to convert everything into vectorized format. I am stuck at this place. Any help would be appreciated.

model = gensim.models.Word2Vec()
model.build_vocab(sequence_text, progress_per=1000)
model.train(sequence_text, total_examples=model.corpus_count, epochs=model.epochs)

I will also put a screenshot of the CSV file.

Ultimate Goal: I need to identify if a sequence belongs to malware class or not.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source

'What algorithm should I use for the following sequential dataset training?

Sources

Related Questions