'multilabel text clasification with 1D CNN
I'm working on authorship detection from text task, I'm doing this by using a data frame consisting of symbol n-gram that I created using about 110k of 147 different authors' texts and TfidfVectorizer. Example of data, I encode author name strings using sklearn label encoder, which converts strings to numbers Example of labels
- Data seperated into: (99817, 1000) (11091, 1000) (99817,) (11091,)
- Using model below my best results were after 7-8 iterations: loss: 0.7225 - accuracy: 0.8070 - val_loss: 1.3828 - val_accuracy: 0.6777, after that model starts to overfit.
Model:
model = Sequential(
[
Dense(300, activation="relu", input_shape=(Data_train.shape[-1],)),
Dense(750, activation="relu"),
BatchNormalization(),
Dropout(0.5),
Dense(147, activation="softmax"),
])
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(),
metrics=['accuracy'])
history = model.fit(Data_train,
Labels_train,
epochs=10,
shuffle=True,
callbacks=[early_stopping])
I want to try to solve this task with CNN network, I have found similar example of my task using 1D network and tried to use it:
vocab_size = 1000
maxlen = 1000
batch_size = 32
embedding_dims = 10
filters = 16
kernel_size = 3
hidden_dims = 250
epochs = 10
early_stopping = EarlyStopping(patience=0.1)
model = Sequential(
[
Embedding(vocab_size, embedding_dims, input_length=maxlen),
Dropout(0.5),
Conv1D(filters, kernel_size, padding='valid', activation='relu'),
MaxPooling1D(),
BatchNormalization(),
Conv1D(filters, kernel_size, padding='valid', activation='relu'),
MaxPooling1D(),
Flatten(),
Dense(hidden_dims, activation='relu'),
Dropout(0.5),
Dense(147, activation='softmax')
]
)
model.compile(optimizer='adam',
loss=keras.losses.SparseCategoricalCrossentropy(),
metrics=['accuracy'])
model.fit(Data_train, Labels_train,
batch_size=batch_size,
epochs=epochs,
validation_split=0.2,
callbacks=[early_stopping])
I only manage to get these results, from the first iteration to the 5, accuracy and val_accuracy reach ~0.07 and stay the same, after 5 iterations this is what I got:
- loss: 4.5621 - accuracy: 0.0701 - val_loss: 4.5597 - val_accuracy: 0.0702
Could someone help me to improve these models to get better results, especially CNN? any suggestions are welcome, if I need to provide anything more please let me know, thank you.
Solution 1:[1]
I have managed to solve my issue, instead of using an embedding layer, I transformed my data and labels adding additional dimensions and passing input shape directly to the convolutional layer
Data_train_tmp = tf.expand_dims(Data_train, -1)
Labels_train_tmp = tf.expand_dims(Labels_train, 1)
Conv1D(62, 5, activation='relu', input_shape=(Data_train.shape[1],1)),
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | hixi22745 |