Category "deep-learning"

how to modify resnet 50 with 4 channels as input using pre-trained weights in Pytorch?

I would like to change the resnet50 so that I can switch to 4 channel input, use the same weights for the rgb channels and initialize the last channel with a no

How to extract relation between entities for stock prediction

I am trying to extract relation between two entities (entity1- relation- entity2) from news articles for stock prediction. I have used NER for entity extraction

Extracting labels after applying softmax

I have a multi class classification neural network. I apply softmax at the end to get probabilities for my classes. However, now I want to pick the maximum prob

Why the initialization of weights in darknet?

there! I am studying Mr. Redmon's darknet code from https://github.com/pjreddie/darknet I found the initialization of weights of a connected layer is like below

Why the initialization of weights in darknet?

there! I am studying Mr. Redmon's darknet code from https://github.com/pjreddie/darknet I found the initialization of weights of a connected layer is like below

UnimplementedError: Fused conv implementation does not support grouped convolutions for now

I am trying to build a CNN model to recognise human sketch using the TU-Berlin dataset. I downloaded the png zip file, imported the data to Google Colab and the

what is the number of layers in EfficientNetB2?

Knowing that the total number of layers in EfficientNet-B0 is 237 and in EfficientNet-B7 the total comes out to 813, what is the total number of layers in Effic

Pretraining a language model on a small custom corpus

I was curious if it is possible to use transfer learning in text generation, and re-train/pre-train it on a specific kind of text. For example, having a pre

Variational AutoEncoder - TypeError

I am trying to implement a VAE for MNIST using convolutional layers using TensorFlow-2.6 and Python-3.9. The code I have is: # Specify latent space dimensions-

Derivates from a class instance in TF1

I am using the Physics Informed Neural Networks (PINNs) methodology to solve non-linear PDEs in high dimension. Specifically, I am using this class https://git

logits and labels must be broadcastable error in Tensorflow RNN

I am new to Tensorflow and deep leaning. I am trying to see how the loss decreases over 10 epochs in my RNN model that I created to read a dataset from kaggle w

How to clean garbage from CUDA in Pytorch?

I teached my neural nets and realized that even after torch.cuda.empty_cache() and gc.collect() my cuda-device memory is filled. In Colab Notebooks we can see t

Correct Implementation of Dice Loss in Tensorflow / Keras

I've been trying to experiment with Region Based: Dice Loss but there have been a lot of variations on the internet to a varying degree that I could not find tw

How to get the location of all text present in an image using OpenCV?

I have this image that contains text (numbers and alphabets) in it. I want to get the location of all the text and numbers present in this image. Also I want to

LSTM is Showing very low accuracy and large loss

I am applying LSTM on a dataset that has 53699 entries for the training set and 23014 entries for the test set. The shape of the input training set is (53699,4)

why does the VQ-VAE require 2 Stage training?

According the the paper, VQ-VAE goes through two stage training. First to train the encoder and the vector quantization and then train an auto-regressive model

Random cropping data augmentation convolutional neural networks

I am training a convolutional neural network, but have a relatively small dataset. So I am implementing techniques to augment it. Now this is the first time i a

How to understand masked multi-head attention in transformer

I'm currently studying code of transformer, but I can not understand the masked multi-head of decoder. The paper said that it is to prevent you from seeing the

Data augmentation in test/validation set?

It is common practice to augment data (add samples programmatically, such as random crops, etc. in the case of a dataset consisting of images) on both training

Why do we need to call zero_grad() in PyTorch?

Why does zero_grad() need to be called during training? | zero_grad(self) | Sets gradients of all model parameters to zero.