I'm trying to implement the CNN code from Andreas Werdich: https://github.com/awerdich/physionet
" The goal of this project was to implement a deep-learning algorithm that classifies electrocardiogram (ECG) recordings from a single-channel handheld ECG device into four distinct categories: normal sinus rhythm (N), atrial fibrillation (A), other rhythm (O), or too noisy to be classified (~). "
Executing the code works fine. But now after the model was trained I'm not sure how to predict a different ECG signal. He uses ECG signal stored in hdf5 files.
"For each group of data in the hdf5 file representing a single ECG time series, the following metadata was saved as attribute:
baseline voltage in uV
bit depth
gain
sampling frequency
measurement units"
After training I saved the model with
model.save(filepath)
I put it on filedropper: http://www.filedropper.com/ecgcnn
And I have an hdf5 file full with ECG signals that I'd like to predict: http://www.filedropper.com/physioval
I tried using the model.predict function, but it didn't work. I'm not quite sure how to pass on the ECG signal, because I need 4 different classifications.
Does anyone know how I can make the prediction work?
Thanks
Related
Using the steps in the following link, I was able to fine tune the yamnet model https://github.com/tensorflow/models/issues/8425
But I have problem with tuning the hyperparameters of the yamnet model.
If I understand it correctly, each audio is divided into frames with a length of patch_window_seconds and a hop length of patch_window_seconds. The input of the model is a batch of these frames. What if there is a frame of silence in each audio and we label that as our object of interest. Is not that problematic?
Of course, we can change the patch_window_seconds and patch_hop_seconds parameters in the parameter file, but how can we be sure that each frame ends up containing the audio of the object of interest?
I have achieved good accuracy in the training and validation set. I have three sets of tests. For one set, which is from the same distribution as the training data set, the accuracy is good, but for the others, it is not. The test sets I used are from another paper where good accuracy was achieved with a simple CNN for all test sets.
I am training a CNN model(made using Keras). Input image data has around 10200 images. There are 120 classes to be classified. Plotting the data frequency, I can see that sample data for every class is more or less uniform in terms of distribution.
Problem I am facing is loss plot for training data goes down with epochs but for validation data it first falls and then goes on increasing. Accuracy plot reflects this. Accuracy for training data finally settles down at .94 but for validation data its around 0.08.
Basically its case of over fitting.
I am using learning rate of 0.005 and dropout of .25.
What measures can I take to get better accuracy for validation? Is it possible that sample size for each class is too small and I may need data augmentation to have more data points?
Hard to say what could be the reason. First you can try classical regularization techniques like reducing the size of your model, adding dropout or l2/l1-regularizers to the layers. But this is more like randomly guessing the models hyperparameters and hoping for the best.
The scientific approach would be to look at the outputs for your model and try to understand why it produces these outputs and obviously checking your pipeline. Did you had a look at the outputs (are they all the same)? Did you preprocess the validation data the same way as the training data? Did you made a stratified train/test-split, i.e. keeping the class distribution the same in both sets? Is the data shuffles when you feed it to your model?
In the end you have about ~85 images per class which is really not a lot, compare CIFAR-10 resp. CIFAR-100 with 6000/600 images per class or ImageNet with 20k classes and 14M images (~500 images per class). So data augmentation could be beneficial as well.
I am working on a project where I have 1024x1024 brain images over time depicting blood flow. A blood flow parameter image is computed using the brain images over time, and is off the same dimension (1024 x 1024). My goal is to train a CNN to learn the mapping between the brain images over time and the blood flow parameter image.
I've looked into current CNN architectures, but it seems like most research on CNNs is either done for classification on single images (not images over time) or action recognition on video data, which I'm not sure my problem falls under. If anyone can provide me with any insight or papers I can read on how to train a model on temporal data, with the output being an image (rather than a classification score), that would be immensely helpful.
I'm playing with the SWWAE example, and I put a classification head onto the end of the encoder. I'm running it as both a supervised classifier and an encoder/decoder and it's working fine. However, I'm unsure how to run it in semi-supervised mode. I was thinking that for unlabeled data, I could just set all the output labels to 0? If I understand correctly, for categorical cross entropy, this should mean that there's no error signal to propagate. Is that correct? In this case would I need to make batches of data where each item in the batch is either all unlabeled or all labeled?
I am currently trying to replicate the works of a paper, in which they train a cnn using MFCC features without the DCT performed at the end. It is basically the log of the energies of the filter banks.
I know that kaldi can compute the MFCC features using the make_mfcc.sh script. But can the script somehow be altered to compute the MFCC without the DCT performed at the end, if not is there other tools that might me able to do so?
MFCCs are commonly derived as follows:
Take the Fourier transform of (a windowed excerpt of) a signal.
Map the powers of the spectrum obtained above onto the mel scale,
using triangular overlapping windows.
Take the logs of the powers at each of the mel frequencies.
Take the discrete cosine transform of the list of mel log powers, as
if it were a signal.
The MFCCs are the amplitudes of the resulting spectrum.
You can use make_fbank script to extract log energies.