Accuracy Measurement Methods other than PSNR - statistics

In image and data compression, typically PSNR is used to measure the quality of reconstruction. But for situations such as images or visual plots of 1D data, PSNR can produce mathematically good results that are visually bad. Does anyone know of simple metrics that are more related to visual accuracy than mathematical accuracy?

Related

sklearn HistGradientBoostingClassifier with large unbalanced data

I've been using Sklearn HistGradientBoostingClassifier to classify some data. My experiment is multi-class classification with single label predictions (20 labels).
My experience shows two cases. The first case is the measurement of the accuracy of these algorithms without data augmentation (around unbalanced 3,000 samples). The second case is the measurement of accuracy with data augmentation (around 12,000 unbalanced samples). I am using default parameters.
In the first case, the HistGradientBoostingClassifier shows an accuracy of around 86.0%. However, with data augmentation, results show weak accuracy, around 23%.
I am wondering if this accuracy was coming from unbalanced datasets, but since there are no features to fix unbalanced datasets for the HistGradientBoostingClassifier algorithm within the Sklearn library, I cannot verify that fact.
Do some people have the same kind of problem with large dataset and HistGradientBoostingClassifier?
Edit: I tried other algorithms with the same data split, and the results seems normal (accuracy around 5% more w/ data augmentation). I am wondering why I am only getting this with HistGradientBoostingClassifier.
Accuracy is a poor metric when dealing with imbalanced data. Suppose I have 90:10 class 0 and class 1. A DummyClassifier that only predicts class 0 will achieve 90% accuracy.
You'll have to look at precision, recall, f1, confusion matrix, and not just accuracy alone.
I have found something that could be the reason of the lack of accuracy while using HistGradientBoostingClassifier algorithm with default parameters on augmented dataset of roughly 12,000 samples.
I compared HistGradientBoostingClassifier and LightGBM algorithms on the same data split (HistGradientBoostingClassifier from sklearn is an implementation of Microsoft's LightGBM.). HistGradientBoostingClassifier shows a weak accuracy of 24.7% and LightGBM a strong one 87.5%.
As I can read on sklearn's and Microsoft's docs, HistGradientBoostingClassifier "cannot handle properly" unbalanced dataset while LightGBM can. The latter has this parameter: class_weigth (dict, 'balanced' or None, optional (default=None)) (found on that page)
My hypothesis is that, for the time being, the dataset becomes more unbalanced with augmentation and, without any feature for the HistGradientBoostingClassifier algorithm to handle unbalanced data, the algorithm is misled.
Also, as mentioned by Hanafi Haffidz in comments the algorithm could tend to overfit with default parameters.

looking for a loss function sensitive to edges for medical image quality enhuncement

For an image-to-image translation task in which I want to generate high-quality images from low-quality images ( MRI Images), I need a loss function to highlight the edges and generate images with sharper edges.
Do you have any recommendation for selecting the desired loss function between Pytorch's loss function??
https://pytorch.org/docs/stable/nn.html#loss-functions
I really appreciate it if anyone can even provide me the code of any predefined loss function for this task.
Thanks
I suppose you are using MSE loss function at the moment? This loss function indeed tends to prefer "smoother" outputs rather than sharp edges.
For image generation tasks, consider using perceptual loss that better correlates with human perception of image quality.
For more details on this loss function see
R. Zhang, P. Isola, A. A. Efros, E. Shechtman, O. Wang The Unreasonable effectiveness of Deep Features as a Perceptual Metric (CVPR 2018).

Linear SVM vs Nonlinear SVM high dimensional data

I am working on a project where I use Spark Mllib Linear SVM to classify some data (l2 regularization). I have like 200 positive observation, and 150 (generated) negative observation, each with 744 features, which represent the level of activity of a person in different region of a house.
I have run some tests and the "areaUnderROC" metric was 0.991 and it seems that the model is quite good in classify the data that I provide to it.
I did some research and I found that the linear SVM is good in high dimensional data, but the problem is that I don't understand how something linear can divide my data so well.
I think in 2D, and maybe this is the problem but looking at the bottom image, I am 90% sure that my data looks more like a non linear problem
So it is normal that I have good results on the tests? Am I doing something wrong? Should I change the approach?
I think you question is about 'why linear SVM could classfy my hight Dimensions data well even the data should be non-linear'
some data set look like non-linear in low dimension just like you example image on right, but it is literally hard to say the data set is definitely non-linear in high dimension because a nD non-linear may be linear in (n+1)D space.So i dont know why you are 90% sure your data set is non-linear even it is a high Dimension one.
At the end, I think it is normal that you have a good test result in test samples, because it indicates that your data set just is linear or near linear in high Dimension or it wont work so well.Maybe cross-validation could help you comfirm that your approach is suitable or not.

Should sentiment analysis training data be evenly distributed?

If I am training a sentiment classifier off of a tagged dataset where most documents are negative, say ~95%, should the classifier be trained with the same distribution of negative comments? If not, what would be other options to "normalize" the data set?
You don't say what type of classifier you have but in general you don't have to normalize the distribution of the training set. However, usually the more data the better but you should always do blind tests to prevent over-fitting.
In your case you will have a strong classifier for negative comments and unless you have a very large sample size, a weaker positive classifier. If your sample size is large enough it won't really matter since you hit a point where you might start over-fitting your negative data anyway.
In short, it's impossible to say for sure without knowing the actual algorithm and the size of the data sets and the diversity within the dataset.
Your best bet is to carve off something like 10% of your training data (randomly) and just see how the classifier performs after being trained on the 90% subset.

Test error lower than training error

Would appreciate your input on this. I am constructing a regression model with the help of genetic programming.
If my RMSE on test data is (much) lower than my RMSE on training data for a 1:5 ratio of data, should I be worried?
The test data is drawn randomly without replacement from a set of 24 data points. The model was built using genetic programming technique so the number of features, modeling framework etc vary as I minimize the training RMSE regularized by the number of nodes in the GP tree.
Is the model underfitted? Or should I have minimized MSE instead of RMSE (I thought it would be the same as MSE is positive and the minimum of MSE would coincide with the minimum of RMSE assuming the optimizer is good enough to find the minimum)?
Tks
So your model is trained on 20 out of 24 data points and tested on the 4 remaining data points?
To me it sounds like you need (much) more data, so you can have a larger train and test sets. I'm not surprised by the low performance on your test set as it seems that your model wasn't able to learn from such few data. As a rule of thumb, for machine learning you can never have enough data. Is it a possibility to gather a larger dataset?

Resources