Is there a difference between computing LAB and RGB avg? - colors

I have a number of RGB pixel sets that I need to compute an average colour for. So far I have simply been averaging each R, G and B value separately to compute the average RGB value and then converting it into LAB for comparison against another colour with the DeltaE 2000 algorithm.
Is there any difference to the final computed LAB average if I instead first convert each individual RGB set to LAB and then average the L, A and B values separately?

Yes, there's a difference. RGB values suffer from color metamerism which is where multiple combinations of R G and B can produce the same perceived color. My hunch is that averaging the RGB values will also get you a washed out version of the color as well.
Here's some rgb metamer examples:
light pink 233 200 244 vs 245 196 234
aqua 40 238 234 vs 99 235 224
light green 21 237 70 vs 78 234 62

Related

Redrawing Excel figures on gnuplot

I was working on excel and drew two histograms shown below, I have been told to redraw them using gnuplot on windows which is very new to me.
The original graph that I want to redraw is this.
Area 1 Area 2
Case 1 Case 2 Case 1 Case 2
Parameter 1 36 66 31 72
Parameter 2 57 91 44 85
Parameter 3 62 90 50 85
My file is a text file and I wrote the above table as follows as I am not sure how to group the different columns together.
Area Area1 Area1 Area2 Area2
Case Case1 Case2 Case1 Case2
Parameter_1 36 66 31 72
Parameter_2 57 91 44 85
Parameter_3 62 90 50 85
I used the following commands and got a histogram that is grouped in the wrong way.
clear
reset
unset key
set style data histogram
set style fill solid border
set style histogram clustered
plot for [COL=2:5] 'date_mins.tsv' using COL:xticlabels(1) title columnheader
Kindly guide me on how to group columns together and also how to add the numbers on top of the bars. {The graph should be same as the one excel generated one.}
To be honest I'm regularly puzzled with histograms in gnuplot, apparently I'm not the only one. In gnuplot console, check help histograms.
Although, there are a few histogram examples on the gnuplot homepage, but of course not all possible variations can be covered.
Apparently, this plotting style is a bit confusing to understand.
This would maybe explain that there are more than 800 questions on SO on histograms with gnuplot.
I'm not sure if or how you can get your desired histogram efficiently, maybe there is an easy way.
I would do it "manually" with the plotting style with boxes.
Check the example below as a starting point. There are a few strange workarounds included, e.g. getting the titles into an array in an earlier plot for later use.
Code:
### special histogram
reset session
$Data <<EOD
Area Area1 Area1 Area2 Area2
Case Case1 Case2 Case1 Case2
"Parameter 1" 36 66 31 72
"Parameter 2" 57 91 44 85
"Parameter 3" 62 90 50 85
EOD
set style fill solid noborder
set boxwidth 0.8
set key noautotitle out center bottom horizontal reverse Left samplen 1 width 2
A=2 # Areas
C=2 # Cases
P=3 # Parameters
g=1 # gap
PosX(a,c,p) = ((a-1)*C*(P+g)) + (c-1)*(P+g) + p
PosY(a,c) = column((a-1)*C+c+1)
PosXArea(a) = (PosX(a,C,P)+PosX(a-1,C,P))*0.5
PosXCase(a,c) = (PosX(a,c,P)+PosX(a,c-1,P))*0.5
myColor(p) = int(word("0x5b9bd5 0xed7d31 0xa5a5a5",int(p)))
myValue(a,c) = strcol((a-1)*C+c+1)
set grid y
set xlabel "\n\n\n" # get empty space below the plot
set format x "" # no xtic labels
set yrange[0:]
array Titles[P] # array for titles
plot for [a=1:A] for [c=1:C] $Data u (PosX(a,c,$0)):(PosY(a,c)):(myColor($0+1)) skip 2 w boxes lc rgb var , \
for [a=1:A] for [c=1:C] '' u (PosX(a,c,$0)):(PosY(a,c)):(Titles[int($0+1)]=strcol(1), myValue(a,c)) skip 2 w labels offset 0,0.7, \
for [a=1:A] for [c=1:C] '' u (PosXCase(a,c)):(0):(myValue(a,c)) every ::1::1 w labels offset 0,-1, \
for [a=1:A] '' u (PosXArea(a)):(0):('\n\n'.myValue(a,1)) every ::0::0 w labels offset 0,-1, \
for [p=1:P] keyentry w boxes lc rgb myColor(p) ti Titles[p]
### end of code
Result:

Trimming data from list to fit in a specific shape such as a geodataframe map

I have a bunch of approximated data in a list from which I create a color map. I have overlaid this map onto a map which was drawn from a geodataframe containing a shape file (with all coordinates of the polygon boundaries. In the picture below you can see that the cmap does not conform to the shape of the map.
To make things look cleaner, I would like to somehow 'trim' the edges off of the color map. How could I go about doing this? I have tried using df.totalbounds to conditionally do calculations before the colormap data is made, but this produces the results seen in the photo linked below.
Any solutions or input are appreciated, thanks!
Code snippet of the conditional calculation described above where geo_minx/y and geo_maxx/y are the min and max (x,y) values taken from the df.totalbounds method.
if geo_minx <= realx <= geo_maxx and geo_miny <= realy <= geo_maxy:
Map with color map over a geodataframe shape file
Edit
Here is the structure of the dataframe that holds each approximated data point with its (x,y) coordinates:
approximated data
X Y approx
0 -124.6 24.6 1.006655
1 -124.6 24.8 1.006655
2 -124.6 25.0 1.006655
3 -124.6 25.2 1.006655
4 -124.6 25.4 1.006655
Here is the map dataframe struture
<bound method NDFrame.head of STATEFP STATENS AFFGEOID GEOID ... LSAD ALAND AWATER geometry
0 28 01779790 0400000US28 28 ... 00 121533519481 3926919758 MULTIPOLYGON (((-88.50297 30.21523, -88.49176 ...
1 37 01027616 0400000US37 37 ... 00 125923656064 13466071395 MULTIPOLYGON (((-75.72681 35.93584, -75.71827 ...
2 40 01102857 0400000US40 40 ... 00 177662925723 3374587997 POLYGON ((-103.00257 36.52659, -103.00219 36.6...
3 51 01779803 0400000US51 51 ... 00 102257717110 8528531774 MULTIPOLYGON (((-75.74241 37.80835, -75.74151 ...
4 54 01779805 0400000US54 54 ... 00 62266474513 489028543 POLYGON ((-82.64320 38.16909, -82.64300 38.169.

Too small values to be displayed by gnuplot

I'm encountering a problem using gnuplot to display some distribution data (in the form of bar charts).
Because of the very high values in my data, the smallest ones cannot be displayed.
For example with these values:
10 1
20 4
21 24
22 77
23 177
24 636
25 1700
26 3433
27 5160
28 7462
29 7883
30 6652
31 4155
32 1989
33 797
34 170
Gnuplot do not display the bars corresponding to 10 and 20 because they are way too small comparatively to the maximum.
Is there a way to display them, just a little bit, other than using a logarithmic scale ?
I was especially thinking to a kind of glowing effect at the top of the bars whose values are not null, can it be done using gnuplot ?
Here are the few lines I use to display my data
set style data boxes
set style fill solid 0.1
plot 'distribution.dat'
And here is what I get for the moment:
distribution bar chart
Thanks in advance
Maybe use the "zeroaxis" representation rather than a plot border, and increase the linewidth used to draw boxes.
set style data boxes
set style fill solid 0.1
set xrange [0:*]
set yrange [-100:*]
set xzeroaxis
set yzeroaxis
set tics nomirror
unset key
unset border
plot $DATA linewidth 1.5

How to plot two datasets from two different columns from one file in Gnuplot?

I have one file with two columns containing A/D-samples from two sources. All values are within the range 0-1023 (inclusive) and the sources are not dependant on each other. That is, they are completely different.
Sample excerpt from the datafile:
188 631
196 593
203 594
210 593
218 595
225 593
233 594
240 602
247 593
255 594
262 593
269 594
277 593
284 594
All the values in the first column belongs to A/D-source #1, while all the values in the second column belongs to A/D-source #2.
Now, what I what to do is to get two lines/plots of the respective A/D-source, in the same plot. Since this is my first shot at Gnuplot, I have a hard time to get it right since it seems that no matter what I do Gnuplot interprets the datafile lines as (X, Y) rather than (Y1, Y2) which is what I want. Doing a plain plot 'datafile' will simply bomb all values in a scattered mess.
How do I tell Gnuplot that this particular file contains two datasets, one in each column?
plot 'datafile' using 0:1, 'datafile' using 0:2
Column 0 is a 'pseudocolumn' that evaluates to the ordinal number of the current data point (usually the line number). If only a single column of data is present the program assumes that x = column(0) and y = column(1). The command given above gives the full specification of what to do with the columns, but a simpler form is also accepted:
plot 'datafile' using 1, '' using 2

Once I've used the regression function in excel, how do I find out the formula it used (y =mx +b)

I have a sample data range that has four categories,
foo | bar | bizz| buzz
---------------------------
163 345 456 2435
232 234 457 2435
123 346 234 3673
Foo is the dependant variable, bar, bizz and buzz are independant variables. I've went to Data Analysis => Regression => picked those columns as appropriate, gotten all of the regression statistics and some plots that represent it. How do I find the formula that it used so that I can use it in my predictions in an application?
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.462484844
R Square 0.213892231
Adjusted R Square 0.212161986
Standard Error 2991.441979
Observations 1367
ANOVA
df SS MS F Significance F
Regression 3 3318714896 1106238299 123.6196536 8.06738E-71
Residual 1363 12197112332 8948725.116
Total 1366 15515827228
Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%
Intercept 703.0478619 126.1475776 5.5732173 3.01028E-08 455.5834102 950.5123135 455.5834102 950.5123135
Bar 41.53512531 2.493716675 16.65591193 7.6937E-57 36.64318651 46.42706411 36.64318651 46.42706411
Bizz 1.96479128 0.361015402 5.442402932 6.22595E-08 1.256585224 2.672997336 1.256585224 2.672997336
Buzz 16.77200247 5.419776635 3.094592933 0.002010941 6.139994479 27.40401046 6.139994479 27.40401046
RESIDUAL OUTPUT PROBABILITY OUTPUT
Observation Predicted foo Residuals Standard Residuals Percentile foo
1 6780.632281 34894.36772 11.67756172 0.036576445 63
2 6722.069851 28513.93015 9.542318743 0.109729334 63
3 3382.925842 21471.07416 7.185394378 0.182882224 63
Oh hey, my stats class looks 98% less useless now.
According to that output,
foo = 703.0478619 + 41.53512531 * bar + 1.96479128 * bizz + 16.77200247 * buzz
You can see these values where it lists the coefficients/standard errors for Intercept, Bar, Bizz, and Buzz.
Should probably note that the r squared value is extremely low, which (if I recall correctly) means that the variance in foo is not well explained by the independent variables.

Resources