I am working on an X-colours palette (kinda like this one) but based on d3's lab colour space implementation. I have read the documentation here, here and here, even had a look at the source code here and I figured that l-value must be within [0,100] but I couldn't find a range for a and b. Does any value work with them or are they bound to a specific range (like [0, 255] for rgb for example)?
The a parameter is a point in the range yellow-blue component and the b parameter is a value in the green-magenta component. It's hard to determine valid range values for a and b, as the valid range depends on the value of the parameter L (which is expected to be in the range [0, 100]). This article contains more details. You could use d3.interpolateLab(l, a, b). In this context, a and b are object that can be interpreted as colors (string, d3.rgb, d3.hsv).
Related
Excel
Need to find nearest float in a table, for each integer 0..99
https://www.excel-easy.com/examples/closest-match.html explains a great technique for finding the CLOSEST number from an array to a constant cell.
I need to perform this for many values (specifically, find nearest to a vertical list of integers 0..99 from within a list of floats).
Array formulas don't allow the compare-to value (integers) to change as we move down the list of integers, it treats it like a constant location.
I tried Tables, referring to the integers (works) but the formula from the above web site requires an Array operation (F2, control shift Enter), which are not permitted in Tables. Correction: You can enter the formula, control-enter the array function for one cell, copy the formulas, then insert table. Don't change the search cell reference!
Update:
I can still use array operations, but I manually have to copy the desired function into each 100 target cells. No biggie.
Fixed typo in formula. See end of question for details about "perfection".
Example code:
AI4=some integer
AJ4=MATCH(MIN(ABS(Table[float_column]-AI4)), ABS(Table[float_column]-AI4), 0)
repeat for subsequent integers in AI5...AI103
Example data:
0.1 <= matches 0
0.5
0.95 <= matches 1
1.51 <= matches 2
2.89
Consider the case where target=5, and 4.5, 5.5 exist in the list. One gives -0.5 and the other +0.5. Searching for ABS(-.5) will give the first one. Either one is decent, unless your data is non-monotonic.
This still needs a better solution.
Thanks in advance!
I had another problem, which pushed to a better solution.
Specifically, since the Y values for the X that I am interested in can be at varying distances in X, I will interpolate X between the X point before and after. Ie search for less than or equal, also greater than or equal, interpolate the desired X, then interpolate the Y values.
I could go a step further and interpolate N - 1 to N + 1, which will give cleaner results for noisy data.
Most of us may be aware of normal distribution curves however those who are new to front-loaded and back-loaded normal distribution, I would like to provide the background and then would proceed on stating my problem.
Front-Loaded Distribution: As demonstrated below, it have a rapid start. For e.g. in a project when more resources assumed to be consumed early in the project, cost/hours is distributed aggressively at the start of project.
Back-Loaded Distribution: Contrary to Front-Loaded distribution, it start out with a lower slope and increasingly steep towards the end of the project. For e.g. when most resources assumed to be consumed late in the project.
In the above charts, green line is S-Curve which represents cumulative distribution (utilization of resources over the proposed time) and the blue Columns represents the isolated distribution of resources (Cost/Hours) in that period.
For reference, I am providing the Bell Curve / standard normal distribution (when Mean=Median) chart (below) and the associated formula to begin with.
Problem Statement: I was able to generate the normal distribution curve (See below with formulae) however I am unable to find a solution for Front loaded or Back Loaded curves.
How to bring the skewness to the right (front-loaded / positively skewed distribution which means mean is greater than median) and left skewed (back-loaded / negatively skewed distribution which means mean is less than median) in a normal distribution?
Formula Explaned:
Cell B8 denotes arbitrarily chosen standard deviation. It affects the kurtosis of normal distribution. In the above screenshot, I am choosing the range of the normal distribution to be from -3SD to 3SD.
Cell B9 to B18 denotes the even distribution of Z-Score using the formula:
=B8-((2*$B$8)/Period)
Cell C9 to C18 denotes the normal distribution on the basis of Z Score and the Amount using the formula:
=(NORMSDIST(B9)-NORMSDIST(B8))*Amount/(1-2*NORMSDIST($B$8))
Update: Following one of the link in comment, I closest got to the below situation. The issue is highlighted in Yellow pattern as due to the usage of volatile Rand() function the charts are not smooth as they should be. As my given formula above do not create ZigZag pattern, I am sure we can have skewed normal distribution and smooth too !
Note:
I am using Excel 2016, so I welcome if any newly introduced formula can solve my problem. Also, I am not hesitant to use UDFs.
The numbers of front-load and back-load distribution are notional. They could vary. I am only interested in shape of resulting chart.
Kindly help !
You can generate the curve using below methods and can use the numbers generated by them for your requirement.
With formulae
The curve
Notes:
If you want to change the bins you have to drag the cells down or up
in order to complete the series
If you want to change the total cost, you can change the multiplier
If you want to change the tilting of the curve you can change the
divider in column C which is currently set to 2, if it is -2 the tilt
will change direction, you can experiment with different numbers,
the direction depends upon either it is less than zero or greater
than zero
For copy past
=A2+180/($G$3-1)
=RADIANS(A2)
=$G$4*SIN(B2 + SIN(B2)/2)
I used the actual mathematical formulas to arrive at the result. It looks like to me what you wanted to achieve. The orange cells in 'Skewed' section are the ones which can be changed to vary the degree and direction of skew. Some pictures for demonstration are below, followed by the formulas used.
Formulas in row 5, column
B:=(A5*$A$2)+0 (0 is the mean, you can change as you like)
C:=(1/($A$2* SQRT(2*PI())))*EXP(-(B5^2)/2)
D:=0.5*(1+ERF(B5/SQRT(2)))
E:=$A$1*C5
F: =(A5*$A$2*(1+$F$2*SIN((F4*PI())/(2*$F$4))))+0 (0 is the mean, you can change as you like)
G:=(1/($A$2* SQRT(2*PI())))*EXP(-((F5+$G$2)^2)/2)
H:=0.5*(1+ERF((B5+$G$2)/SQRT(2)))
I:=$A$1*G5
If you want to make sure the bins always have a value in them, you can use the following approach, which uses normal distributions and simply changes the mean and the standard deviation to get a curve that you want.
Changing the mean moves the peak to the left or right. Changing the standard deviation makes the quantities more uniform or more variable. I've used 0-1000 as my default range in the example below, but it should be easy to modify the formula to bring any value you want. NOTE in order to fulfill your requirement that all bins must be non-zero, you need to manually adjust the numbers till you get a curve that suits.
Yellow cells are for data entry, green cells are a count (so if you add bins, they would need to be numbered according to the sequence).
Formula in cell B7 (copied down to cell B16):
=NORMDIST($A7*1000/MAX($A$6:$A$17),$B$3,$B$4,TRUE)-NORMDIST($A6*1000/MAX($A$6:$A$17),$B$3,$B$4,TRUE)
Formula in cell C7 (copied down to cell C16):
=IF(A7=MAX($A$6:$A$17),$C$5-SUM(C$6:C6),ROUND(B7/SUM($B$7:$B$17)*$C$5,0))
Adding new bins is simple enough and is still based on a 0-1000 range, so you don't need to change any numbers other than adding rows and copying down the formulae:
The above example is also showing how a narrow standard deviation and a high mean combine to make the starting bins have very little quantity. But there is still a value (as long as count is big enough).
You may want to pre-define the different skewness selections if this is going to be used by other people (make column B dependent on a lookup, for example) but hopefully this is extensible enough for your needs.
If you are open to a Python answer the I can give you the code to get Python Pandas libary to generate the random observations from a skewed Normal and then bin (bucket) them for you. The following in a Python script which captures the use case but also can be created using COM and so creatable from VBA.
import numpy as np
import pandas as pd
from scipy.stats import skewnorm
class PythonSkewedNormal(object):
_reg_clsid_ = "{1583241D-27EA-4A01-ACFB-4905810F6B98}"
_reg_progid_= 'SciPyInVBA.PythonSkewedNormal'
_public_methods_ = ['GeneratePopulation','BinnedSkewedNormal']
def GeneratePopulation(self,a, sz):
# https://docs.scipy.org/doc/numpy-1.15.1/reference/generated/numpy.random.seed.html
np.random.seed(10);
#https://docs.scipy.org/doc/scipy-0.19.1/reference/generated/scipy.stats.skewnorm.html
return skewnorm.rvs(a, size=sz).tolist();
def BinnedSkewedNormal(self,a, sz, bins):
# https://docs.scipy.org/doc/numpy-1.15.1/reference/generated/numpy.random.seed.html
np.random.seed(10);
#https://docs.scipy.org/doc/scipy-0.19.1/reference/generated/scipy.stats.skewnorm.html
pop = skewnorm.rvs(a, size=sz); #.tolist();
bins2 = np.array(bins)
bins3 = pd.cut(pop,bins2)
table = pd.value_counts(bins3, sort=False)
table.index = table.index.astype(str)
return table.reset_index().values.tolist();
if __name__=='__main__':
print ("Registering COM server...")
import win32com.server.register
win32com.server.register.UseCommandLine(PythonSkewedNormal)
And the VBA client code
Option Explicit
Sub TestPythonSkewedNormal()
Dim skewedNormal As Object
Set skewedNormal = CreateObject("SciPyInVBA.PythonSkewedNormal")
Dim lSize As Long
lSize = 100
Dim shtData As Excel.Worksheet
Set shtData = ThisWorkbook.Worksheets.Item("Sheet3") '<--- change sheet to your circumstances
shtData.Cells.Clear
Dim vBins
vBins = Array(-5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5)
'Stop
Dim vBinnedData
vBinnedData = skewedNormal.BinnedSkewedNormal(-5, lSize, vBins)
Dim rngData As Excel.Range
Set rngData = shtData.Cells(2, 1).Resize(UBound(vBins) - LBound(vBins), 2)
rngData.Value2 = vBinnedData
'Stop
End Sub
Sample output
(-5, -4] 0
(-4, -3] 0
(-3, -2] 4
(-2, -1] 32
(-1, 0] 57
(0, 1] 7
(1, 2] 0
(2, 3] 0
(3, 4] 0
(4, 5] 0
Original code deposited on my blog
Based on #usmanhaq 's ans, vba macro made for distribution curve simulation. Corrected for 100% scaling of front & backloading curve.
click here to go Github Lib
I have a scale, based on which I decide the value of the coefficient for the multiplication. The scale looks as following:
Which means that:
for Category1: when value>=1.000.000 then coef is 1, when value>=500.000 then coef is 0.8 and etc.
Same logic applies for Category2;
Then I have input data in the following format:
Company !MainCat|Sales Amount|
Company1|T1 | 6.500.000|
Company2|T2 | 70.000|
I need to find corresponding coefficient, ratio of the coeffitient and the value (=ratio*MaxCoef). Currently, I am finding coef the following way:
- for company1:
=IF(C8>=$D2;$D$1;IF(C8>=$E2;$E$1;IF(C8>=$F2;$F$1;IF(C8>=$G2;$G$1;IF(C8>=$H2;$H$1;IF(C8>=$I2;$I$1))))))
That is literally hardcoded and doesn't look good. Maybe there is a better way of doing ? Any suggestions?
Formula view:
You can COUNTIF(range, [criteria] < value) * 0.2 as your add 0.2 per coef stage.
To you data do: =COUNTIF(D2:H2, "<"&C8) * 0.2, count how many stages the value passes * the value per stage.
Your count if range needs to be until H2 as I2 is 0, so inferior to value and gets counted.
To combine the COUNTIF() with a dynamic search for the right category based on MainCat you can MATCH() the MainCat with Code which will give the row where the Code is located and utilize INDIRECT() to apply it as range.
=COUNTIF(INDIRECT("D"&MATCH(B8,B:B,0)&":H"&MATCH(B8,B:B,0)),"<"&C8)*0.2
MATCH(B8,B:B,0) - will match the value on B8 (lets say T1) and return the row 2.
INDIRECT("D"&MATCH(B8,B:B,0)&":H"&MATCH(B8,B:B,0) = INDIRECT("D"&2&":H"&2) - will turn the text into an actual range to be use by the COUNTIF().
Create a table ‚Mapping’ that contains two columns, ‚Category’ and ‚Coefficient‘, then use INDEX-MATCH on it as described in https://www.deskbright.com/excel/using-index-match/.
=INDEX(Mapping[Category]; MATCH([Coefficient]; Mapping[Coefficient]; -1))
This example assumes that you put this formula into a table that has a column named ‚Coefficient‘ with the input value to your multiple IFs.
The trick is that as a match_type argument, provide either -1 or 1, according to your needs.
You can do this in VBA. Write your own function which ends in something like that
=MyOwnScale(C8; B8; A2:I3)
The first parameter of your VBA-function is the value, the second the category and the third is the range with the thresholds. So you can move your cascading IF-loops in VBA-Code and you (and your users) see only a clean function call in the cell.
This may sound a bit odd and maybe I'm just missing the forest through the trees on this question, but is there a way to force the Excel Solver to return only one instance of a result? As a short example imagine that we have some results on the likability of various objects (colors, animals, and shapes). We want the solver to return the three most preferred objects from this list.
Red (400)
Dog (120)
Circle (100)
Red (400)
Cat (90)
Square (75)
Blue (90)
Horse (60)
Triangle (70)
Green (80)
Snake (30)
Rectangle (40)
Yellow (40)
Rabbit (20)
Pentagon (15)
The problem is, of course, simplified in this example. Basically, my issue arises in that I want one of each type, namely Red, Dog, and Circle but I keep getting Red, Red (again), and Dog because the total is higher. I want to define a way to prevent Solver from returning two values named the same. I just can't seem to figure it out and Google doesn't seem to produce any viable responses either.
It's unclear how your data is setup, and this could affect how you setup the Solver problem, but here is one method (nb - this method will only work if you have 200 or fewer values to choose from).
Make Column A for "Category". This would have values such as "Color", "Animal", and "Shape".
Column B would be for "Type", and contain the information you provided. (e.g. Dog, Cat, ... Red, Blue, ... Circle, Square, ...)
Column C is the Value or Score for the type shown in Column B, again the information you provided.
Column D has fields that Solver will manipulate, let's call it "Selected". Selected will be a 0 or a 1.
Column E is the result of selection, a simple calculation, =C2*D2, filled down.
Make Cell H2 the sum of Column E. This will be your objective for Solver.
Make G3 through G5 the values in "Category" (Color, Animal, Shape).
Make H3 through H5 the total selected values in each category. That is =SUMIF($A$2:$A$16,"="&G3,$D$2:$D$16) filled down.
The workbook looks like this ...
... from this, you can setup Solver with the following ...
Set Objective: is $H$2
To: is set to Max. (i.e. you are looking for the most preferred)
By Changing Variable Cells: is set to $D$2:$D$16
Subject to the Constraints: has four entries. $D$2:$D$16 = binary; $H$3 = 1; $H$4 = 1; $H$5 = 1
Select a Solving Method: is set to Evolutionary. You can use GRG Nonlinear, but it takes longer.
The dialog looks like this ...
... with the following result, which meets your criteria ...
I am having some trouble with applying a colour scale on circles on a scatterplot.
What I did to set up the scale was:
var color = d3.scale.ordinal()
.domain(d3.extent(dataset, function (d) { return parseFloat(d.Weight); }))
.range(["#D6F5D6", "#ADEBAD", "#84E184", "#5BD75B","#32CD32"]);
And then apply the color to the circles with:
.attr("fill", function (d) { return color(d.Weight) }
But from the graph I can tell that the colours are not correct, in fact I see some values that are higher that have a lighter colour. I think the problem is that the values get read like strings, and in fact if I do the console log, the values that are not the min or max appear as strings, and I believe that this is the problem why I get wrong color values.
I tried also to set every d.Weight as a number in the domain, like so:
dataset.forEach(function (d) { d.Weight = +d.Weight; });
But it doesn't work either. Attached here's an image of what I'm getting:
The X Axis is set to the Weight, so the colour of the points should be from lighter to darker going left to right, but it clearly isn't:
Any help is appreciated, thanks!
EDIT Forgot to mention that i tried with scale.linear() but with it I get this result:
Just the first values get picked by the circles
Ordinal scales work differently from linear scales in that no interpolation between domain elements is done. To quote the documentation:
The first element in values will be mapped to the first element in the output range, the second domain value to the second range value, and so on.
This means that your input values are being mapped incorrectly, as you're only using min and max to set up the domain. For your purposes, you probably want a linear scale.
When using a linear scale, you'll have to tell D3 how to interpolate the values of the output range, e.g.
.range(["#D6F5D6", "#ADEBAD", "#84E184", "#5BD75B","#32CD32"])
.interpolate(d3.interpolateRgb);
Note that in this case, you also need to provide the same number of input values as output values:
Although linear scales typically have just two numeric values in their domain, you can specify more than two values for a polylinear scale. In this case, there must be an equivalent number of values in the output range.
In your case, it would be easiest to simply take the min and max and the two extreme colours.