Background color for excel file with Pandas: problem with single column boundaries (for outliers)

Background color for excel file with Pandas: problem with single column boundaries (for outliers) - excel

I need to export an excel where the outliers of my dataset are highlighted in yellow. As you know, calculated with upper bound and lower bound.
I've managed to set up the function, I just can't get it to work in an iterated way so the bounds are for each column.
When I export the dataframe into an excel file, it colours the cells based on the outliers of only one variable. It doesn't work in iterated mode.
Here my code, where am I wrong?
Here I just calculate for each column the lower and the upper bound (except for Subject ID)
for col in dfm.columns.difference(['Sbj']):
Q1c = dfm[col].quantile(0.25)
Q3c = dfm[col].quantile(0.75)
IQRc = Q3 - Q1
lowc = Q1-1.5*IQR
uppc = Q3+1.5*IQR
I set the function to colour the single value based on the limit values (not iterated)
def color(v):
if v < lowc or v > uppc:
color = 'yellow'
return 'background-color: %s' % color
apply the function iteratively for each column
for col in dfm.columns.difference(['Sbj']):
df_colored = dfm.style.applymap(color)
It is obvious that something is wrong with the iteration.
Many thanks!!

Related

How to calculate with the Poisson-Distribution in Matlab?

I’ve used Excel in the past but the calculations including the Poisson-Distribution took a while, that’s why I switched to SQL. Soon I’ve recognized that SQL might not be a proper solution to deal with statistical issues. Finally I’ve decided to switch to Matlab but I’m not used to it at all, my problem Is the following:
I’ve imported a .csv-table and have two columns with values, let’s say A and B (110 x 1 double)
These values both are the input values for my Poisson-calculations. Since I wanna calculate for at least the first 20 events, I’ve created a variable z=1:20.
When I now calculated let’s say
New = Poisspdf(z,A),
it says something like non-scalar arguments must match in size.
Z only has 20 records but A and l both have 110 records. So I’ve expanded Z= 1:110 and transposed it:
Znew = Z.
When I now try to execute the actual calculation:
Results = Poisspdf(Znew,A).*Poisspdf(Znew,B)
I always get only a 100x1 Vector but what I want is a matrix that is 20x20 for each record of A and B (based on my actual choice of z=1:20, I only changed to z=1:110 because Matlab told that they need to match in size).
So in this 20x20 Matrix there should always be in each cell the result of a slightly different calculation (Poisspdf(Znew,A).*Poisspdf(Znew,B)).
For example in the first cell (1,1) I want to have the result of
Poisspdf(0,value of A).*Poisspdf(0,value of B),
in cell(1,2): Poisspdf(0,value of A).*Poisspdf(1,value of B),
in cell(2,1): Poisspdf(1,value of A).*Poisspdf(0,value of B),
and so on...assuming that it’s in the Format cell(row, column)
Finally I want to sum up certain parts of each 20x20 matrix and show the result of the summed up parts in new columns.
Is there anybody able to help? Many thanks!
EDIT:
Poisson Matrix in Excel
In Excel there is Poisson-function: POISSON(x, μ, FALSE) = probability density function value f(x) at the value x for the Poisson distribution with mean μ.
In e.g. cell AD313 in the table above there is the following calculation:
=POISSON(0;first value of A;FALSE)*POISSON(0;first value of B;FALSE)
, in cell AD314
=POISSON(1;first value of A;FALSE)*POISSON(0;first value of B;FALSE)
, in cell AE313
=POISSON(0;first value of A;FALSE)*POISSON(1;first value of B;FALSE)
, and so on.

I am not sure if I completely understand your question. I wrote this code that might help you:
clear; clc
% These are the lambdas parameters for the Poisson distribution
lambdaA = 100;
lambdaB = 200;
% Generating Poisson data here
A = poissrnd(lambdaA,110,1);
B = poissrnd(lambdaB,110,1);
% Get the first 20 samples
zA = A(1:20);
zB = B(1:20);
% Perform the calculation
results = repmat(poisspdf(zA,lambdaA),1,20) .* repmat(poisspdf(zB,lambdaB)',20,1);
% Sum
sumFinal = sum(results,2);
Let me know if this is what you were trying to do.

Spotfire Table Visualization column width

Is there a way to set Spotfire Table visualization column width either through Java or Python script. i can able to change the column width manually but whenever the value changes through property control it reset again. I need to set constant column width. Thanks in advance.

from Spotfire.Dxp.Application.Visuals import TablePlot
# Tab is the (Visualization) parameter passed to the script specify the table to work on
dataTable= Tab.As[TablePlot]().Data.DataTableReference
# Get a handle to the Table plot
table = Tab.As[TablePlot]()
# Get the ColumnCollection for the Table plot
columns = table.TableColumns
# Size all of the columns
for x in columns:
x.Width = 200

good question! you can add an IronPython script (I don't believe it's possible to do this using Javascript unless you are some kind of wizard, or otherwise hate yourself :) to do this pretty simply.
I'll put the examples all in one code snippet, but obviously you would only want to do one of these loops at a time. the snippet expects a parameter called viz whose value is the TablePlot visualization you wish to modify.
from Spotfire.Dxp.Application.Visuals import TablePlot
v = viz.As[TablePlot]()
# increase all column widths by 10 pixels
for col in v.Columns:
col.Width = col.Width + 10
# set all column widths to 100 pixels (the default value)
for col in v.Columns:
col.Width = 100
# set the width of a specific column to 50 pixels
for col in v.Columns:
if col.Name == "My Column":
col.Width = 50

How to recreate Excel's "Index(,,Match())" function in SPSS?

I am trying to recreate Excel's "Index(,,Match())" function in SPSS. My data is organized as follows:
The "Position" variables indicate what column (T:V) the value in the "Value" variables should go.
In the 1st row, the positions are in order 1-3, so the values in columns T:V are in the same order as the "Value" variables.
In the second row the positions are 2,3,1; so the value in "Value1" should go in column U (the second column in that last block of variables), the value in "Value2" should go in the column V, and the value in "Value 3 should go in column T. And so on.
After looking into doing this in SPSS, SPSS' Index and Match functions will not help.
Do any Excel/SPSS users know how to accomplish this in SPSS with syntax?

There are probably several ways to approach the problem, depending on how many columns you're dealing with and whether they're all numeric or if there are strings (there's probably a matrix algebra answer, I just can't think of it).
If you only have 3 sets of 3 columns, the simplest approach would be to write 9 (3*3) if-statements (you don't have column names for cols T/U/V, so I'm just referencing their Excel column):
if (Position1 = 1) T = Value1.
if (Position1 = 2) T = Value2.
if (Position1 = 3) T = Value3.
if (Position2 = 1) U = Value1.
if (Position2 = 2) U = Value2.
...
This should work. If you have many more columns, you can also use vector loops to define the sets of variables.

Here's a scalable approach:
vector match(3).
do repeat p = position1 to position3 / v= value1 to value3 / y = #y1 to #y3.
compute y = v*p.
end repeat.
loop #i = 1 to 3.
compute match(#i) = any(#i, #y1 to #y3).
end loop.
exe.

How to do math operation on rows in gnuplot

Say, my data file has two columns and five rows as follows,
1 3
2 5
3 3
4 4
5 2
Now I would like to plot them but with a little math operation on second column. For example,
plot 'test.dat' u 1:($2*)
What I mean by asterisk is I would like to sqrt(row2^2+row1^2), which is sqrt(5^2+3^2), on second column values. How I can do that? Many thanks!

Usually, one can access only the values of all columns of the current row. Accessing the values of a previous row is possible, but tricky. Basically, you must save the values in temporary variables.
That works in the following way:
In the first row, save the values of both columns and do not plot them (use NaN as value).
In the second row, save the current x-values, use the x-value of the previous row. Then save the current y-value, and compute your value based on the previous row (prevY) and the current row (currY).
That doesn't plot the last line. But that hasn't a next row anyway. If you want it to plot also the last line with e.g. 0 as additional value, you must add a last row with 0 0.
In the script I use set macros for better readability of the code:
set macros
prevX = currX = prevY = currY = 0
UsePreviousXvalue = '(($0 == 0) ? (prevX = NaN, currX = $1) : (prevX = currX, currX = $1)), prevX'
AssignYvalue = '(prevY = currY, currY = $2)'
plot 'test.dat' using (#UsePreviousXvalue):(#AssignYvalue, sqrt(prevY**2 + currY**2))

Black Sholes formula

I found the Black Sholes Option calculator.
As you can see from the spreadsheet the adjusted stock price is given by =S*EXP(-q*T), how can I use a function in Excel to lookup for the value of q in my spreasheet?

A = Se^(-qT), then A/S = e^(-qT) and S/A = e^(qT).
Therefore, ln(S/A) = qT and so ln(S/A)/T = q as desired.
So your formula should be
=ln(S/A)/T
where S, A, and T are cell references to the values from which you want to compute the value of q.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Background color for excel file with Pandas: problem with single column boundaries (for outliers) - excel

Related

How to calculate with the Poisson-Distribution in Matlab?

Spotfire Table Visualization column width

How to recreate Excel's "Index(,,Match())" function in SPSS?

How to do math operation on rows in gnuplot

Black Sholes formula

Categories

Resources