geom_mark_hull include empty spaces with no points - ggforce

I am plotting biplots of NMDS and using ggforce::geom_marK_hull to create hulls around points representing fish communities from two sites, faceted by year and season (wet and dry):
Biplot
I found that some of the hulls are including empty spaces without any points (e.g. Upstream group in wet season 2018 and Upstream group in dry season 2020). I have tried to adjust the expand argument but it seems the problem persists.
Any idea what the reason might be? I would appreciate any comments or help! Thanks in advance!
Raw data files used are fish.spp and fish.env. Codes are:
library(vegan)
library(ggplot2)
library(ggforce)
library(concaveman)
fish.nmds.k3 <- metaMDS(fish.spp, distance = "bray", k = 3)
fish.nmds.k3.sites <- as.data.frame(fish.nmds.k3$points)
fish.nmds.k3.sp <- as.data.frame(fish.nmds.k3$species)
(ggplot(fish.nmds.k3.sites, aes(MDS1, MDS2))+
ylim(-3, 3) + xlim (-3, 3) +
geom_mark_hull(aes(fill = fish.env$Site), expand = unit(3, "mm")) +
geom_point() +
labs(title = "NMDS Plot of Freshwater Fish Community") +
theme_bw() +
theme(plot.title = element_text(hjust = 0.5, face = "bold", size = 10),
panel.grid = element_blank()) +
facet_grid(rows = vars(fish.env$Year), cols = vars(fish.env$Season))
)

Related

Border/Fill Around Strip Chart Points

Question
I know this question (or similar questions) have been posted, though I cannot seem to find what is going on.
I am doing a strip chart and I have tried using both geom_jitter(position = position_jitter(0.065)) and geom_point(position = position_jitter(0.065)), but I cannot seem to put a border around each of my points.
I have been only able to fill them with a colour, regardless of whether I use fill = or colour =.
Code:
ggplot(Titanic.Data, aes(x = as.factor(Survive),
y = Age,
colour = Survive)) +
geom_jitter(position = position_jitter(0.065)) +
labs(x = "Survive",
y = "Age",
title = "Strip Chart of Age vs Survive") +
scale_colour_manual(values = c("DarkBlue", "DarkRed")) +
theme_test() +
theme(plot.title = element_text(hjust = 0.5,
face = "bold",
size = 18)) +
theme(legend.position = "none")
Graph:
Here is what the graph will look like
Code
The data set is huge, and I do not know how to add the file to this thread so here is a small portion of it.
Titanic Data

In drc() package, drm fct = L.4 finds wrong intercept parameters, even though the graph is right

I have a problem with the following code.
It calculates the drc curve correctly, but the ec50 wrongly, although the are closely related...
x <- c(-1, -0.114074, 0.187521, 0.363612, 0.488551, 0.585461, 0.664642, 0.730782, 0.788875, 0.840106, 0.885926, 0.92737, 0.965202, 1)
y <- c(100, 3.978395643, 0.851717911, 0.697307565, 0.512455497, 0.512455497, 0.482273052, 0.479293487, 0.361024717, 0.355324864, 0.303120838, 0.286539832, 0.465692047, 0.358045152)
mat <- cbind(x, y)
df <- as.data.frame(mat)
calc <- drm(
formula = y ~ x,
data = df,
fct = L.4(names = c("hill", "min_value", "max_value", "ec50"))
)
plot <- ggplot(df, aes(x=x, y=y), color="black") +
geom_point() +
labs(x = "x", y = "y") +
theme(
axis.title.x = element_text(color="black", size=10),
axis.title.y = element_text(color="black", size=10),
axis.line.x = element_line(color = "black"),
axis.line.y = element_line(color = "black")
) +
stat_smooth(
formula = y ~ x,
method = "drm", color="black",
method.args = list(fct = L.4(names = c("hill", "min_value", "max_value", "ec50"))),
se = FALSE
) +
theme(panel.background=element_rect(fill="white"))+
ylim(0, NA)
ec50 <- ED(calc,50)
print(ec50)
print(calc)
print(plot)
This is the graph I obtain:
But if I print the parameters of the function L.4, I have the following result:
hill:(Intercept) 6.3181
min_value:(Intercept) 0.3943
max_value:(Intercept) 111.0511
ec50:(Intercept) -0.6520
max_value:(Intercept) is obviously wrong (it has to be 100), and, as a consequence, ec50 is wrong too.
I would also add that for other sets of data, the min_value:(Intercept) is wrong too (with values < 0...)
I cannot find the mistake, because the graph derived from the same function L.4 shows the right values.
Thank you very much for your help!
The upper asymptote in your case assumes a symmetrical curve (due to 4PL fitting). Meaning that both bottom and upper asymptote have the same inflection.
Your data might max out at 100 but the formula calculates the upper asymptote further than 100 (111) because that's where the actual asymptote lies, not the end of your data.
So the graph is based on your data, but the estimated parameters forces a symmetrical curve to fit it, and your asymptote increases. This will also shift the EC50.

Getting Equation From Trendlines Once They've Been Formated

Good Morning,
This problem has been plaguing me since before Christmas. I have a graph that I apply trendlines to then format the trendline to get a more detailed equation for, 6 Sig Figs rather than excels default mode. However when I come to extract the equation of the trendlines, sometimes I get the full 6 sig fig version of the equation and other times I get the default version. I can see no logic behind why this jumps from case to case, although I'm sure there is a reason for this. The code that I use is below (I'm also sure that there may be far more efficient ways to code this - currently just trying to get it to work)
Set t = My_Chart.Chart.SeriesCollection(X).Trendlines(1)
t.DataLabel.NumberFormat = "0.00000E+00"
Application.Wait (Now + TimeValue("0:00:01"))
Eqn_Len = Len(t.DataLabel.Text)
LHS_Len = Round(Eqn_Len / 2)
RHS_Len = Eqn_Len - LHS_Len
LHS = Left(t.DataLabel.Text, LHS_Len)
RHS = Right(t.DataLabel.Text, RHS_Len)
Equation_Formula = LHS & RHS
Cells(95, 1) = Equation_Formula
Equation_Formula_Main = Replace(Equation_Formula, "y=", "")
Equation_Formula_Main = Replace(Equation_Formula_Main, "x3", "x^3")
Equation_Formula_Main = Replace(Equation_Formula_Main, "x2", "x^2")
Cells(95, 1).Value = Equation_Formula_Main
Equation_Formula_Main = Right(Equation_Formula_Main, Len(Equation_Formula_Main) - 2)
The Trendline I'm using is a 3rd Order Polynomial with about 15 -30 points, dependant on the data. Can anyone see any reason why this would be the case?
To aid, I've added an example.
I have defined all lengths as double, the variables that contain strings as strings and t as a trendline
My current equation = "y = -8.80186E-09X^3 + 1.28815E-04X^2 - 5.17488E-01X + 7.06521E+02. This gives the variable Eqn_Len a value of 42, which is clearly wrong, however the value of Len(t.DataLabel.Text) is equal to 63. Following through the code the variable LHS becomes "y = -9E-09X^3 + 0.0001" and the variable RHS becomes "x^2 - 0.5175X + 706.52. You can see that this equation is excels default version of the trendline rather than the formated version that it should be.
Thank you for any help you may be able to provide plus any light you can shed on this rather small but annoying issue.

Apply function to masked region

I have an image like that:
I have both the mask and the original image. I would like to calculate the colour temperature of ONLY the ducks region.
Right now, I'm iterating through each row and column of the image below and getting pixels where their values are not zero. But I think this isn't the right way to do this. Any suggestions?
What I did was:
xyzImg = cv2.cvtColor(resImage, cv2.COLOR_BGR2XYZ)
x,y,z = cv2.split(xyzImg)
xList=[]
yList=[]
zList=[]
rows=x.shape[0]
cols=x.shape[1]
for i in range(rows):
for j in range(cols):
if (x[i][j]!=0) and (y[i][j]!=0) and (z[i][j]!=0):
xList.append(x[i][j])
yList.append(y[i][j])
zList.append(z[i][j])
xAvg = np.mean(xList)
yAvg = np.mean(yList)
zAvg = np.mean(zList)
xs = xAvg / (xAvg + yAvg + zAvg)
ys = yAvg / (xAvg + yAvg + zAvg)
xyChrome = np.array([xs,ys])
But this is very slow and I don't think its right...
The simplest way would be to use cv2.mean() function.
It takes two arguments src (having 1 to 4 channels) and mask and returns a vector with mean values for individual channels.
Refer to cv2::mask

Delete meshgrid points between a bounding box points

I am looking for an efficient way to delete points of a meshgrid that comes inside the bounding box of blocks (block 1 and 2 in the code). My Code is:
x_max, x_min, y_max, y_min = 156.0, 141.0, 96.0, 80.0
offset = 5
stepSize = 0.2
x = np.arange(x_min-offset, x_max+offset, stepSize)
y = np.arange(y_min-offset, y_max+offset, stepSize)
xv, yv = np.meshgrid(x, y)
#bounding box (and pints inside) that I want to remove for mesh
block1 = [(139.78, 86.4), (142.6, 86.4), (142.6, 88.0), (139.78, 88.0)]
block2 = [(154.8, 87.2), (157.6, 87.2), (157.6, 88.8), (154.8, 88.8)]
As per one of the answer, I could generate the required result if I have only one block to be removed from the mesh. If I have multiple blocks then it won't work. What could be the optimized way to remove multiple blocks from mesh grid. The final figure should look like this:
Mesh
Edit: Improved questions and edited code.
Simply redefine your x and y around your block:
block_xmin = np.min(block[:,0])
block_xmax = np.max(block[:,0])
block_ymin = np.min(block[:,1])
block_ymax = np.max(block[:,1])
X = np.hstack((np.arange(x_min-offset, block_xmin, stepSize), np.arange(block_xmax, x_max+offset, stepSize)))
Y = np.hstack((np.arange(y_min-offset, block_ymin, stepSize), np.arange(block_ymax, y_max+offset, stepSize)))
XV, YV = np.meshgrid(X, Y)
I think I figured it out based on the explanation of #hpaulj (I cannot up-vote his suggestions as well probably due to low points). I can append blocks in allBlocks array and then run a loop over allBlocks an simultaneous disabling the points in mesh. Here is my solution:
x_new = np.copy(xv)
y_new = np.copy(yv)
ori_x = xv[0][0]
ori_y = yv[0][0]
for block in allBlocks:
block_xmin = np.min((block[0][0], block[1][0]))
block_xmax = np.max((block[0][0], block[1][0]))
block_ymin = np.min((block[0][1], block[1][1]))
block_ymax = np.max((block[0][1], block[3][1]))
rx_min, rx_max = int((block_xmin-ori_x)/stepSize), int((block_xmax-ori_x)/stepSize)
ry_min, ry_max = int((block_ymin-ori_y)/stepSize), int((block_ymax-ori_y)/stepSize)
for i in range(rx_min,rx_max+1):
for j in range(ry_min,ry_max+1):
x_new[j][i] = np.nan
for i in range(ry_min,ry_max+1):
for j in range(rx_min,rx_max+1):
y_new[i][j] = np.nan

Resources