Risk calculation probability - statistics

I have a probability of a breach happening for Company A in 2019: .25
I have a probability of a breach happening for Company A historically (2010-2019): 0.10
How can I integrate a model where both probabilities are communicated?

Not possible. Since there is a 25% chance in 2019, there must be at least a 25% change in years 2010 through 2019 (inclusive).

Related

Actuarial vs. predicted survival comparison

I have a set of patients and their actuarial 1- and 5-years survival. I have also used their data with a certain commonly utilised medical score, that calcualtes survival probability for 1- and 5-years (for example 75% and 55% respectively). I'd like to compare both survival rates.
I did calculate the mean survival probability for all patients at 1- and 5-years as the mean of predicted survival probabilities. I then calculated the mean actuarial survival by using 100% if alive at 1 year and 0% if dead at 5 years. I then compared the means of both groups with a t-test.
I have a feeling that what i am doing is grossly incorrect and goes against all rules of statistics, however i have not find any solution of my problem anywhere. Maybe someone can help me? R packages and codes are welcome.

Python time series using FB Prophet with covid-19

I have prepared a time series model using FB Prophet for making forecasts. The model forecasts for the coming 30 days and my data ranges from Jan 2019 until Mar 2020 both months inclusive with all the dates filled in. The model has been built specifically for the UK market
I have already taken care of the following:
Seasonality
Holidaying Effect
My question is, that how do I take care of the current COVID-19 situation into the same model? The cases that I am trying to forecast are also dependent on the previous data at least from Jan 2020. So in order to forecast I need to take into account the current coronavirus situation as well that would impact my forecasts apart from seasonality and holidaying effect.
How should I achieve this?
Haven't tried it but one thing I have in mind is to add covid-19 as a one-time holiday. You can see an example for adding a one-time promo as a pseudo-holiday in the following link: https://towardsdatascience.com/forecasting-in-python-with-facebook-prophet-29810eb57e66
I assume that it can be applied to a negative effect, caudes by covid-19,
I have had the same issue with COVID at my work with sales forecasting. The easy solution for me was to make an additional regressor which indicates the COVID period, and use that in my model. Then my future is not affected by COVID, unless I tell it that it should be.

How to find the maximum and lowest value of a random normal or log-normal distribution?

This is my first question on Stack Overflow so forgive me if I'm not in conformity with some norms. That being said, this is my problem:
Edited:
I have a continuous variable where I can only measure some points of data and I need to assess the probability curve for the maximum and lowest values between each data point. I have the std deviation and the variable works on lognormal distribution, this means the average is a log-mean and the std deviation is multiplicative.
Example:
Assuming a car's speed is normally distributed and there are no traffic laws, at 10 AM the car is travelling at the speed of 40 MPH, at 11 AM he is travelling at 60 MPH, the standard deviation is a 10% change of its speed every hour. There is this 1h blackout in between where you have no information, but you should be able to estimate: the more probable highest speed the car achieved in this time, the more probable lowest speed, and somehow a probability distribution of everything in between. You can even assume Its the least unlikely probability that its speed at 10 AM was its lowest speed and at 11 AM was it highest speed in the period (if the car speed is truly random at every scale you can even assume its limiting the impossible). The outcome is a lognormal distribution which could be used to simulate scenarios regarding that car.
I'm not an expert in statistics and I understand only the basics and some theory, how should I address this problem?
I'm using this on Python 3.x in case you guys know an way to address that problem there.

What is the probability of more than 100 people arriving at the station, if they come based on exponential distribution with 2 mins?

So, i got this problem:
"You have people arriving at the bus station based on exponential distribution.
You know that the mean of the distribution is 2 mins.
Whats the probability for that in 3 hours more than 100 people will arrive.
So i figured out that the problem is that, we have to calculate the probability of having the actual mean under 1.8 mins.
But i don't really know how to solve this?
Is it something with confidence intervals?
So basically the rate of arrival to get 100 customers in 3hrs will be 1.8 min per customer. Using cumulative distribution function:
Here = 0.5 and t = 1.8. As we are looking for more than 100 customers within 3 hrs so the integral will be from 0 to 1.8.
This gives 1-e^(-0.5*1.8) your answer i.e 0.5934.
You can refer this link to get hold on the theory and few examples.

calculating reliability of measurements

I have many measurements of age of the same person. Let's say:
[23 25 32 23 25]
I would like to output a single value and a reliability score of this value. The single value can be the average.
Reliability, I don't know well how to calculate it. The value should be between 0 and 1, where 1 means all ages are equal and a very unreliable measurement should be near 0.
Probably the variance should be used here, but it's not clear to me how to normalize it between 0 and 1 in a meaningful way (1/(x+1) is not much meaningful :)).
Assume some probability distribution (or determine what probability distribution your data fits most accurately). A good choice is a normal distribution, which for discrete data requires a continuity correction. See example here: http://www.milefoot.com/math/stat/pdfc-normaldisc.htm
In your example, your reliability score for the average age of 26 (25.6 rounded to nearest integer), is simply the probability that X falls in the range (25.5, 26.5).
The easiest way for assessing reliability (or internal consistency) is to use Cronbach's alpha. I guess most statistics software has this method built-in.
https://en.wikipedia.org/wiki/Cronbach%27s_alpha

Resources