I am working on multi objective Genetic Algorithms, I have say 4 objectives and no. of generations is 400, and a population size of 100.
So how many function evaluation will be there?
I mean to say is it 4*400*100 or 400*100?
If for each chromosome you evaluate 4 functions, then obviously you have a total of 4*400*100 evaluations.
What you might also want to consider is the running time of each of this evaluations, because if 3 of the functions run in O(n) and the forth runs in O(n^2), the total running time will be bounded by O(number_of_gens*population_size*n^2), and will be only mildly affected by the other three functions in large problem instances.
If you're asking about the number of evaluations as counted by MOO researchers (i.e., you want to know whether your algorithm is better than mine with the same number of evaluations), then the accepted answer is incorrect. In multi-objective optimization, we formally consider the problem not as optimizing k different functions, but as optimizing one vector-valued function.
It's one evaluation per individual, regardless of the dimensionality of the objective space.
As far as I know, the number of function evaluation of genetic algorithm can be calculated through following equation:
Number of function evaluations = Number of main population + [number of new children(from cross over) + number of mututed children(from mutation)] * number of itteration.
Related
I have two related question on population statistics. I'm not a statistician, but would appreciate pointers to learn more.
I have a process that results from flipping a three sided coin (results: A, B, C) and I compute the statistic t=(A-C)/(A+B+C). In my problem, I have a set that randomly divides itself into sets X and Y, maybe uniformly, maybe not. I compute t for X and Y. I want to know whether the difference I observe in those two t values is likely due to chance or not.
Now if this were a simple binomial distribution (i.e., I'm just counting who ends up in X or Y), I'd know what to do: I compute n=|X|+|Y|, σ=sqrt(np(1-p)) (and I assume my p=.5), and then I compare to the normal distribution. So, for example, if I observed |X|=45 and |Y|=55, I'd say σ=5 and so I expect to have this variation from the mean μ=50 by chance 68.27% of the time. Alternately, I expect greater deviation from the mean 31.73% of the time.
There's an intermediate problem, which also interests me and which I think may help me understand the main problem, where I measure some property of members of A and B. Let's say 25% in A measure positive and 66% in B measure positive. (A and B aren't the same cardinality -- the selection process isn't uniform.) I would like to know if I expect this difference by chance.
As a first draft, I computed t as though it were measuring coin flips, but I'm pretty sure that's not actually right.
Any pointers on what the correct way to model this is?
First problem
For the three-sided coin problem, have a look at the multinomial distribution. It's the distribution to use for a "binomial" problem with more then 2 outcomes.
Here is the example from Wikipedia (https://en.wikipedia.org/wiki/Multinomial_distribution):
Suppose that in a three-way election for a large country, candidate A received 20% of the votes, candidate B received 30% of the votes, and candidate C received 50% of the votes. If six voters are selected randomly, what is the probability that there will be exactly one supporter for candidate A, two supporters for candidate B and three supporters for candidate C in the sample?
Note: Since we’re assuming that the voting population is large, it is reasonable and permissible to think of the probabilities as unchanging once a voter is selected for the sample. Technically speaking this is sampling without replacement, so the correct distribution is the multivariate hypergeometric distribution, but the distributions converge as the population grows large.
Second problem
The second problem seems to be a problem for cross-tabs. Then use the "Chi-squared test for association" to test whether there is a significant association between your variables. And use the "standardized residuals" of your cross-tab to identify which of the assiciations is more likely to occur and which is less likely.
I'm trying to invent programming exercise on Suffix Arrays. I learned O(n*log(n)^2) algorithm for constructing it and then started playing with random input strings of varying length in order to find out when naive approach becomes too slow. E.g. I wanted to choose string length so that people will need to implement "advanced" algorithm.
Suddenly I found that naive algorithm (with using logarithmic sort on all suffixes) is not as slow as O(n^2 * log(n)) means. After thinking a bit, I understand that comparison of suffixes of a randomly generated string is not O(n) amortized. Really, we usually only compare few first characters before we come to difference and there we return from comparison function. This of course depends on the size of the alphabet, but anyway it does not depend much on the length of suffixes.
I tried simple implementation in PHP processing 50000-characters string in 2 seconds (despite slowness of scripting language). If it will work at least as O(n^2) we'll expect it to work at least several minutes (with 1e7 operations per second and ~1e9 operations total).
So I understand that even if it is O(n^2 * log(n)) then the constant factor is a very small fraction of 1, really something close to 0. Or we should say about such complexity as worst-case only, right?
But what is the amortized time complexity of the naive approach? I'm bit bewildered about how to assess it.
You seem to be confusing amortized and expected complexity. In this case you are talking about expected complexity. And yes the stated complexity is computed assuming that the suffix comparison takes O(n). This will be the worst case for suffix comparison and for random generated input you will only perform constant number of comparisons in most cases. Thus O(n^2*log(n)) is worst case complexity.
One more note - on a modern computer you can perform a few billion elementary instructions in a second and it is possible that you execute in the order of 50000^2 in 2 seconds. The correct way to benchmark complexity of an algorithm is to measure the time it takes to complete e.g. for input of size N, N*2, N*4,...(as many as you can go) and then to interpolate the function that would describe the computational complexity
If complexity is O(nlog2(n))...
How to prove execution time for data like 10e7if we know that for data like 10e5execution time is 0.1s?
In short: To my knowledge, you don't prove it in this way.
More verbosely:
The thing about complexity is that they are reported in Big O notation, in which any constants and lower order terms are discarded. For example; the complexity in the question O(nlog2(n)), however this could be the simplified form of k1 * n * log(k2 * log(k3 * n + c3) + c2) + c1.
These constants cover things like initialization tasks which take the same time regardless of the number of samples, the proportional time it takes to do the log2(n) bit (each one of those could potentially take 10^6 times longer than the n bit), and so on.
In addition to the constants you also have variable factors, such as the hardware on which the algorithm is executed, any additional load on the system, etc.
In order use this as the basis for an estimate of execution time you would need to have enough samples of execution times with respect to problem sizes to estimate the both the constants and variable factors.
For practical purposes one could gather multiple samples of execution times for a sufficiently sizable set of problem sizes, then fit the data with a suitable function based on your complexity formula.
In terms of proving an execution time... not really doable, the best you can hope for is a best fit model and a significant p-value.
Of course, if all you want is a rough guess you could always try assuming that all the constants and variables are 1 or 0 as appropriate and plug in the numbers you have: (0.1s / (10^5 * log2(10^5))) * (10^7 * log2(10^7)) = 11 ish
Generally speaking when you are numerically evaluating and integral, say in MATLAB do I just pick a large number for the bounds or is there a way to tell MATLAB to "take the limit?"
I am assuming that you just use the large number because different machines would be able to handle numbers of different magnitudes.
I am just wondering if their is a way to improve my code. I am doing lots of expected value calculations via Monte Carlo and often use the trapezoid method to check my self of my degrees of freedom are small enough.
Strictly speaking, it's impossible to evaluate a numerical integral out to infinity. In most cases, if the integral in question is finite, you can simply integrate over a reasonably large range. To converge at a stable value, the integral of the normal error has to be less than 10 sigma -- this value is, for better or worse, as equal as you are going to get to evaluating the same integral all the way out to infinity.
It depends very much on what type of function you want to integrate. If it is "smooth" (no jumps - preferably not in any derivatives either, but that becomes progressively less important) and finite, that you have two main choices (limiting myself to the simplest approach):
1. if it is periodic, here meaning: could you put the left and right ends together and the also there have no jumps in value (and derivatives...): distribute your points evenly over the interval and just sample the functionvalues to get the estimated average, and than multiply by the length of the interval to get your integral.
2. if not periodic: use Legendre-integration.
Monte-carlo is almost invariably a poor method: it progresses very slow towards (machine-)precision: for any additional significant digit you need to apply 100 times more points!
The two methods above, for periodic and non-periodic "nice" (smooth etcetera) functions gives fair results already with a very small number of sample-points and then progresses very rapidly towards more precision: 1 of 2 points more usually adds several digits to your precision! This far outweighs the burden that you have to throw away all parts of the previous result when you want to apply a next effort with more sample points: you REPLACE the previous set of points with a fresh new one, while in Monte-Carlo you can just simply add points to the existing set and so refine the outcome.
'Simple' question, what is the fastest way to calculate the binomial coefficient? - Some threaded algorithm?
I'm looking for hints :) - not implementations :)
Well the fastest way, I reckon, would be to read them from a table rather than compute them.
Your requirements on integer accuracy from using a double representation means that C(60,30) is all but too big, being around 1e17, so that (assuming you want to have C(m,n) for all m up to some limit, and all n<=m), your table would only have around 1800 entries. As for filling the table in I think Pascal's triangle is the way to go.
According to the equation below (from wikipedia) the fastest way would be to split the range i=1,k to the number of threads, give each thread one range segment, and each thread updates the final result in a lock. "Academic way" would be to split the range into tasks, each task being to calculate (n - k + i)/i, and then no matter how many threads you have, they all run in a loop asking for next task. First is faster, second is... academic.
EDIT: further explanation - in both ways we have some arbitrary number of threads. Usually the number of threads is equal to the number of processor cores, because there is no benefit in adding more threads. The difference between two ways is what those threads are doing.
In first way each thread is given N, K, I1 and I2, where I1 and I2 are the segment in the range 1..K. Each thread then has all the data it neads, so it calculates its part of the result, and upon finish updates the final result.
In second way each thread is given N, K, and access to some syncronized counter that counts from 1 to K. Each thread then aquires one value from this shared counter, calculates one fraction of the result, updates the final result, and loops on this until counter informs the thread that there are no more items. If it happens that some processor cores are faster that others then this second way will put all cores to maximum use. Downside to second way is too much synchronization that effectively blocks, say, 20% of threads all the time.
Hint: You want to do as little multiplications as possible. The formula is n! / (k! * (n-k)!). You should do less than 2m multiplications, where m is the minimum of k and n-k. If you want to work with (fairly) big numbers, you should use a special class for the number representation (Java has BigInteger for instance).
Here's a way that never overflows if the final result is expressible natively in the machine, involves no multiplications/factorizations, is easily parallelized, and generalizes to BigInteger-types:
First note that the binomial coefficients satisfy following:
.
This yields a straightforward recursion for computing the coefficient: the base cases are and , both of which are 1.
The individual results from the subcalls are integers and if \binom{n}{k} can be represented by an int, they can too; so, overflow is not a concern.
Naively implemented, the recursion leads to repeated subcalls and exponential runtimes.
This can be fixed by caching intermediate results. There are
n^2 subproblems, which can be combined in O(1) time, yielding an O(n^2) complexity bound.
This answer calculates binomial with Python:
def h(a, b, c):
x = 0
part = str("=")
while x < (c+1):
nCr = math.comb(c,x)
part = part+'+'+str(int(a**(c-1))*int(b**x)*int(nCr)+'x^'+str(x)
x = x+1
print(part)
h(2,6,4)