Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 8 years ago.
Improve this question
For uncorrelated variables, the covariance should be zero. But for the two variables, x=0,1 and y=1,0. Clearly they are orthogonal and so they are uncorrelated. But the covariance is
(0-0.5)(1-0.5)+(1-0.5)(0-0.5)/2 = -0.25 not zero. What is wrong then? Sorry for the stupid question.
"Clearly they are orthogonal and so they are uncorrelated" - vectors can be orthogonal, this has nothing to do with random processes...
They are negatively correlated (one thing goes up when another goes down), everything is all right...
From wikipedia:
covariance is a measure of how much two random variables change together. If the greater values of one variable mainly correspond with the greater values of the other variable, and the same holds for the smaller values, i.e., the variables tend to show similar behavior, the covariance is positive.1 In the opposite case, when the greater values of one variable mainly correspond to the smaller values of the other, i.e., the variables tend to show opposite behavior, the covariance is negative.
Related
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I have a financial application that deals with "percentages" quite frequently. We are using "decimal" types to avoid rounding errors. My question is:
If I have a quantity representing 76%, is it better to store it as 0.76 or 76?
Furthermore, what are the advantages and/or disadvantages from a code maintainability point of view? Are there conventions for this?
If percentage is only a small part of the problem domain, I would probably stick with a primitive number.
PerCent literally means per hundred, and is a decimal number; e.g. 76 % is equal to 0.76. Thus, in the absence of units of measure support in the programming language itself, I would represent a percentage as a decimal number.
This also means that you don't need to perform special arithmetic in order to calculate with percentages, but you will need to multiply by 100 if you need to display the number in percent.
If percentage is a very central part of the problem domain, you should consider discarding Primitive Obsession and instead introduce a proper Value Object.
Still, even if you introduce a Percent Value Object that contains conversions to and from primitive numbers, you're still stuck with potential programmer errors, because in itself, the number 0.9 could both mean 90 % or 0.9 % depending on how you choose to interpret it.
In the end, my best advice is to cover your code base with appropriate unit tests, so that you lock the conversion code down.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 9 years ago.
Improve this question
Im looking for an example of a weakly normalising lambda term.
Am I right in saying that the following:
(λa.b)((λx.xx)(λx.xx))
Reduces to:
b
or:
doesnt terminate (if you try to reduce (λx.xx)(λx.xx))
I wasnt sure if the first reduction is correct so just need some clarification,
thanks.
If you evaluate the right term first and continually then it will never reach a normal form, thus it is not strongly normalizable. If you evaluate the left term first it will immediately reach a normal form, thus it is normalizable and demonstrates that this term is weakly normalizable. It's also an example of the non-confluence of the untyped lambda calculus.
Note that you're more likely to want to talk about how a rewriting system is normalizing than a particular term. This term is thus a counterexample to the strong normalization property of untyped lambda calculus, but does not provide positive evidence that ULC is weakly normalizing (and it isn't).
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
Given a set of 10 symbols and a set of strings(at max 100) of length at max 20 each consisting of these symbols, find the maximum length string which can be made from these symbols that doesn't have any of the given strings as its sub-string. In case, if we can have infinite long string satisfying the property, print -1.
Besides brute force algorithm which will go exponential in time, I am not able to get any solution for this.
Any hint to approach this problem will be thankful.
Given a set of strings that need to be matched, my immediate reaction is to use http://en.wikipedia.org/wiki/Aho%E2%80%93Corasick_string_matching_algorithm to create a matcher. This matcher is a finite state machine that accepts one character at a time and tells you which state you end up in next, given that character.
So I think you can reduce the problem to accepting a directed graph and a starting point and finding the longest route through that graph that does not go through the nodes that correspond to pattern matches - which I think we can simply delete from the graph. This is covered in http://en.wikipedia.org/wiki/Longest_path. Constructing this graph is also linear so the whole thing seems to be O(n)
Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 9 years ago.
Improve this question
It seems excel knows how to calculate =cos(2^27-1) but fails to calculate =cos(2^27). That returns #NUM!. Does anyone know why?
I have no idea what sort of arithmetic that Excel uses internally, but at some point, with a large number, the error after you do a mod 2*pi operation is too substantial to produce a reliable answer. Presumably they picked 2^27 as their cutoff.
This is behavior is not well documented. The Sin Function documentation indicates that the argument is a Double, and the specified limits in the documentation indicate that the double type is stored as a 64-bit number ranging from 4.94E-324 to 1.797E308 (for positive numbers).
I suspect that it is not coincidental that 2^27 (134,217,728) bytes is precisely 128 megabytes, and it seems likely that there is an internal limitation for some trig functions (eg. COS, SIN and TAN, but interestingly, NOT for TANH, etc.). This is not to say that this amount of memory consumption would be required - it's just that a programmer's implementation could have some (potentially unnecessary) limits on these types of inputs internally.
To get around this silly limit, simply use the following:
=COS(MOD(2^27, 2*PI()))
This works because the limitation does not exist for other operations, and is nowhere to be seen in the Excel Specifications and Limits. :-)
It would be good for the documentation as linked provided a description of these limits, but unfortunately, it does not.
Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
Question with regards to taking the logarithm of a variable (Statistics Question)
Say you have a bar graph displaying data for an example "Cost of Computer Orders by the Population" and you are trying to analyze the data and find a distribution. The information does not indicate anything so you take the logarithm of the variable and the graph then resembles a normal distribution. I know that the normal distribution basically means the mean, but what does taking the logarithm of the information indicate?
It seems that you are describing the lognormal distribution: a random variable is said to come from a lognormal distribution if its logarithm is distributed normally.
In practice, this can describe processes where the value cannot go below zero, and most of the population is close to the left (right skewness). For example: salaries, home prices, bone fractures, number of girlfriends all could be reasonably modeled with a log normal distribution.
For example: say that on average young adults have had 2.5 girlfriends. A few have never had one; you cannot have "negative number" of girlfriends, and a few bastards have had 25. However, most young adults will have had between, say, one and three.
if you display the values of x as their log(x) then the line in the diagramm is a straight line, when the values grow exponential. This is a stastistically trick for a fast check if values grow exponentionally.