Smalltalk / Squeak string shallow equality

Smalltalk / Squeak string shallow equality - string

The following code prints "false":
a := 'aaa'.
b := a deepCopy.
Transcript show: (a == b).
I do expect this behavior and my explanation to this would be that deepCopy returns a new object "b" that is a completely different object than "a" and since operator "==" compares by reference the result is "false". Is that correct?
However, I do not understand why the following code produces "true":
a := 'aaa'.
b := 'aaa'.
Transcript show: (a == b).
Here we made two assignments to two different objects, "a" and "b", and there shouldn't be any relation between them except the fact that they contain the same value. But if operator "==" compares by reference and not by value, why is the result of this comparison "true"?

The same misconception in both cases is that the question is not "what happens?", but "what is guaranteed?". The key is that there is no guarantee that 'aaa' == 'aaa', but the compiler and VM are free to do things that way. The same seems true for the case of copying; since strings are immutable, I guess there's nothing to say that copying a string couldn't return the same object!
In your first example, as usual, the best teacher is the image. #deepCopy delegates to #shallowCopy, which at some point evaluates class basicNew: index, and copies the characters into the new object. So, this particular implementation will always create a new object.

In addition to what Sean DeNigris said, the reason why the comparison is true in the second case is that when you execute all three statements together, the compiler wants to be smart and only once creates the object for 'aaa' and shares them for a and b.
The same happens if you put this into one method *:
Object subclass: #MyClassA
instanceVariableNames: ''
classVariableNames: ''
poolDictionaries: ''
category: 'MyApp'
!MyClassA methodsFor: 'testing' stamp: nil prior: nil!
testStrings
| a b |
a := 'aaa'
b := 'aaa'
^ a == b
! !
MyClassA testStrings " ==> true"
But this does not happen if they are in different methods:
Object subclass: #MyClassB
instanceVariableNames: ''
classVariableNames: ''
poolDictionaries: ''
category: 'MyApp'
!MyClassB methodsFor: 'testing' stamp: nil prior: nil!
a
| a |
a := 'aaa'
^ a
! !
!MyClassB methodsFor: 'testing' stamp: nil prior: nil!
b
| b |
b := 'aaa'
^ b
! !
!MyClassB methodsFor: 'testing' stamp: nil prior: nil!
testStrings
^ self a == self b
! !
MyClassB testStrings " ==> false"
That is because in Squeak, literal objects like stings are stored in the method object of the method they are defined in
*: Technically, every DoIt or PrintIt, that is when you just execute code by keystroke, gets compiled to one method in Squeak.

This is what I know from one of the free Smalltalk books scattered online but I can't find the reference:
As you would expect the instance of a class is a unique object in memory. deepCopy intentionally creates an object first and then stores a copy of the existing instance in it.
However numbers, characters and strings are treated as primitive data types by Smalltalk. When literal data, also referred to as literals, are assigned to variables they are first checked against a local scope dictionary which is invisible to the user and holds literals to check if they have been already added to it. If they haven't they will be added to the dictionary and the variable will point to the dictionary field. If identical literal data has been assigned before, the new variable will only point to the local scope dictionary field that contains the identical literal. This means that two or more variables assigned identical literals are pointing to the same dictionary field and therefore are identical objects. This is why the second comparison in your question is returning true.

Related

Why there is difference between 'is' and '==' with various results in python [duplicate]

This question's answers are a community effort. Edit existing answers to improve this post. It is not currently accepting new answers or interactions.
My Google-fu has failed me.
In Python, are the following two tests for equality equivalent?
n = 5
# Test one.
if n == 5:
print 'Yay!'
# Test two.
if n is 5:
print 'Yay!'
Does this hold true for objects where you would be comparing instances (a list say)?
Okay, so this kind of answers my question:
L = []
L.append(1)
if L == [1]:
print 'Yay!'
# Holds true, but...
if L is [1]:
print 'Yay!'
# Doesn't.
So == tests value where is tests to see if they are the same object?

is will return True if two variables point to the same object (in memory), == if the objects referred to by the variables are equal.
>>> a = [1, 2, 3]
>>> b = a
>>> b is a
True
>>> b == a
True
# Make a new copy of list `a` via the slice operator,
# and assign it to variable `b`
>>> b = a[:]
>>> b is a
False
>>> b == a
True
In your case, the second test only works because Python caches small integer objects, which is an implementation detail. For larger integers, this does not work:
>>> 1000 is 10**3
False
>>> 1000 == 10**3
True
The same holds true for string literals:
>>> "a" is "a"
True
>>> "aa" is "a" * 2
True
>>> x = "a"
>>> "aa" is x * 2
False
>>> "aa" is intern(x*2)
True
Please see this question as well.

There is a simple rule of thumb to tell you when to use == or is.
== is for value equality. Use it when you would like to know if two objects have the same value.
is is for reference equality. Use it when you would like to know if two references refer to the same object.
In general, when you are comparing something to a simple type, you are usually checking for value equality, so you should use ==. For example, the intention of your example is probably to check whether x has a value equal to 2 (==), not whether x is literally referring to the same object as 2.
Something else to note: because of the way the CPython reference implementation works, you'll get unexpected and inconsistent results if you mistakenly use is to compare for reference equality on integers:
>>> a = 500
>>> b = 500
>>> a == b
True
>>> a is b
False
That's pretty much what we expected: a and b have the same value, but are distinct entities. But what about this?
>>> c = 200
>>> d = 200
>>> c == d
True
>>> c is d
True
This is inconsistent with the earlier result. What's going on here? It turns out the reference implementation of Python caches integer objects in the range -5..256 as singleton instances for performance reasons. Here's an example demonstrating this:
>>> for i in range(250, 260): a = i; print "%i: %s" % (i, a is int(str(i)));
...
250: True
251: True
252: True
253: True
254: True
255: True
256: True
257: False
258: False
259: False
This is another obvious reason not to use is: the behavior is left up to implementations when you're erroneously using it for value equality.

Is there a difference between == and is in Python?
Yes, they have a very important difference.
==: check for equality - the semantics are that equivalent objects (that aren't necessarily the same object) will test as equal. As the documentation says:
The operators <, >, ==, >=, <=, and != compare the values of two objects.
is: check for identity - the semantics are that the object (as held in memory) is the object. Again, the documentation says:
The operators is and is not test for object identity: x is y is true
if and only if x and y are the same object. Object identity is
determined using the id() function. x is not y yields the inverse
truth value.
Thus, the check for identity is the same as checking for the equality of the IDs of the objects. That is,
a is b
is the same as:
id(a) == id(b)
where id is the builtin function that returns an integer that "is guaranteed to be unique among simultaneously existing objects" (see help(id)) and where a and b are any arbitrary objects.
Other Usage Directions
You should use these comparisons for their semantics. Use is to check identity and == to check equality.
So in general, we use is to check for identity. This is usually useful when we are checking for an object that should only exist once in memory, referred to as a "singleton" in the documentation.
Use cases for is include:
None
enum values (when using Enums from the enum module)
usually modules
usually class objects resulting from class definitions
usually function objects resulting from function definitions
anything else that should only exist once in memory (all singletons, generally)
a specific object that you want by identity
Usual use cases for == include:
numbers, including integers
strings
lists
sets
dictionaries
custom mutable objects
other builtin immutable objects, in most cases
The general use case, again, for ==, is the object you want may not be the same object, instead it may be an equivalent one
PEP 8 directions
PEP 8, the official Python style guide for the standard library also mentions two use-cases for is:
Comparisons to singletons like None should always be done with is or
is not, never the equality operators.
Also, beware of writing if x when you really mean if x is not None --
e.g. when testing whether a variable or argument that defaults to None
was set to some other value. The other value might have a type (such
as a container) that could be false in a boolean context!
Inferring equality from identity
If is is true, equality can usually be inferred - logically, if an object is itself, then it should test as equivalent to itself.
In most cases this logic is true, but it relies on the implementation of the __eq__ special method. As the docs say,
The default behavior for equality comparison (== and !=) is based on
the identity of the objects. Hence, equality comparison of instances
with the same identity results in equality, and equality comparison of
instances with different identities results in inequality. A
motivation for this default behavior is the desire that all objects
should be reflexive (i.e. x is y implies x == y).
and in the interests of consistency, recommends:
Equality comparison should be reflexive. In other words, identical
objects should compare equal:
x is y implies x == y
We can see that this is the default behavior for custom objects:
>>> class Object(object): pass
>>> obj = Object()
>>> obj2 = Object()
>>> obj == obj, obj is obj
(True, True)
>>> obj == obj2, obj is obj2
(False, False)
The contrapositive is also usually true - if somethings test as not equal, you can usually infer that they are not the same object.
Since tests for equality can be customized, this inference does not always hold true for all types.
An exception
A notable exception is nan - it always tests as not equal to itself:
>>> nan = float('nan')
>>> nan
nan
>>> nan is nan
True
>>> nan == nan # !!!!!
False
Checking for identity can be much a much quicker check than checking for equality (which might require recursively checking members).
But it cannot be substituted for equality where you may find more than one object as equivalent.
Note that comparing equality of lists and tuples will assume that identity of objects are equal (because this is a fast check). This can create contradictions if the logic is inconsistent - as it is for nan:
>>> [nan] == [nan]
True
>>> (nan,) == (nan,)
True
A Cautionary Tale:
The question is attempting to use is to compare integers. You shouldn't assume that an instance of an integer is the same instance as one obtained by another reference. This story explains why.
A commenter had code that relied on the fact that small integers (-5 to 256 inclusive) are singletons in Python, instead of checking for equality.
Wow, this can lead to some insidious bugs. I had some code that checked if a is b, which worked as I wanted because a and b are typically small numbers. The bug only happened today, after six months in production, because a and b were finally large enough to not be cached. – gwg
It worked in development. It may have passed some unittests.
And it worked in production - until the code checked for an integer larger than 256, at which point it failed in production.
This is a production failure that could have been caught in code review or possibly with a style-checker.
Let me emphasize: do not use is to compare integers.

== determines if the values are equal, while is determines if they are the exact same object.

What's the difference between is and ==?
== and is are different comparison! As others already said:
== compares the values of the objects.
is compares the references of the objects.
In Python names refer to objects, for example in this case value1 and value2 refer to an int instance storing the value 1000:
value1 = 1000
value2 = value1
Because value2 refers to the same object is and == will give True:
>>> value1 == value2
True
>>> value1 is value2
True
In the following example the names value1 and value2 refer to different int instances, even if both store the same integer:
>>> value1 = 1000
>>> value2 = 1000
Because the same value (integer) is stored == will be True, that's why it's often called "value comparison". However is will return False because these are different objects:
>>> value1 == value2
True
>>> value1 is value2
False
When to use which?
Generally is is a much faster comparison. That's why CPython caches (or maybe reuses would be the better term) certain objects like small integers, some strings, etc. But this should be treated as implementation detail that could (even if unlikely) change at any point without warning.
You should only use is if you:
want to check if two objects are really the same object (not just the same "value"). One example can be if you use a singleton object as constant.
want to compare a value to a Python constant. The constants in Python are:
None
True1
False1
NotImplemented
Ellipsis
__debug__
classes (for example int is int or int is float)
there could be additional constants in built-in modules or 3rd party modules. For example np.ma.masked from the NumPy module)
In every other case you should use == to check for equality.
Can I customize the behavior?
There is some aspect to == that hasn't been mentioned already in the other answers: It's part of Pythons "Data model". That means its behavior can be customized using the __eq__ method. For example:
class MyClass(object):
def __init__(self, val):
self._value = val
def __eq__(self, other):
print('__eq__ method called')
try:
return self._value == other._value
except AttributeError:
raise TypeError('Cannot compare {0} to objects of type {1}'
.format(type(self), type(other)))
This is just an artificial example to illustrate that the method is really called:
>>> MyClass(10) == MyClass(10)
__eq__ method called
True
Note that by default (if no other implementation of __eq__ can be found in the class or the superclasses) __eq__ uses is:
class AClass(object):
def __init__(self, value):
self._value = value
>>> a = AClass(10)
>>> b = AClass(10)
>>> a == b
False
>>> a == a
So it's actually important to implement __eq__ if you want "more" than just reference-comparison for custom classes!
On the other hand you cannot customize is checks. It will always compare just if you have the same reference.
Will these comparisons always return a boolean?
Because __eq__ can be re-implemented or overridden, it's not limited to return True or False. It could return anything (but in most cases it should return a boolean!).
For example with NumPy arrays the == will return an array:
>>> import numpy as np
>>> np.arange(10) == 2
array([False, False, True, False, False, False, False, False, False, False], dtype=bool)
But is checks will always return True or False!
1 As Aaron Hall mentioned in the comments:
Generally you shouldn't do any is True or is False checks because one normally uses these "checks" in a context that implicitly converts the condition to a boolean (for example in an if statement). So doing the is True comparison and the implicit boolean cast is doing more work than just doing the boolean cast - and you limit yourself to booleans (which isn't considered pythonic).
Like PEP8 mentions:
Don't compare boolean values to True or False using ==.
Yes: if greeting:
No: if greeting == True:
Worse: if greeting is True:

They are completely different. is checks for object identity, while == checks for equality (a notion that depends on the two operands' types).
It is only a lucky coincidence that "is" seems to work correctly with small integers (e.g. 5 == 4+1). That is because CPython optimizes the storage of integers in the range (-5 to 256) by making them singletons. This behavior is totally implementation-dependent and not guaranteed to be preserved under all manner of minor transformative operations.
For example, Python 3.5 also makes short strings singletons, but slicing them disrupts this behavior:
>>> "foo" + "bar" == "foobar"
True
>>> "foo" + "bar" is "foobar"
True
>>> "foo"[:] + "bar" == "foobar"
True
>>> "foo"[:] + "bar" is "foobar"
False

https://docs.python.org/library/stdtypes.html#comparisons
is tests for identity
== tests for equality
Each (small) integer value is mapped to a single value, so every 3 is identical and equal. This is an implementation detail, not part of the language spec though

Your answer is correct. The is operator compares the identity of two objects. The == operator compares the values of two objects.
An object's identity never changes once it has been created; you may think of it as the object's address in memory.
You can control comparison behaviour of object values by defining a __cmp__ method or a rich comparison method like __eq__.

Have a look at Stack Overflow question Python's “is” operator behaves unexpectedly with integers.
What it mostly boils down to is that "is" checks to see if they are the same object, not just equal to each other (the numbers below 256 are a special case).

In a nutshell, is checks whether two references point to the same object or not.== checks whether two objects have the same value or not.
a=[1,2,3]
b=a #a and b point to the same object
c=list(a) #c points to different object
if a==b:
print('#') #output:#
if a is b:
print('##') #output:##
if a==c:
print('###') #output:##
if a is c:
print('####') #no output as c and a point to different object

As the other people in this post answer the question in details the difference between == and is for comparing Objects or variables, I would emphasize mainly the comparison between is and == for strings which can give different results and I would urge programmers to carefully use them.
For string comparison, make sure to use == instead of is:
str = 'hello'
if (str is 'hello'):
print ('str is hello')
if (str == 'hello'):
print ('str == hello')
Out:
str is hello
str == hello
But in the below example == and is will get different results:
str2 = 'hello sam'
if (str2 is 'hello sam'):
print ('str2 is hello sam')
if (str2 == 'hello sam'):
print ('str2 == hello sam')
Out:
str2 == hello sam
Conclusion and Analysis:
Use is carefully to compare between strings.
Since is for comparing objects and since in Python 3+ every variable such as string interpret as an object, let's see what happened in above paragraphs.
In python there is id function that shows a unique constant of an object during its lifetime. This id is using in back-end of Python interpreter to compare two objects using is keyword.
str = 'hello'
id('hello')
> 140039832615152
id(str)
> 140039832615152
But
str2 = 'hello sam'
id('hello sam')
> 140039832615536
id(str2)
> 140039832615792

As John Feminella said, most of the time you will use == and != because your objective is to compare values. I'd just like to categorise what you would do the rest of the time:
There is one and only one instance of NoneType i.e. None is a singleton. Consequently foo == None and foo is None mean the same. However the is test is faster and the Pythonic convention is to use foo is None.
If you are doing some introspection or mucking about with garbage collection or checking whether your custom-built string interning gadget is working or suchlike, then you probably have a use-case for foo is bar.
True and False are also (now) singletons, but there is no use-case for foo == True and no use case for foo is True.

Most of them already answered to the point. Just as an additional note (based on my understanding and experimenting but not from a documented source), the statement
== if the objects referred to by the variables are equal
from above answers should be read as
== if the objects referred to by the variables are equal and objects belonging to the same type/class
. I arrived at this conclusion based on the below test:
list1 = [1,2,3,4]
tuple1 = (1,2,3,4)
print(list1)
print(tuple1)
print(id(list1))
print(id(tuple1))
print(list1 == tuple1)
print(list1 is tuple1)
Here the contents of the list and tuple are same but the type/class are different.

When a GString will change its toString representation

I am reading the Groovy closure documentation in https://groovy-lang.org/closures.html#this. Having a question regarding with GString behavior.
Closures in GStrings
The document mentioned the following:
Take the following code:
def x = 1
def gs = "x = ${x}"
assert gs == 'x = 1'
The code behaves as you would expect, but what happens if you add:
x = 2
assert gs == 'x = 2'
You will see that the assert fails! There are two reasons for this:
a GString only evaluates lazily the toString representation of values
the syntax ${x} in a GString does not represent a closure but an expression to $x, evaluated when the GString is created.
In our example, the GString is created with an expression referencing x. When the GString is created, the value of x is 1, so the GString is created with a value of 1. When the assert is triggered, the GString is evaluated and 1 is converted to a String using toString. When we change x to 2, we did change the value of x, but it is a different object, and the GString still references the old one.
A GString will only change its toString representation if the values it references are mutating. If the references change, nothing will happen.
My question is regarding the above-quoted explanation, in the example code, 1 is obviously a value, not a reference type, then if this statement is true, it should update to 2 in the GString right?
The next example listed below I feel also a bit confusing for me (the last part)
why if we mutate Sam to change his name to Lucy, this time the GString is correctly mutated??
I am expecting it won't mutate?? why the behavior is so different in the two examples?
class Person {
String name
String toString() { name }
}
def sam = new Person(name:'Sam')
def lucy = new Person(name:'Lucy')
def p = sam
def gs = "Name: ${p}"
assert gs == 'Name: Sam'
p = Lucy. //if we change p to Lucy
assert gs == 'Name: Sam' // the string still evaluates to Sam because it was the value of p when the GString was created
/* I would expect below to be 'Name: Sam' as well
* if previous example is true. According to the
* explanation mentioned previously.
*/
sam.name = 'Lucy' // so if we mutate Sam to change his name to Lucy
assert gs == 'Name: Lucy' // this time the GString is correctly mutated
Why the comment says 'this time the GString is correctly mutated? In previous comments it just metioned
the string still evaluates to Sam because it was the value of p when the GString was created, the value of p is 'Sam' when the String was created
thus I think it should not change here??
Thanks for kind help.

These two examples explain two different use cases. In the first example, the expression "x = ${x}" creates a GString object that internally stores strings = ['x = '] and values = [1]. You can check internals of this particular GString with println gs.dump():
<org.codehaus.groovy.runtime.GStringImpl#6aa798b strings=[x = , ] values=[1]>
Both objects, a String one in the strings array, and an Integer one in the values array are immutable. (Values are immutable, not arrays.) When the x variable is assigned to a new value, it creates a new object in the memory that is not associated with the 1 stored in the GString.values array. x = 2 is not a mutation. This is new object creation. This is not a Groovy specific thing, this is how Java works. You can try the following pure Java example to see how it works:
List<Integer> list = new ArrayList<>();
Integer number = 2;
list.add(number);
number = 4;
System.out.println(list); // prints: [2]
The use case with a Person class is different. Here you can see how mutation of an object works. When you change sam.name to Lucy, you mutate an internal stage of an object stored in the GString.values array. If you, instead, create a new object and assigned it to sam variable (e.g. sam = new Person(name:"Adam")), it would not affect internals of the existing GString object. The object that was stored internally in the GString did not mutate. The variable sam in this case just refers to a different object in the memory. When you do sam.name = "Lucy", you mutate the object in the memory, thus GString (which uses a reference to the same object) sees this change. It is similar to the following plain Java use case:
List<List<Integer>> list2 = new ArrayList<>();
List<Integer> nested = new ArrayList<>();
nested.add(1);
list2.add(nested);
System.out.println(list2); // prints: [[1]]
nested.add(3);
System.out.println(list2); // prints: [[1,3]]
nested = new ArrayList<>();
System.out.println(list2); // prints: [[1,3]]
You can see that list2 stores the reference to the object in the memory represented by nested variable at the time when nested was added to list2. When you mutated nested list by adding new numbers to it, those changes are reflected in list2, because you mutate an object in the memory that list2 has access to. But when you override nested with a new list, you create a new object, and list2 has no connection with this new object in the memory. You could add integers to this new nested list and list2 won't be affected - it stores a reference to a different object in the memory. (The object that previously could be referred to using nested variable, but this reference was overridden later in the code with a new object.)
GString in this case behaves similarly to the examples with lists I shown you above. If you mutate the state of the interpolated object (e.g. sam.name, or adding integers to nested list), this change is reflected in the GString.toString() that produces a string when the method is called. (The string that is created uses the current state of values stored in the values internal array.) On the other hand, if you override a variable with a new object (e.g. x = 2, sam = new Person(name:"Adam"), or nested = new ArrayList()), it won't change what GString.toString() method produces, because it still uses an object (or objects) that is stored in the memory, and that was previously associated with the variable name you assigned to a new object.

That's almost the whole story, as you can use a Closure for your GString evaluation, so in place of just using the variable:
def gs = "x = ${x}"
You can use a closure that returns the variable:
def gs = "x = ${-> x}"
This means that the value x is evaluated at the time the GString is changed to a String, so this then works (from the original question)
def x = 1
def gs = "x = ${-> x}"
assert gs == 'x = 1'
x = 2
assert gs == 'x = 2'

Python operating on big numbers causes margin errors [duplicate]

This question's answers are a community effort. Edit existing answers to improve this post. It is not currently accepting new answers or interactions.
My Google-fu has failed me.
In Python, are the following two tests for equality equivalent?
n = 5
# Test one.
if n == 5:
print 'Yay!'
# Test two.
if n is 5:
print 'Yay!'
Does this hold true for objects where you would be comparing instances (a list say)?
Okay, so this kind of answers my question:
L = []
L.append(1)
if L == [1]:
print 'Yay!'
# Holds true, but...
if L is [1]:
print 'Yay!'
# Doesn't.
So == tests value where is tests to see if they are the same object?

is will return True if two variables point to the same object (in memory), == if the objects referred to by the variables are equal.
>>> a = [1, 2, 3]
>>> b = a
>>> b is a
True
>>> b == a
True
# Make a new copy of list `a` via the slice operator,
# and assign it to variable `b`
>>> b = a[:]
>>> b is a
False
>>> b == a
True
In your case, the second test only works because Python caches small integer objects, which is an implementation detail. For larger integers, this does not work:
>>> 1000 is 10**3
False
>>> 1000 == 10**3
True
The same holds true for string literals:
>>> "a" is "a"
True
>>> "aa" is "a" * 2
True
>>> x = "a"
>>> "aa" is x * 2
False
>>> "aa" is intern(x*2)
True
Please see this question as well.

There is a simple rule of thumb to tell you when to use == or is.
== is for value equality. Use it when you would like to know if two objects have the same value.
is is for reference equality. Use it when you would like to know if two references refer to the same object.
In general, when you are comparing something to a simple type, you are usually checking for value equality, so you should use ==. For example, the intention of your example is probably to check whether x has a value equal to 2 (==), not whether x is literally referring to the same object as 2.
Something else to note: because of the way the CPython reference implementation works, you'll get unexpected and inconsistent results if you mistakenly use is to compare for reference equality on integers:
>>> a = 500
>>> b = 500
>>> a == b
True
>>> a is b
False
That's pretty much what we expected: a and b have the same value, but are distinct entities. But what about this?
>>> c = 200
>>> d = 200
>>> c == d
True
>>> c is d
True
This is inconsistent with the earlier result. What's going on here? It turns out the reference implementation of Python caches integer objects in the range -5..256 as singleton instances for performance reasons. Here's an example demonstrating this:
>>> for i in range(250, 260): a = i; print "%i: %s" % (i, a is int(str(i)));
...
250: True
251: True
252: True
253: True
254: True
255: True
256: True
257: False
258: False
259: False
This is another obvious reason not to use is: the behavior is left up to implementations when you're erroneously using it for value equality.

Is there a difference between == and is in Python?
Yes, they have a very important difference.
==: check for equality - the semantics are that equivalent objects (that aren't necessarily the same object) will test as equal. As the documentation says:
The operators <, >, ==, >=, <=, and != compare the values of two objects.
is: check for identity - the semantics are that the object (as held in memory) is the object. Again, the documentation says:
The operators is and is not test for object identity: x is y is true
if and only if x and y are the same object. Object identity is
determined using the id() function. x is not y yields the inverse
truth value.
Thus, the check for identity is the same as checking for the equality of the IDs of the objects. That is,
a is b
is the same as:
id(a) == id(b)
where id is the builtin function that returns an integer that "is guaranteed to be unique among simultaneously existing objects" (see help(id)) and where a and b are any arbitrary objects.
Other Usage Directions
You should use these comparisons for their semantics. Use is to check identity and == to check equality.
So in general, we use is to check for identity. This is usually useful when we are checking for an object that should only exist once in memory, referred to as a "singleton" in the documentation.
Use cases for is include:
None
enum values (when using Enums from the enum module)
usually modules
usually class objects resulting from class definitions
usually function objects resulting from function definitions
anything else that should only exist once in memory (all singletons, generally)
a specific object that you want by identity
Usual use cases for == include:
numbers, including integers
strings
lists
sets
dictionaries
custom mutable objects
other builtin immutable objects, in most cases
The general use case, again, for ==, is the object you want may not be the same object, instead it may be an equivalent one
PEP 8 directions
PEP 8, the official Python style guide for the standard library also mentions two use-cases for is:
Comparisons to singletons like None should always be done with is or
is not, never the equality operators.
Also, beware of writing if x when you really mean if x is not None --
e.g. when testing whether a variable or argument that defaults to None
was set to some other value. The other value might have a type (such
as a container) that could be false in a boolean context!
Inferring equality from identity
If is is true, equality can usually be inferred - logically, if an object is itself, then it should test as equivalent to itself.
In most cases this logic is true, but it relies on the implementation of the __eq__ special method. As the docs say,
The default behavior for equality comparison (== and !=) is based on
the identity of the objects. Hence, equality comparison of instances
with the same identity results in equality, and equality comparison of
instances with different identities results in inequality. A
motivation for this default behavior is the desire that all objects
should be reflexive (i.e. x is y implies x == y).
and in the interests of consistency, recommends:
Equality comparison should be reflexive. In other words, identical
objects should compare equal:
x is y implies x == y
We can see that this is the default behavior for custom objects:
>>> class Object(object): pass
>>> obj = Object()
>>> obj2 = Object()
>>> obj == obj, obj is obj
(True, True)
>>> obj == obj2, obj is obj2
(False, False)
The contrapositive is also usually true - if somethings test as not equal, you can usually infer that they are not the same object.
Since tests for equality can be customized, this inference does not always hold true for all types.
An exception
A notable exception is nan - it always tests as not equal to itself:
>>> nan = float('nan')
>>> nan
nan
>>> nan is nan
True
>>> nan == nan # !!!!!
False
Checking for identity can be much a much quicker check than checking for equality (which might require recursively checking members).
But it cannot be substituted for equality where you may find more than one object as equivalent.
Note that comparing equality of lists and tuples will assume that identity of objects are equal (because this is a fast check). This can create contradictions if the logic is inconsistent - as it is for nan:
>>> [nan] == [nan]
True
>>> (nan,) == (nan,)
True
A Cautionary Tale:
The question is attempting to use is to compare integers. You shouldn't assume that an instance of an integer is the same instance as one obtained by another reference. This story explains why.
A commenter had code that relied on the fact that small integers (-5 to 256 inclusive) are singletons in Python, instead of checking for equality.
Wow, this can lead to some insidious bugs. I had some code that checked if a is b, which worked as I wanted because a and b are typically small numbers. The bug only happened today, after six months in production, because a and b were finally large enough to not be cached. – gwg
It worked in development. It may have passed some unittests.
And it worked in production - until the code checked for an integer larger than 256, at which point it failed in production.
This is a production failure that could have been caught in code review or possibly with a style-checker.
Let me emphasize: do not use is to compare integers.

== determines if the values are equal, while is determines if they are the exact same object.

What's the difference between is and ==?
== and is are different comparison! As others already said:
== compares the values of the objects.
is compares the references of the objects.
In Python names refer to objects, for example in this case value1 and value2 refer to an int instance storing the value 1000:
value1 = 1000
value2 = value1
Because value2 refers to the same object is and == will give True:
>>> value1 == value2
True
>>> value1 is value2
True
In the following example the names value1 and value2 refer to different int instances, even if both store the same integer:
>>> value1 = 1000
>>> value2 = 1000
Because the same value (integer) is stored == will be True, that's why it's often called "value comparison". However is will return False because these are different objects:
>>> value1 == value2
True
>>> value1 is value2
False
When to use which?
Generally is is a much faster comparison. That's why CPython caches (or maybe reuses would be the better term) certain objects like small integers, some strings, etc. But this should be treated as implementation detail that could (even if unlikely) change at any point without warning.
You should only use is if you:
want to check if two objects are really the same object (not just the same "value"). One example can be if you use a singleton object as constant.
want to compare a value to a Python constant. The constants in Python are:
None
True1
False1
NotImplemented
Ellipsis
__debug__
classes (for example int is int or int is float)
there could be additional constants in built-in modules or 3rd party modules. For example np.ma.masked from the NumPy module)
In every other case you should use == to check for equality.
Can I customize the behavior?
There is some aspect to == that hasn't been mentioned already in the other answers: It's part of Pythons "Data model". That means its behavior can be customized using the __eq__ method. For example:
class MyClass(object):
def __init__(self, val):
self._value = val
def __eq__(self, other):
print('__eq__ method called')
try:
return self._value == other._value
except AttributeError:
raise TypeError('Cannot compare {0} to objects of type {1}'
.format(type(self), type(other)))
This is just an artificial example to illustrate that the method is really called:
>>> MyClass(10) == MyClass(10)
__eq__ method called
True
Note that by default (if no other implementation of __eq__ can be found in the class or the superclasses) __eq__ uses is:
class AClass(object):
def __init__(self, value):
self._value = value
>>> a = AClass(10)
>>> b = AClass(10)
>>> a == b
False
>>> a == a
So it's actually important to implement __eq__ if you want "more" than just reference-comparison for custom classes!
On the other hand you cannot customize is checks. It will always compare just if you have the same reference.
Will these comparisons always return a boolean?
Because __eq__ can be re-implemented or overridden, it's not limited to return True or False. It could return anything (but in most cases it should return a boolean!).
For example with NumPy arrays the == will return an array:
>>> import numpy as np
>>> np.arange(10) == 2
array([False, False, True, False, False, False, False, False, False, False], dtype=bool)
But is checks will always return True or False!
1 As Aaron Hall mentioned in the comments:
Generally you shouldn't do any is True or is False checks because one normally uses these "checks" in a context that implicitly converts the condition to a boolean (for example in an if statement). So doing the is True comparison and the implicit boolean cast is doing more work than just doing the boolean cast - and you limit yourself to booleans (which isn't considered pythonic).
Like PEP8 mentions:
Don't compare boolean values to True or False using ==.
Yes: if greeting:
No: if greeting == True:
Worse: if greeting is True:

They are completely different. is checks for object identity, while == checks for equality (a notion that depends on the two operands' types).
It is only a lucky coincidence that "is" seems to work correctly with small integers (e.g. 5 == 4+1). That is because CPython optimizes the storage of integers in the range (-5 to 256) by making them singletons. This behavior is totally implementation-dependent and not guaranteed to be preserved under all manner of minor transformative operations.
For example, Python 3.5 also makes short strings singletons, but slicing them disrupts this behavior:
>>> "foo" + "bar" == "foobar"
True
>>> "foo" + "bar" is "foobar"
True
>>> "foo"[:] + "bar" == "foobar"
True
>>> "foo"[:] + "bar" is "foobar"
False

https://docs.python.org/library/stdtypes.html#comparisons
is tests for identity
== tests for equality
Each (small) integer value is mapped to a single value, so every 3 is identical and equal. This is an implementation detail, not part of the language spec though

Your answer is correct. The is operator compares the identity of two objects. The == operator compares the values of two objects.
An object's identity never changes once it has been created; you may think of it as the object's address in memory.
You can control comparison behaviour of object values by defining a __cmp__ method or a rich comparison method like __eq__.

Have a look at Stack Overflow question Python's “is” operator behaves unexpectedly with integers.
What it mostly boils down to is that "is" checks to see if they are the same object, not just equal to each other (the numbers below 256 are a special case).

In a nutshell, is checks whether two references point to the same object or not.== checks whether two objects have the same value or not.
a=[1,2,3]
b=a #a and b point to the same object
c=list(a) #c points to different object
if a==b:
print('#') #output:#
if a is b:
print('##') #output:##
if a==c:
print('###') #output:##
if a is c:
print('####') #no output as c and a point to different object

As the other people in this post answer the question in details the difference between == and is for comparing Objects or variables, I would emphasize mainly the comparison between is and == for strings which can give different results and I would urge programmers to carefully use them.
For string comparison, make sure to use == instead of is:
str = 'hello'
if (str is 'hello'):
print ('str is hello')
if (str == 'hello'):
print ('str == hello')
Out:
str is hello
str == hello
But in the below example == and is will get different results:
str2 = 'hello sam'
if (str2 is 'hello sam'):
print ('str2 is hello sam')
if (str2 == 'hello sam'):
print ('str2 == hello sam')
Out:
str2 == hello sam
Conclusion and Analysis:
Use is carefully to compare between strings.
Since is for comparing objects and since in Python 3+ every variable such as string interpret as an object, let's see what happened in above paragraphs.
In python there is id function that shows a unique constant of an object during its lifetime. This id is using in back-end of Python interpreter to compare two objects using is keyword.
str = 'hello'
id('hello')
> 140039832615152
id(str)
> 140039832615152
But
str2 = 'hello sam'
id('hello sam')
> 140039832615536
id(str2)
> 140039832615792

As John Feminella said, most of the time you will use == and != because your objective is to compare values. I'd just like to categorise what you would do the rest of the time:
There is one and only one instance of NoneType i.e. None is a singleton. Consequently foo == None and foo is None mean the same. However the is test is faster and the Pythonic convention is to use foo is None.
If you are doing some introspection or mucking about with garbage collection or checking whether your custom-built string interning gadget is working or suchlike, then you probably have a use-case for foo is bar.
True and False are also (now) singletons, but there is no use-case for foo == True and no use case for foo is True.

Most of them already answered to the point. Just as an additional note (based on my understanding and experimenting but not from a documented source), the statement
== if the objects referred to by the variables are equal
from above answers should be read as
== if the objects referred to by the variables are equal and objects belonging to the same type/class
. I arrived at this conclusion based on the below test:
list1 = [1,2,3,4]
tuple1 = (1,2,3,4)
print(list1)
print(tuple1)
print(id(list1))
print(id(tuple1))
print(list1 == tuple1)
print(list1 is tuple1)
Here the contents of the list and tuple are same but the type/class are different.

Groovy different results on using equals() and == on a GStringImpl

According to the Groovy docs, the == is just a "clever" equals() as it also takes care of avoiding NullPointerException:
Java’s == is actually Groovy’s is() method, and Groovy’s == is a clever equals()!
[...]
But to do the usual equals() comparison, you should prefer Groovy’s ==, as it also takes care of avoiding NullPointerException, independently of whether the left or right is null or not.
So, the == and equals() should return the same value if the objects are not null. However, I'm getting unexpected results on executing the following script:
println "${'test'}" == 'test'
println "${'test'}".equals('test')
The output that I'm getting is:
true
false
Is this a known bug related to GStringImpl or something that I'm missing?

Nice question, the surprising thing about the code above is that
println "${'test'}".equals('test')
returns false. The other line of code returns the expected result, so let's forget about that.
Summary
"${'test'}".equals('test')
The object that equals is called on is of type GStringImpl whereas 'test' is of type String, so they are not considered equal.
But Why?
Obviously the GStringImpl implementation of equals could have been written such that when it is passed a String that contain the same characters as this, it returns true. Prima facie, this seems like a reasonable thing to do.
I'm guessing that the reason it wasn't written this way is because it would violate the equals contract, which states that:
It is symmetric: for any non-null reference values x and y, x.equals(y) should return true if and only if y.equals(x) returns true.
The implementation of String.equals(Object other) will always return false when passed a GSStringImpl, so if GStringImpl.equals(Object other) returns true when passed any String, it would be in violation of the symmetric requirement.

In groovy a == b checks first for a compareTo method and uses a.compareTo(b) == 0 if a compareTo method exists. Otherwise it will use equals.
Since Strings and GStrings implement Comparable there is a compareTo method available.
The following prints true, as expected:
println "${'test'}".compareTo('test') == 0
The behaviour of == is documented in the Groovy Language Documentation:
In Java == means equality of primitive types or identity for objects. In Groovy == means equality in all cases. It translates to a.compareTo(b) == 0, when evaluating equality for Comparable objects, and a.equals(b) otherwise. To check for identity (reference equality), use the is method: a.is(b). From Groovy 3, you can also use the === operator (or negated version): a === b (or c !== d).
The full list of operators are provided in the Groovy Language Documentation for operator overloading:
Operator
Method
+
a.plus(b)
-
a.minus(b)
*
a.multiply(b)
/
a.div(b)
%
a.mod(b)
**
a.power(b)
|
a.or(b)
&
a.and(b)
^
a.xor(b)
as
a.asType(b)
a()
a.call()
a[b]
a.getAt(b)
a[b] = c
a.putAt(b, c)
a in b
b.isCase(a)
<<
a.leftShift(b)
>>
a.rightShift(b)
>>>
a.rightShiftUnsigned(b)
++
a.next()
--
a.previous()
+a
a.positive()
-a
a.negative()
~a
a.bitwiseNegate()

Leaving this here as an additional answer, so it can be found easily for Groovy beginners.
I am explicitly transforming the GString to a normal String before comparing it.
println "${'test'}".equals("test");
println "${'test'}".toString().equals("test");
results in
false
true

Smalltalk - Compare two strings for equality

I am trying to compare two strings in Smalltalk, but I seem to be doing something wrong.
I keep getting this error:
Unhandled Exception: Non-boolean receiver. Proceed for truth.
stringOne := 'hello'.
stringTwo := 'hello'.
myNumber := 10.
[stringOne = stringTwo ] ifTrue:[
myNumber := 20].
Any idea what I'm doing wrong?

Try
stringOne = stringTwo
ifTrue: [myNumber := 20]`
I don't think you need square brackets in the first line
Found great explanation. Whole thing is here
In Smalltalk, booleans (ie, True or False) are objects: specifically, they're instantiations of the abstract base class Boolean, or rather of its two subclasses True and False. So every boolean has type True or False, and no actual member data. Bool has two virtual functions, ifTrue: and ifFalse:, which take as their argument a block of code. Both True and False override these functions; True's version of ifTrue: calls the code it's passed, and False's version does nothing (and vice-versa for ifFalse:). Here's an example:
a < b
ifTrue: [^'a is less than b']
ifFalse: [^'a is greater than or equal to b']
Those things in square brackets are essentially anonymous functions, by the way. Except they're objects, because everything is an object in Smalltalk. Now, what's happening there is that we call a's "<" method, with argument b; this returns a boolean. We call its ifTrue: and ifFalse: methods, passing as arguments the code we want executed in either case. The effect is the same as that of the Ruby code
if a < b then
puts "a is less than b"
else
puts "a is greater than or equal to b"
end

As others have said, it will work the way you want if you get rid of the first set of square brackets.
But to explain the problem you were running into better:
[stringOne = stringTwo ] ifTrue:[myNumber := 20]
is passing the message ifTrue: to a block, and blocks do not understand that method, only boolean objects do.
If you first evaluate the block, it will evaluate to a true object, which will then know how to respond:
[stringOne = stringTwo] value ifTrue:[myNumber := 20]
Or what you should really do, as others have pointed out:
stringOne = stringTwo ifTrue:[myNumber := 20]
both of which evaluates stringOne = stringTwo to true before sending ifTrue:[...] to it.

[stringOne = stringTwo] is a block, not a boolean. When the block is invoked, perhaps it will result in a boolean. But you are not invoking the block here. Instead, you are merely causing the block to be the receiver of ifTrue.
Instead, try:
(stringOne = stringTwo) ifTrue: [
myNumber := 20 ].

Should you be blocking the comparison? I would have thought that:
( stringOne = stringTwo ) ifTrue: [ myNumber := 20 ]
would be enough.

but I seem to be doing something wrong
Given that you are using VisualWorks your install should include a doc folder.
Look at the AppDevGuide.pdf - it has a lot of information about programming with VisualWorks and more to the point it has a lot of introductory information about Smalltalk programming.
Look through the Contents table at the beginning, until Chapter 7 "Control Structures", click "Branching" or "Conditional Tests" and you'll be taken to the appropriate section in the pdf that tells you all about Smalltalk if-then-else and gives examples that would have helped you see what you were doing wrong.

I would like to add the following 50Cent:
as blocks are actually lambdas which can be passed around, another good example would be the following method:
do:aBlock ifCondition:aCondition
... some more code ...
aCondition value ifTrue: aBlock.
... some more code ...
aBlock value
...
so the argument to ifTrue:/ifFalse: can actually come from someone else. This kind of passed-in conditions is often useful in "..ifAbsent:" or "..onError:" kind of methods.
(originally meant as a comment, but I could not get the code example to be unformatted)

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Smalltalk / Squeak string shallow equality - string

Related

Why there is difference between 'is' and '==' with various results in python [duplicate]

When a GString will change its toString representation

Python operating on big numbers causes margin errors [duplicate]

Groovy different results on using equals() and == on a GStringImpl

Smalltalk - Compare two strings for equality

Categories

Resources