Struct equality with arrays

Struct equality with arrays - struct

If I have arrays in a struct as below, I can't compare the equality of the struct because the arrays are mutable? Is there a way to get the equality to pass down to the array so that I get true for a([1,2,3]) == a([1,2,3])? Or is the only way to do this to extend Base.==?
julia> struct a
v
end
julia> a([1,2,3]) == a([1,2,3])
false
julia> a(1) == a(1)
true
julia> [1,2,3] == [1,2,3] # want the equality to work like this for the struct
true
julia> [1,2,3] === [1,2,3]
false

The answer by #miguel raz does not work at all!
This happens since isequal is actually calling == rather than == calling isequal. In the isequal doc you can find explicitely that:
The default implementation of isequal calls ==, so a type that does not involve
floating-point values generally only needs to define ==
Hence the correct code is:
struct A
v
end
import Base.==
==(x::A,y::A) = x.v==y.v
However, a more elegant approach would be to write a generic code that does not rely on having the field v. Since we do not want to overload the default == operator we can define an abstract type that will tell Julia to use our implementation:
abstract type Comparable end
import Base.==
function ==(a::T, b::T) where T <: Comparable
f = fieldnames(T)
getfield.(Ref(a),f) == getfield.(Ref(b),f)
end
Now you can define your own structures that will correctly compare:
struct B <: Comparable
x
y
end
Testing:
julia> b1 = B([1,2],[B(7,[1])]);
julia> b2 = B([1,2],[B(7,[1])])
julia> b1 == b2
true

As #Przemyslaw answered, an overload of == is needed.
For situations where an abstract class implementation doesn't work, the overload can be done as a one-liner (avoiding the import statement):
Base.:(==)(a::A, b::A) = Base.:(==)(a.v, a.v)
It's good to also override isequal (for structs that don't need special NaN and missing value semantics) and hash (for structs that can be used as keys):
Base.isequal(a::A, b::A) = Base.isequal(a.v, b.v)
Base.hash(a::A, h::UInt) = Base.hash(a.v, h)

Related

Why there is difference between 'is' and '==' with various results in python [duplicate]

This question's answers are a community effort. Edit existing answers to improve this post. It is not currently accepting new answers or interactions.
My Google-fu has failed me.
In Python, are the following two tests for equality equivalent?
n = 5
# Test one.
if n == 5:
print 'Yay!'
# Test two.
if n is 5:
print 'Yay!'
Does this hold true for objects where you would be comparing instances (a list say)?
Okay, so this kind of answers my question:
L = []
L.append(1)
if L == [1]:
print 'Yay!'
# Holds true, but...
if L is [1]:
print 'Yay!'
# Doesn't.
So == tests value where is tests to see if they are the same object?

is will return True if two variables point to the same object (in memory), == if the objects referred to by the variables are equal.
>>> a = [1, 2, 3]
>>> b = a
>>> b is a
True
>>> b == a
True
# Make a new copy of list `a` via the slice operator,
# and assign it to variable `b`
>>> b = a[:]
>>> b is a
False
>>> b == a
True
In your case, the second test only works because Python caches small integer objects, which is an implementation detail. For larger integers, this does not work:
>>> 1000 is 10**3
False
>>> 1000 == 10**3
True
The same holds true for string literals:
>>> "a" is "a"
True
>>> "aa" is "a" * 2
True
>>> x = "a"
>>> "aa" is x * 2
False
>>> "aa" is intern(x*2)
True
Please see this question as well.

There is a simple rule of thumb to tell you when to use == or is.
== is for value equality. Use it when you would like to know if two objects have the same value.
is is for reference equality. Use it when you would like to know if two references refer to the same object.
In general, when you are comparing something to a simple type, you are usually checking for value equality, so you should use ==. For example, the intention of your example is probably to check whether x has a value equal to 2 (==), not whether x is literally referring to the same object as 2.
Something else to note: because of the way the CPython reference implementation works, you'll get unexpected and inconsistent results if you mistakenly use is to compare for reference equality on integers:
>>> a = 500
>>> b = 500
>>> a == b
True
>>> a is b
False
That's pretty much what we expected: a and b have the same value, but are distinct entities. But what about this?
>>> c = 200
>>> d = 200
>>> c == d
True
>>> c is d
True
This is inconsistent with the earlier result. What's going on here? It turns out the reference implementation of Python caches integer objects in the range -5..256 as singleton instances for performance reasons. Here's an example demonstrating this:
>>> for i in range(250, 260): a = i; print "%i: %s" % (i, a is int(str(i)));
...
250: True
251: True
252: True
253: True
254: True
255: True
256: True
257: False
258: False
259: False
This is another obvious reason not to use is: the behavior is left up to implementations when you're erroneously using it for value equality.

Is there a difference between == and is in Python?
Yes, they have a very important difference.
==: check for equality - the semantics are that equivalent objects (that aren't necessarily the same object) will test as equal. As the documentation says:
The operators <, >, ==, >=, <=, and != compare the values of two objects.
is: check for identity - the semantics are that the object (as held in memory) is the object. Again, the documentation says:
The operators is and is not test for object identity: x is y is true
if and only if x and y are the same object. Object identity is
determined using the id() function. x is not y yields the inverse
truth value.
Thus, the check for identity is the same as checking for the equality of the IDs of the objects. That is,
a is b
is the same as:
id(a) == id(b)
where id is the builtin function that returns an integer that "is guaranteed to be unique among simultaneously existing objects" (see help(id)) and where a and b are any arbitrary objects.
Other Usage Directions
You should use these comparisons for their semantics. Use is to check identity and == to check equality.
So in general, we use is to check for identity. This is usually useful when we are checking for an object that should only exist once in memory, referred to as a "singleton" in the documentation.
Use cases for is include:
None
enum values (when using Enums from the enum module)
usually modules
usually class objects resulting from class definitions
usually function objects resulting from function definitions
anything else that should only exist once in memory (all singletons, generally)
a specific object that you want by identity
Usual use cases for == include:
numbers, including integers
strings
lists
sets
dictionaries
custom mutable objects
other builtin immutable objects, in most cases
The general use case, again, for ==, is the object you want may not be the same object, instead it may be an equivalent one
PEP 8 directions
PEP 8, the official Python style guide for the standard library also mentions two use-cases for is:
Comparisons to singletons like None should always be done with is or
is not, never the equality operators.
Also, beware of writing if x when you really mean if x is not None --
e.g. when testing whether a variable or argument that defaults to None
was set to some other value. The other value might have a type (such
as a container) that could be false in a boolean context!
Inferring equality from identity
If is is true, equality can usually be inferred - logically, if an object is itself, then it should test as equivalent to itself.
In most cases this logic is true, but it relies on the implementation of the __eq__ special method. As the docs say,
The default behavior for equality comparison (== and !=) is based on
the identity of the objects. Hence, equality comparison of instances
with the same identity results in equality, and equality comparison of
instances with different identities results in inequality. A
motivation for this default behavior is the desire that all objects
should be reflexive (i.e. x is y implies x == y).
and in the interests of consistency, recommends:
Equality comparison should be reflexive. In other words, identical
objects should compare equal:
x is y implies x == y
We can see that this is the default behavior for custom objects:
>>> class Object(object): pass
>>> obj = Object()
>>> obj2 = Object()
>>> obj == obj, obj is obj
(True, True)
>>> obj == obj2, obj is obj2
(False, False)
The contrapositive is also usually true - if somethings test as not equal, you can usually infer that they are not the same object.
Since tests for equality can be customized, this inference does not always hold true for all types.
An exception
A notable exception is nan - it always tests as not equal to itself:
>>> nan = float('nan')
>>> nan
nan
>>> nan is nan
True
>>> nan == nan # !!!!!
False
Checking for identity can be much a much quicker check than checking for equality (which might require recursively checking members).
But it cannot be substituted for equality where you may find more than one object as equivalent.
Note that comparing equality of lists and tuples will assume that identity of objects are equal (because this is a fast check). This can create contradictions if the logic is inconsistent - as it is for nan:
>>> [nan] == [nan]
True
>>> (nan,) == (nan,)
True
A Cautionary Tale:
The question is attempting to use is to compare integers. You shouldn't assume that an instance of an integer is the same instance as one obtained by another reference. This story explains why.
A commenter had code that relied on the fact that small integers (-5 to 256 inclusive) are singletons in Python, instead of checking for equality.
Wow, this can lead to some insidious bugs. I had some code that checked if a is b, which worked as I wanted because a and b are typically small numbers. The bug only happened today, after six months in production, because a and b were finally large enough to not be cached. – gwg
It worked in development. It may have passed some unittests.
And it worked in production - until the code checked for an integer larger than 256, at which point it failed in production.
This is a production failure that could have been caught in code review or possibly with a style-checker.
Let me emphasize: do not use is to compare integers.

== determines if the values are equal, while is determines if they are the exact same object.

What's the difference between is and ==?
== and is are different comparison! As others already said:
== compares the values of the objects.
is compares the references of the objects.
In Python names refer to objects, for example in this case value1 and value2 refer to an int instance storing the value 1000:
value1 = 1000
value2 = value1
Because value2 refers to the same object is and == will give True:
>>> value1 == value2
True
>>> value1 is value2
True
In the following example the names value1 and value2 refer to different int instances, even if both store the same integer:
>>> value1 = 1000
>>> value2 = 1000
Because the same value (integer) is stored == will be True, that's why it's often called "value comparison". However is will return False because these are different objects:
>>> value1 == value2
True
>>> value1 is value2
False
When to use which?
Generally is is a much faster comparison. That's why CPython caches (or maybe reuses would be the better term) certain objects like small integers, some strings, etc. But this should be treated as implementation detail that could (even if unlikely) change at any point without warning.
You should only use is if you:
want to check if two objects are really the same object (not just the same "value"). One example can be if you use a singleton object as constant.
want to compare a value to a Python constant. The constants in Python are:
None
True1
False1
NotImplemented
Ellipsis
__debug__
classes (for example int is int or int is float)
there could be additional constants in built-in modules or 3rd party modules. For example np.ma.masked from the NumPy module)
In every other case you should use == to check for equality.
Can I customize the behavior?
There is some aspect to == that hasn't been mentioned already in the other answers: It's part of Pythons "Data model". That means its behavior can be customized using the __eq__ method. For example:
class MyClass(object):
def __init__(self, val):
self._value = val
def __eq__(self, other):
print('__eq__ method called')
try:
return self._value == other._value
except AttributeError:
raise TypeError('Cannot compare {0} to objects of type {1}'
.format(type(self), type(other)))
This is just an artificial example to illustrate that the method is really called:
>>> MyClass(10) == MyClass(10)
__eq__ method called
True
Note that by default (if no other implementation of __eq__ can be found in the class or the superclasses) __eq__ uses is:
class AClass(object):
def __init__(self, value):
self._value = value
>>> a = AClass(10)
>>> b = AClass(10)
>>> a == b
False
>>> a == a
So it's actually important to implement __eq__ if you want "more" than just reference-comparison for custom classes!
On the other hand you cannot customize is checks. It will always compare just if you have the same reference.
Will these comparisons always return a boolean?
Because __eq__ can be re-implemented or overridden, it's not limited to return True or False. It could return anything (but in most cases it should return a boolean!).
For example with NumPy arrays the == will return an array:
>>> import numpy as np
>>> np.arange(10) == 2
array([False, False, True, False, False, False, False, False, False, False], dtype=bool)
But is checks will always return True or False!
1 As Aaron Hall mentioned in the comments:
Generally you shouldn't do any is True or is False checks because one normally uses these "checks" in a context that implicitly converts the condition to a boolean (for example in an if statement). So doing the is True comparison and the implicit boolean cast is doing more work than just doing the boolean cast - and you limit yourself to booleans (which isn't considered pythonic).
Like PEP8 mentions:
Don't compare boolean values to True or False using ==.
Yes: if greeting:
No: if greeting == True:
Worse: if greeting is True:

They are completely different. is checks for object identity, while == checks for equality (a notion that depends on the two operands' types).
It is only a lucky coincidence that "is" seems to work correctly with small integers (e.g. 5 == 4+1). That is because CPython optimizes the storage of integers in the range (-5 to 256) by making them singletons. This behavior is totally implementation-dependent and not guaranteed to be preserved under all manner of minor transformative operations.
For example, Python 3.5 also makes short strings singletons, but slicing them disrupts this behavior:
>>> "foo" + "bar" == "foobar"
True
>>> "foo" + "bar" is "foobar"
True
>>> "foo"[:] + "bar" == "foobar"
True
>>> "foo"[:] + "bar" is "foobar"
False

https://docs.python.org/library/stdtypes.html#comparisons
is tests for identity
== tests for equality
Each (small) integer value is mapped to a single value, so every 3 is identical and equal. This is an implementation detail, not part of the language spec though

Your answer is correct. The is operator compares the identity of two objects. The == operator compares the values of two objects.
An object's identity never changes once it has been created; you may think of it as the object's address in memory.
You can control comparison behaviour of object values by defining a __cmp__ method or a rich comparison method like __eq__.

Have a look at Stack Overflow question Python's “is” operator behaves unexpectedly with integers.
What it mostly boils down to is that "is" checks to see if they are the same object, not just equal to each other (the numbers below 256 are a special case).

In a nutshell, is checks whether two references point to the same object or not.== checks whether two objects have the same value or not.
a=[1,2,3]
b=a #a and b point to the same object
c=list(a) #c points to different object
if a==b:
print('#') #output:#
if a is b:
print('##') #output:##
if a==c:
print('###') #output:##
if a is c:
print('####') #no output as c and a point to different object

As the other people in this post answer the question in details the difference between == and is for comparing Objects or variables, I would emphasize mainly the comparison between is and == for strings which can give different results and I would urge programmers to carefully use them.
For string comparison, make sure to use == instead of is:
str = 'hello'
if (str is 'hello'):
print ('str is hello')
if (str == 'hello'):
print ('str == hello')
Out:
str is hello
str == hello
But in the below example == and is will get different results:
str2 = 'hello sam'
if (str2 is 'hello sam'):
print ('str2 is hello sam')
if (str2 == 'hello sam'):
print ('str2 == hello sam')
Out:
str2 == hello sam
Conclusion and Analysis:
Use is carefully to compare between strings.
Since is for comparing objects and since in Python 3+ every variable such as string interpret as an object, let's see what happened in above paragraphs.
In python there is id function that shows a unique constant of an object during its lifetime. This id is using in back-end of Python interpreter to compare two objects using is keyword.
str = 'hello'
id('hello')
> 140039832615152
id(str)
> 140039832615152
But
str2 = 'hello sam'
id('hello sam')
> 140039832615536
id(str2)
> 140039832615792

As John Feminella said, most of the time you will use == and != because your objective is to compare values. I'd just like to categorise what you would do the rest of the time:
There is one and only one instance of NoneType i.e. None is a singleton. Consequently foo == None and foo is None mean the same. However the is test is faster and the Pythonic convention is to use foo is None.
If you are doing some introspection or mucking about with garbage collection or checking whether your custom-built string interning gadget is working or suchlike, then you probably have a use-case for foo is bar.
True and False are also (now) singletons, but there is no use-case for foo == True and no use case for foo is True.

Most of them already answered to the point. Just as an additional note (based on my understanding and experimenting but not from a documented source), the statement
== if the objects referred to by the variables are equal
from above answers should be read as
== if the objects referred to by the variables are equal and objects belonging to the same type/class
. I arrived at this conclusion based on the below test:
list1 = [1,2,3,4]
tuple1 = (1,2,3,4)
print(list1)
print(tuple1)
print(id(list1))
print(id(tuple1))
print(list1 == tuple1)
print(list1 is tuple1)
Here the contents of the list and tuple are same but the type/class are different.

Does `==` for struct check recursively in Julia ? It seems not

There is a behavior I don't understand when checking for equality in Julia concerning "struct" objects.
The documentation states : : "For collections, == is generally called recursively on all contents, though other properties (like the shape for arrays) may also be taken into account". Though it seems for structs it is cast to === or something.
Here is a minimal working exemple :
As expected :
string1 = String("S")
string2 = String("S")
string1 == string2
=> returns true
and :
set1 = Set(["S"])
set2 = Set(["S"])
set1 == set2
=> returns true
BUT ! And that is what I don't understand :
struct StringStruct
f::String
end
stringstruct1 = StringStruct("S")
stringstruct2 = StringStruct("S")
stringstruct1 == stringstruct2
=> returns true
However :
struct SetStruct
f::Set{String}
end
setstruct1 = SetStruct(Set(["S"]))
setstruct2 = SetStruct(Set(["S"]))
setstruct1 == setstruct2
=> returns false
To me, it looks like ===is tested on the elements of the struct.
So my question is : what is the real behavior of == when tested on structs ? Does it cast == or === ? In case it casts == as the documentation states, what is the point I misunderstand ?

For structs by default == falls back to ===, so e.g.:
setstruct1 == setstruct2
is the same as
setstruct1 === setstruct2
So now we go down to the way how === works. And it is defined as:
Determine whether x and y are identical, in the sense that no program could distinguish them.
(I am leaving out the rest of the definition, as I believe this fist sentence builds a nice mental model).
Now clearly stringstruct1 and stringstruct2 are non-distinguishable. They are immutable and contain strings that are immutable in Julia. In particular they have the same hash value (which is not a definitive test, but is a nice mental model here).
julia> hash(stringstruct1), hash(stringstruct2)
(0x9e0bef39ad32ce56, 0x9e0bef39ad32ce56)
Now setstrict1 and setstruct2 are distinguishable. They store different sets, although at the moment of comparison these sets contain the same elements, but they have different memory locations (so in the future they can be different - in short - they are distinguishable). Note that these structs have different hashes in particular:
julia> hash(setstruct1), hash(setstruct2)
(0xe7d0f90913646f29, 0x3b31ce0af9245c64)
Now notice the following:
julia> s = Set(["S"])
Set{String} with 1 element:
"S"
julia> ss1 = SetStruct(s)
SetStruct(Set(["S"]))
julia> ss2 = SetStruct(s)
SetStruct(Set(["S"]))
julia> ss1 == ss2
true
julia> ss1 === ss2
true
julia> hash(ss1), hash(ss2)
(0x9127f7b72f753361, 0x9127f7b72f753361)
This time ss1 and ss2 are passing all the tests, as again they are indistinguishable (if you change ss1 then ss2 changes in sync as they hold the same Set).

While Bogumil's answer explains what is happening let me show how to bring the behavior that you are expecting to your code.
Simply add the following lines.
abstract type Comparable end
import Base.==
==(a::T, b::T) where T <: Comparable =
getfield.(Ref(a),fieldnames(T)) == getfield.(Ref(b),fieldnames(T))
Now you can make your struct to be comparable as you wish:
struct SetStruct <: Comparable
f::Set{String}
end
setstruct1 = SetStruct(Set(["S"]))
setstruct2 = SetStruct(Set(["S"]))
Tesing:
julia> setstruct1 == setstruct2
true

Variable fieldnames in Julia mutable structs

I would like to able to get fields from a mutable struct in Julia using a variable.
e.g.
mutable struct myType
my_field1::Int = 1
my_field2::Int = 2
my_field3::Int = 3
end
And then let's imagine you declare a particular instance of this struct using struct_instance = myType()
How can you extract the value of a field from an instance of this mutable struct in a variable fashion?
Let's say you want to assign the value of my_struct.field[X] to a variable, using a for-loop, so that the particular field you're currently accessing depends on the variable X:
foo = zeros(Int64, 3)
for X = 1:3
foo(X) = struct_instance.field[X]
end
I don't know how to actually implement the above for-loop -- what I wrote above is just pseudo-code above. In MATLAB you would use the following notation, for instance:
foo = zeros(1,3)
for x = 1:3
foo(x) = struct_instance.(sprintf('field%d',x))
end
Thanks in advance.

For the code at the beginning of your example to work you need the Paramaters package, otherwise you are not able to have default values (the code from your example throws an error). I use it very often exactly in situations where I need a struct to use to represent a bunch of variables.
using Parameters
#with_kw mutable struct MyType
my_field1::Int = 1
my_field2::Int = 2
my_field3::Int = 3
end
This generates also a set of keyword methods that can be used to set the fields programmatically when creating the object. Have a look at the following code:
julia> vals = [Symbol("my_field$i") =>10i for i in 2:3 ]
2-element Array{Pair{Symbol,Int64},1}:
:my_field2 => 20
:my_field3 => 30
julia> MyType(;vals...)
MyType
my_field1: Int64 1
my_field2: Int64 20
my_field3: Int64 30
We have created here a set of field names and now we are using it when creating an object. This approach is particularly useful when you could consider using immutable object instead of mutable ones (immutable objects are always much faster).
You can mutate the object using setfield!:
julia> for i in 1:2
setfield!(m,Symbol("my_field$i"), 20i)
end
julia> m
MyType
my_field1: Int64 20
my_field2: Int64 40
my_field3: Int64 30
I think that when you are coming from Matlab this would be the most convenient way to layout your structs.

The function to get fields from a struct is fieldnames:
julia> mutable struct A
x
y
z
end
julia> fieldnames(A)
(:x, :y, :z)
To set those field values* programmatically, you can use setproperty!:
julia> a = A(1,1,1)
A(1,1,1)
julia> a.x
1
julia> setproperty!(a, :x, 2)
2
julia> a.x
2
With a for loop,
julia> for i in fieldnames(A)
setproperty!(a, i, 3)
end
julia> a
A(3,3,3)

Python operating on big numbers causes margin errors [duplicate]

This question's answers are a community effort. Edit existing answers to improve this post. It is not currently accepting new answers or interactions.
My Google-fu has failed me.
In Python, are the following two tests for equality equivalent?
n = 5
# Test one.
if n == 5:
print 'Yay!'
# Test two.
if n is 5:
print 'Yay!'
Does this hold true for objects where you would be comparing instances (a list say)?
Okay, so this kind of answers my question:
L = []
L.append(1)
if L == [1]:
print 'Yay!'
# Holds true, but...
if L is [1]:
print 'Yay!'
# Doesn't.
So == tests value where is tests to see if they are the same object?

is will return True if two variables point to the same object (in memory), == if the objects referred to by the variables are equal.
>>> a = [1, 2, 3]
>>> b = a
>>> b is a
True
>>> b == a
True
# Make a new copy of list `a` via the slice operator,
# and assign it to variable `b`
>>> b = a[:]
>>> b is a
False
>>> b == a
True
In your case, the second test only works because Python caches small integer objects, which is an implementation detail. For larger integers, this does not work:
>>> 1000 is 10**3
False
>>> 1000 == 10**3
True
The same holds true for string literals:
>>> "a" is "a"
True
>>> "aa" is "a" * 2
True
>>> x = "a"
>>> "aa" is x * 2
False
>>> "aa" is intern(x*2)
True
Please see this question as well.

There is a simple rule of thumb to tell you when to use == or is.
== is for value equality. Use it when you would like to know if two objects have the same value.
is is for reference equality. Use it when you would like to know if two references refer to the same object.
In general, when you are comparing something to a simple type, you are usually checking for value equality, so you should use ==. For example, the intention of your example is probably to check whether x has a value equal to 2 (==), not whether x is literally referring to the same object as 2.
Something else to note: because of the way the CPython reference implementation works, you'll get unexpected and inconsistent results if you mistakenly use is to compare for reference equality on integers:
>>> a = 500
>>> b = 500
>>> a == b
True
>>> a is b
False
That's pretty much what we expected: a and b have the same value, but are distinct entities. But what about this?
>>> c = 200
>>> d = 200
>>> c == d
True
>>> c is d
True
This is inconsistent with the earlier result. What's going on here? It turns out the reference implementation of Python caches integer objects in the range -5..256 as singleton instances for performance reasons. Here's an example demonstrating this:
>>> for i in range(250, 260): a = i; print "%i: %s" % (i, a is int(str(i)));
...
250: True
251: True
252: True
253: True
254: True
255: True
256: True
257: False
258: False
259: False
This is another obvious reason not to use is: the behavior is left up to implementations when you're erroneously using it for value equality.

Is there a difference between == and is in Python?
Yes, they have a very important difference.
==: check for equality - the semantics are that equivalent objects (that aren't necessarily the same object) will test as equal. As the documentation says:
The operators <, >, ==, >=, <=, and != compare the values of two objects.
is: check for identity - the semantics are that the object (as held in memory) is the object. Again, the documentation says:
The operators is and is not test for object identity: x is y is true
if and only if x and y are the same object. Object identity is
determined using the id() function. x is not y yields the inverse
truth value.
Thus, the check for identity is the same as checking for the equality of the IDs of the objects. That is,
a is b
is the same as:
id(a) == id(b)
where id is the builtin function that returns an integer that "is guaranteed to be unique among simultaneously existing objects" (see help(id)) and where a and b are any arbitrary objects.
Other Usage Directions
You should use these comparisons for their semantics. Use is to check identity and == to check equality.
So in general, we use is to check for identity. This is usually useful when we are checking for an object that should only exist once in memory, referred to as a "singleton" in the documentation.
Use cases for is include:
None
enum values (when using Enums from the enum module)
usually modules
usually class objects resulting from class definitions
usually function objects resulting from function definitions
anything else that should only exist once in memory (all singletons, generally)
a specific object that you want by identity
Usual use cases for == include:
numbers, including integers
strings
lists
sets
dictionaries
custom mutable objects
other builtin immutable objects, in most cases
The general use case, again, for ==, is the object you want may not be the same object, instead it may be an equivalent one
PEP 8 directions
PEP 8, the official Python style guide for the standard library also mentions two use-cases for is:
Comparisons to singletons like None should always be done with is or
is not, never the equality operators.
Also, beware of writing if x when you really mean if x is not None --
e.g. when testing whether a variable or argument that defaults to None
was set to some other value. The other value might have a type (such
as a container) that could be false in a boolean context!
Inferring equality from identity
If is is true, equality can usually be inferred - logically, if an object is itself, then it should test as equivalent to itself.
In most cases this logic is true, but it relies on the implementation of the __eq__ special method. As the docs say,
The default behavior for equality comparison (== and !=) is based on
the identity of the objects. Hence, equality comparison of instances
with the same identity results in equality, and equality comparison of
instances with different identities results in inequality. A
motivation for this default behavior is the desire that all objects
should be reflexive (i.e. x is y implies x == y).
and in the interests of consistency, recommends:
Equality comparison should be reflexive. In other words, identical
objects should compare equal:
x is y implies x == y
We can see that this is the default behavior for custom objects:
>>> class Object(object): pass
>>> obj = Object()
>>> obj2 = Object()
>>> obj == obj, obj is obj
(True, True)
>>> obj == obj2, obj is obj2
(False, False)
The contrapositive is also usually true - if somethings test as not equal, you can usually infer that they are not the same object.
Since tests for equality can be customized, this inference does not always hold true for all types.
An exception
A notable exception is nan - it always tests as not equal to itself:
>>> nan = float('nan')
>>> nan
nan
>>> nan is nan
True
>>> nan == nan # !!!!!
False
Checking for identity can be much a much quicker check than checking for equality (which might require recursively checking members).
But it cannot be substituted for equality where you may find more than one object as equivalent.
Note that comparing equality of lists and tuples will assume that identity of objects are equal (because this is a fast check). This can create contradictions if the logic is inconsistent - as it is for nan:
>>> [nan] == [nan]
True
>>> (nan,) == (nan,)
True
A Cautionary Tale:
The question is attempting to use is to compare integers. You shouldn't assume that an instance of an integer is the same instance as one obtained by another reference. This story explains why.
A commenter had code that relied on the fact that small integers (-5 to 256 inclusive) are singletons in Python, instead of checking for equality.
Wow, this can lead to some insidious bugs. I had some code that checked if a is b, which worked as I wanted because a and b are typically small numbers. The bug only happened today, after six months in production, because a and b were finally large enough to not be cached. – gwg
It worked in development. It may have passed some unittests.
And it worked in production - until the code checked for an integer larger than 256, at which point it failed in production.
This is a production failure that could have been caught in code review or possibly with a style-checker.
Let me emphasize: do not use is to compare integers.

== determines if the values are equal, while is determines if they are the exact same object.

What's the difference between is and ==?
== and is are different comparison! As others already said:
== compares the values of the objects.
is compares the references of the objects.
In Python names refer to objects, for example in this case value1 and value2 refer to an int instance storing the value 1000:
value1 = 1000
value2 = value1
Because value2 refers to the same object is and == will give True:
>>> value1 == value2
True
>>> value1 is value2
True
In the following example the names value1 and value2 refer to different int instances, even if both store the same integer:
>>> value1 = 1000
>>> value2 = 1000
Because the same value (integer) is stored == will be True, that's why it's often called "value comparison". However is will return False because these are different objects:
>>> value1 == value2
True
>>> value1 is value2
False
When to use which?
Generally is is a much faster comparison. That's why CPython caches (or maybe reuses would be the better term) certain objects like small integers, some strings, etc. But this should be treated as implementation detail that could (even if unlikely) change at any point without warning.
You should only use is if you:
want to check if two objects are really the same object (not just the same "value"). One example can be if you use a singleton object as constant.
want to compare a value to a Python constant. The constants in Python are:
None
True1
False1
NotImplemented
Ellipsis
__debug__
classes (for example int is int or int is float)
there could be additional constants in built-in modules or 3rd party modules. For example np.ma.masked from the NumPy module)
In every other case you should use == to check for equality.
Can I customize the behavior?
There is some aspect to == that hasn't been mentioned already in the other answers: It's part of Pythons "Data model". That means its behavior can be customized using the __eq__ method. For example:
class MyClass(object):
def __init__(self, val):
self._value = val
def __eq__(self, other):
print('__eq__ method called')
try:
return self._value == other._value
except AttributeError:
raise TypeError('Cannot compare {0} to objects of type {1}'
.format(type(self), type(other)))
This is just an artificial example to illustrate that the method is really called:
>>> MyClass(10) == MyClass(10)
__eq__ method called
True
Note that by default (if no other implementation of __eq__ can be found in the class or the superclasses) __eq__ uses is:
class AClass(object):
def __init__(self, value):
self._value = value
>>> a = AClass(10)
>>> b = AClass(10)
>>> a == b
False
>>> a == a
So it's actually important to implement __eq__ if you want "more" than just reference-comparison for custom classes!
On the other hand you cannot customize is checks. It will always compare just if you have the same reference.
Will these comparisons always return a boolean?
Because __eq__ can be re-implemented or overridden, it's not limited to return True or False. It could return anything (but in most cases it should return a boolean!).
For example with NumPy arrays the == will return an array:
>>> import numpy as np
>>> np.arange(10) == 2
array([False, False, True, False, False, False, False, False, False, False], dtype=bool)
But is checks will always return True or False!
1 As Aaron Hall mentioned in the comments:
Generally you shouldn't do any is True or is False checks because one normally uses these "checks" in a context that implicitly converts the condition to a boolean (for example in an if statement). So doing the is True comparison and the implicit boolean cast is doing more work than just doing the boolean cast - and you limit yourself to booleans (which isn't considered pythonic).
Like PEP8 mentions:
Don't compare boolean values to True or False using ==.
Yes: if greeting:
No: if greeting == True:
Worse: if greeting is True:

They are completely different. is checks for object identity, while == checks for equality (a notion that depends on the two operands' types).
It is only a lucky coincidence that "is" seems to work correctly with small integers (e.g. 5 == 4+1). That is because CPython optimizes the storage of integers in the range (-5 to 256) by making them singletons. This behavior is totally implementation-dependent and not guaranteed to be preserved under all manner of minor transformative operations.
For example, Python 3.5 also makes short strings singletons, but slicing them disrupts this behavior:
>>> "foo" + "bar" == "foobar"
True
>>> "foo" + "bar" is "foobar"
True
>>> "foo"[:] + "bar" == "foobar"
True
>>> "foo"[:] + "bar" is "foobar"
False

https://docs.python.org/library/stdtypes.html#comparisons
is tests for identity
== tests for equality
Each (small) integer value is mapped to a single value, so every 3 is identical and equal. This is an implementation detail, not part of the language spec though

Your answer is correct. The is operator compares the identity of two objects. The == operator compares the values of two objects.
An object's identity never changes once it has been created; you may think of it as the object's address in memory.
You can control comparison behaviour of object values by defining a __cmp__ method or a rich comparison method like __eq__.

Have a look at Stack Overflow question Python's “is” operator behaves unexpectedly with integers.
What it mostly boils down to is that "is" checks to see if they are the same object, not just equal to each other (the numbers below 256 are a special case).

In a nutshell, is checks whether two references point to the same object or not.== checks whether two objects have the same value or not.
a=[1,2,3]
b=a #a and b point to the same object
c=list(a) #c points to different object
if a==b:
print('#') #output:#
if a is b:
print('##') #output:##
if a==c:
print('###') #output:##
if a is c:
print('####') #no output as c and a point to different object

As the other people in this post answer the question in details the difference between == and is for comparing Objects or variables, I would emphasize mainly the comparison between is and == for strings which can give different results and I would urge programmers to carefully use them.
For string comparison, make sure to use == instead of is:
str = 'hello'
if (str is 'hello'):
print ('str is hello')
if (str == 'hello'):
print ('str == hello')
Out:
str is hello
str == hello
But in the below example == and is will get different results:
str2 = 'hello sam'
if (str2 is 'hello sam'):
print ('str2 is hello sam')
if (str2 == 'hello sam'):
print ('str2 == hello sam')
Out:
str2 == hello sam
Conclusion and Analysis:
Use is carefully to compare between strings.
Since is for comparing objects and since in Python 3+ every variable such as string interpret as an object, let's see what happened in above paragraphs.
In python there is id function that shows a unique constant of an object during its lifetime. This id is using in back-end of Python interpreter to compare two objects using is keyword.
str = 'hello'
id('hello')
> 140039832615152
id(str)
> 140039832615152
But
str2 = 'hello sam'
id('hello sam')
> 140039832615536
id(str2)
> 140039832615792

As John Feminella said, most of the time you will use == and != because your objective is to compare values. I'd just like to categorise what you would do the rest of the time:
There is one and only one instance of NoneType i.e. None is a singleton. Consequently foo == None and foo is None mean the same. However the is test is faster and the Pythonic convention is to use foo is None.
If you are doing some introspection or mucking about with garbage collection or checking whether your custom-built string interning gadget is working or suchlike, then you probably have a use-case for foo is bar.
True and False are also (now) singletons, but there is no use-case for foo == True and no use case for foo is True.

Most of them already answered to the point. Just as an additional note (based on my understanding and experimenting but not from a documented source), the statement
== if the objects referred to by the variables are equal
from above answers should be read as
== if the objects referred to by the variables are equal and objects belonging to the same type/class
. I arrived at this conclusion based on the below test:
list1 = [1,2,3,4]
tuple1 = (1,2,3,4)
print(list1)
print(tuple1)
print(id(list1))
print(id(tuple1))
print(list1 == tuple1)
print(list1 is tuple1)
Here the contents of the list and tuple are same but the type/class are different.

Python, perplexity about "is" operator on integers [duplicate]

Why does the following behave unexpectedly in Python?
>>> a = 256
>>> b = 256
>>> a is b
True # This is an expected result
>>> a = 257
>>> b = 257
>>> a is b
False # What happened here? Why is this False?
>>> 257 is 257
True # Yet the literal numbers compare properly
I am using Python 2.5.2. Trying some different versions of Python, it appears that Python 2.3.3 shows the above behaviour between 99 and 100.
Based on the above, I can hypothesize that Python is internally implemented such that "small" integers are stored in a different way than larger integers and the is operator can tell the difference. Why the leaky abstraction? What is a better way of comparing two arbitrary objects to see whether they are the same when I don't know in advance whether they are numbers or not?

Take a look at this:
>>> a = 256
>>> b = 256
>>> id(a) == id(b)
True
>>> a = 257
>>> b = 257
>>> id(a) == id(b)
False
Here's what I found in the documentation for "Plain Integer Objects":
The current implementation keeps an array of integer objects for all integers between -5 and 256. When you create an int in that range you actually just get back a reference to the existing object.
So, integers 256 are identical, but 257 are not. This is a CPython implementation detail, and not guaranteed for other Python implementations.

Python's “is” operator behaves unexpectedly with integers?
In summary - let me emphasize: Do not use is to compare integers.
This isn't behavior you should have any expectations about.
Instead, use == and != to compare for equality and inequality, respectively. For example:
>>> a = 1000
>>> a == 1000 # Test integers like this,
True
>>> a != 5000 # or this!
True
>>> a is 1000 # Don't do this! - Don't use `is` to test integers!!
False
Explanation
To know this, you need to know the following.
First, what does is do? It is a comparison operator. From the documentation:
The operators is and is not test for object identity: x is y is true
if and only if x and y are the same object. x is not y yields the
inverse truth value.
And so the following are equivalent.
>>> a is b
>>> id(a) == id(b)
From the documentation:
id
Return the “identity” of an object. This is an integer (or long
integer) which is guaranteed to be unique and constant for this object
during its lifetime. Two objects with non-overlapping lifetimes may
have the same id() value.
Note that the fact that the id of an object in CPython (the reference implementation of Python) is the location in memory is an implementation detail. Other implementations of Python (such as Jython or IronPython) could easily have a different implementation for id.
So what is the use-case for is? PEP8 describes:
Comparisons to singletons like None should always be done with is or
is not, never the equality operators.
The Question
You ask, and state, the following question (with code):
Why does the following behave unexpectedly in Python?
>>> a = 256
>>> b = 256
>>> a is b
True # This is an expected result
It is not an expected result. Why is it expected? It only means that the integers valued at 256 referenced by both a and b are the same instance of integer. Integers are immutable in Python, thus they cannot change. This should have no impact on any code. It should not be expected. It is merely an implementation detail.
But perhaps we should be glad that there is not a new separate instance in memory every time we state a value equals 256.
>>> a = 257
>>> b = 257
>>> a is b
False # What happened here? Why is this False?
Looks like we now have two separate instances of integers with the value of 257 in memory. Since integers are immutable, this wastes memory. Let's hope we're not wasting a lot of it. We're probably not. But this behavior is not guaranteed.
>>> 257 is 257
True # Yet the literal numbers compare properly
Well, this looks like your particular implementation of Python is trying to be smart and not creating redundantly valued integers in memory unless it has to. You seem to indicate you are using the referent implementation of Python, which is CPython. Good for CPython.
It might be even better if CPython could do this globally, if it could do so cheaply (as there would a cost in the lookup), perhaps another implementation might.
But as for impact on code, you should not care if an integer is a particular instance of an integer. You should only care what the value of that instance is, and you would use the normal comparison operators for that, i.e. ==.
What is does
is checks that the id of two objects are the same. In CPython, the id is the location in memory, but it could be some other uniquely identifying number in another implementation. To restate this with code:
>>> a is b
is the same as
>>> id(a) == id(b)
Why would we want to use is then?
This can be a very fast check relative to say, checking if two very long strings are equal in value. But since it applies to the uniqueness of the object, we thus have limited use-cases for it. In fact, we mostly want to use it to check for None, which is a singleton (a sole instance existing in one place in memory). We might create other singletons if there is potential to conflate them, which we might check with is, but these are relatively rare. Here's an example (will work in Python 2 and 3) e.g.
SENTINEL_SINGLETON = object() # this will only be created one time.
def foo(keyword_argument=None):
if keyword_argument is None:
print('no argument given to foo')
bar()
bar(keyword_argument)
bar('baz')
def bar(keyword_argument=SENTINEL_SINGLETON):
# SENTINEL_SINGLETON tells us if we were not passed anything
# as None is a legitimate potential argument we could get.
if keyword_argument is SENTINEL_SINGLETON:
print('no argument given to bar')
else:
print('argument to bar: {0}'.format(keyword_argument))
foo()
Which prints:
no argument given to foo
no argument given to bar
argument to bar: None
argument to bar: baz
And so we see, with is and a sentinel, we are able to differentiate between when bar is called with no arguments and when it is called with None. These are the primary use-cases for is - do not use it to test for equality of integers, strings, tuples, or other things like these.

I'm late but, you want some source with your answer? I'll try and word this in an introductory manner so more folks can follow along.
A good thing about CPython is that you can actually see the source for this. I'm going to use links for the 3.5 release, but finding the corresponding 2.x ones is trivial.
In CPython, the C-API function that handles creating a new int object is PyLong_FromLong(long v). The description for this function is:
The current implementation keeps an array of integer objects for all integers between -5 and 256, when you create an int in that range you actually just get back a reference to the existing object. So it should be possible to change the value of 1. I suspect the behaviour of Python in this case is undefined. :-)
(My italics)
Don't know about you but I see this and think: Let's find that array!
If you haven't fiddled with the C code implementing CPython you should; everything is pretty organized and readable. For our case, we need to look in the Objects subdirectory of the main source code directory tree.
PyLong_FromLong deals with long objects so it shouldn't be hard to deduce that we need to peek inside longobject.c. After looking inside you might think things are chaotic; they are, but fear not, the function we're looking for is chilling at line 230 waiting for us to check it out. It's a smallish function so the main body (excluding declarations) is easily pasted here:
PyObject *
PyLong_FromLong(long ival)
{
// omitting declarations
CHECK_SMALL_INT(ival);
if (ival < 0) {
/* negate: cant write this as abs_ival = -ival since that
invokes undefined behaviour when ival is LONG_MIN */
abs_ival = 0U-(unsigned long)ival;
sign = -1;
}
else {
abs_ival = (unsigned long)ival;
}
/* Fast path for single-digit ints */
if (!(abs_ival >> PyLong_SHIFT)) {
v = _PyLong_New(1);
if (v) {
Py_SIZE(v) = sign;
v->ob_digit[0] = Py_SAFE_DOWNCAST(
abs_ival, unsigned long, digit);
}
return (PyObject*)v;
}
Now, we're no C master-code-haxxorz but we're also not dumb, we can see that CHECK_SMALL_INT(ival); peeking at us all seductively; we can understand it has something to do with this. Let's check it out:
#define CHECK_SMALL_INT(ival) \
do if (-NSMALLNEGINTS <= ival && ival < NSMALLPOSINTS) { \
return get_small_int((sdigit)ival); \
} while(0)
So it's a macro that calls function get_small_int if the value ival satisfies the condition:
if (-NSMALLNEGINTS <= ival && ival < NSMALLPOSINTS)
So what are NSMALLNEGINTS and NSMALLPOSINTS? Macros! Here they are:
#ifndef NSMALLPOSINTS
#define NSMALLPOSINTS 257
#endif
#ifndef NSMALLNEGINTS
#define NSMALLNEGINTS 5
#endif
So our condition is if (-5 <= ival && ival < 257) call get_small_int.
Next let's look at get_small_int in all its glory (well, we'll just look at its body because that's where the interesting things are):
PyObject *v;
assert(-NSMALLNEGINTS <= ival && ival < NSMALLPOSINTS);
v = (PyObject *)&small_ints[ival + NSMALLNEGINTS];
Py_INCREF(v);
Okay, declare a PyObject, assert that the previous condition holds and execute the assignment:
v = (PyObject *)&small_ints[ival + NSMALLNEGINTS];
small_ints looks a lot like that array we've been searching for, and it is! We could've just read the damn documentation and we would've know all along!:
/* Small integers are preallocated in this array so that they
can be shared.
The integers that are preallocated are those in the range
-NSMALLNEGINTS (inclusive) to NSMALLPOSINTS (not inclusive).
*/
static PyLongObject small_ints[NSMALLNEGINTS + NSMALLPOSINTS];
So yup, this is our guy. When you want to create a new int in the range [NSMALLNEGINTS, NSMALLPOSINTS) you'll just get back a reference to an already existing object that has been preallocated.
Since the reference refers to the same object, issuing id() directly or checking for identity with is on it will return exactly the same thing.
But, when are they allocated??
During initialization in _PyLong_Init Python will gladly enter in a for loop to do this for you:
for (ival = -NSMALLNEGINTS; ival < NSMALLPOSINTS; ival++, v++) {
Check out the source to read the loop body!
I hope my explanation has made you C things clearly now (pun obviously intented).
But, 257 is 257? What's up?
This is actually easier to explain, and I have attempted to do so already; it's due to the fact that Python will execute this interactive statement as a single block:
>>> 257 is 257
During complilation of this statement, CPython will see that you have two matching literals and will use the same PyLongObject representing 257. You can see this if you do the compilation yourself and examine its contents:
>>> codeObj = compile("257 is 257", "blah!", "exec")
>>> codeObj.co_consts
(257, None)
When CPython does the operation, it's now just going to load the exact same object:
>>> import dis
>>> dis.dis(codeObj)
1 0 LOAD_CONST 0 (257) # dis
3 LOAD_CONST 0 (257) # dis again
6 COMPARE_OP 8 (is)
So is will return True.

It depends on whether you're looking to see if 2 things are equal, or the same object.
is checks to see if they are the same object, not just equal. The small ints are probably pointing to the same memory location for space efficiency
In [29]: a = 3
In [30]: b = 3
In [31]: id(a)
Out[31]: 500729144
In [32]: id(b)
Out[32]: 500729144
You should use == to compare equality of arbitrary objects. You can specify the behavior with the __eq__, and __ne__ attributes.

As you can check in source file intobject.c, Python caches small integers for efficiency. Every time you create a reference to a small integer, you are referring the cached small integer, not a new object. 257 is not an small integer, so it is calculated as a different object.
It is better to use == for that purpose.

I think your hypotheses is correct. Experiment with id (identity of object):
In [1]: id(255)
Out[1]: 146349024
In [2]: id(255)
Out[2]: 146349024
In [3]: id(257)
Out[3]: 146802752
In [4]: id(257)
Out[4]: 148993740
In [5]: a=255
In [6]: b=255
In [7]: c=257
In [8]: d=257
In [9]: id(a), id(b), id(c), id(d)
Out[9]: (146349024, 146349024, 146783024, 146804020)
It appears that numbers <= 255 are treated as literals and anything above is treated differently!

There's another issue that isn't pointed out in any of the existing answers. Python is allowed to merge any two immutable values, and pre-created small int values are not the only way this can happen. A Python implementation is never guaranteed to do this, but they all do it for more than just small ints.
For one thing, there are some other pre-created values, such as the empty tuple, str, and bytes, and some short strings (in CPython 3.6, it's the 256 single-character Latin-1 strings). For example:
>>> a = ()
>>> b = ()
>>> a is b
True
But also, even non-pre-created values can be identical. Consider these examples:
>>> c = 257
>>> d = 257
>>> c is d
False
>>> e, f = 258, 258
>>> e is f
True
And this isn't limited to int values:
>>> g, h = 42.23e100, 42.23e100
>>> g is h
True
Obviously, CPython doesn't come with a pre-created float value for 42.23e100. So, what's going on here?
The CPython compiler will merge constant values of some known-immutable types like int, float, str, bytes, in the same compilation unit. For a module, the whole module is a compilation unit, but at the interactive interpreter, each statement is a separate compilation unit. Since c and d are defined in separate statements, their values aren't merged. Since e and f are defined in the same statement, their values are merged.
You can see what's going on by disassembling the bytecode. Try defining a function that does e, f = 128, 128 and then calling dis.dis on it, and you'll see that there's a single constant value (128, 128)
>>> def f(): i, j = 258, 258
>>> dis.dis(f)
1 0 LOAD_CONST 2 ((128, 128))
2 UNPACK_SEQUENCE 2
4 STORE_FAST 0 (i)
6 STORE_FAST 1 (j)
8 LOAD_CONST 0 (None)
10 RETURN_VALUE
>>> f.__code__.co_consts
(None, 128, (128, 128))
>>> id(f.__code__.co_consts[1], f.__code__.co_consts[2][0], f.__code__.co_consts[2][1])
4305296480, 4305296480, 4305296480
You may notice that the compiler has stored 128 as a constant even though it's not actually used by the bytecode, which gives you an idea of how little optimization CPython's compiler does. Which means that (non-empty) tuples actually don't end up merged:
>>> k, l = (1, 2), (1, 2)
>>> k is l
False
Put that in a function, dis it, and look at the co_consts—there's a 1 and a 2, two (1, 2) tuples that share the same 1 and 2 but are not identical, and a ((1, 2), (1, 2)) tuple that has the two distinct equal tuples.
There's one more optimization that CPython does: string interning. Unlike compiler constant folding, this isn't restricted to source code literals:
>>> m = 'abc'
>>> n = 'abc'
>>> m is n
True
On the other hand, it is limited to the str type, and to strings of internal storage kind "ascii compact", "compact", or "legacy ready", and in many cases only "ascii compact" will get interned.
At any rate, the rules for what values must be, might be, or cannot be distinct vary from implementation to implementation, and between versions of the same implementation, and maybe even between runs of the same code on the same copy of the same implementation.
It can be worth learning the rules for one specific Python for the fun of it. But it's not worth relying on them in your code. The only safe rule is:
Do not write code that assumes two equal but separately-created immutable values are identical (don't use x is y, use x == y)
Do not write code that assumes two equal but separately-created immutable values are distinct (don't use x is not y, use x != y)
Or, in other words, only use is to test for the documented singletons (like None) or that are only created in one place in the code (like the _sentinel = object() idiom).

For immutable value objects, like ints, strings or datetimes, object identity is not especially useful. It's better to think about equality. Identity is essentially an implementation detail for value objects - since they're immutable, there's no effective difference between having multiple refs to the same object or multiple objects.

is is the identity equality operator (functioning like id(a) == id(b)); it's just that two equal numbers aren't necessarily the same object. For performance reasons some small integers happen to be memoized so they will tend to be the same (this can be done since they are immutable).
PHP's === operator, on the other hand, is described as checking equality and type: x == y and type(x) == type(y) as per Paulo Freitas' comment. This will suffice for common numbers, but differ from is for classes that define __eq__ in an absurd manner:
class Unequal:
def __eq__(self, other):
return False
PHP apparently allows the same thing for "built-in" classes (which I take to mean implemented at C level, not in PHP). A slightly less absurd use might be a timer object, which has a different value every time it's used as a number. Quite why you'd want to emulate Visual Basic's Now instead of showing that it is an evaluation with time.time() I don't know.
Greg Hewgill (OP) made one clarifying comment "My goal is to compare object identity, rather than equality of value. Except for numbers, where I want to treat object identity the same as equality of value."
This would have yet another answer, as we have to categorize things as numbers or not, to select whether we compare with == or is. CPython defines the number protocol, including PyNumber_Check, but this is not accessible from Python itself.
We could try to use isinstance with all the number types we know of, but this would inevitably be incomplete. The types module contains a StringTypes list but no NumberTypes. Since Python 2.6, the built in number classes have a base class numbers.Number, but it has the same problem:
import numpy, numbers
assert not issubclass(numpy.int16,numbers.Number)
assert issubclass(int,numbers.Number)
By the way, NumPy will produce separate instances of low numbers.
I don't actually know an answer to this variant of the question. I suppose one could theoretically use ctypes to call PyNumber_Check, but even that function has been debated, and it's certainly not portable. We'll just have to be less particular about what we test for now.
In the end, this issue stems from Python not originally having a type tree with predicates like Scheme's number?, or Haskell's type class Num. is checks object identity, not value equality. PHP has a colorful history as well, where === apparently behaves as is only on objects in PHP5, but not PHP4. Such are the growing pains of moving across languages (including versions of one).

It also happens with strings:
>>> s = b = 'somestr'
>>> s == b, s is b, id(s), id(b)
(True, True, 4555519392, 4555519392)
Now everything seems fine.
>>> s = 'somestr'
>>> b = 'somestr'
>>> s == b, s is b, id(s), id(b)
(True, True, 4555519392, 4555519392)
That's expected too.
>>> s1 = b1 = 'somestrdaasd ad ad asd as dasddsg,dlfg ,;dflg, dfg a'
>>> s1 == b1, s1 is b1, id(s1), id(b1)
(True, True, 4555308080, 4555308080)
>>> s1 = 'somestrdaasd ad ad asd as dasddsg,dlfg ,;dflg, dfg a'
>>> b1 = 'somestrdaasd ad ad asd as dasddsg,dlfg ,;dflg, dfg a'
>>> s1 == b1, s1 is b1, id(s1), id(b1)
(True, False, 4555308176, 4555308272)
Now that's unexpected.

What’s New In Python 3.8: Changes in Python behavior:
The compiler now produces a SyntaxWarning when identity checks (is and
is not) are used with certain types of literals (e.g. strings, ints).
These can often work by accident in CPython, but are not guaranteed by
the language spec. The warning advises users to use equality tests (==
and !=) instead.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Struct equality with arrays - struct

Related

Why there is difference between 'is' and '==' with various results in python [duplicate]

Does `==` for struct check recursively in Julia ? It seems not

Variable fieldnames in Julia mutable structs

Python operating on big numbers causes margin errors [duplicate]

Python, perplexity about "is" operator on integers [duplicate]

Categories

Resources