Can cast pandas Series to `int64` but not to `Int64` - python-3.x

I am stuck with a weird type conversion issue..
I have a pandas DataFrame pp with a column Value. All values are 'int', of type float. It is possible to convert them all to int64. But I get an error when attempting to conver to nullable Int64. Help!
Some details: pp['Value'] is of dtype float and may contain NaN. For my specific case, all values are integer values (see my debugging attempts below). I want them to become Int64. This has worked for months with thousands of series, but now suddenly 1 series seems to fail - I cannot find any non-int value that would explain this.
What I am trying: Convert the below DataSeries to Int64:
print(f"pp['Value']:\n{pp['Value']}")
pp['Value']:
0 3500000.0
1 600000.0
2 400000.0
3 8300000.0
4 5700000.0
5 4400000.0
6 3000000.0
7 2700000.0
8 2000000.0
9 800000.0
10 300000.0
11 300000.0
12 5300000.0
13 2500000.0
14 11000000.0
15 1000000.0
16 18000000.0
17 6250000.0
18 5000000.0
19 4400000.0
20 4200000.0
21 2000000.0
22 1750000.0
23 900000.0
24 4000000.0
25 800000.0
26 9250000.0
27 5200000.0
28 600000.0
29 5700000.0
30 13500000.0
31 10000000.0
32 3300000.0
33 3200000.0
34 2000000.0
35 750000.0
Name: Value, dtype: float64
For some reason, pp['Value'].astype('Int64') raises the error TypeError: cannot safely cast non-equivalent float64 to int64
I debugged 2 alternative approaches, with both work..
A: Convert the series to 'int64' - works like a charm (the numbers
really all are integers):
pp['Value'] = pp['Value'].astype('int64')
print(f"pp['Value']:\n{pp['Value']}")
pp['Value']:
0 3500000
1 600000
2 400000
3 8300000
4 5700000
5 4400000
6 3000000
7 2700000
8 2000000
9 800000
10 300000
11 300000
12 5300000
13 2500000
14 11000000
15 1000000
16 18000000
17 6250000
18 5000000
19 4400000
20 4200000
21 2000000
22 1750000
23 900000
24 4000000
25 800000
26 9250000
27 5200000
28 600000
29 5700000
30 13500000
31 10000000
32 3300000
33 3200000
34 2000000
35 750000
Name: Value, dtype: int64
B: Converted each element individually, and checked whether any single value does have some weird floating-point arithmetic issue.. not the case either. All values can be converted..
for idx, row in pp.iterrows():
print(f"{idx}: value = {row['Value']}, residual vs. int: {row['Value']%row['Value']}, int value: {int(row['Value'])}")
0: value = 3500000.0, residual vs. int: 0.0, int value: 3500000
1: value = 600000.0, residual vs. int: 0.0, int value: 600000
2: value = 400000.0, residual vs. int: 0.0, int value: 400000
3: value = 8300000.000000001, residual vs. int: 0.0, int value: 8300000
4: value = 5700000.0, residual vs. int: 0.0, int value: 5700000
5: value = 4400000.0, residual vs. int: 0.0, int value: 4400000
6: value = 3000000.0, residual vs. int: 0.0, int value: 3000000
7: value = 2700000.0, residual vs. int: 0.0, int value: 2700000
8: value = 2000000.0, residual vs. int: 0.0, int value: 2000000
9: value = 800000.0, residual vs. int: 0.0, int value: 800000
10: value = 300000.0, residual vs. int: 0.0, int value: 300000
11: value = 300000.0, residual vs. int: 0.0, int value: 300000
12: value = 5300000.0, residual vs. int: 0.0, int value: 5300000
13: value = 2500000.0, residual vs. int: 0.0, int value: 2500000
14: value = 11000000.0, residual vs. int: 0.0, int value: 11000000
15: value = 1000000.0, residual vs. int: 0.0, int value: 1000000
16: value = 18000000.0, residual vs. int: 0.0, int value: 18000000
17: value = 6250000.0, residual vs. int: 0.0, int value: 6250000
18: value = 5000000.0, residual vs. int: 0.0, int value: 5000000
19: value = 4400000.0, residual vs. int: 0.0, int value: 4400000
20: value = 4200000.0, residual vs. int: 0.0, int value: 4200000
21: value = 2000000.0, residual vs. int: 0.0, int value: 2000000
22: value = 1750000.0, residual vs. int: 0.0, int value: 1750000
23: value = 900000.0, residual vs. int: 0.0, int value: 900000
24: value = 4000000.0, residual vs. int: 0.0, int value: 4000000
25: value = 800000.0, residual vs. int: 0.0, int value: 800000
26: value = 9250000.0, residual vs. int: 0.0, int value: 9250000
27: value = 5200000.0, residual vs. int: 0.0, int value: 5200000
28: value = 600000.0, residual vs. int: 0.0, int value: 600000
29: value = 5700000.0, residual vs. int: 0.0, int value: 5700000
30: value = 13500000.0, residual vs. int: 0.0, int value: 13500000
31: value = 10000000.0, residual vs. int: 0.0, int value: 10000000
32: value = 3300000.0, residual vs. int: 0.0, int value: 3300000
33: value = 3200000.0, residual vs. int: 0.0, int value: 3200000
34: value = 2000000.0, residual vs. int: 0.0, int value: 2000000
35: value = 750000.0, residual vs. int: 0.0, int value: 750000
I am lost... All the values are int. I can convert all values to int. I can convert the whole Series to int64. But when converting to Int64, I get an error. Why? What is wrong here?
Edit note:
pp['Value'] = pp['Value'].round().astype('Int64')
solves the problem.. But I would love to understand why. As you can see above, the set is guaranteed to only contain integers; each value is an 'int' down to machine accuracy.. Why on earth would the 'non-safe conversion' error be raised?

As Jason suggested in his comment, your edit solves the problem because rounding changes 8300000.000000001 to 8300000.0.
This is important as it means that after the type conversion the two values are still equal, and so they meet the "safe" casting rule for numpy conversions. When converting to 'Int64' pandas use the numpy.ndarray.astype function which applies this rule. The details on "safe" casting can be found here.
As far as I am aware, there is no way to request that pandas uses the numpy function with a different type of casting, so rounding the values first is the solution to your problem.

Related

Pytorch training: after each layer, how can I make updates to the output and cast the updated output to next layer? I want to keep different bits

I am doing node classification using Cora dataset in Pytorch. The model consists 2 GCN layers, I want to keep different precision of the output after each layer. Specificially, after each layer, I convert output (float32 tensor type) into binary representations (32 bits). Then I keep only a few bits of 32. Then I convert binary to the float32 and input to the next layer.
I encountered an inplace operation, I wonder how to solve it ?
Error:RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [2708, 7, 1]], which is output 0 of PowBackward1, is at version 2; expected version 0 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!
def forward(self, data):
torch.autograd.set_detect_anomaly(True)
x, edge_index = data.x, data.edge_index
x = self.conv1(x, edge_index)
bit_x = float2bit(x)
float_x = bit2float(bit_x)
x = torch.sigmoid(float_x)
bit_x = float2bit(x)
bit_x[super_node,:,2:] = 0
float_x = bit2float(bit_x)
x = self.conv2(float_x, edge_index)
bit_x = float2bit(x)
float_x = bit2float(bit_x)
return F.log_softmax(float_x, dim=1)
def bit2float(b, num_e_bits=8, num_m_bits=23, bias=127.):
#b = bit.clone().detach()
"""Turn input tensor into float.
Args:
b : binary tensor. The last dimension of this tensor should be the
the one the binary is at.
num_e_bits : Number of exponent bits. Default: 8.
num_m_bits : Number of mantissa bits. Default: 23.
bias : Exponent bias/ zero offset. Default: 127.
Returns:
Tensor: Float tensor. Reduces last dimension.
"""
expected_last_dim = num_m_bits + num_e_bits + 1
assert b.shape[-1] == expected_last_dim, "Binary tensors last dimension " \
"should be {}, not {}.".format(
expected_last_dim, b.shape[-1])
# check if we got the right type
dtype = torch.float32
if expected_last_dim > 32: dtype = torch.float64
if expected_last_dim > 64:
warnings.warn("pytorch can not process floats larger than 64 bits, keep"
" this in mind. Your result will be not exact.")
s = torch.index_select(b, -1, torch.arange(0, 1))
e = torch.index_select(b, -1, torch.arange(1, 1 + num_e_bits))
m = torch.index_select(b, -1, torch.arange(1 + num_e_bits,
1 + num_e_bits + num_m_bits))
# SIGN BIT
out = ((-1) ** s).squeeze(-1).type(dtype)
# EXPONENT BIT
exponents = -torch.arange(-(num_e_bits - 1.), 1.)
exponents = exponents.repeat(b.shape[:-1] + (1,))
e_decimal = torch.sum(e * 2 ** exponents, dim=-1) - bias
out *= 2 ** e_decimal
# MANTISSA
matissa = (torch.Tensor([2.]) ** (
-torch.arange(1., num_m_bits + 1.))).repeat(
m.shape[:-1] + (1,))
out *= 1. + torch.sum(m * matissa, dim=-1)
return out
def float2bit(f, num_e_bits=8, num_m_bits=23, bias=127., dtype=torch.float32):
#f = float.clone().detach()
"""Turn input tensor into binary.
Args:
f : float tensor.
num_e_bits : Number of exponent bits. Default: 8.
num_m_bits : Number of mantissa bits. Default: 23.
bias : Exponent bias/ zero offset. Default: 127.
dtype : This is the actual type of the tensor that is going to be
returned. Default: torch.float32.
Returns:
Tensor: Binary tensor. Adds last dimension to original tensor for
bits.
"""
## SIGN BIT
s = torch.sign(f)
f = f * s
# turn sign into sign-bit
s = (s * (-1) + 1.) * 0.5
s = s.unsqueeze(-1)
## EXPONENT BIT
e_scientific = torch.floor(torch.log2(f))
e_decimal = e_scientific + bias
e = integer2bit(e_decimal, num_bits=num_e_bits)
## MANTISSA
m1 = integer2bit(f - f % 1, num_bits=num_e_bits)
m2 = remainder2bit(f % 1, num_bits=bias)
m = torch.cat([m1, m2], dim=-1)
dtype = f.type()
idx = torch.arange(num_m_bits).unsqueeze(0).type(dtype) \
+ (8. - e_scientific).unsqueeze(-1)
idx = idx.long()
m = torch.gather(m, dim=-1, index=idx)
return torch.cat([s, e, m], dim=-1).type(dtype)

How to safely round-and-clamp from float64 to int64?

This question is about python/numpy, but it may apply to other languages as well.
How can the following code be improved to safely clamp large float values to the
maximum int64 value during conversion? (Ideally, it should still be efficient.)
import numpy as np
def int64_from_clipped_float64(x, dtype=np.int64):
x = np.round(x)
x = np.clip(x, np.iinfo(dtype).min, np.iinfo(dtype).max)
# The problem is that np.iinfo(dtype).max is imprecisely approximated as a
# float64, and the approximation leads to overflow in the conversion.
return x.astype(dtype)
for x in [-3.6, 0.4, 1.7, 1e18, 1e25]:
x = np.array(x, dtype=np.float64)
print(f'x = {x:<10} result = {int64_from_clipped_float64(x)}')
# x = -3.6 result = -4
# x = 0.4 result = 0
# x = 1.7 result = 2
# x = 1e+18 result = 1000000000000000000
# x = 1e+25 result = -9223372036854775808
The problem is that the largest np.int64 is 263 - 1, which is not representable in floating point. The same issue doesn't happen on the other end, because -263 is exactly representable.
So do the clipping half in float space (for detection) and in integer space (for correction):
def int64_from_clipped_float64(x, dtype=np.int64):
assert x.dtype == np.float64
limits = np.iinfo(dtype)
too_small = x <= np.float64(limits.min)
too_large = x >= np.float64(limits.max)
ix = x.astype(dtype)
ix[too_small] = limits.min
ix[too_large] = limits.max
return ix
Here is a generalization of the answer by orlp# to safely clip-convert from
arbitrary floats to arbitrary integers, and to support scalar values as input.
The function is also useful for the conversion of np.float32 to np.int32
because it avoids the creation of intermediate np.float64 values,
as seen in the timing measurements.
def int_from_float(x, dtype=np.int64):
x = np.asarray(x)
assert issubclass(x.dtype.type, np.floating)
input_is_scalar = x.ndim == 0
x = np.atleast_1d(x)
imin, imax = np.iinfo(dtype).min, np.iinfo(dtype).max
fmin, fmax = x.dtype.type((imin, imax))
too_small = x <= fmin
too_large = x >= fmax
ix = x.astype(dtype)
ix[too_small] = imin
ix[too_large] = imax
return ix.item() if input_is_scalar else ix
print(int_from_float(np.float32(3e9), dtype=np.int32)) # 2147483647
print(int_from_float(np.float32(5e9), dtype=np.uint32)) # 4294967295
print(int_from_float(np.float64(1e25), dtype=np.int64)) # 9223372036854775807
a = np.linspace(0, 5e9, 1_000_000, dtype=np.float32).reshape(1000, 1000)
%timeit int_from_float(np.round(a), dtype=np.int32)
# 100 loops, best of 3: 3.74 ms per loop
%timeit np.clip(np.round(a), np.iinfo(np.int32).min, np.iinfo(np.int32).max).astype(np.int32)
# 100 loops, best of 3: 5.56 ms per loop

xirr: TypeError: 'float' object is not callable

I have a dataframe with column Date, Cash, Name, KEY. I am trying to find the xirr by grouping the key. But when I run my code get the error "TypeError: 'float' object is not callable"
f1['l'] = list(zip(f1["Date"], f1["Cash"]))
[![image][1]][1]
def xirr(transactions):
years = [(ta[0] - transactions[0][0]).days / 365.0 for ta in transactions]
residual = 1
step = 0.05
guess = 0.05
epsilon = 0.0001
limit = 10000
while abs(residual) > epsilon and limit > 0:
limit -= 1
residual = 0.0
for i, ta in enumerate(transactions):
residual += ta[1] / pow(guess, years[i])
if abs(residual) > epsilon:
if residual > 0:
guess += step
else:
guess -= step
step /= 2.0
return guess-1
print(xirr(f1['l'])) #till here it runs
f2 = f1.groupby('KEY').apply(xirr(f1['l'])) # this line is giving error

Utilizing Quadratic Equation to Output Roots or Message saying Undefinable

Finding quadratic roots:
import math
def main():
print "Hello! This program finds the real solutions to a quadratic"
print
a, b, c = input("Please enter the coefficients (a, b, c): ")
d = (b**2) - (4*a*c) # finding the discriminant
if d < 0:
d = -d
else:
print "This quadratic equation does not have imaginary roots"
return
dRoot = math.sqrt(d)
root1r = (-b) / (2 * a)
root1i = dRoot / (2 * a)
root2r = root1r
root2i = -root1i
print "%s+%si , %s+%si" % (root1r, root1i, root2r, root2i)
print
main()
sample:::
a, b, c
0.0, 0.0, 0.0
0.0, 0.0, 1.0
0.0, 2.0, 4.0
1.0, 2.0, 1.0
1.0, -5.0, 6.0
1.0, 2.0, 3.0,
Need help making a quadratic equation that can help me find the square root(s) or outputing a message saying that the root cannot be found/ or it is undefinable. Using the given a, b, c as examples to finding root(s) or prompting a message. That is what i have,

Python-3.x range() with step in float format [duplicate]

How do I iterate between 0 and 1 by a step of 0.1?
This says that the step argument cannot be zero:
for i in range(0, 1, 0.1):
print(i)
Rather than using a decimal step directly, it's much safer to express this in terms of how many points you want. Otherwise, floating-point rounding error is likely to give you a wrong result.
Use the linspace function from the NumPy library (which isn't part of the standard library but is relatively easy to obtain). linspace takes a number of points to return, and also lets you specify whether or not to include the right endpoint:
>>> np.linspace(0,1,11)
array([ 0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1. ])
>>> np.linspace(0,1,10,endpoint=False)
array([ 0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9])
If you really want to use a floating-point step value, use numpy.arange:
>>> import numpy as np
>>> np.arange(0.0, 1.0, 0.1)
array([ 0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9])
Floating-point rounding error will cause problems, though. Here's a simple case where rounding error causes arange to produce a length-4 array when it should only produce 3 numbers:
>>> numpy.arange(1, 1.3, 0.1)
array([1. , 1.1, 1.2, 1.3])
range() can only do integers, not floating point.
Use a list comprehension instead to obtain a list of steps:
[x * 0.1 for x in range(0, 10)]
More generally, a generator comprehension minimizes memory allocations:
xs = (x * 0.1 for x in range(0, 10))
for x in xs:
print(x)
Building on 'xrange([start], stop[, step])', you can define a generator that accepts and produces any type you choose (stick to types supporting + and <):
>>> def drange(start, stop, step):
... r = start
... while r < stop:
... yield r
... r += step
...
>>> i0=drange(0.0, 1.0, 0.1)
>>> ["%g" % x for x in i0]
['0', '0.1', '0.2', '0.3', '0.4', '0.5', '0.6', '0.7', '0.8', '0.9', '1']
>>>
Increase the magnitude of i for the loop and then reduce it when you need it.
for i * 100 in range(0, 100, 10):
print i / 100.0
EDIT: I honestly cannot remember why I thought that would work syntactically
for i in range(0, 11, 1):
print i / 10.0
That should have the desired output.
NumPy is a bit overkill, I think.
[p/10 for p in range(0, 10)]
[0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]
Generally speaking, to do a step-by-1/x up to y you would do
x=100
y=2
[p/x for p in range(0, int(x*y))]
[0.0, 0.01, 0.02, 0.03, ..., 1.97, 1.98, 1.99]
(1/x produced less rounding noise when I tested).
scipy has a built in function arange which generalizes Python's range() constructor to satisfy your requirement of float handling.
from scipy import arange
Similar to R's seq function, this one returns a sequence in any order given the correct step value. The last value is equal to the stop value.
def seq(start, stop, step=1):
n = int(round((stop - start)/float(step)))
if n > 1:
return([start + step*i for i in range(n+1)])
elif n == 1:
return([start])
else:
return([])
Results
seq(1, 5, 0.5)
[1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0, 4.5, 5.0]
seq(10, 0, -1)
[10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
seq(10, 0, -2)
[10, 8, 6, 4, 2, 0]
seq(1, 1)
[ 1 ]
The range() built-in function returns a sequence of integer values, I'm afraid, so you can't use it to do a decimal step.
I'd say just use a while loop:
i = 0.0
while i <= 1.0:
print i
i += 0.1
If you're curious, Python is converting your 0.1 to 0, which is why it's telling you the argument can't be zero.
Here's a solution using itertools:
import itertools
def seq(start, end, step):
if step == 0:
raise ValueError("step must not be 0")
sample_count = int(abs(end - start) / step)
return itertools.islice(itertools.count(start, step), sample_count)
Usage Example:
for i in seq(0, 1, 0.1):
print(i)
[x * 0.1 for x in range(0, 10)]
in Python 2.7x gives you the result of:
[0.0, 0.1, 0.2, 0.30000000000000004, 0.4, 0.5, 0.6000000000000001, 0.7000000000000001, 0.8, 0.9]
but if you use:
[ round(x * 0.1, 1) for x in range(0, 10)]
gives you the desired:
[0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]
import numpy as np
for i in np.arange(0, 1, 0.1):
print i
Best Solution: no rounding error
>>> step = .1
>>> N = 10 # number of data points
>>> [ x / pow(step, -1) for x in range(0, N + 1) ]
[0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0]
Or, for a set range instead of set data points (e.g. continuous function), use:
>>> step = .1
>>> rnge = 1 # NOTE range = 1, i.e. span of data points
>>> N = int(rnge / step
>>> [ x / pow(step,-1) for x in range(0, N + 1) ]
[0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0]
To implement a function: replace x / pow(step, -1) with f( x / pow(step, -1) ), and define f.
For example:
>>> import math
>>> def f(x):
return math.sin(x)
>>> step = .1
>>> rnge = 1 # NOTE range = 1, i.e. span of data points
>>> N = int(rnge / step)
>>> [ f( x / pow(step,-1) ) for x in range(0, N + 1) ]
[0.0, 0.09983341664682815, 0.19866933079506122, 0.29552020666133955, 0.3894183423086505,
0.479425538604203, 0.5646424733950354, 0.644217687237691, 0.7173560908995228,
0.7833269096274834, 0.8414709848078965]
And if you do this often, you might want to save the generated list r
r=map(lambda x: x/10.0,range(0,10))
for i in r:
print i
more_itertools is a third-party library that implements a numeric_range tool:
import more_itertools as mit
for x in mit.numeric_range(0, 1, 0.1):
print("{:.1f}".format(x))
Output
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
This tool also works for Decimal and Fraction.
My versions use the original range function to create multiplicative indices for the shift. This allows same syntax to the original range function.
I have made two versions, one using float, and one using Decimal, because I found that in some cases I wanted to avoid the roundoff drift introduced by the floating point arithmetic.
It is consistent with empty set results as in range/xrange.
Passing only a single numeric value to either function will return the standard range output to the integer ceiling value of the input parameter (so if you gave it 5.5, it would return range(6).)
Edit: the code below is now available as package on pypi: Franges
## frange.py
from math import ceil
# find best range function available to version (2.7.x / 3.x.x)
try:
_xrange = xrange
except NameError:
_xrange = range
def frange(start, stop = None, step = 1):
"""frange generates a set of floating point values over the
range [start, stop) with step size step
frange([start,] stop [, step ])"""
if stop is None:
for x in _xrange(int(ceil(start))):
yield x
else:
# create a generator expression for the index values
indices = (i for i in _xrange(0, int((stop-start)/step)))
# yield results
for i in indices:
yield start + step*i
## drange.py
import decimal
from math import ceil
# find best range function available to version (2.7.x / 3.x.x)
try:
_xrange = xrange
except NameError:
_xrange = range
def drange(start, stop = None, step = 1, precision = None):
"""drange generates a set of Decimal values over the
range [start, stop) with step size step
drange([start,] stop, [step [,precision]])"""
if stop is None:
for x in _xrange(int(ceil(start))):
yield x
else:
# find precision
if precision is not None:
decimal.getcontext().prec = precision
# convert values to decimals
start = decimal.Decimal(start)
stop = decimal.Decimal(stop)
step = decimal.Decimal(step)
# create a generator expression for the index values
indices = (
i for i in _xrange(
0,
((stop-start)/step).to_integral_value()
)
)
# yield results
for i in indices:
yield float(start + step*i)
## testranges.py
import frange
import drange
list(frange.frange(0, 2, 0.5)) # [0.0, 0.5, 1.0, 1.5]
list(drange.drange(0, 2, 0.5, precision = 6)) # [0.0, 0.5, 1.0, 1.5]
list(frange.frange(3)) # [0, 1, 2]
list(frange.frange(3.5)) # [0, 1, 2, 3]
list(frange.frange(0,10, -1)) # []
Lots of the solutions here still had floating point errors in Python 3.6 and didnt do exactly what I personally needed.
Function below takes integers or floats, doesnt require imports and doesnt return floating point errors.
def frange(x, y, step):
if int(x + y + step) == (x + y + step):
r = list(range(int(x), int(y), int(step)))
else:
f = 10 ** (len(str(step)) - str(step).find('.') - 1)
rf = list(range(int(x * f), int(y * f), int(step * f)))
r = [i / f for i in rf]
return r
Suprised no-one has yet mentioned the recommended solution in the Python 3 docs:
See also:
The linspace recipe shows how to implement a lazy version of range that suitable for floating point applications.
Once defined, the recipe is easy to use and does not require numpy or any other external libraries, but functions like numpy.linspace(). Note that rather than a step argument, the third num argument specifies the number of desired values, for example:
print(linspace(0, 10, 5))
# linspace(0, 10, 5)
print(list(linspace(0, 10, 5)))
# [0.0, 2.5, 5.0, 7.5, 10]
I quote a modified version of the full Python 3 recipe from Andrew Barnert below:
import collections.abc
import numbers
class linspace(collections.abc.Sequence):
"""linspace(start, stop, num) -> linspace object
Return a virtual sequence of num numbers from start to stop (inclusive).
If you need a half-open range, use linspace(start, stop, num+1)[:-1].
"""
def __init__(self, start, stop, num):
if not isinstance(num, numbers.Integral) or num <= 1:
raise ValueError('num must be an integer > 1')
self.start, self.stop, self.num = start, stop, num
self.step = (stop-start)/(num-1)
def __len__(self):
return self.num
def __getitem__(self, i):
if isinstance(i, slice):
return [self[x] for x in range(*i.indices(len(self)))]
if i < 0:
i = self.num + i
if i >= self.num:
raise IndexError('linspace object index out of range')
if i == self.num-1:
return self.stop
return self.start + i*self.step
def __repr__(self):
return '{}({}, {}, {})'.format(type(self).__name__,
self.start, self.stop, self.num)
def __eq__(self, other):
if not isinstance(other, linspace):
return False
return ((self.start, self.stop, self.num) ==
(other.start, other.stop, other.num))
def __ne__(self, other):
return not self==other
def __hash__(self):
return hash((type(self), self.start, self.stop, self.num))
This is my solution to get ranges with float steps.
Using this function it's not necessary to import numpy, nor install it.
I'm pretty sure that it could be improved and optimized. Feel free to do it and post it here.
from __future__ import division
from math import log
def xfrange(start, stop, step):
old_start = start #backup this value
digits = int(round(log(10000, 10)))+1 #get number of digits
magnitude = 10**digits
stop = int(magnitude * stop) #convert from
step = int(magnitude * step) #0.1 to 10 (e.g.)
if start == 0:
start = 10**(digits-1)
else:
start = 10**(digits)*start
data = [] #create array
#calc number of iterations
end_loop = int((stop-start)//step)
if old_start == 0:
end_loop += 1
acc = start
for i in xrange(0, end_loop):
data.append(acc/magnitude)
acc += step
return data
print xfrange(1, 2.1, 0.1)
print xfrange(0, 1.1, 0.1)
print xfrange(-1, 0.1, 0.1)
The output is:
[1.0, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2.0]
[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0, 1.1]
[-1.0, -0.9, -0.8, -0.7, -0.6, -0.5, -0.4, -0.3, -0.2, -0.1, 0.0]
For completeness of boutique, a functional solution:
def frange(a,b,s):
return [] if s > 0 and a > b or s < 0 and a < b or s==0 else [a]+frange(a+s,b,s)
You can use this function:
def frange(start,end,step):
return map(lambda x: x*step, range(int(start*1./step),int(end*1./step)))
It can be done using Numpy library. arange() function allows steps in float. But, it returns a numpy array which can be converted to list using tolist() for our convenience.
for i in np.arange(0, 1, 0.1).tolist():
print i
start and stop are inclusive rather than one or the other (usually stop is excluded) and without imports, and using generators
def rangef(start, stop, step, fround=5):
"""
Yields sequence of numbers from start (inclusive) to stop (inclusive)
by step (increment) with rounding set to n digits.
:param start: start of sequence
:param stop: end of sequence
:param step: int or float increment (e.g. 1 or 0.001)
:param fround: float rounding, n decimal places
:return:
"""
try:
i = 0
while stop >= start and step > 0:
if i==0:
yield start
elif start >= stop:
yield stop
elif start < stop:
if start == 0:
yield 0
if start != 0:
yield start
i += 1
start += step
start = round(start, fround)
else:
pass
except TypeError as e:
yield "type-error({})".format(e)
else:
pass
# passing
print(list(rangef(-100.0,10.0,1)))
print(list(rangef(-100,0,0.5)))
print(list(rangef(-1,1,0.2)))
print(list(rangef(-1,1,0.1)))
print(list(rangef(-1,1,0.05)))
print(list(rangef(-1,1,0.02)))
print(list(rangef(-1,1,0.01)))
print(list(rangef(-1,1,0.005)))
# failing: type-error:
print(list(rangef("1","10","1")))
print(list(rangef(1,10,"1")))
Python 3.6.2 (v3.6.2:5fd33b5, Jul 8 2017, 04:57:36) [MSC v.1900 64
bit (AMD64)]
I know I'm late to the party here, but here's a trivial generator solution that's working in 3.6:
def floatRange(*args):
start, step = 0, 1
if len(args) == 1:
stop = args[0]
elif len(args) == 2:
start, stop = args[0], args[1]
elif len(args) == 3:
start, stop, step = args[0], args[1], args[2]
else:
raise TypeError("floatRange accepts 1, 2, or 3 arguments. ({0} given)".format(len(args)))
for num in start, step, stop:
if not isinstance(num, (int, float)):
raise TypeError("floatRange only accepts float and integer arguments. ({0} : {1} given)".format(type(num), str(num)))
for x in range(int((stop-start)/step)):
yield start + (x * step)
return
then you can call it just like the original range()... there's no error handling, but let me know if there is an error that can be reasonably caught, and I'll update. or you can update it. this is StackOverflow.
To counter the float precision issues, you could use the Decimal module.
This demands an extra effort of converting to Decimal from int or float while writing the code, but you can instead pass str and modify the function if that sort of convenience is indeed necessary.
from decimal import Decimal
def decimal_range(*args):
zero, one = Decimal('0'), Decimal('1')
if len(args) == 1:
start, stop, step = zero, args[0], one
elif len(args) == 2:
start, stop, step = args + (one,)
elif len(args) == 3:
start, stop, step = args
else:
raise ValueError('Expected 1 or 2 arguments, got %s' % len(args))
if not all([type(arg) == Decimal for arg in (start, stop, step)]):
raise ValueError('Arguments must be passed as <type: Decimal>')
# neglect bad cases
if (start == stop) or (start > stop and step >= zero) or \
(start < stop and step <= zero):
return []
current = start
while abs(current) < abs(stop):
yield current
current += step
Sample outputs -
from decimal import Decimal as D
list(decimal_range(D('2')))
# [Decimal('0'), Decimal('1')]
list(decimal_range(D('2'), D('4.5')))
# [Decimal('2'), Decimal('3'), Decimal('4')]
list(decimal_range(D('2'), D('4.5'), D('0.5')))
# [Decimal('2'), Decimal('2.5'), Decimal('3.0'), Decimal('3.5'), Decimal('4.0')]
list(decimal_range(D('2'), D('4.5'), D('-0.5')))
# []
list(decimal_range(D('2'), D('-4.5'), D('-0.5')))
# [Decimal('2'),
# Decimal('1.5'),
# Decimal('1.0'),
# Decimal('0.5'),
# Decimal('0.0'),
# Decimal('-0.5'),
# Decimal('-1.0'),
# Decimal('-1.5'),
# Decimal('-2.0'),
# Decimal('-2.5'),
# Decimal('-3.0'),
# Decimal('-3.5'),
# Decimal('-4.0')]
Add auto-correction for the possibility of an incorrect sign on step:
def frange(start,step,stop):
step *= 2*((stop>start)^(step<0))-1
return [start+i*step for i in range(int((stop-start)/step))]
My solution:
def seq(start, stop, step=1, digit=0):
x = float(start)
v = []
while x <= stop:
v.append(round(x,digit))
x += step
return v
Here is my solution which works fine with float_range(-1, 0, 0.01) and works without floating point representation errors. It is not very fast, but works fine:
from decimal import Decimal
def get_multiplier(_from, _to, step):
digits = []
for number in [_from, _to, step]:
pre = Decimal(str(number)) % 1
digit = len(str(pre)) - 2
digits.append(digit)
max_digits = max(digits)
return float(10 ** (max_digits))
def float_range(_from, _to, step, include=False):
"""Generates a range list of floating point values over the Range [start, stop]
with step size step
include=True - allows to include right value to if possible
!! Works fine with floating point representation !!
"""
mult = get_multiplier(_from, _to, step)
# print mult
int_from = int(round(_from * mult))
int_to = int(round(_to * mult))
int_step = int(round(step * mult))
# print int_from,int_to,int_step
if include:
result = range(int_from, int_to + int_step, int_step)
result = [r for r in result if r <= int_to]
else:
result = range(int_from, int_to, int_step)
# print result
float_result = [r / mult for r in result]
return float_result
print float_range(-1, 0, 0.01,include=False)
assert float_range(1.01, 2.06, 5.05 % 1, True) ==\
[1.01, 1.06, 1.11, 1.16, 1.21, 1.26, 1.31, 1.36, 1.41, 1.46, 1.51, 1.56, 1.61, 1.66, 1.71, 1.76, 1.81, 1.86, 1.91, 1.96, 2.01, 2.06]
assert float_range(1.01, 2.06, 5.05 % 1, False)==\
[1.01, 1.06, 1.11, 1.16, 1.21, 1.26, 1.31, 1.36, 1.41, 1.46, 1.51, 1.56, 1.61, 1.66, 1.71, 1.76, 1.81, 1.86, 1.91, 1.96, 2.01]
I am only a beginner, but I had the same problem, when simulating some calculations. Here is how I attempted to work this out, which seems to be working with decimal steps.
I am also quite lazy and so I found it hard to write my own range function.
Basically what I did is changed my xrange(0.0, 1.0, 0.01) to xrange(0, 100, 1) and used the division by 100.0 inside the loop.
I was also concerned, if there will be rounding mistakes. So I decided to test, whether there are any. Now I heard, that if for example 0.01 from a calculation isn't exactly the float 0.01 comparing them should return False (if I am wrong, please let me know).
So I decided to test if my solution will work for my range by running a short test:
for d100 in xrange(0, 100, 1):
d = d100 / 100.0
fl = float("0.00"[:4 - len(str(d100))] + str(d100))
print d, "=", fl , d == fl
And it printed True for each.
Now, if I'm getting it totally wrong, please let me know.
The trick to avoid round-off problem is to use a separate number to move through the range, that starts and half the step ahead of start.
# floating point range
def frange(a, b, stp=1.0):
i = a+stp/2.0
while i<b:
yield a
a += stp
i += stp
Alternatively, numpy.arange can be used.
My answer is similar to others using map(), without need of NumPy, and without using lambda (though you could). To get a list of float values from 0.0 to t_max in steps of dt:
def xdt(n):
return dt*float(n)
tlist = map(xdt, range(int(t_max/dt)+1))

Resources