Can cast pandas Series to `int64` but not to `Int64` - python-3.x
I am stuck with a weird type conversion issue..
I have a pandas DataFrame pp with a column Value. All values are 'int', of type float. It is possible to convert them all to int64. But I get an error when attempting to conver to nullable Int64. Help!
Some details: pp['Value'] is of dtype float and may contain NaN. For my specific case, all values are integer values (see my debugging attempts below). I want them to become Int64. This has worked for months with thousands of series, but now suddenly 1 series seems to fail - I cannot find any non-int value that would explain this.
What I am trying: Convert the below DataSeries to Int64:
print(f"pp['Value']:\n{pp['Value']}")
pp['Value']:
0 3500000.0
1 600000.0
2 400000.0
3 8300000.0
4 5700000.0
5 4400000.0
6 3000000.0
7 2700000.0
8 2000000.0
9 800000.0
10 300000.0
11 300000.0
12 5300000.0
13 2500000.0
14 11000000.0
15 1000000.0
16 18000000.0
17 6250000.0
18 5000000.0
19 4400000.0
20 4200000.0
21 2000000.0
22 1750000.0
23 900000.0
24 4000000.0
25 800000.0
26 9250000.0
27 5200000.0
28 600000.0
29 5700000.0
30 13500000.0
31 10000000.0
32 3300000.0
33 3200000.0
34 2000000.0
35 750000.0
Name: Value, dtype: float64
For some reason, pp['Value'].astype('Int64') raises the error TypeError: cannot safely cast non-equivalent float64 to int64
I debugged 2 alternative approaches, with both work..
A: Convert the series to 'int64' - works like a charm (the numbers
really all are integers):
pp['Value'] = pp['Value'].astype('int64')
print(f"pp['Value']:\n{pp['Value']}")
pp['Value']:
0 3500000
1 600000
2 400000
3 8300000
4 5700000
5 4400000
6 3000000
7 2700000
8 2000000
9 800000
10 300000
11 300000
12 5300000
13 2500000
14 11000000
15 1000000
16 18000000
17 6250000
18 5000000
19 4400000
20 4200000
21 2000000
22 1750000
23 900000
24 4000000
25 800000
26 9250000
27 5200000
28 600000
29 5700000
30 13500000
31 10000000
32 3300000
33 3200000
34 2000000
35 750000
Name: Value, dtype: int64
B: Converted each element individually, and checked whether any single value does have some weird floating-point arithmetic issue.. not the case either. All values can be converted..
for idx, row in pp.iterrows():
print(f"{idx}: value = {row['Value']}, residual vs. int: {row['Value']%row['Value']}, int value: {int(row['Value'])}")
0: value = 3500000.0, residual vs. int: 0.0, int value: 3500000
1: value = 600000.0, residual vs. int: 0.0, int value: 600000
2: value = 400000.0, residual vs. int: 0.0, int value: 400000
3: value = 8300000.000000001, residual vs. int: 0.0, int value: 8300000
4: value = 5700000.0, residual vs. int: 0.0, int value: 5700000
5: value = 4400000.0, residual vs. int: 0.0, int value: 4400000
6: value = 3000000.0, residual vs. int: 0.0, int value: 3000000
7: value = 2700000.0, residual vs. int: 0.0, int value: 2700000
8: value = 2000000.0, residual vs. int: 0.0, int value: 2000000
9: value = 800000.0, residual vs. int: 0.0, int value: 800000
10: value = 300000.0, residual vs. int: 0.0, int value: 300000
11: value = 300000.0, residual vs. int: 0.0, int value: 300000
12: value = 5300000.0, residual vs. int: 0.0, int value: 5300000
13: value = 2500000.0, residual vs. int: 0.0, int value: 2500000
14: value = 11000000.0, residual vs. int: 0.0, int value: 11000000
15: value = 1000000.0, residual vs. int: 0.0, int value: 1000000
16: value = 18000000.0, residual vs. int: 0.0, int value: 18000000
17: value = 6250000.0, residual vs. int: 0.0, int value: 6250000
18: value = 5000000.0, residual vs. int: 0.0, int value: 5000000
19: value = 4400000.0, residual vs. int: 0.0, int value: 4400000
20: value = 4200000.0, residual vs. int: 0.0, int value: 4200000
21: value = 2000000.0, residual vs. int: 0.0, int value: 2000000
22: value = 1750000.0, residual vs. int: 0.0, int value: 1750000
23: value = 900000.0, residual vs. int: 0.0, int value: 900000
24: value = 4000000.0, residual vs. int: 0.0, int value: 4000000
25: value = 800000.0, residual vs. int: 0.0, int value: 800000
26: value = 9250000.0, residual vs. int: 0.0, int value: 9250000
27: value = 5200000.0, residual vs. int: 0.0, int value: 5200000
28: value = 600000.0, residual vs. int: 0.0, int value: 600000
29: value = 5700000.0, residual vs. int: 0.0, int value: 5700000
30: value = 13500000.0, residual vs. int: 0.0, int value: 13500000
31: value = 10000000.0, residual vs. int: 0.0, int value: 10000000
32: value = 3300000.0, residual vs. int: 0.0, int value: 3300000
33: value = 3200000.0, residual vs. int: 0.0, int value: 3200000
34: value = 2000000.0, residual vs. int: 0.0, int value: 2000000
35: value = 750000.0, residual vs. int: 0.0, int value: 750000
I am lost... All the values are int. I can convert all values to int. I can convert the whole Series to int64. But when converting to Int64, I get an error. Why? What is wrong here?
Edit note:
pp['Value'] = pp['Value'].round().astype('Int64')
solves the problem.. But I would love to understand why. As you can see above, the set is guaranteed to only contain integers; each value is an 'int' down to machine accuracy.. Why on earth would the 'non-safe conversion' error be raised?
As Jason suggested in his comment, your edit solves the problem because rounding changes 8300000.000000001 to 8300000.0.
This is important as it means that after the type conversion the two values are still equal, and so they meet the "safe" casting rule for numpy conversions. When converting to 'Int64' pandas use the numpy.ndarray.astype function which applies this rule. The details on "safe" casting can be found here.
As far as I am aware, there is no way to request that pandas uses the numpy function with a different type of casting, so rounding the values first is the solution to your problem.
Related
Pytorch training: after each layer, how can I make updates to the output and cast the updated output to next layer? I want to keep different bits
I am doing node classification using Cora dataset in Pytorch. The model consists 2 GCN layers, I want to keep different precision of the output after each layer. Specificially, after each layer, I convert output (float32 tensor type) into binary representations (32 bits). Then I keep only a few bits of 32. Then I convert binary to the float32 and input to the next layer. I encountered an inplace operation, I wonder how to solve it ? Error:RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [2708, 7, 1]], which is output 0 of PowBackward1, is at version 2; expected version 0 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck! def forward(self, data): torch.autograd.set_detect_anomaly(True) x, edge_index = data.x, data.edge_index x = self.conv1(x, edge_index) bit_x = float2bit(x) float_x = bit2float(bit_x) x = torch.sigmoid(float_x) bit_x = float2bit(x) bit_x[super_node,:,2:] = 0 float_x = bit2float(bit_x) x = self.conv2(float_x, edge_index) bit_x = float2bit(x) float_x = bit2float(bit_x) return F.log_softmax(float_x, dim=1) def bit2float(b, num_e_bits=8, num_m_bits=23, bias=127.): #b = bit.clone().detach() """Turn input tensor into float. Args: b : binary tensor. The last dimension of this tensor should be the the one the binary is at. num_e_bits : Number of exponent bits. Default: 8. num_m_bits : Number of mantissa bits. Default: 23. bias : Exponent bias/ zero offset. Default: 127. Returns: Tensor: Float tensor. Reduces last dimension. """ expected_last_dim = num_m_bits + num_e_bits + 1 assert b.shape[-1] == expected_last_dim, "Binary tensors last dimension " \ "should be {}, not {}.".format( expected_last_dim, b.shape[-1]) # check if we got the right type dtype = torch.float32 if expected_last_dim > 32: dtype = torch.float64 if expected_last_dim > 64: warnings.warn("pytorch can not process floats larger than 64 bits, keep" " this in mind. Your result will be not exact.") s = torch.index_select(b, -1, torch.arange(0, 1)) e = torch.index_select(b, -1, torch.arange(1, 1 + num_e_bits)) m = torch.index_select(b, -1, torch.arange(1 + num_e_bits, 1 + num_e_bits + num_m_bits)) # SIGN BIT out = ((-1) ** s).squeeze(-1).type(dtype) # EXPONENT BIT exponents = -torch.arange(-(num_e_bits - 1.), 1.) exponents = exponents.repeat(b.shape[:-1] + (1,)) e_decimal = torch.sum(e * 2 ** exponents, dim=-1) - bias out *= 2 ** e_decimal # MANTISSA matissa = (torch.Tensor([2.]) ** ( -torch.arange(1., num_m_bits + 1.))).repeat( m.shape[:-1] + (1,)) out *= 1. + torch.sum(m * matissa, dim=-1) return out def float2bit(f, num_e_bits=8, num_m_bits=23, bias=127., dtype=torch.float32): #f = float.clone().detach() """Turn input tensor into binary. Args: f : float tensor. num_e_bits : Number of exponent bits. Default: 8. num_m_bits : Number of mantissa bits. Default: 23. bias : Exponent bias/ zero offset. Default: 127. dtype : This is the actual type of the tensor that is going to be returned. Default: torch.float32. Returns: Tensor: Binary tensor. Adds last dimension to original tensor for bits. """ ## SIGN BIT s = torch.sign(f) f = f * s # turn sign into sign-bit s = (s * (-1) + 1.) * 0.5 s = s.unsqueeze(-1) ## EXPONENT BIT e_scientific = torch.floor(torch.log2(f)) e_decimal = e_scientific + bias e = integer2bit(e_decimal, num_bits=num_e_bits) ## MANTISSA m1 = integer2bit(f - f % 1, num_bits=num_e_bits) m2 = remainder2bit(f % 1, num_bits=bias) m = torch.cat([m1, m2], dim=-1) dtype = f.type() idx = torch.arange(num_m_bits).unsqueeze(0).type(dtype) \ + (8. - e_scientific).unsqueeze(-1) idx = idx.long() m = torch.gather(m, dim=-1, index=idx) return torch.cat([s, e, m], dim=-1).type(dtype)
How to safely round-and-clamp from float64 to int64?
This question is about python/numpy, but it may apply to other languages as well. How can the following code be improved to safely clamp large float values to the maximum int64 value during conversion? (Ideally, it should still be efficient.) import numpy as np def int64_from_clipped_float64(x, dtype=np.int64): x = np.round(x) x = np.clip(x, np.iinfo(dtype).min, np.iinfo(dtype).max) # The problem is that np.iinfo(dtype).max is imprecisely approximated as a # float64, and the approximation leads to overflow in the conversion. return x.astype(dtype) for x in [-3.6, 0.4, 1.7, 1e18, 1e25]: x = np.array(x, dtype=np.float64) print(f'x = {x:<10} result = {int64_from_clipped_float64(x)}') # x = -3.6 result = -4 # x = 0.4 result = 0 # x = 1.7 result = 2 # x = 1e+18 result = 1000000000000000000 # x = 1e+25 result = -9223372036854775808
The problem is that the largest np.int64 is 263 - 1, which is not representable in floating point. The same issue doesn't happen on the other end, because -263 is exactly representable. So do the clipping half in float space (for detection) and in integer space (for correction): def int64_from_clipped_float64(x, dtype=np.int64): assert x.dtype == np.float64 limits = np.iinfo(dtype) too_small = x <= np.float64(limits.min) too_large = x >= np.float64(limits.max) ix = x.astype(dtype) ix[too_small] = limits.min ix[too_large] = limits.max return ix
Here is a generalization of the answer by orlp# to safely clip-convert from arbitrary floats to arbitrary integers, and to support scalar values as input. The function is also useful for the conversion of np.float32 to np.int32 because it avoids the creation of intermediate np.float64 values, as seen in the timing measurements. def int_from_float(x, dtype=np.int64): x = np.asarray(x) assert issubclass(x.dtype.type, np.floating) input_is_scalar = x.ndim == 0 x = np.atleast_1d(x) imin, imax = np.iinfo(dtype).min, np.iinfo(dtype).max fmin, fmax = x.dtype.type((imin, imax)) too_small = x <= fmin too_large = x >= fmax ix = x.astype(dtype) ix[too_small] = imin ix[too_large] = imax return ix.item() if input_is_scalar else ix print(int_from_float(np.float32(3e9), dtype=np.int32)) # 2147483647 print(int_from_float(np.float32(5e9), dtype=np.uint32)) # 4294967295 print(int_from_float(np.float64(1e25), dtype=np.int64)) # 9223372036854775807 a = np.linspace(0, 5e9, 1_000_000, dtype=np.float32).reshape(1000, 1000) %timeit int_from_float(np.round(a), dtype=np.int32) # 100 loops, best of 3: 3.74 ms per loop %timeit np.clip(np.round(a), np.iinfo(np.int32).min, np.iinfo(np.int32).max).astype(np.int32) # 100 loops, best of 3: 5.56 ms per loop
xirr: TypeError: 'float' object is not callable
I have a dataframe with column Date, Cash, Name, KEY. I am trying to find the xirr by grouping the key. But when I run my code get the error "TypeError: 'float' object is not callable" f1['l'] = list(zip(f1["Date"], f1["Cash"])) [![image][1]][1] def xirr(transactions): years = [(ta[0] - transactions[0][0]).days / 365.0 for ta in transactions] residual = 1 step = 0.05 guess = 0.05 epsilon = 0.0001 limit = 10000 while abs(residual) > epsilon and limit > 0: limit -= 1 residual = 0.0 for i, ta in enumerate(transactions): residual += ta[1] / pow(guess, years[i]) if abs(residual) > epsilon: if residual > 0: guess += step else: guess -= step step /= 2.0 return guess-1 print(xirr(f1['l'])) #till here it runs f2 = f1.groupby('KEY').apply(xirr(f1['l'])) # this line is giving error
Utilizing Quadratic Equation to Output Roots or Message saying Undefinable
Finding quadratic roots: import math def main(): print "Hello! This program finds the real solutions to a quadratic" print a, b, c = input("Please enter the coefficients (a, b, c): ") d = (b**2) - (4*a*c) # finding the discriminant if d < 0: d = -d else: print "This quadratic equation does not have imaginary roots" return dRoot = math.sqrt(d) root1r = (-b) / (2 * a) root1i = dRoot / (2 * a) root2r = root1r root2i = -root1i print "%s+%si , %s+%si" % (root1r, root1i, root2r, root2i) print main() sample::: a, b, c 0.0, 0.0, 0.0 0.0, 0.0, 1.0 0.0, 2.0, 4.0 1.0, 2.0, 1.0 1.0, -5.0, 6.0 1.0, 2.0, 3.0, Need help making a quadratic equation that can help me find the square root(s) or outputing a message saying that the root cannot be found/ or it is undefinable. Using the given a, b, c as examples to finding root(s) or prompting a message. That is what i have,
Python-3.x range() with step in float format [duplicate]
How do I iterate between 0 and 1 by a step of 0.1? This says that the step argument cannot be zero: for i in range(0, 1, 0.1): print(i)
Rather than using a decimal step directly, it's much safer to express this in terms of how many points you want. Otherwise, floating-point rounding error is likely to give you a wrong result. Use the linspace function from the NumPy library (which isn't part of the standard library but is relatively easy to obtain). linspace takes a number of points to return, and also lets you specify whether or not to include the right endpoint: >>> np.linspace(0,1,11) array([ 0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1. ]) >>> np.linspace(0,1,10,endpoint=False) array([ 0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]) If you really want to use a floating-point step value, use numpy.arange: >>> import numpy as np >>> np.arange(0.0, 1.0, 0.1) array([ 0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]) Floating-point rounding error will cause problems, though. Here's a simple case where rounding error causes arange to produce a length-4 array when it should only produce 3 numbers: >>> numpy.arange(1, 1.3, 0.1) array([1. , 1.1, 1.2, 1.3])
range() can only do integers, not floating point. Use a list comprehension instead to obtain a list of steps: [x * 0.1 for x in range(0, 10)] More generally, a generator comprehension minimizes memory allocations: xs = (x * 0.1 for x in range(0, 10)) for x in xs: print(x)
Building on 'xrange([start], stop[, step])', you can define a generator that accepts and produces any type you choose (stick to types supporting + and <): >>> def drange(start, stop, step): ... r = start ... while r < stop: ... yield r ... r += step ... >>> i0=drange(0.0, 1.0, 0.1) >>> ["%g" % x for x in i0] ['0', '0.1', '0.2', '0.3', '0.4', '0.5', '0.6', '0.7', '0.8', '0.9', '1'] >>>
Increase the magnitude of i for the loop and then reduce it when you need it. for i * 100 in range(0, 100, 10): print i / 100.0 EDIT: I honestly cannot remember why I thought that would work syntactically for i in range(0, 11, 1): print i / 10.0 That should have the desired output.
NumPy is a bit overkill, I think. [p/10 for p in range(0, 10)] [0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9] Generally speaking, to do a step-by-1/x up to y you would do x=100 y=2 [p/x for p in range(0, int(x*y))] [0.0, 0.01, 0.02, 0.03, ..., 1.97, 1.98, 1.99] (1/x produced less rounding noise when I tested).
scipy has a built in function arange which generalizes Python's range() constructor to satisfy your requirement of float handling. from scipy import arange
Similar to R's seq function, this one returns a sequence in any order given the correct step value. The last value is equal to the stop value. def seq(start, stop, step=1): n = int(round((stop - start)/float(step))) if n > 1: return([start + step*i for i in range(n+1)]) elif n == 1: return([start]) else: return([]) Results seq(1, 5, 0.5) [1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0, 4.5, 5.0] seq(10, 0, -1) [10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0] seq(10, 0, -2) [10, 8, 6, 4, 2, 0] seq(1, 1) [ 1 ]
The range() built-in function returns a sequence of integer values, I'm afraid, so you can't use it to do a decimal step. I'd say just use a while loop: i = 0.0 while i <= 1.0: print i i += 0.1 If you're curious, Python is converting your 0.1 to 0, which is why it's telling you the argument can't be zero.
Here's a solution using itertools: import itertools def seq(start, end, step): if step == 0: raise ValueError("step must not be 0") sample_count = int(abs(end - start) / step) return itertools.islice(itertools.count(start, step), sample_count) Usage Example: for i in seq(0, 1, 0.1): print(i)
[x * 0.1 for x in range(0, 10)] in Python 2.7x gives you the result of: [0.0, 0.1, 0.2, 0.30000000000000004, 0.4, 0.5, 0.6000000000000001, 0.7000000000000001, 0.8, 0.9] but if you use: [ round(x * 0.1, 1) for x in range(0, 10)] gives you the desired: [0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]
import numpy as np for i in np.arange(0, 1, 0.1): print i
Best Solution: no rounding error >>> step = .1 >>> N = 10 # number of data points >>> [ x / pow(step, -1) for x in range(0, N + 1) ] [0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0] Or, for a set range instead of set data points (e.g. continuous function), use: >>> step = .1 >>> rnge = 1 # NOTE range = 1, i.e. span of data points >>> N = int(rnge / step >>> [ x / pow(step,-1) for x in range(0, N + 1) ] [0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0] To implement a function: replace x / pow(step, -1) with f( x / pow(step, -1) ), and define f. For example: >>> import math >>> def f(x): return math.sin(x) >>> step = .1 >>> rnge = 1 # NOTE range = 1, i.e. span of data points >>> N = int(rnge / step) >>> [ f( x / pow(step,-1) ) for x in range(0, N + 1) ] [0.0, 0.09983341664682815, 0.19866933079506122, 0.29552020666133955, 0.3894183423086505, 0.479425538604203, 0.5646424733950354, 0.644217687237691, 0.7173560908995228, 0.7833269096274834, 0.8414709848078965]
And if you do this often, you might want to save the generated list r r=map(lambda x: x/10.0,range(0,10)) for i in r: print i
more_itertools is a third-party library that implements a numeric_range tool: import more_itertools as mit for x in mit.numeric_range(0, 1, 0.1): print("{:.1f}".format(x)) Output 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 This tool also works for Decimal and Fraction.
My versions use the original range function to create multiplicative indices for the shift. This allows same syntax to the original range function. I have made two versions, one using float, and one using Decimal, because I found that in some cases I wanted to avoid the roundoff drift introduced by the floating point arithmetic. It is consistent with empty set results as in range/xrange. Passing only a single numeric value to either function will return the standard range output to the integer ceiling value of the input parameter (so if you gave it 5.5, it would return range(6).) Edit: the code below is now available as package on pypi: Franges ## frange.py from math import ceil # find best range function available to version (2.7.x / 3.x.x) try: _xrange = xrange except NameError: _xrange = range def frange(start, stop = None, step = 1): """frange generates a set of floating point values over the range [start, stop) with step size step frange([start,] stop [, step ])""" if stop is None: for x in _xrange(int(ceil(start))): yield x else: # create a generator expression for the index values indices = (i for i in _xrange(0, int((stop-start)/step))) # yield results for i in indices: yield start + step*i ## drange.py import decimal from math import ceil # find best range function available to version (2.7.x / 3.x.x) try: _xrange = xrange except NameError: _xrange = range def drange(start, stop = None, step = 1, precision = None): """drange generates a set of Decimal values over the range [start, stop) with step size step drange([start,] stop, [step [,precision]])""" if stop is None: for x in _xrange(int(ceil(start))): yield x else: # find precision if precision is not None: decimal.getcontext().prec = precision # convert values to decimals start = decimal.Decimal(start) stop = decimal.Decimal(stop) step = decimal.Decimal(step) # create a generator expression for the index values indices = ( i for i in _xrange( 0, ((stop-start)/step).to_integral_value() ) ) # yield results for i in indices: yield float(start + step*i) ## testranges.py import frange import drange list(frange.frange(0, 2, 0.5)) # [0.0, 0.5, 1.0, 1.5] list(drange.drange(0, 2, 0.5, precision = 6)) # [0.0, 0.5, 1.0, 1.5] list(frange.frange(3)) # [0, 1, 2] list(frange.frange(3.5)) # [0, 1, 2, 3] list(frange.frange(0,10, -1)) # []
Lots of the solutions here still had floating point errors in Python 3.6 and didnt do exactly what I personally needed. Function below takes integers or floats, doesnt require imports and doesnt return floating point errors. def frange(x, y, step): if int(x + y + step) == (x + y + step): r = list(range(int(x), int(y), int(step))) else: f = 10 ** (len(str(step)) - str(step).find('.') - 1) rf = list(range(int(x * f), int(y * f), int(step * f))) r = [i / f for i in rf] return r
Suprised no-one has yet mentioned the recommended solution in the Python 3 docs: See also: The linspace recipe shows how to implement a lazy version of range that suitable for floating point applications. Once defined, the recipe is easy to use and does not require numpy or any other external libraries, but functions like numpy.linspace(). Note that rather than a step argument, the third num argument specifies the number of desired values, for example: print(linspace(0, 10, 5)) # linspace(0, 10, 5) print(list(linspace(0, 10, 5))) # [0.0, 2.5, 5.0, 7.5, 10] I quote a modified version of the full Python 3 recipe from Andrew Barnert below: import collections.abc import numbers class linspace(collections.abc.Sequence): """linspace(start, stop, num) -> linspace object Return a virtual sequence of num numbers from start to stop (inclusive). If you need a half-open range, use linspace(start, stop, num+1)[:-1]. """ def __init__(self, start, stop, num): if not isinstance(num, numbers.Integral) or num <= 1: raise ValueError('num must be an integer > 1') self.start, self.stop, self.num = start, stop, num self.step = (stop-start)/(num-1) def __len__(self): return self.num def __getitem__(self, i): if isinstance(i, slice): return [self[x] for x in range(*i.indices(len(self)))] if i < 0: i = self.num + i if i >= self.num: raise IndexError('linspace object index out of range') if i == self.num-1: return self.stop return self.start + i*self.step def __repr__(self): return '{}({}, {}, {})'.format(type(self).__name__, self.start, self.stop, self.num) def __eq__(self, other): if not isinstance(other, linspace): return False return ((self.start, self.stop, self.num) == (other.start, other.stop, other.num)) def __ne__(self, other): return not self==other def __hash__(self): return hash((type(self), self.start, self.stop, self.num))
This is my solution to get ranges with float steps. Using this function it's not necessary to import numpy, nor install it. I'm pretty sure that it could be improved and optimized. Feel free to do it and post it here. from __future__ import division from math import log def xfrange(start, stop, step): old_start = start #backup this value digits = int(round(log(10000, 10)))+1 #get number of digits magnitude = 10**digits stop = int(magnitude * stop) #convert from step = int(magnitude * step) #0.1 to 10 (e.g.) if start == 0: start = 10**(digits-1) else: start = 10**(digits)*start data = [] #create array #calc number of iterations end_loop = int((stop-start)//step) if old_start == 0: end_loop += 1 acc = start for i in xrange(0, end_loop): data.append(acc/magnitude) acc += step return data print xfrange(1, 2.1, 0.1) print xfrange(0, 1.1, 0.1) print xfrange(-1, 0.1, 0.1) The output is: [1.0, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2.0] [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0, 1.1] [-1.0, -0.9, -0.8, -0.7, -0.6, -0.5, -0.4, -0.3, -0.2, -0.1, 0.0]
For completeness of boutique, a functional solution: def frange(a,b,s): return [] if s > 0 and a > b or s < 0 and a < b or s==0 else [a]+frange(a+s,b,s)
You can use this function: def frange(start,end,step): return map(lambda x: x*step, range(int(start*1./step),int(end*1./step)))
It can be done using Numpy library. arange() function allows steps in float. But, it returns a numpy array which can be converted to list using tolist() for our convenience. for i in np.arange(0, 1, 0.1).tolist(): print i
start and stop are inclusive rather than one or the other (usually stop is excluded) and without imports, and using generators def rangef(start, stop, step, fround=5): """ Yields sequence of numbers from start (inclusive) to stop (inclusive) by step (increment) with rounding set to n digits. :param start: start of sequence :param stop: end of sequence :param step: int or float increment (e.g. 1 or 0.001) :param fround: float rounding, n decimal places :return: """ try: i = 0 while stop >= start and step > 0: if i==0: yield start elif start >= stop: yield stop elif start < stop: if start == 0: yield 0 if start != 0: yield start i += 1 start += step start = round(start, fround) else: pass except TypeError as e: yield "type-error({})".format(e) else: pass # passing print(list(rangef(-100.0,10.0,1))) print(list(rangef(-100,0,0.5))) print(list(rangef(-1,1,0.2))) print(list(rangef(-1,1,0.1))) print(list(rangef(-1,1,0.05))) print(list(rangef(-1,1,0.02))) print(list(rangef(-1,1,0.01))) print(list(rangef(-1,1,0.005))) # failing: type-error: print(list(rangef("1","10","1"))) print(list(rangef(1,10,"1"))) Python 3.6.2 (v3.6.2:5fd33b5, Jul 8 2017, 04:57:36) [MSC v.1900 64 bit (AMD64)]
I know I'm late to the party here, but here's a trivial generator solution that's working in 3.6: def floatRange(*args): start, step = 0, 1 if len(args) == 1: stop = args[0] elif len(args) == 2: start, stop = args[0], args[1] elif len(args) == 3: start, stop, step = args[0], args[1], args[2] else: raise TypeError("floatRange accepts 1, 2, or 3 arguments. ({0} given)".format(len(args))) for num in start, step, stop: if not isinstance(num, (int, float)): raise TypeError("floatRange only accepts float and integer arguments. ({0} : {1} given)".format(type(num), str(num))) for x in range(int((stop-start)/step)): yield start + (x * step) return then you can call it just like the original range()... there's no error handling, but let me know if there is an error that can be reasonably caught, and I'll update. or you can update it. this is StackOverflow.
To counter the float precision issues, you could use the Decimal module. This demands an extra effort of converting to Decimal from int or float while writing the code, but you can instead pass str and modify the function if that sort of convenience is indeed necessary. from decimal import Decimal def decimal_range(*args): zero, one = Decimal('0'), Decimal('1') if len(args) == 1: start, stop, step = zero, args[0], one elif len(args) == 2: start, stop, step = args + (one,) elif len(args) == 3: start, stop, step = args else: raise ValueError('Expected 1 or 2 arguments, got %s' % len(args)) if not all([type(arg) == Decimal for arg in (start, stop, step)]): raise ValueError('Arguments must be passed as <type: Decimal>') # neglect bad cases if (start == stop) or (start > stop and step >= zero) or \ (start < stop and step <= zero): return [] current = start while abs(current) < abs(stop): yield current current += step Sample outputs - from decimal import Decimal as D list(decimal_range(D('2'))) # [Decimal('0'), Decimal('1')] list(decimal_range(D('2'), D('4.5'))) # [Decimal('2'), Decimal('3'), Decimal('4')] list(decimal_range(D('2'), D('4.5'), D('0.5'))) # [Decimal('2'), Decimal('2.5'), Decimal('3.0'), Decimal('3.5'), Decimal('4.0')] list(decimal_range(D('2'), D('4.5'), D('-0.5'))) # [] list(decimal_range(D('2'), D('-4.5'), D('-0.5'))) # [Decimal('2'), # Decimal('1.5'), # Decimal('1.0'), # Decimal('0.5'), # Decimal('0.0'), # Decimal('-0.5'), # Decimal('-1.0'), # Decimal('-1.5'), # Decimal('-2.0'), # Decimal('-2.5'), # Decimal('-3.0'), # Decimal('-3.5'), # Decimal('-4.0')]
Add auto-correction for the possibility of an incorrect sign on step: def frange(start,step,stop): step *= 2*((stop>start)^(step<0))-1 return [start+i*step for i in range(int((stop-start)/step))]
My solution: def seq(start, stop, step=1, digit=0): x = float(start) v = [] while x <= stop: v.append(round(x,digit)) x += step return v
Here is my solution which works fine with float_range(-1, 0, 0.01) and works without floating point representation errors. It is not very fast, but works fine: from decimal import Decimal def get_multiplier(_from, _to, step): digits = [] for number in [_from, _to, step]: pre = Decimal(str(number)) % 1 digit = len(str(pre)) - 2 digits.append(digit) max_digits = max(digits) return float(10 ** (max_digits)) def float_range(_from, _to, step, include=False): """Generates a range list of floating point values over the Range [start, stop] with step size step include=True - allows to include right value to if possible !! Works fine with floating point representation !! """ mult = get_multiplier(_from, _to, step) # print mult int_from = int(round(_from * mult)) int_to = int(round(_to * mult)) int_step = int(round(step * mult)) # print int_from,int_to,int_step if include: result = range(int_from, int_to + int_step, int_step) result = [r for r in result if r <= int_to] else: result = range(int_from, int_to, int_step) # print result float_result = [r / mult for r in result] return float_result print float_range(-1, 0, 0.01,include=False) assert float_range(1.01, 2.06, 5.05 % 1, True) ==\ [1.01, 1.06, 1.11, 1.16, 1.21, 1.26, 1.31, 1.36, 1.41, 1.46, 1.51, 1.56, 1.61, 1.66, 1.71, 1.76, 1.81, 1.86, 1.91, 1.96, 2.01, 2.06] assert float_range(1.01, 2.06, 5.05 % 1, False)==\ [1.01, 1.06, 1.11, 1.16, 1.21, 1.26, 1.31, 1.36, 1.41, 1.46, 1.51, 1.56, 1.61, 1.66, 1.71, 1.76, 1.81, 1.86, 1.91, 1.96, 2.01]
I am only a beginner, but I had the same problem, when simulating some calculations. Here is how I attempted to work this out, which seems to be working with decimal steps. I am also quite lazy and so I found it hard to write my own range function. Basically what I did is changed my xrange(0.0, 1.0, 0.01) to xrange(0, 100, 1) and used the division by 100.0 inside the loop. I was also concerned, if there will be rounding mistakes. So I decided to test, whether there are any. Now I heard, that if for example 0.01 from a calculation isn't exactly the float 0.01 comparing them should return False (if I am wrong, please let me know). So I decided to test if my solution will work for my range by running a short test: for d100 in xrange(0, 100, 1): d = d100 / 100.0 fl = float("0.00"[:4 - len(str(d100))] + str(d100)) print d, "=", fl , d == fl And it printed True for each. Now, if I'm getting it totally wrong, please let me know.
The trick to avoid round-off problem is to use a separate number to move through the range, that starts and half the step ahead of start. # floating point range def frange(a, b, stp=1.0): i = a+stp/2.0 while i<b: yield a a += stp i += stp Alternatively, numpy.arange can be used.
My answer is similar to others using map(), without need of NumPy, and without using lambda (though you could). To get a list of float values from 0.0 to t_max in steps of dt: def xdt(n): return dt*float(n) tlist = map(xdt, range(int(t_max/dt)+1))