Using luigi.LocalTarget with PdfPages in python3 - python-3.x

The below code snippet worked fine for python2, but does not work for python3. This code snippet is intended to allow for a luigi workflow to write to a multipage PDF, while still using the LocalTarget context manager that allows for atomicity.
import luigi
from matplotlib.backends.backend_pdf import PdfPages
import matplotlib.pyplot as plt
test = luigi.LocalTarget('test.pdf')
with test.open('wb') as fh, PdfPages(fh) as outf:
plt = plt.plot([1, 2, 3], [4, 5, 6])
This works in python2, but in python3 leads to the error:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-6-ba62e5b716d2> in <module>
----> 1 with test.open('wb') as fh, PdfPages(fh) as outf:
2 plt = plt.plot([1, 2, 3], [4, 5, 6])
~/miniconda3/envs/cat3.7/lib/python3.7/site-packages/matplotlib/backends/backend_pdf.py in __init__(self, filename, keep_empty, metadata)
2386
2387 """
-> 2388 self._file = PdfFile(filename, metadata=metadata)
2389 self.keep_empty = keep_empty
2390
~/miniconda3/envs/cat3.7/lib/python3.7/site-packages/matplotlib/backends/backend_pdf.py in __init__(self, filename, metadata)
445 self.fh = fh
446 self.currentstream = None # stream object to write to, if any
--> 447 fh.write(b"%PDF-1.4\n") # 1.4 is the first version to have alpha
448 # Output some eight-bit chars as a comment so various utilities
449 # recognize the file as binary by looking at the first few
TypeError: write() argument must be str, not bytes
How can I retain this atomic functionality in python3?

Sorry, I know this isn't the answer you're looking for, but I found this line in the Luigi repository in the definition of LocalTarget:
def open(self, mode='r'):
rwmode = mode.replace('b', '').replace('t', '')
...
https://github.com/spotify/luigi/blob/master/luigi/local_target.py#L159
It seems that they are not doing byte writing whatsoever (at least in the current version). I would definitely bring this up with them in the Github Issues.

I am no expert in the working of LocalTarget, so i do not know if there is a reason for the removal of the b-flag, or if this is a bug.
A way to work around this is to wrap the code using the temporary_path function:
import luigi
class BinaryFileExample(luigi.Task):
def output(self):
return luigi.LocalTarget("simple_binary_file.extension")
def run(self):
with self.output().temporary_path() as my_binary_file_path:
with open(my_binary_file_path, 'wb') as inner_file:
newFileBytes = [123, 3, 255, 0, 100]
for byte in newFileBytes:
inner_file.write(byte.to_bytes(1, byteorder='big'))

Related

How to resolve : "IndexError: band index 1 out of range (not in ())". Raster. Rasterio

I am trying to plot MODIS data product MOD09GQ. The following is my code and console output:
import rasterio
from rasterio.plot import show
import numpy as np
import matplotlib.pyplot as plt
filepath1 = '/Users/sayantanmandal/Projects/MODIS/MOD09GQ.A2010200.h26v06.061.2021166023144.hdf'
with rasterio.open(filepath1) as modis:
print(modis.profile)
print(modis.crs)
show(modis)
Console output:
{'driver': 'HDF4', 'dtype': 'float_', 'nodata': None, 'width': 512, 'height': 512, 'count': 0, 'crs': None, 'transform': Affine(1.0, 0.0, 0.0,
0.0, 1.0, 0.0), 'tiled': False}
None
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
~/opt/miniconda3/lib/python3.9/site-packages/rasterio/plot.py in show(source, with_bounds, contour, contour_label_kws, ax, title, transform, adjust, **kwargs)
101 # Gather the indexes of the RGB channels in that order
--> 102 rgb_indexes = [source_colorinterp[ci] for ci in
103 (colorinterp.red, colorinterp.green, colorinterp.blue)]
~/opt/miniconda3/lib/python3.9/site-packages/rasterio/plot.py in <listcomp>(.0)
101 # Gather the indexes of the RGB channels in that order
--> 102 rgb_indexes = [source_colorinterp[ci] for ci in
103 (colorinterp.red, colorinterp.green, colorinterp.blue)]
KeyError: <ColorInterp.red: 3>
During handling of the above exception, another exception occurred:
IndexError Traceback (most recent call last)
/var/folders/bt/kqf88mw53h55m9mj35rkwt6h0000gn/T/ipykernel_3220/591497476.py in <module>
3 print(modis.profile)
4 print(modis.crs)
----> 5 show(modis)
~/opt/miniconda3/lib/python3.9/site-packages/rasterio/plot.py in show(source, with_bounds, contour, contour_label_kws, ax, title, transform, adjust, **kwargs)
109
110 except KeyError:
--> 111 arr = source.read(1, masked=True)
112 else:
113 # The source is a numpy array reshape it to image if it has 3+ bands
rasterio/_io.pyx in rasterio._io.DatasetReaderBase.read()
IndexError: band index 1 out of range (not in ())
At first I thought that may be the image does not have any values for the selected area. But when I open this file in QGIS I do get a multiband image. I may be wrong in assuming few things and may be throwing out wrong jargons, as this subject is pretty much new to me. Any idea what may be causing this error? Thanks.
I don't know the data you are talking about.
How many band does your data have?
If you look at the profile result, it comes out as 0.
(If there is only one band, it should yield 1.)
The only question you may have is whether your data is organized into an array that rasterio understands.
Rasterio understands (band, height, width).
Check this what print(modis.read().shape)
If it comes out differently, use numpy to change it so that rasterio can understand it.
Resolved this.
import rioxarray as rxr
modis = rxr.open_rasterio('/Users/sayantanmandal/Projects/MODIS/MOD09GQ.
A2010200.h26v06.061.2021166023144.hdf', masked = True)
type(modis)
Console output:
xarray.core.dataset.Dataset

How can I utilize JAX library on my code with numpy take realted error: "NotImplementedError: The 'raise' mode to jnp.take is not supported."

Due to my need to speed up my written code, I have modified that to pure NumPy code to evaluate the runtime in this way and by JAX accelerator in Python. I don't know if my code is appropriate to be accelerated by JAX, but my little previous studies and JAX usage experiences encourage me to try vectorizing or parallelizing the prepared NumPy code by JAX. For initial test, I have put jax.jit decorator on the function, but it stuck at the first line of my code. it raised the following error in Colab:
<__array_function__ internals> in take(*args, **kwargs)
UnfilteredStackTrace: NotImplementedError: The 'raise' mode to jnp.take is not supported.
The stack trace below excludes JAX-internal frames.
The preceding is the original exception that occurred, unmodified.
--------------------
The above exception was the direct cause of the following exception:
NotImplementedError Traceback (most recent call last)
<__array_function__ internals> in take(*args, **kwargs)
/usr/local/lib/python3.7/dist-packages/jax/_src/numpy/lax_numpy.py in _take(a, indices, axis, out, mode)
5437 elif mode == "raise":
5438 # TODO(phawkins): we have no way to report out of bounds errors yet.
-> 5439 raise NotImplementedError("The 'raise' mode to jnp.take is not supported.")
5440 elif mode == "wrap":
5441 indices = mod(indices, _constant_like(indices, a.shape[axis_idx]))
NotImplementedError: The 'raise' mode to jnp.take is not supported.
I don't know how to handle this code by JAX. This error is related to np.take module, although I guess it will stuck again at some other lines e.g. which contain reduce.
The sample code is:
import numpy as np
import jax
pp_ = np.array([[0.75, 0.5, 0.5], [15, 10, 15], [0.5, 3., 0.35], [15, 17, 15]])
rr_ = np.array([1, 3, 2, 5], dtype=np.float64)
gg_ = np.array([-0.48305741, -1])
ee_ = np.array([[0, 2], [1, 3]], dtype=np.int64)
#jax.jit
def JAX_acc(pp_, rr_, gg_, ee_):
rr_act = np.take(rr_, ee_)
r_add = np.add.reduce(rr_act, axis=1)
pc_dis = np.sum((r_add, gg_), axis=0)
ang_ = np.arccos((rr_act ** 5 + pc_dis[:, None] ** 2) / 1e5)
pl_rad = rr_act * np.cos(ang_)
pp_act = np.take(pp_, ee_, axis=0)
pc_vec = -np.subtract.reduce(pp_act, axis=1)
pc_ = pp_act[:, 0, :] + pc_vec / np.linalg.norm(pc_vec, axis=1)[:, None] * np.abs(pl_rad[:, 0][:, None])
return print(pc_dis, pc_, pl_rad)
JAX_acc(pp_, rr_, gg_, ee_)
main Qusestion: Could JAX library be utilized for this example? How?
Shall I use other modules instead np.take?
I would be appreciated for helping to cure this code by JAX.
---------------- solved by the update ----------------
I would be grateful for any other explanations on the following extraneus questions (not needed):
Which of math operations (-,+,*,...) and their NumPy equivalents (np.power, nu.sum,...) will be faster using JAX? Do NumPy ones will be handled by JAX in a better scheme (in terms of speed) than common math ones?
Does JAX CPU mode need other writing styles than TPU mode; I didn't use that so far.
Updates:
I have changed the code using jnp related modules based on #jakedvp comment and the problem by np.take is gone:
def JAX_acc_jnp(pp_, rr_, gg_, ee_):
rr_act = jnp.take(rr_, ee_)
r_add = jnp.sum(rr_act, axis=1) # .squees()
pc_dis = jnp.add(r_add, gg_)
ang_ = jnp.arccos((rr_act ** 5 + pc_dis[:, None] ** 2) / 1e5)
pl_rad = rr_act * jnp.cos(ang_)
pp_act = jnp.take(pp_, ee_, axis=0)
pc_vec = jnp.diff(pp_act, axis=1).squeeze()
pc_ = pp_act[:, 0, :] + pc_vec / jnp.linalg.norm(pc_vec, axis=1)[:, None] * jnp.abs(pl_rad[:, 0][:, None])
return pc_dis, pc_, pl_rad
For pc_dis and pc_ the results are true, but pl_rad is different due to ang_ different achieved values which are all -1.0927847e-10; perhaps because true values are with -13 decimals and JAX changed dtype to float32, I don't know. If so, how could I specify which dtype JAX use?
larger data sizes: pp_, rr_, gg_, ee_

AttributeError in python: object has no attribute

I started learning Machine Learning and came across Neural Networks. while implementing a program i got this error. i have tried checking for every solution but no luck. here's my code:
from numpy import exp, array, random, dot
class neural_network:
def _init_(self):
random.seed(1)
self.weights = 2 * random.random((2, 1)) - 1
def train(self, inputs, outputs, num):
for iteration in range(num):
output = self.think(inputs)
error = outputs - output
adjustment = 0.01*dot(inputs.T, error)
self.weights += adjustment
def think(self, inputs):
return (dot(inputs, self.weights))
neural = neural_network()
# The training set
inputs = array([[2, 3], [1, 1], [5, 2], [12, 3]])
outputs = array([[10, 4, 14, 30]]).T
# Training the neural network using the training set.
neural.train(inputs, outputs, 10000)
# Ask the neural network the output
print(neural.think(array([15, 2])))
this is the error which i'm getting when running neural.train:
Traceback (most recent call last):
File "neural.py", line 27, in <module>
neural.train(inputs, outputs, 10000)
File "neural.py", line 10, in train
output = self.think(inputs)
File "neural.py", line 16, in think
return (dot(inputs, self.weights))
AttributeError: 'neural_network' object has no attribute 'weights'
Though its has a self attribute self.weights() still it says no such attribute.
Well, it turns out that your initialization method should be named __init__ (two underscores), not _init_...
So, changing the method to
def __init__(self):
random.seed(1)
self.weights = 2 * random.random((2, 1)) - 1
your code works OK:
neural.train(inputs, outputs, 10000)
print(neural.think(array([15, 2])))
# [ 34.]
Your initializing method is written wrong, its two underscores __init__(self): not one underscore_init_(self):
Otherwise, nice code!

Compiler error, while creating pdf using pylatex

I am using PyLaTex to create a pdf document. I'm facing some issues with compilers.
I am running my program on MacOS-High Sierra, and have installed basic version of MacTeX Mactex Download and latexmk is also installed using
sudo tlmgr install latexmk
For the following starter code, I'm getting error in compilers loop. Here's the error log attached after code.
import numpy as np
from pylatex import Document, Section, Subsection, Tabular, Math, TikZ, Axis, \
Plot, Figure, Matrix, Alignat
from pylatex.utils import italic
import os
if __name__ == '__main__':
# image_filename = os.path.join(os.path.dirname(__file__), 'kitten.jpg')
geometry_options = {"tmargin": "1cm", "lmargin": "10cm"}
doc = Document(geometry_options=geometry_options)
with doc.create(Section('The simple stuff')):
doc.append('Some regular text and some')
doc.append(italic('italic text. '))
doc.append('\nAlso some crazy characters: $&#{}')
with doc.create(Subsection('Math that is incorrect')):
doc.append(Math(data=['2*3', '=', 9]))
with doc.create(Subsection('Table of something')):
with doc.create(Tabular('rc|cl')) as table:
table.add_hline()
table.add_row((1, 2, 3, 4))
table.add_hline(1, 2)
table.add_empty_row()
table.add_row((4, 5, 6, 7))
doc.generate_pdf('full', clean_tex=False, compiler_args='--latexmk')
Error code:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-10-dbe7f407e095> in <module>()
27
28
---> 29 doc.generate_pdf('full', clean_tex=False, compiler_args='--latexmk')
~/anaconda3/lib/python3.6/site-packages/pylatex/document.py in generate_pdf(self, filepath, clean, clean_tex, compiler, compiler_args, silent)
227
228 for compiler, arguments in compilers:
--> 229 command = [compiler] + arguments + compiler_args + main_arguments
230
231 try:
TypeError: can only concatenate list (not "str") to list
Please help me understand the error and fix the same
Regards,
Looks like a confusion between the compiler keyword argument, which accepts a string and compiler_args which accepts a list.
Maybe something like this is what you're after:
doc.generate_pdf('full', clean_tex=False, compiler='latexmk', compiler_args=['-c'])

AttributeError: Filter attribute has no attribute append python 3.x [duplicate]

filter, map, and reduce work perfectly in Python 2. Here is an example:
>>> def f(x):
return x % 2 != 0 and x % 3 != 0
>>> filter(f, range(2, 25))
[5, 7, 11, 13, 17, 19, 23]
>>> def cube(x):
return x*x*x
>>> map(cube, range(1, 11))
[1, 8, 27, 64, 125, 216, 343, 512, 729, 1000]
>>> def add(x,y):
return x+y
>>> reduce(add, range(1, 11))
55
But in Python 3, I receive the following outputs:
>>> filter(f, range(2, 25))
<filter object at 0x0000000002C14908>
>>> map(cube, range(1, 11))
<map object at 0x0000000002C82B70>
>>> reduce(add, range(1, 11))
Traceback (most recent call last):
File "<pyshell#8>", line 1, in <module>
reduce(add, range(1, 11))
NameError: name 'reduce' is not defined
I would appreciate if someone could explain to me why this is.
Screenshot of code for further clarity:
You can read about the changes in What's New In Python 3.0. You should read it thoroughly when you move from 2.x to 3.x since a lot has been changed.
The whole answer here are quotes from the documentation.
Views And Iterators Instead Of Lists
Some well-known APIs no longer return lists:
[...]
map() and filter() return iterators. If you really need a list, a quick fix is e.g. list(map(...)), but a better fix is often to use a list comprehension (especially when the original code uses lambda), or rewriting the code so it doesn’t need a list at all. Particularly tricky is map() invoked for the side effects of the function; the correct transformation is to use a regular for loop (since creating a list would just be wasteful).
[...]
Builtins
[...]
Removed reduce(). Use functools.reduce() if you really need it; however, 99 percent of the time an explicit for loop is more readable.
[...]
The functionality of map and filter was intentionally changed to return iterators, and reduce was removed from being a built-in and placed in functools.reduce.
So, for filter and map, you can wrap them with list() to see the results like you did before.
>>> def f(x): return x % 2 != 0 and x % 3 != 0
...
>>> list(filter(f, range(2, 25)))
[5, 7, 11, 13, 17, 19, 23]
>>> def cube(x): return x*x*x
...
>>> list(map(cube, range(1, 11)))
[1, 8, 27, 64, 125, 216, 343, 512, 729, 1000]
>>> import functools
>>> def add(x,y): return x+y
...
>>> functools.reduce(add, range(1, 11))
55
>>>
The recommendation now is that you replace your usage of map and filter with generators expressions or list comprehensions. Example:
>>> def f(x): return x % 2 != 0 and x % 3 != 0
...
>>> [i for i in range(2, 25) if f(i)]
[5, 7, 11, 13, 17, 19, 23]
>>> def cube(x): return x*x*x
...
>>> [cube(i) for i in range(1, 11)]
[1, 8, 27, 64, 125, 216, 343, 512, 729, 1000]
>>>
They say that for loops are 99 percent of the time easier to read than reduce, but I'd just stick with functools.reduce.
Edit: The 99 percent figure is pulled directly from the What’s New In Python 3.0 page authored by Guido van Rossum.
As an addendum to the other answers, this sounds like a fine use-case for a context manager that will re-map the names of these functions to ones which return a list and introduce reduce in the global namespace.
A quick implementation might look like this:
from contextlib import contextmanager
#contextmanager
def noiters(*funcs):
if not funcs:
funcs = [map, filter, zip] # etc
from functools import reduce
globals()[reduce.__name__] = reduce
for func in funcs:
globals()[func.__name__] = lambda *ar, func = func, **kwar: list(func(*ar, **kwar))
try:
yield
finally:
del globals()[reduce.__name__]
for func in funcs: globals()[func.__name__] = func
With a usage that looks like this:
with noiters(map):
from operator import add
print(reduce(add, range(1, 20)))
print(map(int, ['1', '2']))
Which prints:
190
[1, 2]
Just my 2 cents :-)
Since the reduce method has been removed from the built in function from Python3, don't forget to import the functools in your code. Please look at the code snippet below.
import functools
my_list = [10,15,20,25,35]
sum_numbers = functools.reduce(lambda x ,y : x+y , my_list)
print(sum_numbers)
One of the advantages of map, filter and reduce is how legible they become when you "chain" them together to do something complex. However, the built-in syntax isn't legible and is all "backwards". So, I suggest using the PyFunctional package (https://pypi.org/project/PyFunctional/).
Here's a comparison of the two:
flight_destinations_dict = {'NY': {'London', 'Rome'}, 'Berlin': {'NY'}}
PyFunctional version
Very legible syntax. You can say:
"I have a sequence of flight destinations. Out of which I want to get
the dict key if city is in the dict values. Finally, filter out the
empty lists I created in the process."
from functional import seq # PyFunctional package to allow easier syntax
def find_return_flights_PYFUNCTIONAL_SYNTAX(city, flight_destinations_dict):
return seq(flight_destinations_dict.items()) \
.map(lambda x: x[0] if city in x[1] else []) \
.filter(lambda x: x != []) \
Default Python version
It's all backwards. You need to say:
"OK, so, there's a list. I want to filter empty lists out of it. Why?
Because I first got the dict key if the city was in the dict values.
Oh, the list I'm doing this to is flight_destinations_dict."
def find_return_flights_DEFAULT_SYNTAX(city, flight_destinations_dict):
return list(
filter(lambda x: x != [],
map(lambda x: x[0] if city in x[1] else [], flight_destinations_dict.items())
)
)
Here are the examples of Filter, map and reduce functions.
numbers = [10,11,12,22,34,43,54,34,67,87,88,98,99,87,44,66]
//Filter
oddNumbers = list(filter(lambda x: x%2 != 0, numbers))
print(oddNumbers)
//Map
multiplyOf2 = list(map(lambda x: x*2, numbers))
print(multiplyOf2)
//Reduce
The reduce function, since it is not commonly used, was removed from the built-in functions in Python 3. It is still available in the functools module, so you can do:
from functools import reduce
sumOfNumbers = reduce(lambda x,y: x+y, numbers)
print(sumOfNumbers)
from functools import reduce
def f(x):
return x % 2 != 0 and x % 3 != 0
print(*filter(f, range(2, 25)))
#[5, 7, 11, 13, 17, 19, 23]
def cube(x):
return x**3
print(*map(cube, range(1, 11)))
#[1, 8, 27, 64, 125, 216, 343, 512, 729, 1000]
def add(x,y):
return x+y
reduce(add, range(1, 11))
#55
It works as is. To get the output of map use * or list

Resources