featuretools progress bar when running dfs - featuretools

When using featuretools is there a way to show a progress bar when running dfs?

Setting the parameter verbose=True in the dfs function call should give you a progress bar.

Related

Python Progress Bar for non-iterable process

I'm using this Notebook, where section Apply DocumentClassifier is altered as below.
Jupyter Labs, kernel: conda_mxnet_latest_p37.
tqdm is a progress bar wrapper. It seems to work both on for loops and in CLI. However, I would like to use it on line:
classified_docs = doc_classifier.predict(docs_to_classify)
This is an iterative process, but under the bonnet.
How can I apply tqdm to this line?
Code Cell:
doc_dir = "GRIs/" # contains 2 .pdfs
with open('filt_gri.txt', 'r') as filehandle:
tags = [current_place.rstrip() for current_place in filehandle.readlines()]
doc_classifier = TransformersDocumentClassifier(model_name_or_path="cross-encoder/nli-distilroberta-base",
task="zero-shot-classification",
labels=tags,
batch_size=2)
# convert to Document using a fieldmap for custom content fields the classification should run on
docs_to_classify = [Document.from_dict(d) for d in docs_sliding_window]
# classify using gpu, batch_size makes sure we do not run out of memory
classified_docs = doc_classifier.predict(docs_to_classify)
Based on this TDS Article; all Python progress bar libraries work with for loops. Hypothetically, I could alter the predict() function and append there but that's simply too much work.
Note: I'm happy to remove this answer if there is indeed a solution for non-iterablly "accessible" processes.

Spark Console Progress Bar Missing

How to get 'Stage' Console Progress Bar in Jupyter Notebook?
Progress bar was displayed earlier, but somehow it is not displayed now. I couldn't relate it to an Nbextension either.
Let me know if there is any configuration available to enable it back.
[Stage X:==========> (A + B) / C]
Set spark.ui.showConsoleProgress to True docs

Pyspark Display not showing chart in Jupyter

I have the following line of code:
display(df2.groupBy("TransactionDate").sum("Amount").orderBy("TransactionDate"))
Which according to this document:
https://docs.databricks.com/user-guide/visualizations/index.html#visualizations-in-python
Should give me a chart in Jupyter. Instead I get the following output:
DataFrame[TransactionDate: timestamp, sum(Amount): double]
How come?
Which according to this document
...
Should give me a chart in Jupyter.
It should not. display is a feature of a proprietary Databricks platform, not feature of Spark, so unless you use their notebook flavor (based on Zeppelin not Jupyter), it won't be available for you.

Parameter to notebook - widget not defined error

I have passed a parameter from JobScheduling to Databricks Notebook and tried capturing it inside the python Notebook using dbutils.widgets.get ().
When I run the scheduled job, I get the error "No Input Widget defined", thrown by the library module ""InputWidgetNotDefined"
May I know the reason, thanks.
To use widgets, you fist need to create them in the notebook.
For example, to create a text widget, run:
dbutils.widgets.text('date', '2021-01-11')
After the widget has been created once on a given cluster, it can be used to supply parameters to the notebook on subsequent job runs.

Matplotlib interactive widget from python script

I am adapting this example to interact with a plot using matplotlib's widgets.
The way I normally work is interactive, from within spyder I just call a function that does the plotting.
My goal is to make an executable available to users who do not have Python installed, so as an intermediate step I am wrapping the functions into a script.
I have minimal experience with standalone scripts, so in a nutshell mine looks like this:
import various_modules
def plotting()
...
plot_some_initial_stuff
sl = Slider()
plt.show() <-------
def update()
...
ax.set_ydata()
fig.canvas.draw_idle()
sl.on_change(update)
return()
plotting()
So I just define the plotting function and then call it.
I had to add the plt.show() command, which I do not need to have when I'm working from the IPython shell, otherwise doing:
python my_plot.py
would not produce anything. By adding plt.show(), the window shows up with the graphs I define in the initialization part. However, no interaction happens.
What is the correct way of achieving interaction when running a script like I'm doing?
Although I do not know all the details, I learnt that:
the default mode for script is non-interactive
this means that
no plot is generated until show() is called
the execution of the script does not proceed after the statement
Therefore, my call to show() was too early and actually blocked the definition of the update function and its connection to the slider.on_change() event.
It has been enough to move show() to the last line of the function definition to obtain the desired behaviour.

Resources