Using IronPython to set "Data Limit by Expression" on a visualization

Using IronPython to set "Data Limit by Expression" on a visualization - spotfire

How do I use IronPython to set the Data Limit By Expression field on a visualization?
(I mean an example of a simple IP script to set the Data Limit By Expression field on a visualization, I couldn't find one on the internet)

for any type of chart, if this is the only operation you need to do, you can use code like:
from Spotfire.Dxp.Application.Visuals import Visualization
viz = v.As[Visualization]()
print viz.Data.WhereClauseExpression # prints Python's nil value None
viz.Data.WhereClauseExpression = "[Column] = 'Value'"
print viz.Data.WhereClauseExpression # prints the above expression
in this example, v is a parameter pointing to the desired visualization. you could also look it up by name or ID or some other method.
if you're already manipulating this visualization with a script and just want to add a data limit, you can add this to your existing script without importing the Visualization class. every visualization type's Data object has this WhereClauseExpression property

Related

How can I access python variable in Spark SQL?

I have python variable created under %python in my jupyter notebook file in Azure Databricks. How can I access the same variable to make comparisons under %sql. Below is the example:
%python
RunID_Goal = sqlContext.sql("SELECT CONCAT(SUBSTRING(RunID,1,6),SUBSTRING(RunID,1,6),'01_')
FROM RunID_Pace").first()[0]
AS RunID_Goal
%sql
SELECT Type , KPIDate, Value
FROM table
WHERE
RunID = RunID_Goal (This is the variable created under %python and want to compare over here)
When I run this it throws an error:
Error in SQL statement: AnalysisException: cannot resolve 'RunID_Goal' given input columns:
I am new azure databricks and spark sql any sort of help would be appreciated.

One workaround could be to use Widgets to pass parameters between cells. For example, on Python side it could be as following:
# generate test data
import pyspark.sql.functions as F
spark.range(100).withColumn("rnd", F.rand()).write.mode("append").saveAsTable("abc")
# set widgets
import random
vl = random.randint(0, 100)
dbutils.widgets.text("my_val", str(vl))
and then you can refer the value from the widget inside the SQL code:
%sql
select * from abc where id = getArgument('my_val')
will give you:
Another way is to pass variable via Spark configuration. You can set variable value like this (please note that that the variable should have a prefix - in this case it's c.):
spark.conf.set("c.var", "some-value")
and then from SQL refer to variable as ${var-name}:
%sql
select * from table where column = '${c.var}'
One advantage of this is that you can use this variable also for table names, etc. Disadvantage is that you need to do the escaping of the variable, like putting into single quotes for string values.

You cannot access this variable. It is explained in the documentation:
When you invoke a language magic command, the command is dispatched to the REPL in the execution context for the notebook. Variables defined in one language (and hence in the REPL for that language) are not available in the REPL of another language. REPLs can share state only through external resources such as files in DBFS or objects in object storage.

Here is another workaround.
# Optional code to use databricks widgets to assign python variables
dbutils.widgets.text('my_str_col_name','my_str_col_name')
dbutils.widgets.text('my_str_col_value','my_str_col_value')
my_str_col_name = dbutils.widgets.get('my_str_col_name')
my_str_col_value = dbutils.widgets.get('my_str_col_value')
# Query with string formatting
query = """
select *
from my_table
where {0} < '{1}'
"""
# Modify query with the values of Python variable
query = query.format(my_str_col_name,my_str_col_value)
# Execute the query
display(spark.sql(query))

A quick complement to answer.
Do you can use widgets to pass parameters to another cell using magic %sql, as was mentioned;
dbutils.widgets.text("table_name", "db.mytable")
And at the cell that you will use this variable do you can use $ shortcut ~ getArgument isn't supported;
%sql
select * from $table_name

Getting rid of print "<IPython.core.display.Markdown object>" when using `display`

I'm trying to create nice slides using jupyter notebook and RISE. One of my objectives is to display a pandas-dataframe in a Markdown cell in order to have some styling flexibility.
I am using the following code to display my dataframe in a Markdown cell:
{{Markdown(display(df_x))}}
After running this line, I get the following result:
image of dataframe displayed
I would like to get rid of the text printed below my dataframe (<IPython.core.display.Markdown object>).
I still haven't found a way to achieve this. Could someone give me a hand?
This is the library I'm working with:
from IPython.display import display

Not familiar with Markdown class so not sure why you need that but this text printed in the output cell is coming from the fact that this Markdown class is returning and object and since you're not assigning it to any variable the default behavior for the notebook is to run something like str(your_object) which correctly returns <IPython.core.display.Markdown object>.
So the easiest workaround would be to just assign it to some variable like this:
dummy_var = Markdown(display(df_x))
# or better yet:
_ = Markdown(display(df_x))

Parametrize and loop KQL queries in JupyterLab

My question is how to assign variables within a loop in KQL magic command in Jupyter lab. I refer to Microsoft's document on this subject and will base my question on the code given here:
https://learn.microsoft.com/en-us/azure/data-explorer/kqlmagic
1. First query below
%%kql
StormEvents
| summarize max(DamageProperty) by State
| order by max_DamageProperty desc
| limit 10
2. Second: Convert the resultant query to a dataframe and assign a variable to 'statefilter'
df = _kql_raw_result_.to_dataframe()
statefilter =df.loc[0].State
statefilter
3. This is where I would like to modify the above query and let statefilter have multiple variables (i.e. consist of different states):
df = _kql_raw_result_.to_dataframe()
statefilter =df.loc[0:3].State
statefilter
4. And finally I would like to run my kql query within a for loop for each of the variables within statefilter. This below syntax may not be correct but it can give an example for what I am looking for:
dfs = [] # an empty list to store dataframes
for state in statefilters:
%%kql
let _state = state;
StormEvents
| where State in (_state)
| do some operations here for that specific state
df = _kql_raw_result_.to_dataframe()
dfs.append(df) # store the df specific to state in the list
The reason why I am not querying all the desired states within the KQL query is to prevent resulting in really large query outcomes being assigned to dataframes. This is not for this sample StormEvents table which has a reasonable size but for my research data which consists of many sites and is really big. Therefore I would like to be able to run a KQL query/analysis for each site within a for loop and assign each site's query results to a dataframe. Please let me know if this is possible or perhaps there may other logical ways to do this within KQL...

There are few ways to do it.
The simplest is to refractor your %%kql cell magic to a %kql line magic.
Line magic can be embedded in python cell.
Other option is to: from Kqlmagic import kql
The Kqlmagic kql method, accept as a string a kql cell or line.
You can call kql from python.
Third way is to call the kql magic via the ipython method:
ip.run_cell_magic('kql', {your kql magic cell text})
You can call it from python.

Example of using the single line magic mentioned by Michael and a return statement that converted the result to JSON. Without the conversion to JSON I wasn't getting anything back.
def testKQL():
%kql DatabaseName | take 10000
return _kql_raw_result_.to_dataframe().to_json(orient='records')

Using the Python WITH statement to create temporary variable

Suppose I have Pandas data. Any data. I import seaborn to make a colored version of the correlation between varibales. Instead of passing the correlation expression into the heatmap fuction, and instead of creating a one-time variable to store the correlation output, how can I use the with statement to create temporary variable that no longer existss after the heatmap is plotted?
Doesn't work
# Assume: season = sns, Data is heatmapable
with mypandas_df.correlation(method="pearson") as heatmap_input:
# possible other statements
sns.heatmap(heatmap_input)
# possible other statements
If this exissted, then after seaborn plots the map, heatmap_input no longer exists as a variable. I would like tat functionality.
Long way
# this could be temporary but is now global
tcbtbing = mypandas_df.correlation(method="pearson")
sns.heatmap(tcbtbing)
Compact way
sns.heatmap( mypandas_df.correlation(method="pearson") )
I'd like to use the with statement (or similar short) construction to avoid the Long Way and the Compact way, but leave room for other manipulations, such as to the plot itself.

You need to implement enter and exit for the class you want to use it.
see: Implementing use of 'with object() as f' in custom class in python

How to create a filtering scheme using Iron Python

Is it possible to create a new Filtering Scheme and set it to a page only using Iron Python? The reason why I am looking into that is because the Web Player currently does not allow us to create Filtering Schemes. I hope to achieve that by executing a script which will be triggered by a Document property change. The name of the filtering scheme will be passed from the JavaScript api my using SetDocumentProperty method.
The script below adds a new Filtering Scheme but I cannot select it from the Filtering Scheme menu in the Spotfire Analyst, it's nowhere to be seen. What am I missing?
from Spotfire.Dxp.Data import *
from Spotfire.Dxp.Application.Filters import *
Document.ActivePageReference.FilterPanel.Visible = True
# Add a new data filtering selection.
filterings = Document.Data.Filterings
filterings.Add("Test Filtering 1")
for f in filterings:
print f.Name
I cannot see my newly added FilteringScheme after I ran the above script from the Filtering Scheme menu on the Analyst:

The issue here is that "filterings" is a variable that you created, not an alias for the filter list -- you filled it with the data in the existing filters, but updating filterings afterwards does not update the filters on the page itself.
Change the code to this:
from Spotfire.Dxp.Data import *
from Spotfire.Dxp.Application.Filters import *
Document.ActivePageReference.FilterPanel.Visible = True
# Add a new data filtering selection.
Document.Data.Filterings.Add("Test Filtering 1")
filterings = Document.Data.Filterings
for f in filterings:
print f.Name

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Using IronPython to set "Data Limit by Expression" on a visualization - spotfire

How do I use IronPython to set the Data Limit By Expression field on a visualization? (I mean an example of a simple IP script to set the Data Limit By Expression field on a visualization, I couldn't find one on the internet)

Related

How can I access python variable in Spark SQL?

Getting rid of print "<IPython.core.display.Markdown object>" when using `display`

Parametrize and loop KQL queries in JupyterLab

Using the Python WITH statement to create temporary variable

How to create a filtering scheme using Iron Python

Categories

Resources