delta live table expectations syntax looks different - databricks

I was looking at a code and saw that the expect_all is coded as below
dlt.expect_all(dict_expectations)(dlt_quarantine_view)
My understanding of the syntax was that it takes a dictionary of expectations and executes that, i am not able ti understand why the second argument in brackets is there.

Related

Plug Intermediate Variables into Equation

I have a MATLAB symbolic math script that calculates a couple really large matrices for me, which I then convert into a function via matlabFunction. The problem is that the matrices contain integrals, and when matlabFunction converts it it creates loads of intermediate functions (about 200, which is why I'm trying not to do this manually) which contain the variable I'm integrating over, causing errors because the variable isn't defined outside the integral() call. In the end I'm left with a bunch of nested equations that look like:
M = integral(#(s)f(t1,t2,t3,t4)) <-- t1 = g(et1,et2,et3,....) <-- et3 = h(s)
They're not all functions of every other one, but theres a lot of dependencies between them becuase of how MATLAB tried to simplify the expression. The error gets thrown when MATLAB tries to evaluate et3, notices it has an s in it (which is find to be in M because of the integral), then breaks because s isn't defined.
So far I've tried messing with the commands in MATLAB to make them stop generating terms like that but everything I did seemed to make it worse, so I've moved to excel where my current idea is something like:
Start with each equation in it's own cell
Break each cell at "=" to get the actual equations on their own
Find each placeholder variable and substitute the actual expression in
I've been trying to write a macro or lambda function for this (admittedly I don't really know how to do either of those well) but I keep getting stuck because I basically need the script to take a list of variables I want to get rid of and just keep substituting until they're all gone, but I'm not really sure how to do this in Excel.
I've also tried to use a couple "multiFindAndReplace" scripts online that let you find and replace vectors of data but none of them seem to work for this case and the SUBSTITUTE command would end up needing to be nested dozens of time to get it do what I want.
Does anyone have ideas on what to do here? Or is there a setting I'm missing in MATLAB to suppress all the intermediate terms?
EDIT: As soon as I posted this I had another idea - it looks like MATLAB generates the terms such that the et terms can't depend on one another and that no et term appears in more than one t term.
(also this is my first question, let me know if I put it in the wrong place)

what is happening in this line of code in 0.1,0.1, what is used for in this line

I am currently working on the iris data sets as a begineer level so I have came across this code. I want to know what is happening in this code. I did not understand what does the explode and 0.1 are doing here:
iris['variety'].value_counts().plot.pie((explode)=[0.1,0.1,0.1],autopct='%1.1f%%',shadow=False,figsize=(100,8))
This is badly written code. Actually, the parens around explode do absolutely nothing so it should be written as (I shortened the statement a bit for readability):
plot.pie(explode=[0.1,0.1,0.1], autopct='%1.1f%%', shadow=False, figsize=(100,8))
IOW this is just a named argument like any other.

By means of what language design principle is a string instantiated either as a string or as a variable in GNU Octave?

Having an Octave script (in the sense of dynamic languages here) move.m defining function move(direction), it can be invoked from another script (alternatively from the command line) in different ways: move left, move('left') or move(left). While the first two will instantiate direction with the string 'left', the last one will consider left as a variable.
The question is about the formal principle in language definition behind this. I understand that in the first mode, the script is invoked as a command, considering that the rest of the command line is just data, not variables (pretty much as in a Linux prompt); while in the last two it is called as a function, interpreting what follows (between parenthesis) as either data or variables. If this is a general design criteria among scripting languages, what is the principle behind it?
To answer your question, yes, this is by design, and it's syntactic sugar offered by matlab (and hence octave) for running certain functions that expect only string arguments. Here is the relevant section in the matlab manual: https://uk.mathworks.com/help/matlab/matlab_prog/command-vs-function-syntax.html
I should clarify some misconceptions though. First, it's not "data" vs "variables". Any argument supplied in command syntax is simply interpreted as a string. So these two are equivalent:
fprintf("1")
fprintf 1
I.e., in fprintf 1, the 1 is not numeric data. It's a string.
Secondly, not all m files are "scripts". You calling your m file a script caused me some confusion. Your particular file contains a function definition and nothing else, so it's a function, 100%.
The reason this is important here, is that all functions can be called either via functional syntax or command syntax (as long as it makes sense in terms of the expected arguments being strings), whereas scripts take no arguments, so there is no functional / command syntax at play, and if you were passing 'arguments' to a script you're doing something wrong.
I understand that in the first mode, the script is invoked as a command [...]
As far as Octave goes, you are better off forgetting about that distinction. I'm not sure if a "command" ever existed but it certainly does not exist now. The command syntax is just syntactic sugar in Octave. Makes it simpler for interactive plot adjustment since it's functions arguments mainly take strings.

identifying left recursion in antlr4

A grammar (copied from its manual) reports the following left recursions when
I un-commented the following production: casting_type->constant_primary
error(119): The following sets of rules are mutually left-recursive [primary, method_call_root, method_call, cast]
and [casting_type, constant_cast, cast, constant_primary, constant_function_call, function_subroutine_call, primary]
and [subroutine_call, function_subroutine_call, constant_function_call, constant_primary, method_call, method_call_root, casting_type, primary, constant_cast, cast]
The above error report has 3 sets of rules. The third set has 2 left-recursions in it:
casting_type,constant_primary,constant_cast,casting_type
casting_type,constant_primary,constant_function_call,function_subroutine_call,subroutine_call,method_call,method_call_root,primary,cast,casting_type
Since this error was reported after I un-commented one production, I
think it is reasonable to expect to see at least its names in each set (casting_type,constant_primary). Clearly the first set
lacks both these names, so it cannot contain a recursion. And the second set (I cannot give the full
grammar here because it is too long) has recursion-1 and some extra names
which seem not relevant.
My question is: why is Antlr printing the first and the second sets of rules?
Is this a bug in antlr (I tried 4.6 and 4.7, same result), or is this hinting at a problem that I am missing something in these sets?
I saw a similar post elsewhere, where the reported names did not indicate a recursion, but on deeper analysis recursion was found somewhere else.
Probably nobody can really answer your question, not even the authors of ANTLR. To me it looks like you get follow-up errors which make not much sense, because a real error made analysis impossible (or at least can lead to wrong conclusions). Of course there can also be a bug in ANTLR, but I recommend to focus on one of the sets and fix that (if you can see what makes them mutually left recursive). Maybe the other errors disappear then or you have to analyze again.

How to use values (as Column) in function (from functions object) where Scala non-SQL types are expected?

I'd like to undertand how I can dynamically add number of days to a given timestamp: I tried something similar to the example shown below. The issue here is that the second argument is expected to be of type Int, however in my case it returns type Column. How do I unbox this / get the actual value? (The code examples below might not be 100% correct as I write this from top of my head ... I don't have the actual code with me currently)
myDataset.withColumn("finalDate",date_add(col("date"),col("no_of_days")))
I tried casting:
myDataset.withColumn("finalDate",date_add(col("date"),col("no_of_days").cast(IntegerType)))
But this did not help either. So how is it possible to solve this?
I did find a workaround by using selectExpr:
myDataset.selectExpr("date_add(date,no_of_days) as finalDate")
While this works, I still would like to understand how to get the same result with withColumn.
withColumn("finalDate", expr("date_add(date,no_of_days)"))
The above syntax should work.
I think it's not possible as you'd have to use two separate similar-looking type systems - Scala's and Spark SQL's.
What you call a workaround by using selectExpr is probably the only way to do it as you're confined in a single type system, in Spark SQL's and since the parameters are all defined in Spark SQL's "realm" that's the only possible way.
myDataset.selectExpr("date_add(date,no_of_days) as finalDate")
BTW, you've just showed me another reason where support for SQL is different from Dataset's Query DSL. It's about the source of the parameters to functions -- only from structured data sources, only from Scala or a mixture thereof (as in UDFs and UDAFs). Thanks!

Resources