I've created Oracle trigger which execute external python file through DBMS_SCHEDULER.RUN_JOB() but it executes python file first then insert row into table.I want exactly opposite operation.
CREATE OR REPLACE TRIGGER sample
AFTER INSERT ON client
BEGIN
EXEC DBMS_SCHEDULER.RUN_JOB("JOB CONTAN PYTHON FILE");
END;
Tell me right way to do this
There's a difference between the row(s) being inserted into the table and those rows being visible to another session. Until the data is committed then those inserted rows cannot be seen by any other transaction. If your python code tries to connect to the database to look at those rows, it won't see them.
Equally, your transaction can't report back to the client (in your case SQL Developer) that the insert succeeded until the trigger has completed. In this case it needs to wait until the python call has completed before returning.
Generally triggers are considered 'bad practice', though they do have some good applications. Having a session wait on an external task is also something to avoid. I'd recommend you rethink your approach to whatever you are trying to achieve.
How do you know that?
Think!
you have a table
there's a trigger on that table
trigger fires when you insert data into the table and ...
... calls the Python script
So, how can that script run before inserting a row, if exactly that action - a row being inserted - triggers & runs the Python script?
Unless you prove me wrong, everything should be OK.
Related
I'm not familiar with working in an Azure Data Factory, but have a work requirement to have some processing run in that environment.
I have a stored procedure that creates a result set. I've read about a lookup step. That may be what I need to use. I want to call the stored procedure and put the result set into a mass-storage file. Ideally I'd like the process insert pipe delimiters between the columns, but if no Azure process does that, I can put my own delimiters in the stored procedure directly.
What is the process in Data Factory to use to call a stored procedure and put the data set into a mass-storage file?
TIA
Trying to research my options at this point. As mentioned, appears a lookup step may be the process to use?
You can use copy activity for this.
Here is a demo I have reproduced.
Sample Stored procedure for selecting the data:
create or alter procedure sp1
as
begin
select * from copy_table3
end
In your stored procedure create your result set and select from it.
My sample copy_data3 table:
In copy activity source use stored procedure option after giving SQL dataset.
In the drop down you can select your stored procedure and if it has parameters, you can import those above.
In sink I have used a blob csv file, and this is my result csv after execution.
I have a Dash application that queries an API, based on a user search query, performs some calculations on the response, then displays the final results to the user on a Dash app. In order to provide a quick response to the user, I am trying to set up a quick result callback and a full result long_callback.
The quick result will grab limited results from the API and display results to the user within 10-15 seconds, while the full results will run in the background, collecting all results (which can take up to 2 minutes), then updates the page with the full results when they are available.
I am curious what the best way to perform this action is, as I have run into forking issues with my current attempt.
My current attempt: Using the diskcache.Cache() as the DiskcacheLongCallbackManager and a database txt file to store availability of results.
I have a database txt file that stores a dictionary, with the keys being the search query and the fields being quick_results: bool, full_results: bool, file_path: str, timestamp: dt (as str).
When a search query is entered and submit is pressed, a callback loads the database file as a variable and then checks the dictionary keys for the presence of this search query.
If it finds the query in the keys of the database, it loads the saved feather file from the provided file_path and returns it to the dash app for generation of the page content.
If it does not find the query in the database keys, it requests limited data from the API, runs calculations, saves the DataFrame as a feather file on disk, then creates an entry in the database with the search query(as the key), the file path of the saved feather file, the current timestamp, and sets the quick_results value to True.
It then loads this feather file from the file_path created and returns it to the dash app for generation of the page content.
A long_callback is triggered at the same time as the above callback, with a 20 second sleep to prevent overlap with the quick search. This callback also loads the database file as a variable and checks if the query is present in the database keys.
If found, it then checks if the full results value is True and if the timestamp is more than 0 days old.
If the full results are unavailable or are more than 0 days old, the long_callback requests full results from the API, performs the same calculations, then updates the already existing search query in the database, making the full_results True and the timestamp the time of completion for the full search.
It then loads the feather file from the file_path and returns it to the dash app for generation of the page content.
If the results are available and less then 1 day old, the long callback simply loads the feather file from the provided file_path and returns it to the dash app for generation of the page content.
The problem I am currently facing is that I am getting a weird forking error on the long callback on only one of the conditions for a full search. I currently have the long_callback setup to perform a full search only if the full results flag is False or the results are more than 0 days old. When the full_results flag is False, the callback runs as expected, updates the database and returns the full results. However, when the results are available but more than 0 days old, the callback hits a forking error and is unable to complete.
The process has forked and you cannot use this CoreFoundation functionality safely. You MUST exec(). Break on __THE_PROCESS_HAS_FORKED_AND_YOU_CANNOT_USE_THIS_COREFOUNDATION_FUNCTIONALITY___YOU_MUST_EXEC__() to debug.
I am at a loss as to why the function would run without error on one of the conditions, but then have a forking error on the other condition. The process that runs after both conditions is exactly the same.
By using print statements, I have noticed that this forking error triggers when the function tries to call the requests.get() function on the API.
If this issue is related to how I have setup the background process functionality, I would greatly appreciate some suggestions or assitance on how to do this properly, where I will not face this forking error.
If there is any information I have left out that will be helpful, please let me know and I will try to provide it.
Thank you for any help you can provide.
My development instance of Accumulo became quite messy with a lot of tables created for testing.
I would like to bulk delete a large number of tables.
Is there a way to do it other than deleting the entire instance?
BTW - If it's of any relevance, this instance is just a single machine "cluster".
In the Accumulo shell, you can specify a regular expression for table names to delete by using the -p option of the deletetable command.
I would have commented on original answer, but I lack the reputation (first contribution right here).
It would have been helpful to provide a legal regex example.
The Accumulo shell can only escape certain characters. In particular it will not escape brackets []. If you want to remove every table starting with the string "mytable", the otherwise legal regex commands have the following warning/error.
user#instance> deletetable -p mytable[.]*
2016-02-18 10:21:04,704 [shell.Shell] WARN : No tables found that match your criteria
user#instance> deletetable -p mytable[\w]*
2016-02-18 10:21:49,041 [shell.Shell] ERROR: org.apache.accumulo.core.util.BadArgumentException: can only escape single quotes, double quotes, the space character, the backslash, and hex input near index 19
deletetable -p mytable[\w]*
A working shell command would be:
user#instance> deletetable -p mytable.*
There is not currently (as of version 1.7.0) a way to bulk delete many tables in a single call.
Table deletion is actually done in an asynchronous way. The client submits a request to delete the table, and that table will be deleted at some point in the near future. The problem is that after the call to delete the table is performed, the client then waits until the table is deleted. This blocking is entirely artificial and unnecessary, but unfortunately that's how it currently works.
Because each individual table deletion appears to block, a simple loop over the table names to delete them serially is not going to finish quickly. Instead, you should use a thread pool, and issue delete table requests in parallel.
A bulk delete table command would be very useful, though. As an open source project, a feature request on their issue tracker would be most welcome, and any contributions to implement it, even more so.
I have a requirement such that whenever i run my Kettle job, the database connection parameters must be taken dynamically from an excel source on each run.
Say i have an excel with column names : HostName, Username, Database, Password.
i want to pass these connection parameters to my table input step dynamically whenever the job runs.
This is what i was trying to do.
You can achieve this by
reading the DB connection parameters from a source (e.g. Excel or in my example a CSV file)
storing the parameters in variables
using the variables in your connection setting.
Proceed as follows
Create another transformation for setting the variables (you cannot do this in the same transformation that uses it):
In the Set Variables element configure the variables:
In the element reading/writing your data create a new connection and set the connection parameters using ${variable_name}. Note that you have to blindly write ${password} into the appropriate field. Also note that this may be a security issue because the value may show up as plain text in log files!
In your job call the variable transformation first and then the functional part:
All you need is the XLS input and the Set Variables step. Define your variables as being valid in the Root job and you can use them in subsequent jobs, as long as they're called by the same root job, when defining the connection.
The "Copy rows to result" and "Get rows from result" are used to send information (rows of data) from one transformation to the next transformation or job in the same parent job. They're not used to send data between steps, that's what the hops are for.
I am writing the plperl script function for my trigger execution. When INSERT / UPDATE happens ,my plperl script will run , in that I am dynamically forming some query based on event I receive. I wanted to print it in terminal when I do insert/update. But it does not happen. Tell me which way i can print it.?
Use the elog function to raise notices. You can also use it to raise full errors.