Data inserted in ORACLE TEMPORARY TABLE (from a different session) can be accessed from a scheduler job? - temp-tables

I have a program to insert current user information into a temporary table (visible only to that user's session / transaction). And I have an oracle scheduler job which sends a mail to admin with the user's information. It seems the user information is not correct (gives wrong values).
1. Can anyone tell me whether the temp table data be accessed from scheduler?
2. If so, which session's data it will take?

The answer is we can't.
Since TEMPORARY table's data is private (Reference).

Related

Best way to run a script for large userbase?

I have users stored in postgresql database (~10 M) and i want to send all of them emails.
Currently i have written a nodejs script which basically fetches users 1000 at a time (Offset and limit in sql) and queues the request in rabbit MQ. Now this seems clumsy to me, as if the node process fails at any time i have to restart the process (i am currently keeping track of number of users skipped per query, and can restart back at the previous number skipped found from logs). This might lead to some users receiving duplicate email and some not receiving any. I can create a new table with new column indicating whether email has been to that person or not, but in my current situation i cant do so. Neither can i create a new table nor can i add a new row to existing table. (Seems to me like idempotent problem?).
How would you approach this problem? Do you think compound indexes might help. Please explain.
The best way to handle this is indeed to store who received an email, so there's no chance of doing it twice.
If you can't add tables or columns to your existing database, just create a new database for this purpose. If you want to be able to recover from crashes, you will need to store who got the email somewhere so if you are given hard restrictions on not storing this in your main database, get creative with another storage mechanism.

How to seed data with CloudKit?

I need to create some records in CloudKit for each user when they start an app.
I can't just write a seed function that create records. Because when the user starts the app in two devices, they will each write their own seed record.
What I want instead is for the first device to write to CloudKit gets to create the record. And then second device will simply update the values of those records no recreate them.
How can I achieve this?
You have a few options available to you, but all could potentially lead to race-conditions when attempting to write both at the same time, but the actuality of it happening is minimal.
No matter which approach is taken, you should always take the stance of query first. Check if the record exists, update it if needed, then write the new/updated values.
So, in your example:
The first app would query for the record, and create the record - because no record exists.
The second app to launch would query for the record, find it, then do nothing, because the record exists.
Each record in CloudKit maintains a modificationDate. So if you are really concerned about overwriting data that shouldn't be overridden, then you can add attentional queries and date checks to determine if the write should happen.

Chaining spark sql queries over temporary views?

We're exploring the possibility to use temporary views in spark and relate this to some actual file storage - or to other temporary views. We want to achieve something like:
Some user uploads data to some S3/hdfs file storage.
A (temporary) view is defined so that spark sql queries can be run against the data.
Some other temporary view (referring to some other data) is created.
A third temporary view is created that joins data from (2) and (3).
By selecting from the view from (4) the user gets a table that reflects the joined data from (2) and (3). This result can be further processed and stored in a new temporary view and so on.
So we end up with a tree of temporary views - querying their parent temporary views until they end up loading data from the filesystem. Basically we want to store transformation steps (selecting, joining, filtering, modifying etc) on the data - without storing new versions. The spark SQL-support and temporary views seems like a good fit.
We did some successful testing. The idea is to store the specification of these temporary views in our application and recreate them during startup (as temporary or global views).
Not sure if this is viable solution? One problem is that we need to know how the temporary views are related (which one queries which). We create them like:
sparkSession.sql("select * from other_temp_view").createTempView(name)
So, when this is run we have to make sure that other_temp_view is already created in the session. Not sure how this can be achieved. One idea is to store a timestamp and recreate them in the same order. This could be ok since out views most likely will have to be "immutable". We're not allowed to change a query that other queries relies on.
Any thoughts would be most appreciated.
I would definately go with the SessionCatalog object :
https://jaceklaskowski.gitbooks.io/mastering-spark-sql/spark-sql-SessionCatalog.html
You can access it with spark.sessionState.catalog

Restricting access to Excel source data

I have an Excel template which reads data from a source Excel file using vlookups and Index/Match functions. I there a way to prevent the end user from accessing the source data file/sheet? e.g. by storing the source file on a remote location and make the vlookups read from there..
Depending on what resources are available to you, it may be difficult to prevent users from just going around the restrictions you put in place. Even if the data is in a database table you will need measures in place to prevent users from querying it outside of your Excel template. I don't know your situation, but ideally there would be someone (i.e. database administrator, infosec, back-end developer) who could help engineer a proper solution.
Having said that, I do believe your idea around using MS SQL Server could be a good way to go. You could create stored procedures instead of using sql queries to limit access. See this link for more details:
Managing Permissions with Stored Procedures in SQL Server
In addition, I would be worried about users figuring out other user IDs and arbitrarily accessing data. You could implement some sort of protection by having a mapping table so that there's no way to access information with user IDs. The table would be as follows:
Columns: randomKey, userId, creationDate
randomKey is just an x digit random number/letter sequence
creationDateTime is a time stamp and used for timeout purposes
Whenever someone needs a user id you would run a stored procedure that adds a record to the mapping table. You input the user id, the procedure creates a record and returns the key. You provide the user with the key which they enter in your template. A separate stored procedure takes the key and resolves to the user id (using the mapping table) and returns the requested information. These keys expire. Either they can be single use (the procedure deletes the record from the mapping table) or use a timeout (if creationDateTime is more than x hours/days old it will not return data).
For the keys, Mark Ransom shared an interesting solution for creating random IDs for which you could base your logic:
Generate 6 Digit unique number
Sounds like a lot of work, but if there is sensitivity around your data it's worth building a more robust process around it. There's probably a better way to approach this, but I hope it at least gives you food for thought.
No, it's not possible.
Moreover, you absolutely NEED these files open to refresh the values in formulas that refer them. When you open a file with external references, their values will be calculated from local cache (which may not be equal to actual remote file contents). When you open the remote files, the values will refresh.

Global Temporary Table

Help me understand how Global temporary table works
I have process which is going to be threaded and requires data visible only to that thread session. So we opted for Global Temporary Table.
Is it better to leave global temporary table not being dropped after all threads are completed or is it wise to drop the table. Call to this process can happen once or twice in a day.
Around 4 tables are required
Oracle Temp tables are NOT like SQL Server #temp tables. I can't see any reason to continuously drop/create the tables. The data is gone on a per session basis anyways once the transaction or session is completed (depends on table creation options). If you have multiple threads using the same db session, they will see each other's data. If you have one session per thread, then the data is limited in scope as you mentioned. See example here.
If you drop global temporary table and recreate it then it is not impacting to any database activities and server disk io activities because global temporary tables are created in temp tablespace where no archive is generating and not checkpoint is updating header of tempfile. Purpose of temporary table is only accurately maintained in this case.

Resources