How do I find the column listing in UniVerse with RetrieVe or SQL? - u2

I've got an issue where a table (file) is set up to return column foo on LIST table and SELECT * FROM table. I need to know the other possible columns in table. I'm pretty sure this was achieved by setting # (behavoir definition of unqualified LIST), and #select (behavoir definition of * with very SELECT) but I don't know how to get the full list of columns. How do I read the table schema in uvsh and query for the physical table columns?
Running LIST.ITEM on the table shows me a list of all of the field numbers and values, but how do i find the DISPLAY NAME and column name of the numbered fields?

A previous answer I received on SO had mentioned LIST DICT as a way to get some metadata. This was in fact what I think I wanted. The official documentation uses LIST DICT; however, on my system I thought there wasn't LIST DICT, there is. It requires a file argument. It simply wasn't a separate command either (many commands have spaces in them), instead in (UniVerse 10.1) list is defined as:
LIST [ DICT | USING [ DICT ] dictname ] filename [ records | FROM n ]
[ selection ] [ output.limiter ] [ sort ] [ output ] [ report.qualifiers ] [TOXML
[ELEMENTS] [WITHDTD] [XMLMAPPING mapping_file]]
So in summary, The same verb (LIST) to query data is used to query the schema, with the same destination file.
Originally when I presumed there wasn't a LIST DICT I went searching through the VOC file with RetrieVe using LIST VOC WITH NAME MATCHING LIST... I was able to identify a like-named LIST.DICT, a PAragraph that displays the contents of DICTIONARIES sorted by record type. This did exactly what I wanted except the result was a unmanageable list of 400 rows. I don't see the documentation for LIST.DICT anywhere, and it seems as if record qualifiers and report qualifiers don't work on the LIST.DICT like they do on LIST. This was all true and compounded my confusion, in UniVerse parlance: LIST.DICT is a phrase, a stored statement, LIST is the verb I needed.
So now back to my questions:
Any idea on how to make the output of LIST DICT manageable?
You can use the report qualifier and explicitly state columns by using the positional F# syntax, or by stating the names of the columns.
LIST DICT <file> <columns>
on my system you can get a listing of the field names and their display names for instance by issuing
LIST DICT <file> NAME
The NAME comes from the master dictionary, which can be queried using LIST DICT DICT.DICT.
Now, I can see the fields in a nice (fairly clean) list, but I haven't the slightest idea of how to query a file for all of its fields.

Here are the basic varients:
LIST DICT foo NAME
SELECT #ID, NAME FROM DICT foo;
These will give you a physical location that corresponds to the LIST-ITEM verb:
SORT DICT foo WITH TYPE EQ "D" BY LOC LOC NAME
SORT DICT foo WITH TYPE EQ "D" BY LOC LOC NAME TOXML
Note that the "column name" or #ID is displayed by default during a LIST or SORT. TOXML can be useful, but there are a host of other XML features built in.

In Universe every file has an associated dictionary file. The dictionary file is basically just a data file and can be treated exactly like a data file for a variety of purposes. There are 3 things that make a dictionary file special:
You access it via the "DICT" keyword in front of the data file name.
The LIST (and related commands) command will use it by default to process the associated data file.
It has a structure that is defined by the DICT.DICT file, which you need to follow in order for item 2 above to work.
Generally, the dictionaries are maintained by the programmers and database administrators manually. There's no controls in Universe to guarantee that DICT records are created for every field in the associated data file and no reason that you can't have many DICT records for each field. Dictionary items are used to control output formatting and conversions, so it's normal to have multiple DICT items for each data field.
Dictionary records can also join data fields together, perform operations on multiple fields and grab data from other files. So at times it's not even clear what data field a DICT record actually relates to.
The only way to come up with a simple list of dictionary items that correspond to a data file is by inspection. Use the LIST DICT {filename} command and find the entries with the least amount data manipulation in them in their formatting fields.

Some more useful statements that could be helpful to you:
SORT DICT filename
(this is the same as list, except the result is sorted)
SORT ONLY DICT filename
(this displays only the dictionary name)

Related

SharePoint: Change item in the source table (lookup table)

I am new in SharePoint. I have multiple lists with lookups. In list A (target) I have multiple columns from the list B (source). Currently I have data in both lists.
Now I need to change some values for some items (rows) in list B (source). I am curious if I change the rows in list B, will it affect other lookup data in list A (where previous (old) row data is used)?
I am sorry if it is a trivial question. I am new and data is quite critical data to make a mistake.
Can changing (some) data in the source list cause issue in the target list?
Change (some) data in the source list, the source data selected by the lookup column will automatically change accordingly. But no error will be reported:

GCP Data Catalog - search columns containing a dot in the column name

Take the public github dataset as an example
SELECT
*
FROM
`bigquery-public-data.github_repos.commits`
LIMIT
2
There are column names like
difference.old_mode
via search:
column:difference.old_mode
will show no results
So, in this case the period isn't actually the column name, its an indication that you're dealing with a complex type (there's a record/struct column of name difference, and within that exists a column named old_mode.
Per search reference there's no special syntax for complex schemas documented.
A suggestion might be to leverage a logical AND operator like column:(difference,old_mode). It's not as precise as specifying the column relationship, but it should return the results you're interested in receiving.

Python3 Pandas dataframes: beside columns names are there also columns labels?

Many database management systems, such as Oracle, SQL Server or even statistical software like SAS, allow having, beside field names, also field labels.
E.g., in DBMS one may have a table called "Table1" with, among other fields, two fields called "income_A" and "income_B".
Now, in the DBMS logic, "income_A" and "income_B" are the field names.
Beside a name, those two fields can also have plain English labels associated to them, which clarify the actual meaning of those two fields; such as "A - Income of households with dependable children where both parents work and they have a post-degree level of education" and "B - Income of empty-nesters households where only one works".
Is there anything like that in Python3 Pandas dataframes?
I mean, I know I can give a dataframe column a "label" (which is, seen from the above DBMS perspective, more like a "name", in the sense that you can use it to refer to the column itself).
But can I also associate a longer description to the column, something that I can choose to display instead of the column "label" in print-outs and reports or that I can save into dataframe exports, e.g., in MS Excel format? Or do I have to do it all using data dictionaries, instead?
It does not seem that there is a way to store such meta info other than in the columns name. But the column name can be quite verbose. I tested up to 100 characters. Make sure to pass it as a collection.
Such a long name could be annoying to use for indexing in the code. You could use loc/iloc or assign the name to a string for use in indexing.
In[10]: pd.DataFrame([1, 2, 3, 4],columns=['how long can this be i want to know please tell me'])
Out[10]:
how long can this be i want to know please tell me
0 1
1 2
2 3
3 4
This page shows that the columns don't really have any attributes to play with other than the lablels.
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.columns.html
There is some more info you can get about a dataframe:
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.info.html

Using arraylist in decision tables

IBM ODM
I was working on a decision table which has to add the elements in columns to an array list in action column.
I have some an attribute with several names. I would like to add some of those names to an array list so that I can exclude them from executing for particular rule.
If anyone has other options apart from using decision table that will be appreciated.
I have created a variable in the Variable List which is a type of Array. Then, I created a decision table which checks the rule name and then adds "String"(Object Name) to the Array. Now, we are all set to exclude that particular String/Object for some particular Rules. In the definition part of your rule, Check "where this xxx is not one of Array". It can be useful when you have a lot of Strings/Objects to exclude from Execution.

Handling the following use case in Cassandra?

I've been given the task of modelling a simple in Cassandra. Coming from an almost solely SQL background, though, I'm having a bit of trouble figuring it out.
Basically, we have a list of feeds that we're listening to that update periodically. This can be in RSS, JSON, ATOM, XML, etc (depending on the feed).
What we want to do is periodically check for new items in each feed, convert the data into a few formats (i.e. JSON and RSS) and store that in a Cassandra store.
So, in an RBDMS, the structure would be something akin to:
Feed:
feedId
name
URL
FeedItem:
feedItemId
feedId
title
json
rss
created_time
I'm confused as to how to model that data in Cassandra to facilitate simple things such as getting x amount of items for a specific feed in descending created order (which is probably the most common query).
I've heard of one strategy that mentions having a composite key storing, in this example, the the created_time as a time-based UUID with the feed item ID but I'm still a little confused.
For example, lets say I have a series of rows whose key is basically the feedId. Inside each row, I store a range of columns as mentioned above. The question is, where does the actual data go (i.e. JSON, RSS, title)? Would I have to store all the data for that 'record' as the column value?
I think I'm confusing wide rows and narrow (short?) rows as I like the idea of the composite key but I also want to store other data with each record and I'm not sure how to meld the two together...
You can store everything in one column family. However If the data for each FeedItem is very large, you can split the data for each FeedItem into another column family.
For example, you can have 1 column familyfor Feed, and the columns of that key are FeedItem ids, something like,
Feeds # column family
FeedId1 #key
time-stamp-1-feed-item-id1 #columns have no value, or values are enough info
time-stamp-2-feed-item-id2 #to show summary info in a results list
The Feeds column allows you to quickly get the last N items from a feed, but querying for the last N items of a Feed doesn't require fetching all the data for each FeedItem, either nothing is fetched, or just a summary.
Then you can use another column family to store the actual FeedItem data,
FeedItems # column family
feed-item-id1 # key
rss # 1 column for each field of a FeedItem
title #
...
Using CQL should be easier to understand to you as per your SQL background.
Cassandra (and NoSQL in general) is very fast and you don't have real benefits from using a related table for feeds, and anyway you will not be capable of doing JOINs. Obviously you can still create two tables if that's comfortable for you, but you will have to manage linking data inside your application code.
You can use something like:
CREATE TABLE FeedItem (
feedItemId ascii PRIMARY KEY,
feedId ascii,
feedName ascii,
feedURL ascii,
title ascii,
json ascii,
rss ascii,
created_time ascii );
Here I used ascii fields for everything. You can choose to use different data types for feedItemId or created_time, and available data types can be found here, and depending on which languages and client you are using it can be transparent or require some more work to make them works.
You may want to add some secondary indexes. For example, if you want to search for feeds items from a specific feedId, something like:
SELECT * FROM FeedItem where feedId = '123';
To create the index:
CREATE INDEX FeedItem_feedId ON FeedItem (feedId);
Sorting / Ordering, alas, it's not something easy in Cassandra. Maybe reading here and here can give you some clues where to start looking for, and also that's really depending on the cassandra version you're going to use.

Resources