Reading columns with different valuetypes

Reading columns with different valuetypes - cassandra

A SliceQuery< Long, String, String > says the keytype is long, column name is string and column value is string. When I execute the slice query using QueryResult < ColumnSlice< String, String > > I can get all the columns, but the key is long so there must be a column whose value is long type. It's a bit confusing for me to see how type safety works here(since query result will get a column type ).
Also, if there is a column with value type other than string, then problem must arise.
How to have a generic slicequery that can be used to query columns of different value types ?
P.S : I am new to cassandra/hector.
Thanks

Almost. The first type is the type of the row key as you point out, but the row key is not stored as a column. The row key is stored off in some other special place. This is one of those gotcha's that folks coming from the relational DB world (like me) trip over.
As for how to manage column values with different types, there's a two-pronged approach. First, you store the value as a byte array and serialize it yourself. Second, you key off of the column name to tell you which column - and therefore which value type - you're dealing with. Once you know the correct type you can use the appropriate Serializer to deserialize the byte value into a variable of the correct type. For your own complex objects and special types, you can write your own serializers.

Related

Querying the same amount of labels in Presto

I have a field in a table named types and this field can have one or more type value, like this:(they are string inside [] and all between ""
["type1","type2"]
["type1"]
["type3","type4","type11"]
This table has 100 distinct type of types values.
What I need: I want to query this table in a way that I can ensure that I will have the same count of each type value, let's say:
100type1, 100type2,... 100type100
In the above example, I have 2 types1..

Excel Power Query - from web with dynamic worksheet cell value

We have a spreadsheet that gets updated monthly, which queries some data from our server.
The query url looks like this:
http://example.com/?2016-01-31
The returned data is in a json format, like below:
{"CID":"1160","date":"2016-01-31","rate":{"USD":1.22}}
We only need the value of 1.22 from the above and I can get that inserted into the worksheet with no problem.
My questions:
1. How to use a cell value [contain the date] to pass the date parameter [2016-01-31] in the query and displays the result in the cell next to it.
2. There's a long list of dates in a column, can this query be filled down automatically per each date?
3. When I load the query result to the worksheet, it always load in pairs. [taking up two cells, one says "Value", the other contains the value which is "1.22" in my case]. Ideally I would only need "1.22", not the title, can this be removed? [Del won't work, will give you a "Column 1" instead, or you have to hide the entire row which will mess up with the layout].
I know this is a lot to ask but I've tried a lot of search and reading in the last few days and I have to say the M language beats me.
Thanks in advance.

Convert your Web.Contents() request into a function:
let
myFunct = ( param as date ) => let
x = Web.Contents(.... & Date.ToText(date) & ....)
in
x
in
myFunct
Reference your data request function from a new query, include any transformations you need (in this case JSON.Document, table expansions, remove extraneous data. Feel free to delete all the extra data here, including columns that just contain the label 'value'.
(assuming your table of domain values already exists) add a custom column like
=Expand(myFunct( [someparameter] ))
edit: got home and got into my bookmarks. Here is a more detailed reference for what you are looking to do: http://datachix.com/2014/05/22/power-query-functions-some-scenarios/

For a table - Add column where you get data and parse JSON
let
tt=#table(
{"date"},{
{"2017-01-01"},
{"2017-01-02"},
{"2017-01-03"}
}),
add_col = Table.AddColumn(tt, "USD", each Json.Document(Web.Contents("http://example.com/?date="&[date]))[rate][USD])
in
add_col
If you need only one value
Json.Document(Web.Contents("http://example.com/?date="&YOUR_DATE_STRING))[rate][USD]

CQL - Uniqueness of elements in a set of user defined types

C* sets guarantee that all elements in a set are unique. How does it work for user defined types (UDT)?
With simple types, the cell name is just the name of the CQL column concatenated with the column value. For example if we have
CREATE TABLE friendsets (
... user text PRIMARY KEY,
... friends set <text>
... );
We friends are stored as
(column=friends:'doug', value=)
(column=friends:'jon', value=)
What if friends is defined as a set of UTD (friends set < frozen Friend >) ? Will the name of the cells 'friends' concatenated with the serialized value of Friend?

Cassandra will serialize the value for frozen types to a BLOB when you save it to a table. The representation on disk should be identical from any other type for your set, but Cassandra will be able to deserialize the bytes to a UDT instance, once read by a query.

Filter based on existence in one table and non-existence in another

I have the following data model:
Record: Id, ..., CreateDate
FactA: RecordId, CreateDate
FactB: RecordId, CreateDate
Relationships exist from FactA to Record and FactB to Record.
I've written measures on Records such as this with no issues:
FactA's:=CALCULATE(DISTINCTCOUNT(Records[Id]), FactA)
FactB's:=CALCULATE(DISTINCTCOUNT(Records[Id]), FactB)
Now I'd like a count of Records with FactA but no FactB, in SQL I'd do a LEFT JOIN WHERE FactB.RecordId IS NULL but I can't figure out how to do similar in DAX. I've tried:
-- this returns blank, presumably because when there is a FactB then RecordId isn't blank, and when there is no Fact B then RecordId a NULL which isn't blank either
FactA_No_FactB:=CALCULATE(DISTINCTCOUNT(Records[Id]), FactA, FILTER(FactB, ISBLANK([RecordId])))
-- this returns the long "The value for columns "RecordId" in table "FactB" cannot be determined in the current context" error.
FactA_No_FactB:=CALCULATE(DISTINCTCOUNT(Records[Id]), FILTER(FactA, ISBLANK(FactB[RecordId])))
I've also tried various ways of using RELATED and RELATEDTABLE but I don't really understand enough about DAX and context to know what I'm doing.
Can someone explain how I can write the calculated measure to count Records with FactA but no FactB?
Thanks in advance.
Edit - Workaround
I've come up with this, it looks correct so far but I'm not sure if it is the generally correct way to do this:
-- Take the count with FactA and subtract the count of (FactA and FactB)
FactA_No_FactB:=CALCULATE(DISTINCTCOUNT(Records[Id]), FactA) - CALCULATE(DISTINCTCOUNT(Records[Id]), FactA, FactB)

Here's an alternative, that might still not be the best way of doing it:
FactA_No_FactB:=CALCULATE(DISTINCTCOUNT(Records[ID]), FILTER(Records,CONTAINS(FactA, FactA[RecordID],Records[ID]) && NOT(CONTAINS(FactB,FactB[RecordID],Records[ID]))))
The difference between my version and yours is that mine returns a value of 1 for those items in and A but not B and BLANK for everything else. Your version returns 1 for those items in A but not B, 0 for those in both A and B and BLANK for everything else. Depending on your use case, one outcome may be prefereable over the other.

Lucene BooleanQuery - Must be present in one of two columns

Not sure how to format the query in Lucene. The scenario is that the search term must be present in one of the two columns (either one is fine).
boolQuery.Add(query1, Occur.MUST) 'this one is fine
boolQuery.Add(query2, Occur.SHOULD)
boolQuery.Add(query3, Occur.SHOULD)
Brings up results even when the search term is not present at all in column 2 and column 3.
boolQuery.Add(query2, Occur.MUST)
boolQuery.Add(query3, Occur.SHOULD)
Does not bring up results when the search term is present in column 3 but not in column 2.
How do I format the query so that I get equivalent of this:
where column 1= val1 and (column 2 = val2 or column 3 = val2)

MUST, as the name suggests, makes the occurrence mandatory. SHOULD means optional. The first boolean query will basically match only documents hit by the first clause, but if any of them can be hit by the second or third clause, they will score higher. To get the results to match your desired linq (i assume that's what it is) statement, this should work (using java).
BooleanQuery q = new BooleanQuery();
BooleanQuery subQuery = new BooleanQuery();
subQuery.addClause(new BooleanClause(q2,Occur.SHOULD));
subQuery.addClause(new BooleanClause(q3,Occur.SHOULD));
q.addClause(new BooleanClause(q1, Occur.MUST));
q.addClause(new BooleanClause(subQuery,Occur.MUST));
Your confusion probably stems from the fact that the query API implements must and should as unary operators, while in the traditional programming languages AND and OR are binary operators

i solved a similar issue using query syntax:
+(col1:{query} OR col2:{query})
this will return the documents having the value {query} in at least one of the fields.
(note: i am using the classes Query and MultiFieldQueryParser)

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Reading columns with different valuetypes - cassandra

Related

Querying the same amount of labels in Presto

Excel Power Query - from web with dynamic worksheet cell value

CQL - Uniqueness of elements in a set of user defined types

Filter based on existence in one table and non-existence in another

Lucene BooleanQuery - Must be present in one of two columns

Categories

Resources