GridGain join query with both objects returned - gridgain

Is it possible to return both objects of a join as the result of a GridGain cache query?
We can get either one of the sides of the join or fields from both (and then use these to retrieve each object separately), but looking at the examples and documentation there seems to be no way to get both objects.
Thanks!

#dejan- In GridGain and Apache Ignite, you can use _key and _val functionality with SqlFieldsQuery in order to return an object. For example -
SqlFieldsQuery sql = new SqlFieldsQuery(
"select a._key, a._val, b._val from SomeTypeA a, SomeTypeB b " +
"where a.id = b.otherId");
try (QueryCursor<List<?>> cursor = cache.query(sql) {
for (List<?> row : cursor)
System.out.println("Row: " + row);
}
Note that in this case the object will be returned in a Serialized form.

First of all, GridGain Open Source edition is now Apache Ignite, so I would recommend switching.
In Ignite, you can return exactly the fields you need from a query using SqlFieldsQuery, like so:
SqlFieldsQuery sql = new SqlFieldsQuery(
"select fieldA1, fieldA2, fieldB3 from SomeTypeA a, SomeTypeB b " +
"where a.id = b.otherId");
try (QueryCursor<List<?>> cursor = cache.query(sql) {
for (List<?> row : cursor)
System.out.println("Row: " + row);
}
In GridGain open source edtion, you can use GridGain fields query APIs as well.

Related

ServiceStack.OrmLite Using Limit in SQL.In filter

I have a parent/child table setup - Items/ItemDetails. This part works:
var q = db.From<Item>(); //various where clauses based on request
items = db.Select<Item>(q);
q = q.Select(a => a.ITEM_NO);
itemDetails = db.Select<ItemDetail>(x => Sql.In(x.ITEM_NO, q));
Trying to add paging to improve the performance of this request for large data sets, I'm having trouble getting the .Limit(skip, rows) function to work in the SQL.In statement of the child table.
var q = db.From<Item>().Limit(skip, rows);
items = db.Select<Item>(q);
q = q.Select(a => a.ITEM_NO);
itemDetails = db.Select<ItemDetail>(x => Sql.In(x.ITEM_NO, q));
It works when limiting the results in the first select, but when used in the child data pull I get "Only one expression can be specified in the select list when the subquery is not introduced with EXISTS."
The SQL that comes out changes the where subquery to:
WHERE "ITEM_NO" IN (SELECT * FROM (SELECT "ITEM_NO", ROW_NUMBER() OVER
(ORDER BY "ITEM"."ITEM_NO") As RowNum
FROM "ITEM") AS RowConstrainedResult WHERE RowNum > 5 AND RowNum <= 15)
I understand the SQL error is because I am selecting more than one column in the IN clause. Is there a better way to write this to avoid the error?
Thanks
If you're using SQL Server 2012 or later you should use SqlServer2012Dialect.Provider, e.g:
container.Register<IDbConnectionFactory>(c =>
new OrmLiteConnectionFactory(connString, SqlServer2012Dialect.Provider));
Which lets OrmLite use the paging support added in SQL Server 2012 instead of resorting to use the windowing function hack required to implement paging for earlier versions of SQL Server.

Cassandra Datastax BatchStatement with Lightweight Transactions

I'm having difficulty combining BatchStatement and Lightweight Transactions using the Datastax java driver.
Consider the following:
String batch =
"BEGIN BATCH "
+ "Update mykeyspace.mytable set record_version = 102 where id = '" + id + "' if record_version = 101;
" <additional batched statements>
+ "APPLY BATCH";
Row row = session.execute(batch).one();
if (! row.getBool("[applied]")) {
throw new RuntimeException("Optimistic Lock Failure!");
}
This functions as expected and indicates whether my lightweight transaction succeeded and my batch was applied. All is good. However if I try the same thing using a BatchStatement, I run into a couple of problems:
-- My lightweight transaction "if" clause is ignored and the update is always applied
-- The "Row" result is null making it impossible to execute the final row.getBool("[applied]") check.
String update = "Update mykeyspace.mytable set record_version = ? where id = ? if record_version = ?";
PreparedStatement pStmt = getSession().prepare(update);
BatchStatement batch = new BatchStatement();
batch.add(new BoundStatement(pStmt).bind(newVersion, id, oldVersion));
Row row = session.execute(batch).one(); <------ Row is null!
if (! row.getBool("[applied]")) {
throw new RuntimeException("Optimistic Lock Failure!");
}
Am I doing this wrong? Or is this a limitation with the datastax BatchStatement?
I am encountering this same issue. I opened a ticket with DataStax support yesterday and received the following answer:
Currently Lightweight Transactions as PreparedStatements within a BATCH are not supported. This is why you are encountering this issue.
There is nothing on the immediate roadmap to include this feature in Cassandra.
That suggests eliminating the PreparedStatement will workaround the issue. I'm going to try that myself, but haven't yet.
[Update]
I've been trying to work around this issue. Based on the earlier feedback, I assumed the restriction was on using a PreparedStatement for the conditional update.
I tried changing my code to not use a PreparedStatement, but that still didn't work when using a BatchStatement that contained a RegularStatement instead of a PreparedStatement.
BatchStatement batchStatement = new BatchStatement();
batchStatement.add(conditionalRegularStatement);
session.executeQuery(batchStatement);
They only thing that seems to work is to do an executeQuery with a raw query string that includes the batch.
session.executeQuery("BEGIN BATCH " + conditionalRegularStatement.getQueryString() + " APPLY BATCH");
Update:
This issue is resolved in CASSANDRA-7337 which was fixed in C* 2.0.9.
The latest DataStax Enterprise is now on 2.0.11. If you are seeing this issue ==> Upgrade!
The 2nd code snippet doesn't look right to me (there's no constructor taking a string or you've missed the prepared statement).
Can you try instead the following:
String update = "Update mykeyspace.mytable set record_version = ? where id = ? if record_version = ?";
PreparedStatement pStmt = session.prepare(update);
BatchStatement batch = new BatchStatement();
batch.add(pStmt.bind(newVersion, id, recordVersion));
Row row = session.execute(batch).one();
if (! row.getBool("[applied]")) {
throw new RuntimeException("Optimistic Lock Failure!");
}

Most efficient way to read from bottom of Azure Table Storage

I have a an Azure table which serves as an event log. I need the most efficient way to read the bottom of the table to retrieve the most recent entries.
What is the most efficient way of doing this?
First of all, I would really advice you to base your partition key on UTC ticks. You can do it in a way that all the antities are ordered from latest to oldest.
Then if you want to get lets say 100 latest logs you just call (lets say that query is IQueryable something from your favorite client - we use Lucifure Stash): query.Take(100);
If you want to fetch entities for certain period you write: query.Where(x => x.PartitionKey <= value); or something similar.
The "value" variable has to be constructed based on the way you construct the values for partition key.
Assuming you want to fetch the data for last 15 minutes, try this pseudo code:
DateTime toDateTime = DateTime.UtcNow;
DateTime fromDateTime = toDateTime.AddMinutes(-15);
string myPartitionKeyFrom = fromDateTime.ToString("yy-MM");
string myPartitionKeyTo = toDateTime.ToString("yy-MM");
string query = "";
if (myPartitionKeyFrom.Equals(myPartitionKeyTo))//In case both time periods fall in same month, then we can directly hit that partition.
{
query += "(PartitionKey eq '" + myPartitionKeyFrom + "') ";
}
else // Otherwise we would need to do a greater than and lesser than stuff.
{
query += "(PartitionKey ge '" + myPartitionKeyFrom + "' and PartitionKey le '" + myPartitionKeyTo + "') ";
}
query += "and (RowKey ge '" + fromDateTime.ToString() + "' and RowKey le '" + toDateTime.ToString() + "')";
If you want to fetch latest 'n' number of entries then you need to slightly modify your PartitionKey and RowKey value, So that latest entries will be pushed to the top of the table.
For this you need to compute both the keys using DateTime.MaxValue.Subtract(DateTime.UtcNow).Ticks; instead of DateTime.UtcNow.
Microsoft provides a SemanticLogging framework that has a specific sink to log to Azure Table.
If you look at the library code, it generates a partition key (in reverse order) based on a Datetime :
static string GeneratePartitionKeyReversed(DateTime dateTime)
{
dateTime = dateTime.AddMinutes(-1.0);
return GetTicksReversed(
new DateTime(dateTime.Year, dateTime.Month, dateTime.Day, dateTime.Hour, dateTime.Minute, 0));
}
static string GetTicksReversed(DateTime dateTime)
{
return (DateTime.MaxValue - dateTime.ToUniversalTime())
.Ticks.ToString("d19", (IFormatProvider)CultureInfo.InvariantCulture);
}
So you can implement the same logic in your application to build your partitionkey.
If you want to retrieve the logs for a specific date range, you can write a query that looks like that:
var minDate = GeneratePartitionKeyReversed(DateTime.UtcNow.AddHours(-2));
var maxDate = GeneratePartitionKeyReversed(DateTime.UtcNow.AddHours(-1));
// Get the cloud table
var cloudTable = GetCloudTable();
// Build the query
IQueryable<DynamicTableEntity> query = cloudTable.CreateQuery<DynamicTableEntity>();
// condition for max date
query = query.Where(a => string.Compare(a.PartitionKey, maxDate,
StringComparison.Ordinal) >= 0);
// condition for min date
query = query.Where(a => string.Compare(a.PartitionKey, minDate,
StringComparison.Ordinal) <= 0);3

Error in Linq: The text data type cannot be selected as DISTINCT because it is not comparable

I've a problem with LINQ. Basically a third party database that I need to connect to is using the now depreciated text field (I can't change this) and I need to execute a distinct clause in my linq on results that contain this field.
I don't want to do a ToList() before executing the Distinct() as that will result in thousands of records coming back from the database that I don't require and will annoy the client as they get charged for bandwidth usage. I only need the first 15 distinct records.
Anyway query is below:
var query = (from s in db.tSearches
join sc in db.tSearchIndexes on s.GUID equals sc.CPSGUID
join a in db.tAttributes on sc.AttributeGUID equals a.GUID
where s.Notes != null && a.Attribute == "Featured"
select new FeaturedVacancy
{
Id = s.GUID,
DateOpened = s.DateOpened,
Notes = s.Notes
});
return query.Distinct().OrderByDescending(x => x.DateOpened);
I know I can do a subquery to do the same thing as above (tSearches contains unique records) but I'd rather a more straightfoward solution if available as I need to change a number of similar queries throughout the code to get this working.
No answers on how to do this so I went with my first suggestion and retrieved the unique records first from tSearch then constructed a subquery with the non unique records and filtered the search results by this subquery. Answer below:
var query = (from s in db.tSearches
where s.DateClosed == null && s.ConfidentialNotes != null
orderby s.DateOpened descending
select new FeaturedVacancy
{
Id = s.GUID,
Notes = s.ConfidentialNotes
});
/* Now filter by our 'Featured' attribute */
var subQuery = from sc in db.tSearchIndexes
join a in db.tAttributes on sc.AttributeGUID equals a.GUID
where a.Attribute == "Featured"
select sc.CPSGUID;
query = query.Where(x => subQuery.Contains(x.Id));
return query;

Subsonic 3 Simple Query inner join sql syntax

I want to perform a simple join on two tables (BusinessUnit and UserBusinessUnit), so I can get a list of all BusinessUnits allocated to a given user.
The first attempt works, but there's no override of Select which allows me to restrict the columns returned (I get all columns from both tables):
var db = new KensDB();
SqlQuery query = db.Select
.From<BusinessUnit>()
.InnerJoin<UserBusinessUnit>( BusinessUnitTable.IdColumn, UserBusinessUnitTable.BusinessUnitIdColumn )
.Where( BusinessUnitTable.RecordStatusColumn ).IsEqualTo( 1 )
.And( UserBusinessUnitTable.UserIdColumn ).IsEqualTo( userId );
The second attept allows the column name restriction, but the generated sql contains pluralised table names (?)
SqlQuery query = new Select( new string[] { BusinessUnitTable.IdColumn, BusinessUnitTable.NameColumn } )
.From<BusinessUnit>()
.InnerJoin<UserBusinessUnit>( BusinessUnitTable.IdColumn, UserBusinessUnitTable.BusinessUnitIdColumn )
.Where( BusinessUnitTable.RecordStatusColumn ).IsEqualTo( 1 )
.And( UserBusinessUnitTable.UserIdColumn ).IsEqualTo( userId );
Produces...
SELECT [BusinessUnits].[Id], [BusinessUnits].[Name]
FROM [BusinessUnits]
INNER JOIN [UserBusinessUnits]
ON [BusinessUnits].[Id] = [UserBusinessUnits].[BusinessUnitId]
WHERE [BusinessUnits].[RecordStatus] = #0
AND [UserBusinessUnits].[UserId] = #1
So, two questions:
- How do I restrict the columns returned in method 1?
- Why does method 2 pluralise the column names in the generated SQL (and can I get round this?)
I'm using 3.0.0.3...
So far my experience with 3.0.0.3 suggests that this is not possible yet with the query tool, although it is with version 2.
I think the preferred method (so far) with version 3 is to use a linq query with something like:
var busUnits = from b in BusinessUnit.All()
join u in UserBusinessUnit.All() on b.Id equals u.BusinessUnitId
select b;
I ran into the pluralized table names myself, but it was because I'd only re-run one template after making schema changes.
Once I re-ran all the templates, the plural table names went away.
Try re-running all 4 templates and see if that solves it for you.

Resources