NSPredicate SUBQUERY aggregates

NSPredicate SUBQUERY aggregates - core-data

In all of the examples I've seen of SUBQUERY, #count is always used, e.g.,
SUBQUERY(employees, $e, $e.lastName == "Smith").#count > 0
So I have three very closely related questions, which work best as a single StackOverflow question:
Is there any use for SUBQUERY without #count? If so, I haven't found it.
Can any other aggregates be used with SUBQUERY? If so, I haven't been able to get them to work. (See below.)
What exactly does SUBQUERY return? The logical thing seems to be a filtered collection of the type of the first parameter. (I'm speaking conceptually here. Obviously the SQL will be something different, as SQL debugging shows pretty plainly.)
This gives an exception, as does every other aggregate I've tried other than #count, which seems to show that no other aggregates can be used:
SUBQUERY(employees, $e, $e.lastName == "Smith").#avg.salary > 75000
(Let's leave aside for the moment whether this is the best way to express such a thing. The question is about SUBQUERY, not about how best to formulate a query.)
Mundi helpfully pointed out that another use for SUBQUERY is nested subqueries. Yes, I'm aware of them and have used them, but this question is really about the result of SUBQUERY. If we think of SUBQUERY as a function, what is its result and in what ways can it be used, other than with #count?
UPDATE
Thanks to Mundi's research, it appears that aggregates like #avg do in fact work with SUBQUERY, particularly with an in-memory filter such as filteredArrayUsingPredicate:, but not with Core Data when the underlying data store is NSSQLiteStoreType.

Yes, think of nested subqueries. See Dave DeLong's answer that explains subquery in very simple terms.
The reason your #avg does not work is unknown because it should actually work on any collection that has the appropriate attributes required by the aggregate function.
See 1.: SUBQUERY returns a collection.
Here is the transcript of an experiment that proves that the subquery works as expected.
import UIKit
import CoreData
class Department: NSManagedObject {
var name = "Department"
var employees = Set<Person>()
convenience init(name: String) {
self.init()
self.name = name
}
}
class Person: NSManagedObject {
var name: String = "Smith"
var salary: NSNumber = 0
convenience init(name: String, salary: NSNumber) {
self.init()
self.name = name
self.salary = salary
}
}
let department = Department()
department.employees = Set ([
Person(name: "Smith", salary: NSNumber(double: 30000)),
Person(name: "Smith", salary: NSNumber(double: 60000)) ])
let predicate = NSPredicate(format: "SUBQUERY(employees, $e, $e.name = %#).#avg.salary > 44000", "Smith")
let depts = [department, Department()]
let filtered = (depts as NSArray).filteredArrayUsingPredicate(predicate)
The above returns exactly one department with the two employees. If I substitute 45000 in the predicate, the result will return nothing.

Related

In Core Data, how sort an NSFetchRequest depending on the sum of an attribute of a child entity? (SwiftUI)

I am building an iOS app in SwiftUI for which I have a Core Data model with two entities:
CategoryEntity with attribute: name (String)
ExpenseEntity with attributes: name (String) and amount (Double)
There is a To-many relationship between CategoryEntity and ExpenseEntity (A category can have many expenses).
I’m fetching the categories and showing them in a list together with the sum of the expenses for each category as follows: Link to app screenshot
I would like to add a sort to the fetch request so the categories appear in order depending on the total amount of their expenses. In the example of the previous picture, the order of appearance that I would like to get would be: Tech, Clothes, Food and Transport. I don’t know how to approach this problem. Any suggestions?
In my current implementation of the request, the sorted is done alphabetically:
// My current implementation for fetching the categories
func fetchCategories() {
let request = NSFetchRequest<CategoryEntity>(entityName: "CategoryEntity")
let sort = NSSortDescriptor(keyPath: \CategoryEntity.name, ascending: true)
request.sortDescriptors = [sort]
do {
fetchedCategories = try manager.context.fetch(request)
} catch let error {
print("Error fetching. \(error.localizedDescription)")
}
}

You don't have to make another FetchRequest, you can just sort in a computed property like this:
(I assume your fetched results come into a var called fetchedCategories.)
var sortedCategories: [CategoryEntity] {
return fetchedCategories.sorted(by: { cat1, cat2 in
cat1.expensesArray.reduce(0, { $0 + $1.amount }) >
cat2.expensesArray.reduce(0, { $0 + $1.amount })
})
}
So this sorts the fetchedCategories array by a comparing rule, that looks at the sum of all cat1.expenses and compares it with the sum of cat2.expenses. The >says we want the large sums first.
You put the computed var directly in the View where you use it!
And where you used fetchedCategories before in your view (e.g. a ForEach), you now use sortedCategories.
This will update in the same way as the fetched results do.

One approach would be to include a derived attribute in your CategoryEntity model description which keeps the totals for you. For example, to sum the relevant values from the amount column within an expenses relation:
That attribute should be updated whenever you save your managed object context. You'll then be able to sort it just as you would any other attribute, without the performance cost of calculating the expense sum for each category whenever you sort.
Note that this option only really works if you don't have to do any filtering on expenses; for example, if you're looking at sorting based on expenses just in 2022, but your core data store also has seconds in 2021, the derived attribute might not give you the sort order you want.

Kentico 10 ObjectQuery join multiple tables

I am basically trying to run a query that gives me all the Users that have purchased a product with a particular SKU. Essentially this SQL here:
SELECT u.FirstName, u.LastName, u.Email
FROM COM_OrderItem oi INNER JOIN COM_Order o ON oi.OrderItemOrderID = o.OrderID
INNER JOIN COM_Customer c ON o.OrderCustomerID = c.CustomerID
INNER JOIN CMS_User u ON c.CustomerUserID = u.UserID
WHERE oi.OrderItemSKUID = 1013
I was trying to use the ObjectQuery API to try and achieve this but have no idea how to do this. The documentation here does not cover the specific type of scenario I am looking for. I came up with this just to try and see if it works but I don't get the three columns I am after in the result:
var test = OrderItemInfoProvider
.GetOrderItems()
.Source(orderItems => orderItems.Join<OrderInfo>("OrderItemOrderID", "OrderID"))
.Source(orders => orders.Join<CustomerInfo>("OrderCustomerID", "CustomerID"))
.Source(customers => customers.Join<UserInfo>("CustomerUserID", "UserID"))
.WhereEquals("OrderItemSKUID", 1013).Columns("FirstName", "LastName", "Email").Result;
I know this is definitely wrong and I would like to know the right way to achieve this. Perhaps using ObjectQuery is not the right approach here or maybe I can somehow just use raw SQL. I simply don't know enough about Kentico to understand the best approach here.

Actually, the ObjectQuery you created is correct. I tested it and it is providing the correct results. Are you sure that there are indeed orders in the system, which contain a product with SKUID 1013 (you can check that in the COM_OrderItem database table)?
Also, how are you accessing the results? Iterating through the results should look like this:
foreach (DataRow row in test.Tables[0].Rows)
{
string firstName = ValidationHelper.GetString(row["FirstName"], "");
string lastName = ValidationHelper.GetString(row["LastName"], "");
string email = ValidationHelper.GetString(row["Email"], "");
}

Grouping a dictionary NSFetchRequest by object ID

I need to return a list of objects along with a count of its related objects. It doesn't seem to be possible to do this in a single dictionary fetch request as I am unable to group the fetch results by objectID.
let objectIDExpression = NSExpressionDescription()
objectIDExpression.name = "objectID"
objectIDExpression.expression = NSExpression.expressionForEvaluatedObject()
objectIDExpression.expressionResultType = NSAttributeType.ObjectIDAttributeType
let countExpression = NSExpressionDescription()
countExpression.name = "count"
countExpression.expression = NSExpression(forFunction: "count:", arguments: [NSExpression(forKeyPath: "entries")])
countExpression.expressionResultType = .Integer32AttributeType
let fetchRequest = NSFetchRequest(entityName: "Tag")
fetchRequest.resultType = .DictionaryResultType
fetchRequest.propertiesToFetch = [objectIDExpression, countExpression]
fetchRequest.propertiesToGroupBy = [objectIDExpression]
var error: NSError?
if let results = self.context.executeFetchRequest(fetchRequest, error: &error) {
println(results)
}
When this request executes it errors with:
'Invalid keypath expression ((<NSExpressionDescription: 0x7f843bf2d470>), name objectID, isOptional 1, isTransient 0, entity (null), renamingIdentifier objectID, validation predicates (
), warnings (
), versionHashModifier (null)
userInfo {
}) passed to setPropertiesToFetch:'
I also tested just passing the "objectID" expression name, but that also fails.
Is there therefore no way to group by object ID?

You can get the required count without using propertiesToGroupBy. CoreData seems to infer the correct scope for the count and uses a sub-SELECT instead (strangely, only if the fetch includes an attribute as well as the objectID and count, see below). For example, I have Tag many-many with Items:
First attempt
I can fetch tagName and the count of items as follows:
NSFetchRequest *fetch = [NSFetchRequest fetchRequestWithEntityName:#"Tag"];
NSExpressionDescription *countED = [NSExpressionDescription new];
countED.expression = [NSExpression expressionWithFormat:#"count:(items)"];
countED.name = #"countOfItems";
countED.expressionResultType = NSInteger64AttributeType;
fetch.resultType = NSDictionaryResultType;
fetch.propertiesToFetch = #[#"tagName", countED];
NSArray *results = [self.context executeFetchRequest:fetch error:nil];
NSLog(#"results is %#", results);
which generates the following SQL:
SELECT t0.ZTAGNAME, (SELECT COUNT(t1.Z_3ITEMS) FROM Z_3TAGS t1 WHERE (t0.Z_PK = t1.Z_8TAGS) ) FROM ZTAG t0
Second attempt
Sadly, it seems CoreData gets confused if I try to select the objectID instead of the tagName:
NSExpressionDescription *selfED = [NSExpressionDescription new];
selfED.expression = [NSExpression expressionForEvaluatedObject];
selfED.name = #"self";
selfED.expressionResultType = NSObjectIDAttributeType;
fetch.resultType = NSDictionaryResultType;
fetch.propertiesToFetch = #[selfED, countED];
generates this SQL:
SELECT t0.Z_ENT, t0.Z_PK, COUNT( t1.Z_3ITEMS) FROM ZTAG t0 LEFT OUTER JOIN Z_3TAGS t1 ON t0.Z_PK = t1.Z_8TAGS
which counts all the rows from the outer join (and suggests that you need to group by the objectID, though we know that won't work).
Final attempt
However, include tagName and objectID, and all is well again:
fetch.propertiesToFetch = #[selfED, #"tagName", countED];
gives:
SELECT t0.Z_ENT, t0.Z_PK, t0.ZTAGNAME, (SELECT COUNT(t1.Z_3ITEMS) FROM Z_3TAGS t1 WHERE (t0.Z_PK = t1.Z_8TAGS) ) FROM ZTAG t0
which seems to do the trick. (Sorry for reverting to Objective-C, and for using different entity/attribute names, but I'm sure you get the picture).
Aside
One other curiosity I discovered is that the second attempt above can also be made to work by counting an attribute of the relationship, rather than the relationship itself:
countED.expression = [NSExpression expressionWithFormat:#"count:(items.itemName)"];
fetch.propertiesToFetch = #[selfED, countED];
gives:
SELECT t0.Z_ENT, t0.Z_PK, (SELECT COUNT(t2.ZITEMNAME) FROM Z_3TAGS t1 JOIN ZITEMS t2 ON t1.Z_3ITEMS = t2.Z_PK WHERE (t0.Z_PK = t1.Z_8TAGS) ) FROM ZTAG t0
which will (I think) give the correct counts provided itemName is not nil.

I played with this for a bit, sure there had to be some way to tell core-data to group by the primary key.
I couldn't figure it out, though I believe it to be possible.
The best I could do was add another unique attribute "uuid" (which I use for all of my entities anyway, for various reasons). You can do this easily enough with NSUUID, or you can take the permanent object ID URI representation and turn it into a string.
Anyway, I think this gives you what you want, but does so by requiring a separate unique attribute.
fetchRequest.propertiesToGroupBy = #[#"uuid"];
I tried a bunch of alternatives as the group-by property but expressionForEvaluatedObject always barfs, and other attempts fell flat.
I'm sure you know this already. Just in case, though it's at least a workaround, even if you don't use it for anything else, until someone comes around who has actually done this before.
FWIW, here is the SQL...
CoreData: sql: SELECT t0.Z_ENT, t0.Z_PK, COUNT( t1.Z_1ENTRIES), t0.ZUUID
FROM ZTAG t0 LEFT OUTER JOIN Z_1TAGS t1 ON t0.Z_PK = t1.Z_2TAGS
GROUP BY t0.ZUUID
Surely, there has to be a way to tell it to substitute t0.Z_PK in the group-by clause. I would image that should be an easy special case for expressionForEvaluatedObject or "objectID" or "self" or "self.objectID"
Good luck, and please report back if you solve it. I'd be very interested.

It is perhaps easier to use a NSFetchedResultsController. You can set the sectionNameKeyPath to group and use the resulting NSIndexPaths to construct your dictionary.
That being said, I do not think that it makes any sense to group by objectID because every object ID is by definition unique. So there will be one instance in each group. This is likely why setting propertiesToGroupBy fails.
So, short answer: no.
E.g.
let fetchRequest = NSFetchRequest(entityName: "Tag")
var output = [(NSManagedObjectID, Int)]()
do {
let results = try context.executeFetchRequest(request) as! [Tag]
for tag in results {
output.append((tag.objectID, tag.entries.count))
}
} catch {}
// output contains tuples with objectID and count
If entriesis optional, use tag.entries?.count ?? 0.

How to query this with ORM?

I've using Kohana for a couple of weeks. One thing I noticed is that Kohana is missing eager loading (as far as I know). Let's say I have the following tables.
Subjects
id
name
Chapters
id
subject_id
name
Videos
id
chapter_id
name
When a user opens a subject page, I want to display all the chapters and videos. With ORM, I can do
$tutorials = ORM::factory('subject')->where('id','=', 1)->find();
foreach($tutorials as $tutorial)
{
$chapters = $tutorial->chapters->find_all();
foreach($chapters as $chapter)
{
$videos = $chapter->videos->find_all();
}
}
The above code is not efficient since it makes too many queries.
I thought about using join or database query builder, but both of them do not return a model object as their results. I also looked into with(), but it seems like it only works with one-to-one relationship.
using join on an ORM object returns an OPM object, but it doesn't return the data from the joining tables.
What would be my best option here? I would like to minimize # of queries and also want to get ORM objects a result. Whatever it would be, should return all the columns from tutorials, chapters, and videos.

First of all, your code is excess. ORM method find() returns 1 Model_Subject object. See
$chapters = ORM::factory('subject', 1)->chapters->find_all();
foreach($chapters as $chapter)
{
$videos = $chapter->videos->find_all();
}
With DB builder you can make just 2 requests. First get array of all chapters ids:
$chapters = DB::select('id')
->from('chapters')
->where('subject_id', '=', '1')
->execute()
->as_array(NULL, 'id');
Second - get all videos by ids as Model_Video object
$videos = DB::select('id')
->from('videos')
->where('chapter_id', 'IN', $chapters)
->as_object('Model_Video')
->execute()
->as_array();

So I guess you want something like this.
$videos = ORM::factory('Video')
->join(array('chapters', 'chapter'), 'LEFT')->on('video.chapter_id', '=', 'chapter.id')
->join(array('subjects', 'subject'), 'LEFT')->on('chapter.subject_id', '=', 'subject.id')
->where('subject.id', '=', $id)
->find_all();
Come to think of it, if the video belongs_to chapter belongs_to subject, try the following using with():
$videos = ORM::factory('Video')
->with('chapter:subject') // These are the names of the relationships. `:` is separator
// equals $video->chapter->subject;
->where('subject.id', '=', $id)
->find_all();
With things like this it often helps to think 'backwards'. You need the videos on that subject so start with the videos instead of the subject. :)
EDIT: The drawback of the second function is that it is going to preload all the data, it might be shorter to write but heavier on the server. I'd use the first one unless I need to know the subject and chapter anyway.

Design a document database schema

I'm vainly attempting to learn how to use object databases. In database textbooks the tradition seems to be to use the example of keeping track of students, courses and classes because it is so familiar and applicable. What would this example look like as an object database? The relational database would look something like
Student
ID
Name
Address
Course
ID
Name
PassingGrade
Class
ID
CourseID
Name
StartTime
StudentClass
ID
ClassID
StudentID
Grade
Would you keep StudentClasses inside of Classes which is, in turn, inside Course and then keep Student as a top level entity?
Student
ID
Name
Address
Course
ID
Name
Classes[]
Name
StartTime
Students[]
StudentID

So you have Courses, Students and Classes, which are parts of Courses and visited by Students? I think the question answers itself if you think about it. Maybe it's clearer if you go away from the pure JSON of MongoDB and look at how you would define it in an ODM (the equivalent of an ORM in RDBs) as document based DBs don't really enforce schemas of their own (example is based on MongoEngine for Python):
class Student(Document):
name = StringField(max_length=50)
address = StringField()
class Attendance(EmbeddedDocument):
student = ReferenceField(Student)
grade = IntField(min_value=0, max_value=100)
class Class(EmbeddedDocument):
name = StringField(max_length=100)
start_time = DateTimeField()
attendance_list = ListField(EmbeddedDocumentField(Attendance))
class Course(Document):
name = StringField(max_length=100)
classes = ListField(EmbeddedDocumentField(Class))
This would give you two collections: one for Students and one for Courses. Attendance would be embedded in the Classes and the Classes would be embedded in the Courses. Something like this (pseudocode):
Student = {
name: String,
address: String
}
Course = {
name: String,
classes: {
name: String,
start_time: DateTime,
attendance_list: {
student: Student,
grade: Integer
}[]
}[]
}
You could of course put the grade info in the student object, but ultimately there really isn't much you can do to get rid of that extra class.

The whole point of an OODBMS is to allow you to design your data model as if it were just in memory. Don't think of it as a database schema problem, think of it as a data modelling problem on the assumption that you have a whole lot of VM and a finite amount of physical memory, You want to make sure that you don't have to boil an ocean of page faults (or, in fact, database I/O operations) to do the operations that are important.

In a pure OODB, your model is fine.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string