Solr - Relative boost of sort and query - search

My document is structured as the following -
{
"food_group":"Proteins",
"carbs":"6.295",
"protein":"13.729",
"fat":"2.551",
"calories":103.0
}
The aim is to fetch documents in an order determined by the boost of food_group which the user likes in the query as well as boost of proximity to calories preferred by user.
The boost based on food_group is acheived as follows-
(
food_group:"Proteins"^boost1 OR
food_group:"Dairy"^boost2 OR
food_group:"Grains"^boost3
)
However the moment i add abs(sub(preferred_calories,calories)) asc to factor in the proximity of calories to preferred_calories of user the boost based on food_group is lost

Try to use boost function
bf=div(1,abs(sub(100,calories)))
defType=edismax
q=(food_group:"Proteins"^100 OR food_group:"Dairy" OR food_group:"Grains")
Source : https://cwiki.apache.org/confluence/display/solr/The+DisMax+Query+Parser#TheDisMaxQueryParser-Thebf(BoostFunctions)Parameter

Related

[Shopware6]: How can I add SQL Filter to Criteria?

So, the criteria are already quite powerful. Yet I came across a case I seem to not be able to replicate on the criteria object.
I needed to filter out all entries that were not timely relevant.
In a world, where you'd be able to mix SQL with the field definition, it would look like this:
...->addFilter(
new RangeFilter('DATEDIFF(NOW(), INTERVAL createdAt DAY)', [RangeFilter::LTE => 1])
)
Unfortunately that doesn't work in our world.
When i pass the criteria to a searchfunction, i only get:
"DATEDIFF(NOW(), INTERVAL createdAt DAY)" is not a field on xyz
I tried to do it with ->addExtensions and several other experiments, but i couldn't get it to work. I resorted to using the queryBuilder from Doctrine, using queryParts, but the data i'm getting is not very clean and not assigned to an ORMEntity like it should be.
Is it possible to write a criteria that incooperates native SQL filtering?
The DAL is designed in a way that should explicitly not accept SQL statements as it is a core concept of the abstraction. As the DAL offers extendibility for third party extensions it should be preferred to raw SQL in most cases. I would suggest writing a lightweight query that only fetches the IDs using your SQL query and then use these pre-filtered IDs to fetch complete data sets using the DAL.
$ids = (new QueryBuilder($connection))
->select(['LOWER(HEX(id))'])
->from('product')
->where('...')
->execute()
->fetchFirstColumn();
$criteria = new Criteria($ids);
This should offer the best of both worlds, the freedom of using raw SQL and the extendibility features of the DAL.
In your specific case you could also just take the current day, remove the amount of days that should have passed and use this threshold date to compare it to the creation date:
$now = new \DateTimeImmutable();
$dateInterval = new \DateInterval('P1D');
$thresholdDate = $now->sub($dateInterval);
// filter to get all with a creation date greater than now -1 day
$filter = new RangeFilter(
'createdAt',
[RangeFilter::GTE => $thresholdDate->format(Defaults::STORAGE_DATE_TIME_FORMAT)]
);

Cant get identifier and Max Value CosmostDb

I would like to do some reporting on my CosmosDb
my Query is
Select Max(c.results.score) from c
That works but i want the id of the highest score then i get an exception
Select c.id, Max(c.results.score) from c
'c.id' is invalid in the select list because it is not contained in an
aggregate function
you can execute following query to archive what you're asking (thought it can be not very efficient in RU/execution time terms):
Select TOP 1 c.id, c.results.score from c ORDER BY c.results.score DESC
Group by isn't supported natively in Cosmos DB so there is no out of the box way to execute this query.
To implement this using the out of the box functionality you would need to create a new document type that contains the output of your aggregation e.g.
{
"id" : 1,
"highestScore" : 1000
}
You'd then need a process within your application to keep this up-to-date.
There is also documentdb-lumenize that would allow you to do this using stored procedures. I haven't used it myself but it may be worth looking into as an alternative to the above solution.
Link is:
https://github.com/lmaccherone/documentdb-lumenize

Best way to manage internationalization in database

I ' ve some troubles , managing my i18n in my database
For now I ' just two languages available on my application , but in order to be scalable, I would like to do it the "best" way.
I could have duplicated all fields like description_fr, description_en but I was no confortable with this at all. What I've done for now, is a external table , call it content, and its architecture is like this :
id_ref => entity referenced id (2)
type => table name (university)
field => field of the specific table (description)
lang => which lang (fr, en, es…)
content => and finally the appropriate content.
I think it can be important to precise, I use sequelizeJS as ORM. So I can use a usefull hooks as afterFind, afterCreate and afterUpdate. So Each time I wanna to find a resource for example, after find it, my hook retrieve all content for this resource and set definitly my object with goods values. It works, but I'm not in love with this.
But I have some troubles with this :
It's increase considerably my number of requests to the database : If I select 50 rows for example, I have to do 50 requests more.. , and just for a particular model. If I have nested models, it's exponential…
Then, It's complicated to fetch data by content i18ned. Example find a university with a specific name is complicated.
And It's a lot of work for updating etc...
So I wonder, if it would be a good idea , to save as a JSON, directly in the table concerned , the data. Something like
{
fr : { 'name':'Ma super université' },
en : { 'name':'My kick ass university' }
}
And keep on using Sequelize Hooks to build and insert proper data into my object.
What do you think ?
How do you manage this ?
EDIT
I use a mysql database
It concerns around 20 fields (cross models)
I have to set the default value using a my default_lang if there is no content set (e.g, event.description in french will be the same as the english one, if there is no content set)
I used this npm package sequelize-i18n. It worked pretty fine for me using sequelize 3.23.2, unfortunately it seems does not have support for sequelize 4.x yet.

Alternative to skip and limit for mongoose pagination with arbitrary sorting

Let me start by saying that I've read (MongoDB - paging) that using skip and limit for pagination is bad for performance and that its better to sort by something like dateCreated and modify the query for each page.
In my case, I'm letting the user specify the parameter to sort by. Some may be alphabetical. Specifying a query for this type of arbitrary sorting seems rather difficult.
Is there a performance-friendly way to do pagination with arbitrary sorting?
Example
mongoose.model('myModel').find({...})
.sort(req.sort)
...
Secondary question: At what scale do I need to worry about this?
i don't think you can do this.
but in my opinion the best way is to build you query depending on your req.sort var.
for example (it's written in coffescript)
userSort = {name:1} if req.sort? and req.sort="name"
userSort = {date:1} if req.sort? and req.sort="date"
userSort = {number:1} if req.sort? and req.sort="number"
find {}, null , {skip : 0 , limit: 0, sort : userSort } , (err,results)->

Limit results on a ldap requestion with zend_ldap

I'm actually working with Zend Framework and I would like to get informations from a ldap directory. For that, I use this code :
$options = array('host' => '...', 'port' => '...', ...);
$ldap = new Zend_Ldap($options);
$query = '(username=' . $_GET['search'] . ')';
$attributes = array('id', 'username', ...);
$searchResults = $ldap->search($query, $ldap->getBaseDn(), Zend_Ldap::SEARCH_SCOPE_SUB, $attributes);
$ldap->disconnect();
There is may be many results so I would like to realize a pagination by limiting the number of results. I searched in the paramters of the search() function of Zend_Ldap which have a sort parameter but nothing to give an interval.
Do you have a solution to limit the number of results (as in sql with limit 0, 200 for example) ?
Thank you
A client-requested size limit can be used to limit the number of entries the directory server will return. The client-requested size limit cannot override any server-imposed size limit, however. The same applies to the time limit. All searches should include a non-zero size limit and time limit. Failure to include size limit and time limit is very bad form. See "LDAP: Programming Practices" and "LDAP: Search Practices" for more information.
"Paging" is accomplished using the simple paged results control extension. described in my blog entry: "LDAP: Simple Paged Results".
Alternatively, a search result listener, should your API support it, could be used to handle results as they arrive which would reduce memory requirements of your application.
Unfortunately, current releases of PHP don't support the ldap pagination functions out of the box - see http://sgehrig.wordpress.com/2009/11/06/reading-paged-ldap-results-with-php-is-a-show-stopper/
If you have control of your server environment, there's a patch you can install with PHP 5.3.2 (and possibly others) that will allow you to do this: https://bugs.php.net/bug.php?id=42060.
.... or you can wait until 5.4.0 is released for production, which should be in the next few weeks, and which include this feature.
ldap_control_paged_results() and ldap_control_paged_results_response() are the functions you'll want to use if you're going with the patch. I think they have been renamed to the singular ldap_control_paged_result() and ldap_control_paged_result_response() in 5.4.
Good luck!

Resources