Sphinx "reverse" search - search

We have a website where users put up ads for stuff they want to sell, with parameters such as price, location, title and description. These can then be searched for using sphinx and allowing users to specify min- and maxprice, a location with a searchradius (using google maps) etc. Users can choose to save these searches and get emails when new ads appear that fit their search. Herein lies the problem: We want to perform a reverse search every time an ad is posted. With the price, location, title and description as parameters we want to search through all the saved "searches" and get the ones that would have found the ad. The min- and maxprice should just be performed in a query i suppose, and some Quorom syntax to get all ads with at least 2 or mby just 1 occurance in the title/description. Our problem lies mostly in the geo-search. How do we find all searches where the "search-circles" would include our newly posted location without performing a search for every saved search?
That is the main-question, any comment on our suggested solution to the other problems is also very welcome. Thank you in advance / Jenny

The standard 'geo-search' support on sphinx should work just as well on a Prospective Index, as a normal retrospective search.
Having built a sphinx 'index' of all the saved searches...
And you run a query using the 'ad' as the search query:- rather than the 'filter' using a fixed radius, you just use the radius from the attribute (ie the radius stored on the particular query) - if using the API cant use setFilterRange directly, need to use setSelect, to make a new virtual attribute.
$cl->setSelect("*,IF(#geodist<radius,1,0) as myfilter");
$cl->setFilter('myfilter',array(1));
(and yes, the min/maxprice can just be done with normal filters too - just inverting the logic to that you would use in a retrospective search)
... the complication is in the 'full-text' query, if the saved search is anything more than a single keyword, but you appear to have already figured out that part.

Related

Look for unique ID pattern which easy indexed by search engines

Like from Microsoft - "KB2756872" or from National Vulnerability
Database - "CVE-2010-1428" or from Red Hat - "RHSA-2010:0376" or
from OIDs - "1.3.6.1.4.1.311" or from UUID/GUID
- "550e8400-e29b-41d4-a716-446655440000".
I want to put several jobs to UIDs. See next...
I develop blog software and have idea to put unique ID in body of
each post so can easily identify that copy from local storage is
correspond to remote published copy.
Also I want to post to many different blogging services so if one
is down articles will be accessible from another. So link can
dead but if I add UID - anyone can try web-search to find post on
another service!
Also this allow to gather some article spreading
statistics. Many sites just replicate content (copy-writing and
rewriting bots and people) to broke search engines. With UID I
easily can identify such sites...
So my question how is to make UIDs (in which form) so it would be
easily indexed by search engines (web, like Google/Yahoo, and
corporate, like Lucene/Solr/Sphinx/Xapian/etc).
I know about some limitation of search engine like:
only >= 3 chars for each search part
it was not indexed dust like gfh6wytrh6wu56he5gahj763
so this task s not easy...
Any advice is appreciated (books/blog articles/etc).
You could use Tag URIs, as defined by RFC 4151.
They are globally unique, and everyone who owned a domain name or an email address for at least a day can mint them.
Note that these URIs only identify, they don’t locate. So a Tag URI doesn’t say anything about where something is published.
Let’s say your site’s domain is "example.com". If you create a blog post, you could create the following Tag URI:
tag:example.com,2012-12:cute-cat
Note that the date in this URI is not a publication date! It must be a (past) date on which you owned the domain (resp. email address). If you registered your domain in 2003, you could always use Tag URIs starting with tag:example.com,2004: (not "2003", because "2003" would mean "2003-01-01", which might be a time where you didn’t own the domain yet), followed by a (unique) string under your control. However, if you like you could always use the publication date, of course. But don’t use future dates.
You can use year and number based article identifier just like CVE identifiers. Since you need revisions as well, you can append dot after the identifier to clarify the version. For example, for an AWesome Blog Service, AWBS-2012-1.0 would refer to original document, AWBS-2012-1.1 would refer to first revision etc.
However, you need to make sure that AWBSs are unique before you use them. CVEs are assigned manually from the pool. You would probably need some kind of service that assigns AWBS from pool. It could be a simple database query.

Facebook Place Search get no results of new places

After I create a new place on Facebook app, I use graph api to search the place with exact same location. However, I cannot get the place I just created even if I increase the distance to 1000 ft.
My search URL is as follow:
https://graph.facebook.com/search?type=place&center=25.091075, 121.55983449999997&distance=100&limit=100&offset=0&access_token=XXXX
In addition, if I add q="My Place" parameter, I can get the place.
Is it possible to get the new place information without parameter 'q=My Place'?
Most likely, it takes a while for Facebook's search service to index the new place. I would be very surprised if you still having trouble with this after a day or so.

Sharepoint: Limit search results by field

I have setup a search scope for the current members of a website (a "Phone book" type of search). It is setup to automatically suggest limiting search results by people's jobtitles, adding something like "jobtitle:Manager" to the search query.
For single words ("Manager", "Supervisor", etc.) this works fine, but as soon as the title contains more than one word ("Managing Supervisor"), it returns zero search results. My gut feeling is that it's because when the url is entered as jobtitle:Managing Supervisor, it limits results by jobtitle = Managing, and then Supervisor simply becomes a generic search term.
I tried testing with manually added quotation marks, jobtitle:"Supervising Manager", but they are removed when I land on the search page and the effects are the same.
Is there any way to allow limiting of search results by fields with multiple words?
This is running SharePoint 2007.
Did you try adding a + between the words?

Sharepoint 2010 Search - Auto add property to QueryString

Have a bit of a difficult question which as far as I can see, no one has really managed to fix yet.
Here's the scenario. Sharepoint 2010 EnterPrise Search Centre.
I've created a custom Search Results Page. I want people who type any word in the Search box to only display results where the Value provided by the user matches with a specific Managed Search Property.
Now I know a user can search for People with specific criteria by entering for example
Continent:Europe in the actual Search Box. Sharepoint will refresh the page with the following added to the Query String: k=Continent:Europe and the results will only show people who are from Europe.
So my question is : How can I fix this so that the user does not have to enter the Continent:Europe in the Search box and can just type Europe?
Thanks
One option is to create your own webpart that acts as the search box and replaces the standard one with your custom search box. The advantage of this is that you can more tightly control the user interface and then set up the query passed to the server (with the "k" parameter). You could prepend "Continent:" before the search term entered to help narrow the search.
Another use for this is to append * onto any search term because the People search does include partial words by default.
We did this on one site to simplify the input and allow users to search with one text box (without the advanced features) and then users can use the refinements to narrow the search.

Google-Analytics API to track Site Search?

So there's this nifty _trackPageview() api method on a tracker object, but is there a corresponding method that can be used to manually track a search? In other words, _trackPageview() reports to GA that a user hit a page. I want something like _trackSearch("terms") that would report to GA that a user searched for something.
Though not exactly what I was looking for, it seems that one can generate virtual page views to track search results programatically.
Suppose that you've set up a Site Search parameter called "q", so that when a URI is tracked that contains q=these+are+some+terms, GA will mark it as a search hit. One can use the _trackPageview() method to generate virtual search hits like so:
pageTracker._trackPageview('/custom/search?q=These+are+some+terms')
I pass search parameters by GET, so the URL for a search on "TEST" is
http://www.example.com/search?q=TEST
Selecting Content -> Site Search from my analytics account gives me a list of all keywords searched.
To learn more, check the documentation, especially the How do I set up Site Search for my profile? page.

Resources