I have a login-protected website. It's an internal application and it's not avaiable to the general public hence it's not indexed by any search engine.
My application is developed on the Google App Engine.
I would like to add a search engine but obviously without the need to public index it. There's any solution avaiable from Google/Bing/Others for a situation like this?
Have you done this before? What solution did you chose and what are yours results?
Well Google has the Google Search Applicance which is basically a blade server which lives on your internal network and creates a "private" index. But this is meant as an enterprise caliber solution (translation: expensive).
Which framework is your website running on? You may be able to find an indexing/search module.
To answer the latter part of the question... I've used Xapian in a Django based website (via the djapian adapter). It basically creates a full-text index. Results are maybe not as good as Bing or Google would generate but it's still quite good; easy to use API too.
Related
I am prototyping a Shopware App right now, where I want to extend the search with our search API. We already have a working plugin in the store for that.
I found those two references for hooks:
https://developer.shopware.com/docs/resources/references/app-reference/webhook-events-reference
https://developer.shopware.com/docs/resources/references/app-reference/script-reference/script-hooks-reference
Seems like there is no webhook for the search at all and just a script-hook for a finished search. In the plugin, we could just extend the ProductSearchRoute and be completely flexible.
Are search extension not planned right now?
Cheers,
Tobias
I assume you want to alter the criteria for fetching the products. As of today this is not yet possible with non-self-hosted apps. You could use the app scripts to enrich or replace the contents of an already loaded page as you already mentioned. Obviously that comes with some drawbacks regarding performance. The capabilities of apps are being enhanced continuously though so there's chance search manipulation might become possible rather soon.
I might get flagged down by this question.. but still will give it a shot..
Since Google Site Search is going out of business and we are not interested in the free version of it - We decided to go with the Amazon Cloud Search option. The challenge though is - it is not straight forward. We have to build a crawler and there are some features that needs to be custom built.
I am trying to see examples where websites have used ACS and worked but i am not able to find anything good.. Have anyone tried using Amazon Cloud search for their Website search. Our website has around 15000 plus pages.
We are .net based solution - so i am thinking to write a crawler.. extract content on nightly basis and send it to Amazon. Would it be the right way?
ACS is based on Solr. If your site is under your control, i think the first step is extracting all useful content out and generate them into xml/json files, then use AWS CLI upload these documents to ACS. ACS has REST APIs to let you to get the query result. You need to define indexes before uploading them.
I have a simple HTML site with 100+ pages or so. I want to add a search bar at the top so the user can search the site. I know about Google Custom Search, but it shows ads unless you pay at least $100. Obviously I'd like ad-less search on my site for free if at all possible!
I've also heard about Lucene/Solr, but they do not actually crawl the site. For that I would apparently need Nutch.
Anyway, the site I have runs on a Microsoft IIS6 server, but I have basically no knowledge as to how Solr, Nutch, etc. gets "installed" on the server.
Also: I'd like to point out that I do have a local copy of the site. Perhaps I can do one big initial nutch "crawl" locally that will create an .xml for Solr?? That would help me get "up and running", but probably wouldn't be a good long-term solution.
..so should I just use Google Custom Search? or is there a not-extremely-painful-to-implement alternative? The brain hurts folks.
You did not mention how many search requests you want to handle but if you use the json-rest-api of google's custom search you have 100 searchqueries a day for free and you can display them without any ads on your page.
An simple example request can be found here.
Here is an easy way that works pretty well, although you may be looking for something more than this.
http://sitecomber.com/getsitecomber/
You can create code to paste into your site in about 2 minutes. It doesn't get easier than that. Search is powered by Google, but results are isolated to your website.
EDIT: This no longer works.
I'm developing a mobile app that links to search result pages in the Android Marketplace app, but I want to avoid returning any adult related content. The only valid search URI parameters that I can find are:
"details?id=<package_name>"
"search?q=<query>"
"search?q=pub:<publisher_name>"
I think it's pretty weird that Google doesn't offer an option for Safe Search in the Marketplace app, as their web-based Marketplace supports it and the base URI structure is identical to the app's.
I created a small web page with sample URLs that demonstrate the issue. All of the examples work on the Android Marketplace web site, but only the last example works in the app.
If Google doesn't offer any URI parameters that invoke Safe Search (and it appears they don't), can you think of any possible work-arounds?
I don't want kids searching for a term like "bears" and stumbling across something like this. [not safe for work]
...which, just for the record, is currently the 9th result of 2,079 matches for that search term.
The Marketplace app has a setting for content filtering (in four levels, just like Safe Search). Is it possible that the app parses all externally-called URIs and replaces any Safe Search parameters with its own? After all, I shouldn't be able to override the app's settings. This activity would most-likely take place on the client side (it almost has to), so packet sniffing via a WiFi connection may yield some clues.
I'm investigating the possibility of re-using Google Apps/Docs in a local hybrid desktop/browser application.
I've been going through the Google documentation on manipulating docs, eg. the Spreadsheet. I can't seem to find any info on actually hosting the UI. Is this possible, or does it require some form of permission from Google?
You want to basically embed an browser control in your application pointed at the URL of a Google Apps doc? You could use the Google Document List API to retrieve the documents for a user, then use the URLs of those documents in your embedded browser control.
You don't need Google's permission to do that; you're writing a browser with some extra smarts built in.
What do you mean by "hosting the UI?" These apps are HTML/CSS/JavaScript. Are you thinking about embedding them in AIR or Titanium, or in some kind of web control in another app?
i briefly looked into doing this, and figured if i really wanted to i could just load the gdocs page content dynamically, and use javascript to strip away the superflous elements like header and footer. but instead i'll probably just use an OS alternate because they have come a long way and I want rich hooks.