Zoom Search Engine-like search engine, but for Linux/UNIX - linux

I recently found the Zoom Search Engine, which struck me as quite interesting, since its software allows for easy decoupling of the indexing process and the searching process.
In other words, you run the indexer on your local machine, and then you upload this index plus the PHP files using it to search them to your webserver.
So your webserver doesn't have to do the indexing. I have a host in a shared environment where it's best to use as few resources as possible, so this would be great to me. Moreover, I have a mostly unused small server at home (this is not the webserver I have) that I could use for indexing purposes.
However, it runs Linux, SSH only, so the Zoom Search Engine is not an option.
Is there something that has the same principle as the Zoom Search Engine (index locally, upload index + PHP to website), but available for a command line Linux environment?

My recommendation is to have a look at OpenSearchServer . A lucene based Search Engine. Easy to setup, mature and stable.
For Your requirements :
OpenSearchServer supports Linux and windows platform.
SSH is enough for running OpenSearchServer remotely.
You can crawl the website locally and update the index (The data directory of OpenSearchServer ) to your remote machine through replication or through FTP.For larger index replication is the best option.
It has an PHP client library so that you can easily enable search in your existing or new application.

SPHINX SEARCH SERVER: http://sphinxsearch.com/
Absolutely fulfilling all your needs and also used by some popular shops like Craigslist, MySQL etc.
PHP is very inherent to Sphinx. All the interfaces are in PHP with the actual engine written in C++. Its blazing fast.
I myself use Solr/Lucene but I give Sphinx +1 for your tasks.

Related

Using Both Elasticsearch and Sphinx

I have currently sphinx integrated in my website.
Now i thought of integrating Elastic Search for some other Search features which i haven't yet built with sphinx.
I thought to migrate to Elastic Search. But I do not want to change my previous integration from sphinx to elastic search (right now).
Is there any major issue if i use both search engines ?
Use more resources. Like more RAM, more diskspace, and increased CPU. Also more contention on database if both index at the same time. Need powerful enough server
And more maintenance as would need to maintain two packages, updates etc likely to be on different schedules

Which framework and language should I choose to create web based simple CRUD programme

I am a network administrator, former software engineer also.
I want to build my own program to keep track the IP, equipment and etc. Since our company has only only less than
100 equipments (including PCs, Printers), the data to process is small, can anyone suggest which language and platform suit my needs best ?
Hmm.. if it were me, I would do a mix of PHP and MySQL for the data backend (CRUD Operations) with HTML, CSS, and JavaScript for the front end UI. This would require Apache, MySQL, and PHP to be installed. These are available to any platform (Windows, OSX, Linux, etc.)
If its local for your use, just create an Access Database.
If web based, this is a fast and simple way ... xml based CRUD

Solr / Lucene / Search Hosting

I need some sort of hosted search API for my website where I can submit content and search content with fuzzy logic, where spelling mistakes and grammar won't affect results.
I want to use solr/lucene or whatever technology is out there, without needing to install stuff on my server to reduce setup complexity.
What solr/lucene/othersearch hosting services are there?
I'm read some other posts on stackoverflow, but they are either no longer in business or are wordpress extensions that require server installation (i.e. the processing is done on the server).
You might consider Websolr, of which I am a cofounder, which is exactly the sort of service that you describe.
The thing is, Solr is highly dependant on its datamodel. Or rather how your users search will really affect the way you structure the data model in Solr. As far as I know there aren’t any really good hosting services for Solr yet because you almost always need to do such extensive modifications to the Solr configuration (most notably the schema.xml).
However, with that said, Solr is really easy to get up and running. The example application is bundled with Jetty and runs more or less directly after download.
So unless you have immense scaling issues (read 5-10+ milj documents or a really high query per second load) I’d recommend you to actually install the application on your own server.
Amazon CloudSearch is the best alternate if you do not want to worry about hosting.
http://aws.amazon.com/cloudsearch/
http://docs.amazonwebservices.com/cloudsearch/latest/developerguide/SvcIntro.html
gotosolr - http://gotosolr.com/en
Apache Solr indexes are distributed on 2 hosting companies.
Security is managed by Https and basic http authentication.
Real-time statistics.
Also ready for agencies with multi-accounts and
multi-subscriptions.
Supports Drupal and WPSOLR (https://wordpress.org/plugins/wpsolr-search-engine/)

Can CouchDB actually be used for a desktop application?

I'm hoping someone can validate or correct my conclusions here.
I'm looking into writing a small side project. I want to create a desktop application for taking notes that will synchronise to a web-server so that multiple installations can be kept in step and data shared and also so that it can be accessed via a browser if necessary.
I've kind of been half-listening to the noises about CouchDB and I've heard mention of "offline functionality", of desktop-couchdb and of moves to utilise its ability to handle intermittent communications to enable distributed applications in the mobile market. This all led me to believe that it might be an interesting option to look at for providing my data storage and also handling my synchronisation needs, but after spending some time looking around for info on how to get started my conclusion is that I've got completely the wrong end of the stick and the reality is that:
There's no way of packaging up a CouchDB instance, distributing it as part of a desktop application and running it in the context of that application to provide local storage and synchronisation to a central database.
Am I correct here? If so is there any technology out there that does this sort of thing or am I left just rolling my own local storage and maybe still using CouchDB on the server?
Update (2012/05): check out the new TouchDB projects from Couchbase if you are targeting Mac OS X and/or iOS or Android. These actually use SQLite under the hood (at least for now) but can replicate to/from a "real" CouchDB server. Another clientside alternative that is finally starting to mature is PouchDB, which runs in IndexedDB capable browser engines. Using these or using them to inspire similar port to another desktop platform is now becoming a better-trod path.
Original answer:
There's no way of packaging up a
CouchDB instance, distributing it as
part of a desktop application and
running it in the context of that
application to provide local storage
and synchronisation to a central
database.
At this point in time, your statement is practically correct although it is possible to include CouchDB in an app — for an example see CouchDBX.app which is a thin wrapper around a prefixed bundle of CouchDB and all its dependencies.
The easiest way to build a CouchDB app is to assume that the user will already have a CouchDB server running. This is easier than it sounds, especially with Couchone's hosting or a prebuilt app like CouchDBX on OS X or DesktopCouch on Ubuntu. This latter is especially interesting, because if I understand correctly it is included by default with Ubuntu these days, and automatically spins up a CouchDB server per-user when you query its port via D-Bus. Something similar could (and should) be done on OS X using launchd and Bonjour.
So as you write, you either would design your app to store data in a local format and optionally sync with a CouchDB service you provide or you'd have to build and bundle all of Erlang, SpiderMonkey and CouchDB together with your app along with some scripts to make sure it was running when needed. This is possible but obviously neither of these are ideal, and believe me you're not the only one wanting a simpler solution for desktop-oriented apps!

Drupal using Solr externally on another machine?

I'd like to use the Drupal solr search module with the Apache Solr Search hosted on an external machine. I know that Acquia offer this as a service. But it's not an affordable option for me. I'd like to install Solr on an inexpensive VPS and have all my various Drupal sites which are on different hosts accessing the search functions. Am I barking up the wrong tree?
The issues that Mauricio brings up are valid, however I'm certain that it's possible to set up a solr server on a separate server without problems for 2 reasons.
1. we are currently using such a setup.
2. acquia.com offers solr as a service (which might be a good solution for you if you don't want to deal with running or setting up your own solr server).
We are currently using a seperate solr server, and because it is in the same local network as the webserver (both on Rackspace cloud) there are no latency issues.
Security is a concern and should not be taken lightly since solr is has very little built in.
The easiest type of security to setup is limiting access to the solr server to only the webserver. There are however more flexible security solutions, but they will probably take some more setting up.
Sure, you can do that. But keep these things in mind:
Security: if your Solr instance is not in the same local network as your Drupal sites, you'll have to carefully set up security in your Solr's Tomcat/Jetty. Having a publicly accessible Solr instance could be a major security problem.
Latency: another issue with a remote Solr server is latency. It's not cool if your carefully optimized website takes 2s to return the search results.
Bandwidth between Solr and the Drupal website could also be a problem but I guess latency is a bigger issue.

Resources