Elasticsearch requirements to run on my server - node.js

So I am implementing search in my frontend and my backend . I found that most used search engines are algolia and elasticsearch. I decided to go with elasticsearch since it is open source and less costy but I am scared it would be a great overhead to my server if i decided not to host it on elastic cloud or aws or azure.
My Question: Will it be an overhead to my server or its load wont be much

Elasticsearch will run even with a 2GB ram, what you need to consider is what is the usage and how much data you planning to store and how you going to search them.
Better to have a try, setting up is very easy.

Related

Mongodb hosting remote vs on the same network

What is the killer reason to use remote db hostring services for MongoDB (like compose.io) for nodejs application VS hosting MongoDB on the same network (in the same datacenter, etc), for example when using PAAS providers (like modulus.io) which offer "integrated" MongoDB hosting .
What percentage of speed/perfomance may degrade when using internet remote DBs, how do DB providers you solve this? How to make right decision on this?
The reason you use something like compose.io is that you don't want to deal with that on your own and have experts taking care of it that know what they are doing. In the best case with support so you can take further advantage of those experts. And that's the only reason.
If you use Modulus that has this anyway and you run your application there as well - even better. There is no real reason to run your node application on Modulus and your mongodb on a different cloud hosting service.
In practice that probably doesn't matter as much because they all use AWS anyway ;)
Important: If they DON'T run in the same network make sure your mongoDB is protected properly(!!). If you do run in the same network just make sure the mongoDB is not accessible from the outside at all which is def the better solution!
Hope that helps

Where to store configuration of an elastic beanstalk application?

I create a small nodejs application which run on aws elasticbeanstalk. At the moment the application configuration is store in a json file. I want to create an frontend to manipulate some parts of this configuration and read about MEAN stack. But Amazon has no MongoDB support. So what is the best pratice in aws elasticbeanstalk to handle configurations for an application? To store this in S3 Bucket is very easy but I think the performace is not very good.
Best regards
How much configuration data are you talking about? If it is typical small amount, and it only changes once in a while, but you need it available each time the application restarts, S3 is probably the easiest and cheapest option. Spinning up a MongoDB instance, just to store a small amount of mostly-read-only data is probably overkill. What makes you think the performance is not very good?
AWS usually recommends DynamoDB for such cases, but in this case you are getting vendor lock in. Also choose of the configuration storage depend on requirements how fast new changes need to be applied to the instances?
Good option to use mysql as configuration db, because you avoid vendor lock in, you can deliver configuration changes as fast as they has been applied and in app can be used memcached interface of the mysql.

Solr / Lucene / Search Hosting

I need some sort of hosted search API for my website where I can submit content and search content with fuzzy logic, where spelling mistakes and grammar won't affect results.
I want to use solr/lucene or whatever technology is out there, without needing to install stuff on my server to reduce setup complexity.
What solr/lucene/othersearch hosting services are there?
I'm read some other posts on stackoverflow, but they are either no longer in business or are wordpress extensions that require server installation (i.e. the processing is done on the server).
You might consider Websolr, of which I am a cofounder, which is exactly the sort of service that you describe.
The thing is, Solr is highly dependant on its datamodel. Or rather how your users search will really affect the way you structure the data model in Solr. As far as I know there aren’t any really good hosting services for Solr yet because you almost always need to do such extensive modifications to the Solr configuration (most notably the schema.xml).
However, with that said, Solr is really easy to get up and running. The example application is bundled with Jetty and runs more or less directly after download.
So unless you have immense scaling issues (read 5-10+ milj documents or a really high query per second load) I’d recommend you to actually install the application on your own server.
Amazon CloudSearch is the best alternate if you do not want to worry about hosting.
http://aws.amazon.com/cloudsearch/
http://docs.amazonwebservices.com/cloudsearch/latest/developerguide/SvcIntro.html
gotosolr - http://gotosolr.com/en
Apache Solr indexes are distributed on 2 hosting companies.
Security is managed by Https and basic http authentication.
Real-time statistics.
Also ready for agencies with multi-accounts and
multi-subscriptions.
Supports Drupal and WPSOLR (https://wordpress.org/plugins/wpsolr-search-engine/)

log4j Log Indexing using Solr

We are finding it very hard to monitor the logs spread over a cluster of four managed servers. So, I am trying to build a simple log4j appender which uses solrj api to store the logs in the solr server. The idea is to use leverage REST of solr to build a better GUI which could help us
search the logs and the display the previous and the next 50 lines or so and
tail the logs
Being awful on front ends, I am trying to cookup something with GWT (a prototype version). I am planning to host the project on googlecode under ASL.
Greatly appreciate if you could throw some insights on
Whether it makes sense to create a project like this ?
Is using Solr for this an overkill?
Any suggestions on web framework/tool which will help me build a tab-based front end for tailing.
You can use a combination of logstash (for shipping and filtering logs) + elasticsearch (for indexing and storage) + kibana (for a pretty GUI).
The loggly folks have also built logstash, which can be backed by quite a few things, including lucene via elastic search. It can forward to graylog also.
Totally doable thing. Many folks have done the roll your own. A couple of useful links.. there is an online service, www.loggly.com that does this. They are actually based on Solr as the core storage engine! Obviously they have built a proprietary interface.
Another option is http://www.graylog2.org/. It is opensource. Not backed by Solr, but still very cool!

Drupal using Solr externally on another machine?

I'd like to use the Drupal solr search module with the Apache Solr Search hosted on an external machine. I know that Acquia offer this as a service. But it's not an affordable option for me. I'd like to install Solr on an inexpensive VPS and have all my various Drupal sites which are on different hosts accessing the search functions. Am I barking up the wrong tree?
The issues that Mauricio brings up are valid, however I'm certain that it's possible to set up a solr server on a separate server without problems for 2 reasons.
1. we are currently using such a setup.
2. acquia.com offers solr as a service (which might be a good solution for you if you don't want to deal with running or setting up your own solr server).
We are currently using a seperate solr server, and because it is in the same local network as the webserver (both on Rackspace cloud) there are no latency issues.
Security is a concern and should not be taken lightly since solr is has very little built in.
The easiest type of security to setup is limiting access to the solr server to only the webserver. There are however more flexible security solutions, but they will probably take some more setting up.
Sure, you can do that. But keep these things in mind:
Security: if your Solr instance is not in the same local network as your Drupal sites, you'll have to carefully set up security in your Solr's Tomcat/Jetty. Having a publicly accessible Solr instance could be a major security problem.
Latency: another issue with a remote Solr server is latency. It's not cool if your carefully optimized website takes 2s to return the search results.
Bandwidth between Solr and the Drupal website could also be a problem but I guess latency is a bigger issue.

Resources