Spring Boot logs missing using crontab command - linux

I run my Spring Boot/Batch Jar to process large number of files in my local environment like below:
java -jar /path/to_my_dir_for_jar/springbatch.jar --spring.profiles.active=qa
And logs get generated in the same directory: /path/to_my_dir_for_jar
No issues whatsoever till here.
Now I ran the same job using linux crontab using (11:30 am, daily) :
30 11 * * * java -jar /path/to_my_dir_for_jar/springbatch.jar --spring.profiles.active=qa
Please note: I have to rely on linux crontab to schedule and not spring cron-expression as Spring Batch wont read the new files in the staging directory after first run.
Now I can NOT find my logs at - /path/to_my_dir_for_jar. In the application there different types of logs are generated such as:
service_log.gnxError.log
service_log.gcnDone.log
service_log.error.log
service_log.done.log
service_log.log
I tried to follow this and some other similar help but none of them seem to be working as most of them say - No such file or directory
Where to look for them?
One possible way is to redirect the output of my crontab expression to a file like below:
30 11 * * * java -jar /path/to_my_dir_for_jar/springbatch.jar --spring.profiles.active=qa >> foobar.log
But this way I will loose my different type of logs as everything will be fused in just one file in this case.
Will I get the logs the way I get while running the jar manually ever using linux crontab? What are my options in this situation?

Related

OpenMPI: ORTE was unable to reliably start one or more daemons

I've been at it for days but could not solve my problem.
I am running:
mpiexec -hostfile ~/machines -nolocal -pernode mkdir -p $dstpath where $dstpath points to current directory and "machines" is a file containing:
node01
node02
node03
node04
This is the error output:
Failed to parse XML input with the minimalistic parser. If it was not
generated by hwloc, try enabling full XML support with libxml2.
[node01:06177] [[6421,0],0] ORTE_ERROR_LOG: Error in file base/plm_base_launch_support.c at line 891
--------------------------------------------------------------------------
ORTE was unable to reliably start one or more daemons.
This usually is caused by:
* not finding the required libraries and/or binaries on
one or more nodes. Please check your PATH and LD_LIBRARY_PATH
settings, or configure OMPI with --enable-orterun-prefix-by-default
* lack of authority to execute on one or more specified nodes.
Please verify your allocation and authorities.
* the inability to write startup files into /tmp (--tmpdir/orte_tmpdir_base).
Please check with your sys admin to determine the correct location to use.
* compilation of the orted with dynamic libraries when static are required
(e.g., on Cray). Please check your configure cmd line and consider using
one of the contrib/platform definitions for your system type.
* an inability to create a connection back to mpirun due to a
lack of common network interfaces and/or no route found between
them. Please check network connectivity (including firewalls
and network routing requirements).
--------------------------------------------------------------------------
[node01:06177] 1 more process has sent help message help-errmgr-base.txt / failed-daemon-launch
[node01:06177] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
Failed to parse XML input with the minimalistic parser. If it was not
generated by hwloc, try enabling full XML support with libxml2.
[node01:06181] [[6417,0],0] ORTE_ERROR_LOG: Error in file base/plm_base_launch_support.c at line 891
I have 4 machines, node01 to node04. In order to log into these 4 nodes, I have to first log in to node00. I am trying to run some distributed graph functions. The graph software is installed in node01 and is supposed to be synchronised to the other nodes using mpiexec.
What I've done:
Made sure all passwordless login are setup, every machine can ssh to any other machine with no issues.
Have a hostfile in the home directory.
echo $PATH gives /home/myhome/bin:/home/myhome/.local/bin:/usr/include/openmpi:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
echo $LD_LIBRARY_PATH gives
/usr/lib/openmpi/lib
This has previously worked before, but it just suddenly started giving these errors. I got my administrator to install fresh machines but it still gave such errors. I've tried doing it one node at a time but it gave the same errors. I'm not entirely familiar with command line at all so please give me some suggestions. I've tried reinstalling OpenMPI from source and from sudo apt-get install openmpi-bin. I'm on Ubuntu 16.04 LTS.
You should focus on fixing:
Failed to parse XML input with the minimalistic parser. If it was not
generated by hwloc, try enabling full XML support with libxml2.
[node01:06177] [[6421,0],0] ORTE_ERROR_LOG: Error in file base/plm_base_launch_support.c at line 891

How to run a nodejs script every second

I need to run my nodejs script for every second ,Similar to PHP cron jobs. I have tried some nodejs cron libraries like https://github.com/ncb000gt/node-cron but the issue was first run should be manual i:e I have to run the file with cron script for first time manually.
But in php cron jobs, they run by the server so if the apache server running script will automatically start and even if the script return an error for a cycle then script will run again from the beginning from the next cycle
So is there any way to achieve this in nodejs ?
You have two options:
using Node as a daemon, with something like Supervisord to run your node-cron script. This alternative is wasteful on resources such as RAM because Node and Supervisord are running all the time.
using the system's crontab, you can run your script like calling Node on the command line, such as * * * * node /path/to/your/script.js. This alternative is highly efficient but lacks some control, like being able to log the output in case of an error, although you could just pipe the output to a file: node script.js > logfile

Spark is not started automatically on the AWS cluster - how to launch it?

A spark cluster has been launched using the ec2/spark-ec2 script from within the branch-1.4 codebase. I have logged onto it.
I can login to it - and it reflects 1 master, 2 slaves:
11:35:10/sparkup2 $ec2/spark-ec2 -i ~/.ssh/hwspark14.pem login hwspark14
Searching for existing cluster hwspark14 in region us-east-1...
Found 1 master, 2 slaves.
Logging into master ec2-54-83-81-165.compute-1.amazonaws.com...
Warning: Permanently added 'ec2-54-83-81-165.compute-1.amazonaws.com,54.83.81.165' (RSA) to the list of known hosts.
Last login: Tue Jun 23 20:44:05 2015 from c-73-222-32-165.hsd1.ca.comcast.net
__| __|_ )
_| ( / Amazon Linux AMI
___|\___|___|
https://aws.amazon.com/amazon-linux-ami/2013.03-release-notes/
Amazon Linux version 2015.03 is available.
But .. where are they?? The only java processes running are:
Hadoop: NameNode and SecondaryNode
Tachyon: Master and Worker
It is a surprise to me that the Spark Master and Workers are not started. When looking for the processes to start them manually it is not at all obvious where they are located.
Hints on
why spark did not start automatically
and
where the launch scripts live
would be appreciated. (In the meantime i will do an exhaustive
find / -name start-all.sh
And .. survey says:
root#ip-10-151-25-94 etc]$ find / -name start-all.sh
/root/persistent-hdfs/bin/start-all.sh
/root/ephemeral-hdfs/bin/start-all.sh
Which means to me that spark were not even installed??
Update I wonder: is this a bug in 1.4.0? I ran same set of commands in 1.3.1 and the spark cluster came up.
There was a bug in spark 1.4.0 provisioning script which is cloned from github repository by spark-ec2 (https://github.com/mesos/spark-ec2/) with similar symptoms - apache spark haven't started. The reason was - provisioning script failed to download spark archive.
Check was spark downloaded and uncompressed on the master host ls -altr /root/spark there should be several directories there. From your description looks like /root/spark/sbin/start-all.sh script is missing - which is missing there.
Also check the contents of the file cat /tmp/spark-ec2_spark.log it should has information about uncompressing step.
Another thing to try is to run spark-ec2 with other provisioning script branch by adding --spark-ec2-git-branch branch-1.4 into the spark-ec2 command line argument.
Also when you run spark-ec2 save all output and check is there something suspicious:
spark-ec2 <...args...> 2>&1 | tee start.log

Trouble running cron on Joyent

I'm trying to set up a node script to run as a cron job on Joyent. I can run arbitrary commands but node scripts to seem to execute. As an example:
# cron
# call a script every minute
# being specific about the location of node and the script to run
* * * * * /home/node/local/nodejs/bin/node /full/path/to/some-script.js
// node script at /full/path/to/some-script.js
var fs = require('fs');
fs.writeFile('/home/node/node-service/some-script.log', new Date.toString(), 'utf8');
What I expect to see after one minute is a file at /home/node/node-service/some-script.log with content like Mon Jan 21 2013 15:19:11 GMT-0600 but I see nothing. This is still the case even if the script is set to full read, write and execute permissions for all users and whether the crontab is set for the root or node users.
What am I missing?
Thanks
The fourth optional argument to writeFile is a callback to fire when the file system is done writing the file. You can use it to determine the error that is happening, as it's only argument is an error. Refer to the docs here.
It appears to be working now. I'm not sure what I changed that got it working. It may have been a permissions issue.

Runtime.exec() in Hadoop on Azure environment

This question is related to Hadoop on Azure environment.
I am trying to use Runtime.exec() to execute a batch script in the reduce function. I could not get this running in Hadoop on Azure environment while it runs fine in the Hadoop on Linux. I tested the Runtime.exec() code snippet in my desktop (windows 7) environment and it runs fine there. I have made sure that I consume the output and error streams of the sub-process after Runtime.exec().
The batch script contains the below ( a single command):
c:\hdfs\mapred\local\taskTracker\nabeel\jobcache\job_201207121317_0024\attempt_201207121317_0024_r_000001_0\work\tool.exe
-f c:\hdfs\mapred\local\taskTracker\nabeel\jobcache\job_201207121317_0024\work\11_task_201207121317_0024_r_000001.out
-i c:\hdfs\mapred\local\taskTracker\nabeel\jobcache\job_201207121317_0024\attempt_201207121317_0024_r_000001_0\work\input.txt
I distribute the tool.exe and input.txt files using Distributed cache and it creates a symlink from the working directory. tool.exe and input.txt points to the actual files in the jobcache directory.
2012-07-16 04:31:51,613 INFO org.apache.hadoop.mapred.TaskRunner: Creating symlink: /hdfs/mapred/local/taskTracker/distcache/-978619214658189372_-1497645545_209290723/10.73.50.78tool.exe <- \hdfs\mapred\local\taskTracker\nabeel\jobcache\job_201207121317_0024\attempt_201207121317_0024_r_000001_0\work\tool.exe
2012-07-16 04:31:51,644 INFO org.apache.hadoop.mapred.TaskRunner: Creating symlink: /hdfs/mapred/local/taskTracker/distcache/-4944695173898834237_1545037473_2085004342/10.73.50.78input.txt <- \hdfs\mapred\local\taskTracker\nabeel\jobcache\job_201207121317_0024\attempt_201207121317_0024_r_000001_0\work\input.txt
The reducer gives the below error when it runs.
Command Execution Error: Cannot run program
"cmd /q /c c:\hdfs\mapred\local\taskTracker\nabeel\jobcache\job_201207121317_0024\work\11_task_201207121317_0024_r_0000011513543720767963399.bat":
CreateProcess error=2, The system cannot find the file specified
In another case, I tried running the same but without using the absolute paths.. The output stream from the sub-process is shown below:
c:\hdfs\mapred\local\taskTracker\nabeel\jobcache\job_201207121317_0022\attempt_201207121317_0022_r_000000_0\work>tool.exe -f /hdfs/mapred/local/taskTracker/nabeel/jobcache/job_201207121317_0022/work/1_task_201207121317_0022_r_000000.out
-i input.txt
I do not know how the job working directory paths and distributed cache works in Hadoop on Azure environment. Could you please let me know if I am missing something here (or) there is something I need to take care of while using Runtime.exec() in Hadoop on Azure environment.
Thanks,
.,._
Reply to sender | Reply to group | Reply via web post | Start a New Topic
I am not familiar with Hadoop. But the error message seems to be obvious. It would be better if you can check whether the file exists.
c:\hdfs\mapred\local\taskTracker\nabeel\jobcache\job_201207121317_0024\work\11_task_201207121317_0024_r_0000011513543720767963399.bat
Best Regards,
Ming Xu

Resources