Cygwin and Apache Pig - a perplexing pseudo-grunt> - cygwin

I'm trying to get a working installation of Apache Pig on a Windows PC running the Vista operating system, in order to use it as a learning tool; I don't intend to do any serious data processing with Pig on this machine. A single node, single JVM -x local setup is what I want.
I come from a Windows background, so UNIX is the big learning curve for me, but following advice in the online Apache Pig documentation Getting Started, I have installed cygwin and it seems to be working fine. I included the Perl package in my cygwin download and installation, as advised in Getting Started, and that seems to be working fine as well - the /bin directory contains perl.exe and I can access all the Perl documentation.
I then downloaded pig-0.11.1, unpacked it with tar -xzvf pig-0.11.1.tar.gz and spent a few (mostly enjoyable) days using the errors I got when trying pig -x local to study the Bash Reference Manual and go through the pig shell script, which I think I now pretty much understand. Having adjusted calls to the cygwin utility cygpath in this script, so that pig.jar is found and the arguments passed to java.exe remain converted by cygpath to a form that java.exe can understand, I get a grunt prompt. But my whoops of joy have been short-lived.
In fact, I get the same grunt prompt with pig-0.7.0 downloaded, installed and used out-of-the-box, with pig -x local, as RELEASE_NOTES.txt describes, without any tampering with its pig shell script at all. But unfortunately it is the same grunt prompt I get with pig-0.11.1: a curious, pseudo-grunt prompt where the arrow keys can move the cursor all over the prompt, in fact all around the screen, over previous commands given at the dollar prompt even, and the return key (preceded by ;) does nothing but jump the cursor to a new line. Text can be written but not entered, and only ^c and ^\ seem to work - mercifully returning the bash dollar prompt and a little sanity.
From my pig-0.7.0 directory, typing bin/pig -help gives a proper readout:
Apache Pig version 0.7.0 (r941408)<br />
compiled May 05 2010, 11:15:55<br />
USAGE: Pig [options] [-] : Run interactively in grunt shell.</br >
Pig [options] -e[xecute] cmd [cmd ...] : Run cmd(s).<br />
Pig [options] [-f[ile]] file : Run cmds found in file.
options include: ... *etc etc*<br />
From my pig-0.7.0 directory, typing bin/pig -x local results in the following response:
13/04/18 10:37:51 INFO pig.Main: Logging error messages to: C:\cygwin\home\Richard\pig_installation\pig-0.7.0\pig_1366277871311.log<br />
2013-04-18 10:37:51,540 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: file:///<br />
From any directory, since I have set PATH to my pig-0.11.1/bin directory, typing pig -x local results in the following response:
which: no hadoop in (usr/local/bin:/cygdrive/c/Program Files ... *etc etc* .. )<br />
2013-04-18 10:48:59,946 [main] INFO org.apache.pig.Main - Apache Pig version 0.11.1 (r1459641) compiled Mar 22 2013, 02:13:53<br />
2013-04-18 10:48:59,946 [main] INFO org.apache.pig.Main - Logging error messages to: C:\cygwin\home\Richard\pig_installation\pig-0.7.0\pig_1366278539943.log<br />
2013-04-18 10:48:59,965 [main] INFO org.apache.pig.impl.util.Utils - Default bootup file C:\Users\Richard/.pigbootup not found<br />
2013-04-18 10:49:01,404 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: file:///<br />
Is this a fatal error or am I just missing a trick? The pig shell script in pig-0.11.1 seems to imply that if hadoop is not found, pig.jar or pig-?.!(*withouthadoop).jar (e.g. pig-0.11.1.jar) will do instead, and the documentation tells me that pig on Windows with cygwin is supported (for -x local but not -x mapreduce). Is this pseudo-grunt> prompt a complete mirage, or does it indicate partial success?
Postscript to the above: I've followed the section Pig Tutorial in Apache's Pig documentation Getting Started, set the environment variables, edited the pig-0.7.0/tutorial/build.xml file as per instructions, run the ant command, created the pigtutorial.tar.gz file, moved it, unzipped it, found pig script 1 and run pig -x local script1-local.pig and IT WORKS! The output file - part-r-00000 - contains no warnings at all, just five columns of records, as expected. A new attempt to get interactive mode, however, with pig -x local, results in the same pseudo-grunt> prompt.

Related

Opensips-cli -x command not working in opensips 3.3

Recently I am working on upgrading my opensips version manually from 2.2 to 3.3.
Upgradation is done from my side but in old opensips(2.2) I was able to show registered user(SIP) using opensipsctl ul show command but in new version 3.3 opensipsctl is deprecated(I guess not sure).
So I am trying to get details using opensips-cli but I didn't find out correct command for show register and show dump list, I try to follow below link but did not find correct command.
https://www.opensips.org/Documentation/Interface-CoreMI-3-0
Also, my opensips-cli -x command not working giving the below error. (mi_fifo module loaded correctly)
# opensips-cli -o output_type=yaml -x mi uptime
ERROR: cannot access fifo file /tmp/opensips_fifo: [Errno 13] Permission denied: '/tmp/opensips_fifo'
ERROR: starting with Linux kernel 4.19, processes can no longer read from FIFO files
ERROR: that are saved in directories with sticky bits (such as /tmp)
ERROR: and are not owned by the same user the process runs with.
ERROR: To fix this, either store the file in a non-sticky bit directory (such as /var/run/opensips),
ERROR: or disable fifo file protection using 'sysctl fs.protected_fifos=0' (NOT RECOMMENDED)
/tmp/opensips_fifo file also created correctly.
# ls -l /tmp/opensips_fifo
prw-rw-rw- 1 opensips opensips 0 Dec 29 06:52 /tmp/opensips_fifo
Using opensips-cli command I am able to create database and add table but not able to perform -x command.
Can anyone help me to find out a command for show register and show dump list also any suggestion related -x command not working on opensips-cli.
I had a similar error and i found the following:
if you state in the opensips-cli.cfg file that the fifo_file is located at /tmp/opensips_fifo, it will produce this error, try changing this setting to /var/run/opensips/opensips_fifo

OpenMPI: ORTE was unable to reliably start one or more daemons

I've been at it for days but could not solve my problem.
I am running:
mpiexec -hostfile ~/machines -nolocal -pernode mkdir -p $dstpath where $dstpath points to current directory and "machines" is a file containing:
node01
node02
node03
node04
This is the error output:
Failed to parse XML input with the minimalistic parser. If it was not
generated by hwloc, try enabling full XML support with libxml2.
[node01:06177] [[6421,0],0] ORTE_ERROR_LOG: Error in file base/plm_base_launch_support.c at line 891
--------------------------------------------------------------------------
ORTE was unable to reliably start one or more daemons.
This usually is caused by:
* not finding the required libraries and/or binaries on
one or more nodes. Please check your PATH and LD_LIBRARY_PATH
settings, or configure OMPI with --enable-orterun-prefix-by-default
* lack of authority to execute on one or more specified nodes.
Please verify your allocation and authorities.
* the inability to write startup files into /tmp (--tmpdir/orte_tmpdir_base).
Please check with your sys admin to determine the correct location to use.
* compilation of the orted with dynamic libraries when static are required
(e.g., on Cray). Please check your configure cmd line and consider using
one of the contrib/platform definitions for your system type.
* an inability to create a connection back to mpirun due to a
lack of common network interfaces and/or no route found between
them. Please check network connectivity (including firewalls
and network routing requirements).
--------------------------------------------------------------------------
[node01:06177] 1 more process has sent help message help-errmgr-base.txt / failed-daemon-launch
[node01:06177] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
Failed to parse XML input with the minimalistic parser. If it was not
generated by hwloc, try enabling full XML support with libxml2.
[node01:06181] [[6417,0],0] ORTE_ERROR_LOG: Error in file base/plm_base_launch_support.c at line 891
I have 4 machines, node01 to node04. In order to log into these 4 nodes, I have to first log in to node00. I am trying to run some distributed graph functions. The graph software is installed in node01 and is supposed to be synchronised to the other nodes using mpiexec.
What I've done:
Made sure all passwordless login are setup, every machine can ssh to any other machine with no issues.
Have a hostfile in the home directory.
echo $PATH gives /home/myhome/bin:/home/myhome/.local/bin:/usr/include/openmpi:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
echo $LD_LIBRARY_PATH gives
/usr/lib/openmpi/lib
This has previously worked before, but it just suddenly started giving these errors. I got my administrator to install fresh machines but it still gave such errors. I've tried doing it one node at a time but it gave the same errors. I'm not entirely familiar with command line at all so please give me some suggestions. I've tried reinstalling OpenMPI from source and from sudo apt-get install openmpi-bin. I'm on Ubuntu 16.04 LTS.
You should focus on fixing:
Failed to parse XML input with the minimalistic parser. If it was not
generated by hwloc, try enabling full XML support with libxml2.
[node01:06177] [[6421,0],0] ORTE_ERROR_LOG: Error in file base/plm_base_launch_support.c at line 891

R Command not recognized when submitted with SSH

I am submitting a shell script on a remote host that in turn submits an R script, but the error R: command not found or Rscript: command not found (depending whether I tried R CMD BATCH or Rscript).
I have tried submitting in the following ways:
ssh <remote-host> exec $HOME/test_script.sh
ssh <remote-host> `sh $HOME/test_script.sh`
The script test_script.sh contains (have tried Rscript as well):
#!/bin/sh
Rscript --no-save --no-restore $HOME/greetme.R
exit 0
The script greetme.R contains only cat("Hello\n").
The reason I am getting flustered is that when I log into the remote-host and submit the original script with sh $HOME/test_script.sh, it runs as intended.
The system specs and R versions for both the local and remote hosts are identical:
> R.version
_
platform x86_64-unknown-linux-gnu
arch x86_64
os linux-gnu
system x86_64, linux-gnu
status
major 3
minor 1.0
year 2014
month 04
day 10
svn rev 65387
language R
version.string R version 3.1.0 (2014-04-10)
nickname Spring Dance
Why is Linux refusing to recognize the commands?
I would prefer solutions using R CMD BATCH or Rscript but if there are known workarounds using littler or %R_TERM% I would like to hear them too.
I used this related question as reference, as well as the documents referenced in the comments: R.exe, Rcmd.exe, Rscript.exe and Rterm.exe: what's the difference?
EDIT for solution:
As #merlin2011 suggested, once I specified the full path in the test_script.sh, everything worked as intended:
#!/bin/sh
/opt/R/bin/Rscript --no-save --no-restore $HOME/greetme.R
exit 0
I got the path also by the provided suggestion:
$ which Rscript
/opt/R/bin/Rscript
It appears that you have a PATH issue, where R is not on your PATH when you try to run the command through ssh.
If you specify the full path to R and Rscript on the remote host, it should resolve the problem.
If you are not sure what the full path is, try logging into the server and running which R to get the path.

Error on neo4j server start on arch linux

I have an arch linux setup and installed neo4j through the arch user repository (yaourt -S neo4j), and I'm able to run the web console fine (sudo neo4j console with seemingly normal output and full functionality), however when trying to start the server (sudo neo4j start), I encounter the following error message:
/usr/share/neo4j/bin/utils: line 345: [: -lt: unary operator expected
Using additional JVM arguments: -server -XX:+DisableExplicitGC -Dorg.neo4j.server.properties=/etc/neo4j/neo4j-server.properties -Djava.util.logging.config.file=/etc/neo4j/logging.properties -Dlog4j.configuration=file:/etc/neo4j/log4j.properties -XX:+UseConcMarkSweepGC -XX:+CMSClassUnloadingEnabled
Starting Neo4j Server...cat: /run/neo4j/neo4j-service.pid: No such file or directory
process []... waiting for server to be ready. Failed to start within 120 seconds.
Neo4j Server may have failed to start, please check the logs.
rm: cannot remove ‘/run/neo4j/neo4j-service.pid’: No such file or directory
There's no delay before the error message is printed, so it seems to be something other than the timeout. I'm quite new to neo4j (I worked through a fair bit of the user manual using the web console, but no development or server config experience), so I'm not really sure what else might be relevant. I tried looking through the utils script and the error appears to be where it attempts to su neo4j, but it also seems to proceed to attempt to start the server. I also tried changing the port it's starting on as in this question, but no change. The only log I can find just has this over and over (with appropriate timestamps):
Oct 15, 2014 1:33:49 AM com.sun.jersey.server.impl.application.WebApplicationImpl _initiate
INFO: Initiating Jersey application, version 'Jersey: 1.9 09/02/2011 11:17 AM'
Any help at all would be appreciated!
EDIT:
The line 345 that it's failing on is the end of this snippet:
if [ $UID == 0 ] ; then
OPEN_FILES=`su $NEO4J_USER -c "ulimit -n"`
else
OPEN_FILES=`ulimit -n`
fi
if [ $OPEN_FILES -lt 40000 ]; then
From doing some echo debugging, it seems that su $NEO4J_USER is failing, probably because $NEO4J_USER is set to neo4j, a user that does not exist on my system. I tried setting that to root in one of the config files, but evidently that's not working properly. Arch is a continual learning experience for me, but I've not had to add a new user before to get software working.
The interesting line here is:
/usr/share/neo4j/bin/utils: line 345: [: -lt: unary operator expected
I assume that is caused by a wrong default shell for the neo4j user. What default is currently set for the neo4j system user? Try to switch that to bash. The startup scripts should work nicely with bash.

Runtime.exec() in Hadoop on Azure environment

This question is related to Hadoop on Azure environment.
I am trying to use Runtime.exec() to execute a batch script in the reduce function. I could not get this running in Hadoop on Azure environment while it runs fine in the Hadoop on Linux. I tested the Runtime.exec() code snippet in my desktop (windows 7) environment and it runs fine there. I have made sure that I consume the output and error streams of the sub-process after Runtime.exec().
The batch script contains the below ( a single command):
c:\hdfs\mapred\local\taskTracker\nabeel\jobcache\job_201207121317_0024\attempt_201207121317_0024_r_000001_0\work\tool.exe
-f c:\hdfs\mapred\local\taskTracker\nabeel\jobcache\job_201207121317_0024\work\11_task_201207121317_0024_r_000001.out
-i c:\hdfs\mapred\local\taskTracker\nabeel\jobcache\job_201207121317_0024\attempt_201207121317_0024_r_000001_0\work\input.txt
I distribute the tool.exe and input.txt files using Distributed cache and it creates a symlink from the working directory. tool.exe and input.txt points to the actual files in the jobcache directory.
2012-07-16 04:31:51,613 INFO org.apache.hadoop.mapred.TaskRunner: Creating symlink: /hdfs/mapred/local/taskTracker/distcache/-978619214658189372_-1497645545_209290723/10.73.50.78tool.exe <- \hdfs\mapred\local\taskTracker\nabeel\jobcache\job_201207121317_0024\attempt_201207121317_0024_r_000001_0\work\tool.exe
2012-07-16 04:31:51,644 INFO org.apache.hadoop.mapred.TaskRunner: Creating symlink: /hdfs/mapred/local/taskTracker/distcache/-4944695173898834237_1545037473_2085004342/10.73.50.78input.txt <- \hdfs\mapred\local\taskTracker\nabeel\jobcache\job_201207121317_0024\attempt_201207121317_0024_r_000001_0\work\input.txt
The reducer gives the below error when it runs.
Command Execution Error: Cannot run program
"cmd /q /c c:\hdfs\mapred\local\taskTracker\nabeel\jobcache\job_201207121317_0024\work\11_task_201207121317_0024_r_0000011513543720767963399.bat":
CreateProcess error=2, The system cannot find the file specified
In another case, I tried running the same but without using the absolute paths.. The output stream from the sub-process is shown below:
c:\hdfs\mapred\local\taskTracker\nabeel\jobcache\job_201207121317_0022\attempt_201207121317_0022_r_000000_0\work>tool.exe -f /hdfs/mapred/local/taskTracker/nabeel/jobcache/job_201207121317_0022/work/1_task_201207121317_0022_r_000000.out
-i input.txt
I do not know how the job working directory paths and distributed cache works in Hadoop on Azure environment. Could you please let me know if I am missing something here (or) there is something I need to take care of while using Runtime.exec() in Hadoop on Azure environment.
Thanks,
.,._
Reply to sender | Reply to group | Reply via web post | Start a New Topic
I am not familiar with Hadoop. But the error message seems to be obvious. It would be better if you can check whether the file exists.
c:\hdfs\mapred\local\taskTracker\nabeel\jobcache\job_201207121317_0024\work\11_task_201207121317_0024_r_0000011513543720767963399.bat
Best Regards,
Ming Xu

Resources