Does Spark support multiple users? - apache-spark

I have a 3-node spark 2.3.1 cluster running at the moment, and I'm also running a zeppelin server using a normal user, like ulab.
From zeppelin, I ran the commands:
%spark
val file = sc.textFile("file:///mnt/glusterfs/test/testfile")
file.saveAsTextFile("/mnt/glusterfs/test/testfile2")
It report a lot of error messages, something like:
WARN [2018-09-14 05:44:50,540] ({pool-2-thread-8} NotebookServer.java[afterStatusChange]:2302) - Job 20180907-130718_39068508 is finished, status: ERROR, exception: null, result: %text file: org.apache.spark.rdd.RDD[String] = file:///mnt/glusterfs/test/testfile MapPartitionsRDD[49] at textFile at <console>:51
org.apache.spark.SparkException: Job aborted.
...
... 64 elided
Caused by: java.io.IOException: Failed to rename DeprecatedRawLocalFileStatus{path=file:/mnt/glusterfs/test/testfile2/_temporary/0/task_20180914054253_0050_m_000018/part-00018; isDirectory=false; length=33554979; replication=1; blocksize=33554432; modification_time=1536903780000; access_time=0; owner=; group=; permission=rw-rw-rw-; isSymlink=false} to file:/mnt/glusterfs/test/testfile2/part-00018
And I found that some temporary files owned by user root, while some owned by ulab, like the following:
bash-4.4# ls -l testfile2
total 32773
drwxr-xr-x 3 ulab ulab 4096 Sep 14 05:42 _temporary
-rw-r--r-- 1 ulab ulab 33554979 Sep 14 05:44 part-00018
bash-4.4# ls -l testfile2/_temporary/
total 4
drwxr-xr-x 210 ulab ulab 4096 Sep 14 05:44 0
bash-4.4# ls -l testfile2/_temporary/0
total 832
drwxr-xr-x 2 root root 4096 Sep 14 05:42 task_20180914054253_0050_m_000000
drwxr-xr-x 2 root root 4096 Sep 14 05:42 task_20180914054253_0050_m_000001
drwxr-xr-x 2 root root 4096 Sep 14 05:42 task_20180914054253_0050_m_000002
drwxr-xr-x 2 root root 4096 Sep 14 05:42 task_20180914054253_0050_m_000003
....
Is there any setup to let all these temporary files created by ulab? so we can use multiple users in spark driver to isolate the priviledges.

You can enable 'User Impersonate' option for spark interpreter which will start the spark job as logged-in user.
Refer this link for more info

Related

Write dataframe to CSV in spark

I am writing a spark dataframe to CSV file using below code
println("Total number of reports: " + reportDf.count())
reportDf
.coalesce(1)
.write.format("com.databricks.spark.csv")
.csv("output/cluster.csv")
And o/p is:
Total number of reports: 48720
spark#monikatest:~/output/cluster.csv$ ll
total 12
drwxrwxr-x 2 spark spark 4096 Mar 27 20:56 ./
drwxrwxr-x 3 spark spark 4096 Mar 27 20:56 ../
-rw-r--r-- 1 spark spark 0 Mar 27 20:56 _SUCCESS
-rw-r--r-- 1 spark spark 8 Mar 27 20:56 ._SUCCESS.crc
No data written to file, only success file present.
Can anyone please suggest how to overcome this error.

Installation Of Cassandra

I have download cassandra via terminal but problem is where are the other folders like data, conf, lib, doc etc.
i can see only some files as shown in figure i.e Click here
where is the other folders ?
By "download cassandra via terminal" and your screenshot, I'll assume that you installed Cassandra via apt-get.
From the Apache Cassandra project Wiki, section on Installation from Debian packages:
The default location of configuration files is /etc/cassandra.
The default location of log and data directories is /var/log/cassandra/ and /var/lib/cassandra.
As for the lib directory, check how your $CASSANDRA_HOME is being set:
$ grep CASSANDRA_HOME /etc/init.d/cassandra
CASSANDRA_HOME=/usr/share/cassandra
$ ls -al /usr/share/cassandra/
total 8312
drwxr-xr-x 3 root root 4096 Dec 13 07:57 .
drwxr-xr-x 372 root root 12288 Nov 28 08:51 ..
-rw-r--r-- 1 root root 5962385 Jun 1 2016 apache-cassandra-3.6.jar
lrwxrwxrwx 1 root root 24 Jun 1 2016 apache-cassandra.jar -> apache-cassandra-3.6.jar
-rw-r--r-- 1 root root 1902216 Jun 1 2016 apache-cassandra-thrift-3.6.jar
-rw-r--r-- 1 root root 875 May 31 2016 cassandra.in.sh
drwxr-xr-x 3 root root 12288 Dec 13 07:57 lib
-rw-r----- 1 root root 82123 Oct 20 2015 metrics-core-2.2.0.jar
-rw-r----- 1 root root 9639 Oct 20 2015 metrics-graphite-2.2.0.jar
-rw-r--r-- 1 root root 509144 Jun 1 2016 stress.jar
Note that Cassandra's lib directory is shown in the middle of the directory listing above.

Touch command. permission denied

I was able to connect to my school server via SSH. I had an assignment in which I was supposed to use the touch command to create a new file. Yet it keeps returning permission denied. Others were able to do the same thing. Though why do I keep getting this error?
Below is what was the input from the terminal.
Last login: Tue Aug 23 09:16:18 on ttys000
Dominiks-Air:~ fsociety95$ ssh djaneka1#navajo.dtcc.edu
djaneka1#navajo.dtcc.edu's password:
Last login: Tue Aug 23 09:16:35 2016 from pool-72-94-210-193.phlapa.fios.verizon.net
Navajo is Linux shell server provided to staff, faculty, and students. The
operating system is RedHat Enterprise Linux 5.
Alpine, a Pine replacement, has been provided as a mail client. Run "pine"
at the command prompt.
This server also provides web space to users. Web pages can be stored in
the ~/www directory. This is also accessible by mapping a drive in Windows
to \navajo\homepage. The URL for your homepage is
http://user.dtcc.edu/~username/.
Your home directory is also accessible in Windows by mapping to
\navajo\.
If something appears broken or missing, please email path#dtcc.edu.
Could not chdir to home directory /u/d/j/djaneka1: No such file or directory
-bash-3.2$ touch today
touch: cannot touch `today': Permission denied
-bash-3.2$ pwd
/
-bash-3.2$ touch today
touch: cannot touch `today': Permission denied
-bash-3.2$
Edit: here is the result of ls -al
-bash-3.2$ ls -al
total 204
drwxr-xr-x 25 root root 4096 Aug 22 16:50 .
drwxr-xr-x 25 root root 4096 Aug 22 16:50 ..
-rw-r--r-- 1 root root 0 Aug 3 14:01 .autofsck
-rw-r--r-- 1 root root 0 Jan 30 2009 .autorelabel
-rw------- 1 root root 2050 Aug 3 14:00 .bash_history
drwxr-xr-x 2 root root 4096 May 4 04:14 bin
drwxr-xr-x 4 root root 3072 Aug 3 13:57 boot
drwxr-xr-x 11 root root 4060 Aug 3 14:02 dev
drwxr-xr-x 87 root root 12288 Aug 23 10:05 etc
drwxr-xr-x 3 root root 4096 Oct 1 2009 home
drwxr-xr-x 13 root root 12288 Jun 1 04:09 lib
drwx------ 2 root root 16384 Mar 24 2008 lost+found
drwxr-xr-x 3 root root 4096 Oct 1 2009 media
drwxr-xr-x 2 root root 0 Aug 3 14:02 misc
drwxr-xr-x 4 root root 4096 May 26 2012 mnt
drwxr-xr-x 2 root root 0 Aug 3 14:02 net
drwxr-xr-x 9 root root 4096 Jan 5 2009 nsr
drwxrwxr-x 3 root root 4096 Oct 12 2015 opt
dr-xr-xr-x 219 root root 0 Aug 3 14:01 proc
drwxr-x--- 12 root root 4096 Apr 22 10:06 root
drwxr-xr-x 2 root root 12288 Aug 4 04:02 sbin
drwxr-xr-x 2 root root 4096 Oct 1 2009 selinux
drwxr-xr-x 2 root root 4096 Oct 1 2009 srv
drwxr-xr-x 11 root root 0 Aug 3 14:01 sys
drwxrwxrwt 38 root root 4096 Aug 23 10:07 tmp
drwxr-xr-x 34 root root 4096 Jun 21 08:29 u
drwxr-xr-x 14 root root 4096 Apr 16 2010 usr
drwxr-xr-x 24 root root 4096 Apr 16 2010 var
-rw------- 1 root root 2865 Dec 16 2008 .viminfo
-bash-3.2$
EDIT:
Here is what I see after trying touch today in /home
So to try and create a new document in the root directory you need to be recognised as root. That means using the sudo command.
However for that you would need a password that you may not have. If you do perfect. But in any case I would not recommend adding files to the root directory.
Instead try the following:
cd home
touch today
This should work just fine and answer your question.
Still if you need/want to create today in your root directory try the following
sudo touch today
You will then be prompted for the root password that you can type (if you have it obviously)
In any case I suggest reading this which may be very helpful for you.
I wonder if this was ever truly answered.
If I was looking at it, I would try to see what the system thinks is the home directory of djaneka1, since it may have been setup partway and not completed, leaving stuff owned by root that should have been owned by djaneka1.
If you use the pwd command, and get back the "/" (root) directory there is something wrong with your setup.
The message: Could not chdir to home directory /u/d/j/djaneka1: No such file or directory
tells you it can't find your home directory.
-bash-3.2$ pwd
/
the command "pwd" revealing "/" is just an artifact of the system not being able to find your home directory.
To find what the system thinks is one's home directory,
one can search the file named '/etc/passwd' for one's login name.
I expect this is a possible result if you do that:
$ fgrep 'djaneka1' /etc/passwd
djaneka1:x:1505:1506::/u/d/j/djaneka1:/bin/bash
since it complained that it couldn't find that directory.
This needs to be fixed by someone who has more rights to the system, like root.
there is nothing djaneka1 can do a

error while loading shared libraries: libQt5XmlPatterns.so.5:

I have installed one code for my research,when I try to run it I got
./procmt_main: error while loading shared libraries: libQt5XmlPatterns.so.5: cannot open shared object file: No such file or directory
These shared libraries appear to be here,but
root#milenko-HP-Compaq-6830s:/usr/lib/x86_64-linux-gnu# ls -lia libQt5XmlPatterns.so.5
ls: cannot access 'libQt5XmlPatterns.so.5': No such file or directory
I am trying to remove qt4,but
sudo apt-get remove qt4
Reading package lists... Done
Building dependency tree
Reading state information... Done
E: Unable to locate package qt4
root#milenko-HP-Compaq-6830s:/usr/lib/x86_64-linux-gnu# ls -lia qt4
total 88
3016807 drwxr-xr-x 4 root root 4096 Abr 20 19:10 .
2756528 drwxr-xr-x 109 root root 77824 Jun 24 11:24 ..
3147728 drwxr-xr-x 2 root root 4096 Abr 20 19:10 bin
3147729 drwxr-xr-x 15 root root 4096 Abr 20 19:12 plugins
root#milenko-HP-Compaq-6830s:/usr/lib/x86_64-linux-gnu# ls -lia qt5
total 100
3016808 drwxr-xr-x 7 root root 4096 Jun 24 11:24 .
2756528 drwxr-xr-x 109 root root 77824 Jun 24 11:24 ..
3147768 drwxr-xr-x 2 root root 4096 Jun 24 11:24 bin
3147769 drwxr-xr-x 2 root root 4096 Abr 20 19:10 libexec
3020631 drwxr-xr-x 97 root root 4096 Jun 24 11:24 mkspecs
3147770 drwxr-xr-x 16 root root 4096 Abr 20 19:13 plugins
3147771 drwxr-xr-x 11 root root 4096 Abr 20 19:12 qml
I do not understand these linking issues,why this file does not exist?

conf/hadoop-env.sh file opening in readonly mode

Well i am trying to configure a single node cluster hadoop.Now i have created a user hadoop alonside and i have installed hadoop in my usr/local/hadoop directory.
Then i have done the following commands
chown hadoop:hadoop hadoop hadoop-1.0.4
ln -s hadoop-1.04/ hadoop.
as a result when i do ls -l
it shows the following
drwxr-xr-x 2 root root 4096 Jun 16 2012 bin
drwxr-xr-x 2 root root 4096 Jun 16 2012 etc
drwxr-xr-x 2 root root 4096 Jun 16 2012 games
lrwxrwxrwx 1 root root 13 Apr 16 13:20 hadoop -> hadoop-1.0.4/
drwxr-xr-x 13 hadoop hadoop 4096 Oct 3 2012 hadoop-1.0.4
drwxr-xr-x 2 root root 4096 Jun 16 2012 include
drwxr-xr-x 3 root root 4096 Jun 16 2012 lib
lrwxrwxrwx 1 root root 9 Aug 22 2012 man -> share/man
drwxr-xr-x 2 root root 4096 Jun 16 2012 sbin
drwxr-xr-x 6 root root 4096 Jun 16 2012 share
drwxr-xr-x 2 root root 4096 Jun 16 2012 src
so hadoop 1.0.4 has hadoop as usergroup.
Now when i am entring my hadoop group using
su -hadoop
so i can change my conf/hadoop-env.sh file but it is not happening
hadoop#iu1:/usr/local$ vi conf/hadoop-env.sh
the file opens in readonly mode
i think it should be editable mode
Please help
Thanks
You need to chown recursively:
chown -R hadoop:hadoop hadoop hadoop-1.0.4
To verify file permissions do
ls -l /usr/local/hadoop/conf/

Resources