Trying to move a csv file from local file system to hadoop file system - linux

I am trying to copy a csv file from my local file system to hadoop. But I am not able to successfully do it. I am not sure which permissions i need to change. As I understand. hdfs super user does not have access to the /home/naya/dataFiles/BlackFriday.csv
hdfs dfs -put /home/naya/dataFiles/BlackFriday.csv /tmp
# Error: put: Permission denied: user=naya, access=WRITE, inode="/tmp":hdfs:supergroup:drwxr-xr-x
sudo -u hdfs hdfs dfs -put /home/naya/dataFiles/BlackFriday.csv /tmp
# Error: put: `/home/naya/dataFiles/BlackFriday.csv': No such file or directory
Any help is highly appreciated. I want to do it via the command line utility. I can do it via cloudera manager from the hadoop side. But I want to understand whats happening behind the commands

Related

error when trying to save dataframe spark to a hdfs file

I'm using Ubuntu
When i try to save a dataframe to HDFS (Spark Scala):
processed.write.format("json").save("hdfs://localhost:54310/mydata/enedis/POC/processed.json")
I got this error
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=root, access=WRITE, inode="/mydata/enedis/POC":hadoop_amine:supergroup:drwxr-xr-x
You are trying to write data as root user but hdfs directory(/mydata/enedis/POC) having permissions to hadoop_amine user to write to the directory.
Change the permissions on the HDFS directory to allow root user to write to /mydata/enedis/POC directory.
#login as hadoop_amine user then execute below command
hdfs dfs –chmod -R 777 /mydata/enedis/POC
(Or)
Intialize the spark shell with hadoop_amine user then no need to change the permissions of the directory.

HDFS + create simbolic link between HDFS folder to local filesystem folder

I searched in google but not find it,
is it possible to create link between HDFS folder to local folder?
example
we want to create link between folder_1 in HDFS to /home/hdfs_mirror local folder
HDFS folder:
su hdfs
$ hdfs dfs -ls /hdfs_home/folder_1
Linux local folder:
ls /home/hdfs_mirror
I do not think it is possible.
This is because we are talking about two different File Systems (HDFS and Local FileSystem).
in case we want to keep syncing the Local Data Directory to HDFS directory then need to make use of some tools like Apache Flume.

How to allow the root user to write files into HDFS

I have installed hadoop on Cent OS 7. The daemon service written in python trying to make a directory in HDFS , but getting the below permission error.
mkdir: Permission denied: user=root, access=WRITE, inode="/rep_data/store/data/":hadoop:supergroup:drwxr-xr-x
looks like my service is running under root account.
So I would like to know how do I give a permission to the root user to make directory and write files.
If you are trying to create directory under HDFS root i.e /, you may face this type of issue. You can create directories in your home, without any issues
To create directory in root, Execute command like follows
sudo hdfs hdfs dfs -mkdir /directory/name
To create directory in your HDFS home execute below command
hdfs dfs -mkdir /user/user_home/directory/name
This is probably an issue because you are not the super user.
A workaround is to enable Access Control Lists in hdfs and give permissions to your user.
To enable support for ACLs, set dfs.namenode.acls.enabled to true in the NameNode configuration.
For more info check: link

Can't copy local file in linux to hadoop

I just installed Hadoop on a VM linux system. Now I am following my guide book to copy a file from locally to hadoop (file is saved on VM desktop). here is what I did:
hdfs dfs -copyFromLocal filename.csv /user/root
However, I received message saying
"copyFromLocal: 'filename.csv': no such file or directory"
Can anyone tell me what went wrong and what should I do to make it right?
Thanks!
you need to be in your Desktop folder ( containing your file to find the file)
cd /root/Desktop
there are two methods for placing file from local host to hadoop's hdfs:
1) copyFromLocal - as you have used
2) hadoop - hadoop dfs -put yourfilepath(local) hdfspath

Command to store File on HDFS

Introduction
A Hadoop NameNode and three DataNodes have been installed and are running. The next step is to provide a File to HDFS. The following commands have been executed:
hadoop fs -copyFromLocal ubuntu-14.04-desktop-amd64.iso
copyFromLocal: `.': No such file or directory
and
hadoop fs -put ubuntu-14.04-desktop-amd64.iso
put: `.': No such file or directory
without succes.
Question
Which command needs to be issued in order to store a file on HDFS?
If no path is provided, hadoop will try to copy the file in your hdfs home directory. In other words, if you're logged as utrecht, it will try to copy ubuntu-14.04-desktop-amd64.iso to /user/utrecht.
However, this folder doesn't exist from scratch (you can normally check the dfs via a web browser).
To make your command work, you have two choices :
copy it elsewhere (/ works, but putting everything there may lead to complications in the future)
create the directory you want with hdfs dfs -mkdir /yourFolderPath

Resources