Uploading files into hadoop - linux

I've recently downloaded Oracle Virtual Box and I want to take some data and import it into HDFS. I want to state that I am a complete novice when it comes to these things. I've tried copying the instructions from a udacity course which do NOT work.
I apologize if the terminology I'm using is not accurate.
So in my VM space I have the following files
Computer
Training's Home (Provided by Udacity)
Eclipse
Trash
Inside Training's Home I have on the left-hand side under Places
training,
Desktop
File System
Network
Trash
Documents
Pictures
Downloads
On the right-hand side when I select training there are many folders one of them is udacity_training. When I select this there are two folders
code and data. When I select data there are further two folders something called access_log.gz and purchases.txt which is the data I want to load into HDFS
Copying the command entered by the udacity tutorial I typed
[training#localhost ~]$ ls access_log.gz purchases.txt
This gave the error messages
ls: cannot access access_log.gz: No such file or directory
ls: cannot access purchases: No such file or directory
I then tried the next line just to see what happens which was
[training#localhost ~]$ hadoop fs -ls
[training#localhost ~]$ hadoop fs -put purchases.txt
again an error saying
put: 'purchases.txt': No such file or directory
What am I doing wrong? I don't really understand command line prompts I think they're in Linux? So what I'm typing looks quite Alien to me. I want to be able to understand what I'm typing. Could someone help me access the data and perhaps also provide some info on where I can understand what I'm actually typing into the command line? Any help is greatly appreciated.

Please start learning the basics of linux & hadoop commands.
To answer your question try below options.
Use command cd /dir_name to goto the required directory and then use
hadoop fs -put /file_name /hdfs/path

Related

Can someone help me figure out why terminal (MacOS) will no longer find python files I have saved?

So I'm new to learning python. I have python 3.8.2 installed on my MacBook Pro. It was working fine for a week, and I was creating code in atom, saving as .py files to a folder on my desktop. Terminal was locating and running those files easy. However, now when I go into terminal it does not see that any of my files are in the folder, just a hello.py which prints "Hello World" but I do not have any such file in that folder.
For more background, I believe the source of the error is from my earlier command. I was unsure of how to quickly navigate terminal, and as I was in a folder in a folder and wanted to leave the second folder, I thought it was similar to those old command games were you tell them to do something. So in my dumb head I typed "MacBook-Pro:FolderName user$ exit foldername" then clicked return. After this I got a message that I didn't save so I don't remember it but it was a few more lines and I believe it said it exited. I could no longer type in terminal. I closed the shell and opened a new one and thats what lead to my current issue and why I am here seeking help.
I have included a photo of my atom code, what terminal says now and my folder of files terminal can't find.
Edit: I cannot include pictures yet due to reputation being low so it has been changed to a link photo of issue here
EDIT: To add to this, I created a new save folder, moved one of the old files over to the new folder, and it ran normally. This leads me to believe that I used terminal to somehow ignore or forget that initial save folder and all its content. Is that a thing that can happen?
Based on your picture, it looks like the misconception is about how directories work in MacOS. In your terminal, you type cd ~/py4e; in Unix systems, ~ is your home directory, so you’re navigating to the py4e subdirectory under your home directory.
Then, however, you type cd ~/ex_04 (or something), which means you’re trying to navigate to the ex_04 subdirectory under your home directory. This isn’t what you want; you want to navigate to ex_04 under the py4e directory.
In the Terminal, when you’re working in a current folder, you can change to another folder within that current directory by just typing the name of that subdirectory, i.e. cd ex_04 once you have run cd ~/py4e.
If you’re just starting out with the command line in MacOS, I would definitely recommend looking up some beginner tutorials online so you can get a better feel of navigating and working in the Terminal.

Windows Linux Subsystem - File permissions when edited outside bash [duplicate]

As the title suggests, if I paste a c file written somewhere else into the root directory of the Linux Subsystem, I can't compile it.
I did a test where I made two differently titled hello world programs: one in vi that I can get into from the bash interface, and one elsewhere. When I compiled the one made in vi, it worked fine. Trying to do so for the one made elsewhere (after pasting it into the root directory), however, resulted in this:
gcc: error: helloWorld2.c: Input/output error
gcc: fatal error: no input files
compilation terminated
Any help with this would be much appreciated.
Do not change Linux files using Windows apps and tools!
Assuming what you meant by "paste a C file written somewhere else into the root directory of the Linux subsystem" is that you pasted your file into %localappdata%\lxss, this is explicitly unsupported. Files natively created via Linux syscalls in this area have UNIX metadata, which files natively created with Windows tools don't have.
Use /mnt/c (and the like) to access your Windows files from Linux; don't try to modify Linux files from Windows.
Quoting from the Microsoft blog linked at the top of this answer (emphasis from the original):
Therefore, be sure to follow these two rules in order to avoid losing files, and/or corrupting your data:
DO store files in your Windows filesystem that you want to create/modify using Windows tools AND Linux tools
DO NOT create / modify Linux files from Windows apps, tools, scripts or consoles
You cannot copy (by default, who knows how Windows bash is set up!) files into the root directory! Your gcc error is say "no input files", so the copy has most likely failed. Copy the files to your home directory instead, for instance:
cp helloWorld2.c ~/
instead of:
cp helloWorld2.c /

Spring Tool Suite 3.8.2 - Installation on Ubuntu

I managed to install STS 3.8.2 on Ubuntu 16.04 - with a lot of hacking experiments. I have it working, but I am not happy with my solution.
Here is what I had to do:
Extracted the tar file into /opt/sts-bundle.
If you put it anywhere else, like /opt/sts, the TC server fails to start from STS.
With files in /opt/sts-bundle, TC server still fails to start from STS - permission errors. To get it to work you need to futz around with permissions of the pivotal-c-server subdirectories, essentially you need to open it up your group (the same one running STS) (security hole ?).
A local install in your own ~/sts-bundle fails on "files not found" while attempting to backup - all the conf files. It still looks in /opt/sts-bundle for all these config files (just to copy them to /backup). You can change the top directory of the server in STS server properties - but it still looks in /opt/sts-bundle. Seems hard-coded - don't know where. So you have to create all the config files in the conf directory in the tree rooted at /opt/sts-bundle ("touch" works - creating empty files). TC Server still fails to start with a "failed to clean" error - with no clue from the detailed message what files are being "cleaned".
I tried creating a non-privileged user "tcserver" per suggestion from the Pivotal TC Server docs. I installed to /opt/sts-bundle, while logged in as tcserver (with sudo privileges). That fails when I am using STS as a regular developer that is not "tcserver". Could not figure out how to tell TC server to run under a different user than the one that started STS.
The solution I have working and I am not happy with, starts by extracting the tar.gz file into /opt/sts-bundle, as it wants. Then changing owner and group of sts-bundle to my id and my group (same ones that are used in STS UI). I am not happy with that. It seems wrong to put things in /opt that are owned by a single developer.
I am new to Linux, and I still have some Windows habits that need to be unlearned.
The question is: how do I get the clean solution (installing using a "tcserver" user in the global /opt directory) to work for developers who are not "tcserver"? How should the tcserver user be related to the developers (same group?).
Am I making this problem harder than it should be? What am I missing?
I'm not sure this what you want, but I don't install the STS bundles in some kind of shared directory as a special user at all. I just install it in my user.home dir, as myself, and launch it from there.
It is very unsophisticated. I just download the tar.gz file, unpack it in my home dir and then launch it from a trivial bash script which looks something like this:
#!/bin/bash
/home/kdvolder/Applications/sts-bundle/sts-*/STS
That script is on my PATH. So I can just type 'STS' in a terminal and STS will start.
I don't have to do anything else and it works.
If you are trying to somehow install this so that several different users can run a shared installation then this isn't a good setup. But I think for your own personal laptop or desktop which only you are using, this simple setup is perfectly fine.
For a shared-user env, unfortunately, I don't know how to help you. It could be complicated to sort out all the permissions issues etc because Eclipse is a complicated beast w.r.t to installation of plugins etc.

Blank SSHFS mount folder

I am attempting to mount a remote directory located on my web server to a directory in my xUbuntu installedation hosted in a VirtualBox.
I'm using the following command syntax:
sshfs root#*.*.*.*:/var/www Desktop/RemoteMount
Using the file manager, I navigate to the Desktop/RemoteMount directory but find it entirely blank. The SSHFS command above executed with no indication of an error.
Completely by chance, I use the terminal to long list the contents of the Desktop/RemoteMount directory and it shows all the data I was expecting to see in the file manager.
Can anyone tell me why the file manager does not show my remotely mounted data and how I might fix it?
Thanks.
you are missing local mountpoint.
sshfs -o idmap=user mika#192.168.1.2:/home/mika/remotepoint /home/mika/localmountpoint.
And You need to have localmount folder exist.
thanks Mika

phpstorm write issues in ./idea directory

When I try to save a file to disc within a project directory, I get this error:
java.io.IOException: W:\\[projectname]\\.idea not found
Some research tells me, the (network) location is not writable.
I'm trying to write this file from phpstorm in windows 8.
The drive (W:) is a network drive to a linux machine.
The directory I try to write to is chowned to the same user and group as I connect with in windows.
This is a result of ls -alh:
drwxrwxrwx 2 correct-user correct-user
On Linux and other Unix-like operating systems files starting with a . are considered 'hidden files' by default. As such, when the Windows-based program creates it, it suddenly doesn't see it anymore right after since it's hidden, even though the creation was successful. You can fix this in your Samba config by adding the following line to the share configuration:
hide dot files = no
In my samba settings I added a veto files parameter. Removing this parameter allows me to write dot files again.
Samba describes this setting as follows:
This is a list of files and directories that are neither visible nor accessible

Resources