wget -o command output generates more than one file, is it possible to get only one? - linux

I am executing the "wget -o " and because the output is bigger than expected, it is split in more than one file. Is there a way to get only one file? If this is possible I would prefer to use only the command wget.
The command wget that I am executing is:
$ wget -o neighborhoods.json https://raw.githubusercontent.com/mongodb/docs-assets/geospatial/neighborhoods.json
And the multiple output is:
-rw-rw-r-- 1 ubuntu ubuntu 6652 Mar 4 01:15 neighborhoods.json
-rw-rw-r-- 1 ubuntu ubuntu 4137081 Mar 4 01:15 neighborhoods.json.1

Look well at the wget output, you will see what it is/will be doing. wget does not split files if they are long; instead, it avoids to overwrite files, if they exist (creating a new file instead of touching the already existing one).
Delete the two files neighborXXX, and start wget again; be sure it finishes without problems: it will write (create) the single file you asked for. If it is interrupted, and you restart it, it will create a new file (appending .1 and so on).
You can pass it the option -c to tell it to continue a broken download, if it was interrupted - most of the times it works well (not always tough).

Related

Script command: separate input and output

I'm trying to monitor command execution on a shell.
I need to separate the input command, for example:
input:
ls -l /
output:
total 76
lrwxrwxrwx 1 root root 7 Aug 11 10:25 bin -> usr/bin
drwxr-xr-x 3 root root 4096 Aug 11 11:18 boot
drwxr-xr-x 17 root root 3200 Oct 11 11:10 dev
...
Also, I want to do the same if I open another shell, for example, after connection through ssh to another server.
I've been using script command to do this and it works just fine!
It logs all command input and output even if the shell changes (through ssh, or entering a msfconsole, for example).
Nevertheless, I found two main issues:
For my project, I need to separate (using a decoder) each command from the rest, also it would be awesome to be able to separate command input and output, for example:
cmd1. pwd ---> /var/
cmd2. echo "hello world" ---> "hello world"
....
Sometimes the script command could generate an output with garbage due to shell special characters (for colors, for example) which I would like to filter out.
So I've been thinking about this and I guess I could create a simple script that read from the file written by "script" command and processed the data.
Nevertheless, I'm not sure about what could be the best approach to do this.
I'm evaluating different solutions and I would like to know different proposals from the community.
Maybe I'm losing something and you know a better tool rather than script command or have some idea I've not considered.
Best regards,
A useful util for distinguishing stdout from stderr is annotate-output, (install the "devscripts" package), which sends stderr and stdin both to stdout along with helpful little prefixes. For example, let's try counting characters of a file that exists, plus one that doesn't exist:
annotate-output wc -c /bin/bash /bin/nosuchshell
Output:
00:29:06 I: Started wc -c /bin/bash /bin/nosuchshell
00:29:06 E: wc: /bin/nosuchshell: No such file or directory
00:29:06 O: 1099016 /bin/bash
00:29:06 O: 1099016 total
00:29:06 I: Finished with exitcode 1
That output could be parsed separately using sed, awk, or even a tee and a few greps.

Under Linux, is it possible to gcore a process whose executable has been deleted?

Programming on CentOS 6.6, I deleted an executable (whoops, make clean) while it was running in a screen session.
Now, unrelated, I want to gcore the process to debug something. I have rebuilt the executable, but gcore doesn't accept the replaced file. It knows the original file was deleted and won't let me dump core.
# gcore 15659
core.YGsoec:4: Error in sourced command file:
/home/dev/bin/daemon/destinyd (deleted): No such file or directory.
gcore: failed to create core.15659
# ls -l /proc/15659/exe
lrwxrwxrwx. 1 root root 0 Mar 12 21:33 /proc/15659/exe -> /home/dev/bin/daemon/destinyd (deleted)
# ln -s /proc/15659/exe /home/dev/bin/daemon/destinyd
ln: creating symbolic link `/home/dev/bin/daemon/destinyd': File exists
# rm /proc/15659/exe
rm: remove symbolic link `/proc/15659/exe'? y
rm: cannot remove `/proc/15659/exe': Permission denied
FreeBSD's gcore has an optional argument "executable" which looks promising (as if I could specify a binary to use that is not /proc/15659/exe), but that's of no use to me as Linux's gcore does not have any such argument.
Are there any workarounds? Or will I just have to restart the process (using the recreated executable) and wait for the bug I'm tracking to reproduce itself?
Despite the output of ls -l /proc/15659/exe, the original executable is in fact still available through that path.
So, not only was I able to restore the original file with a simple cp (though this was not enough to restore the link and get gcore to work), but I was able to attach GDB to the process using this path as executable:
# gdb -p 15659 /proc/15659/exe
and then run the "generate-core-file" command, followed by "detach".
Then, I became free to examine the core file as needed:
# gdb /proc/15659/exe core.15659
In truth I had forgotten about the ability of GDB to generate core files, plus I was anxious about actually attaching GDB to the process because timing was quite important: generating the core file at precisely the right time to catch that bug.
But nos steered me back onto this path and, my fears apparently unfounded, GDB was able to produce a lovely core.15659 for me.

Copying updated files from one server to another after 15 min

I want to copy updated file from one server to another every 15 min when the new file gets generated. I have written code using expect script. It works fine but after 15 min it copies all the files in the directory i.e. it replaces and copy latest one also. I want only updated file (updated every 15 min) to get copied and not all the files.
Here is my script:
while :
do
expect -c "spawn scp -P $Port sftpuser#$IP_APP:/mnt/oam/PmCounters/LBO* Test/;expect \"password\";send \"password\r\";expect eof"
sleep 900
done
can I use rsync or any other approach and how?
rsync does only copy changed or new file by default. Use for example that syntax:
rsync -avz -e ssh remoteuser#remotehost:/remote/dir /local/dir/
That specifies ssh as remote shell to use (-e ssh ...), -a activates archive mode, -v sets verbose output and -z compresses the transfer.
You could run that every 15 minutes by a cronjob.
For the password you can use the $RSYNC_PASSWORD environment variable or the --password-file flag.

Cgi-bin script to cat a file owned by a user

I'm using Ubuntu server and I have a cgi-bin script doing the following . . .
#!/bin/bash
echo Content-type: text/plain
echo ""
cat /home/user/.program/logs/file.log | tail -400 | col -b > /tmp/o.txt
cat /tmp/o.txt
Now if I run this script with I am "su" the script fills o.txt and then the host.com/cgi-bin/script runs but only shows up to the point I last ran it from the CLI
My apache error log is showing "permission denied" errors. So I know the user apache is running under somehow cannot cat this file. I tried using chown to no avail. Since this file is in a user directory, what is the best way to either duplicate it or symbolic link it or what?
I even considered running the script as root in a crontab to sort of "update" the file in /tmp/ but that did not work for me. How would somebody experienced with cgi-bin handle access to a file in a users directory?
The Apache user www-data does not have write access to a temporary file owned by another user.
But in this particular case, no temporary file is required.
tail -n 400 logfile | col -b
However, if Apache is running in a restricted chroot, it also has no access to /home.
The log file needs to be chmod o+r and all directories leading down to it should be chmod o+x. Make sure you understand the implications of this! If the user has a reason to want to prevent access to an intermediate directory, having read access to the file itself will not suffice. (Making something have www-data as its group owner is possible in theory, but impractical and pointless, as anybody who finds the CGI script will have access to the file anyway.)
More generally, if you do need a temporary file, the simple fix (not even workaround) is to generate a unique temporary file name, and remove it afterwards.
temp=$(mktemp -t cgi.XXXXXXXX) || exit $?
trap 'rm -f "$temp"' 0
trap 'exit 127' 1 2 15
tail -n 400 logfile | col -b >"$temp"
The first trap makes sure the file is removed when the script terminates. The second makes sure the first trap runs if the script is interrupted or killed.
I would be inclined to change the program that creates the log in the first place and write it to some place visible to Apache - maybe through symbolic links.
For example:
ln -s /var/www/cgi-bin/logs /home/user/.program/logs
So your program continues to write to /home/user/.program/logs but the data actually lands in /var/www/cgi-bin/logs where Apache can read it.

user-data (cloud-init) script not executing on EC2

my user-data script
#!
set -e -x
echo `whoami`
su root
yum update -y
touch ~/PLEASE_WORK.txt
which is fed in from the command:
ec2-run-instances ami-05355a6c -n 1 -g mongo-group -k mykey -f myscript.sh -t t1.micro -z us-east-1a
but when I check the file /var/log/cloud-init.log, the tail -n 5 is:
[CLOUDINIT] 2013-07-22 16:02:29,566 - cloud-init-cfg[INFO]: cloud-init-cfg ['runcmd']
[CLOUDINIT] 2013-07-22 16:02:29,583 - __init__.py[DEBUG]: restored from cache type DataSourceEc2
[CLOUDINIT] 2013-07-22 16:02:29,686 - cloud-init-cfg[DEBUG]: handling runcmd with freq=None and args=[]
[CLOUDINIT] 2013-07-22 16:02:33,691 - cloud-init-run-module[INFO]: cloud-init-run-module ['once-per-instance', 'user-scripts', 'execute', 'run-parts', '/var/lib/cloud/data/scripts']
[CLOUDINIT] 2013-07-22 16:02:33,699 - __init__.py[DEBUG]: restored from cache type DataSourceEc2
I've also verified that curl http://169.254.169.254/latest/user-data returns my file as intended.
and no other errors or the output of my script happens. how do I get the user-data scrip to execute on boot up correctly?
Actually, cloud-init allows a single shell script as an input (though you may want to use a MIME archive for more complex setups).
The problem with the OP's script is that the first line is incorrect. You should use something like this:
#!/bin/sh
The reason for this is that, while cloud-init uses #! to recognize a user script, the operating system needs a complete shebang line in order to execute the script.
So what's happening in the OP's case is that cloud-init behaves correctly (i.e. it downloads and tries to run the script) but the operating system is unable to actually execute it.
See: Shebang (Unix) on Wikipedia
Cloud-init does not accept plain bash scripts, just like that. It's a beast that eats YAML file that defines your instance (packages, ssh keys and other stuff).
Using MIME you can also send arbitrary shell scripts, but you have to MIME-encode them.
$ cat my-boothook.txt
#!/bin/sh
echo "Hello World!"
echo "This will run as soon as possible in the boot sequence"
$ cat my-user-script.txt
#!/usr/bin/perl
print "This is a user script (rc.local)\n"
$ cat my-include.txt
# these urls will be read pulled in if they were part of user-data
# comments are allowed. The format is one url per line
http://www.ubuntu.com/robots.txt
http://www.w3schools.com/html/lastpage.htm
$ cat my-upstart-job.txt
description "a test upstart job"
start on stopped rc RUNLEVEL=[2345]
console output
task
script
echo "====BEGIN======="
echo "HELLO From an Upstart Job"
echo "=====END========"
end script
$ cat my-cloudconfig.txt
#cloud-config
ssh_import_id: [smoser]
apt_sources:
- source: "ppa:smoser/ppa"
$ ls
my-boothook.txt my-include.txt my-user-script.txt
my-cloudconfig.txt my-upstart-job.txt
$ write-mime-multipart --output=combined-userdata.txt \
my-boothook.txt:text/cloud-boothook \
my-include.txt:text/x-include-url \
my-upstart-job.txt:text/upstart-job \
my-user-script.txt:text/x-shellscript \
my-cloudconfig.txt
$ ls -l combined-userdata.txt
-rw-r--r-- 1 smoser smoser 1782 2010-07-01 16:08 combined-userdata.txt
The combined-userdata.txt is the file you want to paste there.
More info here:
https://help.ubuntu.com/community/CloudInit
Also note, this highly depends on the image you are using. But you say it is really cloud-init based image, so this applies. There are other cloud initiators which are not named cloud-init - then it could be different.
This is a couple years old now, but for others benefit I had the same issue, and it turned out that cloud-init was running twice, from inside /etc/rc3.d . Deleting these files inside the folder allowed the userdata to run correctly:
lrwxrwxrwx 1 root root 22 Jun 5 02:49 S-1cloud-config -> ../init.d/cloud-config
lrwxrwxrwx 1 root root 20 Jun 5 02:49 S-1cloud-init -> ../init.d/cloud-init
lrwxrwxrwx 1 root root 26 Jun 5 02:49 S-1cloud-init-local -> ../init.d/cloud-init-local
The problem is with cloud-init not allowing the user script to run on the next start-up.
First remove the cloud-init artifacts by executing:
rm /var/lib/cloud/instances/*/sem/config_scripts_user
And then your userdata must look like this:
#!/bin/bash
echo "hello!"
And then start you instance. It now works (tested).

Resources