linux terminal web browser - linux

Is there a program I can run in terminal that will spit out a webpage's output? Basically I want to redirect its output (robots.txt) from a webpage to a txt file.

Wget has this option, amongst others; this will output the page to standard output:
wget -O - http://www.example.com/robots.txt
and this will write into a file that you specified:
wget -O /whatever/output/file/you/put/here.txt http://www.example.com/robots.txt
If you want to append multiple commands to one file, you can use the shell's redirection capabilities. This will append the content of the page to the end of the specified file, preserving its previous content:
wget -O - http://www.example.com/robots.txt >> /home/piskvor/yourfile.txt

Telnet has been a well-known (though now forgotten, I guess) tool for looking at a web page. The general idea is to telnet to the http port, to type an http 1.1 GET command, and then to observe the results on the screen. A good explanation is http://support.microsoft.com/kb/279466
A Google search yields a whole bunch more.
EDIT: deleted extraneous, unrelated junk May 23 2011

Related

Cli colors disappear when piping into a text file [duplicate]

This question already has answers here:
How to trick an application into thinking its stdout is a terminal, not a pipe
(9 answers)
Closed 5 years ago.
Various bash commands I use -- fancy diffs, build scripts, etc, produce lots of color output.
When I redirect this output to a file, and then cat or less the file later, the colorization is gone -- presumably b/c the act of redirecting the output stripped out the color codes that tell the terminal to change colors.
Is there a way to capture colorized output, including the colorization?
One way to capture colorized output is with the script command. Running script will start a bash session where all of the raw output is captured to a file (named typescript by default).
Redirecting doesn't strip colors, but many commands will detect when they are sending output to a terminal, and will not produce colors by default if not. For example, on Linux ls --color=auto (which is aliased to plain ls in a lot of places) will not produce color codes if outputting to a pipe or file, but ls --color will. Many other tools have similar override flags to get them to save colorized output to a file, but it's all specific to the individual tool.
Even once you have the color codes in a file, to see them you need to use a tool that leaves them intact. less has a -r flag to show file data in "raw" mode; this displays color codes. edit: Slightly newer versions also have a -R flag which is specifically aware of color codes and displays them properly, with better support for things like line wrapping/trimming than raw mode because less can tell which things are control codes and which are actually characters going to the screen.
Inspired by the other answers, I started using script. I had to use -c to get it working though. All other answers, including tee, different script examples did not work for me.
Context:
Ubuntu 16.04
running behavior tests with behave and starting shell command during the test with python's subprocess.check_call()
Solution:
script --flush --quiet --return /tmp/ansible-output.txt --command "my-ansible-command"
Explanation for the switches:
--flush was needed, because otherwise the output is not well live-observable, coming in big chunks
--quiet supresses the own output of the script tool
-c, --command directly provides the command to execute, piping from my command to script did not work for me (no colors)
--return to make script propagate the exit code of my command so I know if my command has failed
I found that using script to preserve colors when piping to less doesn't really work (less is all messed up and on exit, bash is all messed up) because less is interactive. script seems to really mess up input coming from stdin even after exiting.
So instead of running:
script -q /dev/null cargo build | less -R
I redirect /dev/null to it before piping to less:
script -q /dev/null cargo build < /dev/null | less -R
So now script doesn't mess with stdin and gets me exactly what I want. It's the equivalent of command | less but it preserves colors while also continuing to read new content appended to the file (other methods I tried wouldn't do that).
some programs remove colorization when they realize the output is not a TTY (i.e. when you redirect them into another program). You can tell some of those to use color forcefully, and tell the pager to turn on colorization, for example use less -R
This question over on superuser helped me when my other answer (involving tee) didn't work. It involves using unbuffer to make the command think it's running from a shell.
I installed it using sudo apt install expect tcl rather than sudo apt-get install expect-dev.
I needed to use this method when redirecting the output of apt, ironically.
I use tee: pipe the command's output to teefilename and it'll keep the colour. And if you don't want to see the output on the screen (which is what tee is for: showing and redirecting output at the same time) then just send the output of tee to /dev/null:
command| teefilename> /dev/null

Redirect wget screen output to a log file in bash

First of all, thank you everyone for your help. I have the following file that contains a series of URL:
Salmonella_enterica_subsp_enterica_Typhi https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/003/717/755/GCF_003717755.1_ASM371775v1/GCF_003717755.1_ASM371775v1_translated_cds.faa.gz
Salmonella_enterica_subsp_enterica_Paratyphi_A https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/818/115/GCF_000818115.1_ASM81811v1/GCF_000818115.1_ASM81811v1_translated_cds.faa.gz
Salmonella_enterica_subsp_enterica_Paratyphi_B https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/018/705/GCF_000018705.1_ASM1870v1/GCF_000018705.1_ASM1870v1_translated_cds.faa.gz
Salmonella_enterica_subsp_enterica_Infantis https://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/011/182/555/GCA_011182555.2_ASM1118255v2/GCA_011182555.2_ASM1118255v2_translated_cds.faa.gz
Salmonella_enterica_subsp_enterica_Typhimurium_LT2 https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/006/945/GCF_000006945.2_ASM694v2/GCF_000006945.2_ASM694v2_translated_cds.faa.gz
Salmonella_enterica_subsp_diarizonae https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/003/324/755/GCF_003324755.1_ASM332475v1/GCF_003324755.1_ASM332475v1_translated_cds.faa.gz
Salmonella_enterica_subsp_arizonae https://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/900/635/675/GCA_900635675.1_31885_G02/GCA_900635675.1_31885_G02_translated_cds.faa.gz
Salmonella_bongori https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/006/113/225/GCF_006113225.1_ASM611322v2/GCF_006113225.1_ASM611322v2_translated_cds.faa.gz
And I have to download the url using wget I have already achieve to download the URL but the typicall output in shell appears:
--2021-04-23 02:49:00-- https://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/900/635/675/GCA_900635675.1_31885_G02/GCA_900635675.1_31885_G02_translated_cds.faa.gz
Reusing existing connection to ftp.ncbi.nlm.nih.gov:443.
HTTP request sent, awaiting response... 200 OK
Length: 1097880 (1,0M) [application/x-gzip]
Saving to: ‘GCA_900635675.1_31885_G02_translated_cds.faa.gz’
GCA_900635675.1_31885_G0 100%[=================================>] 1,05M 2,29MB/s in 0,5s
2021-04-23 02:49:01 (2,29 MB/s) - ‘GCA_900635675.1_31885_G02_translated_cds.faa.gz’ saved [1097880/1097880]
I want to redirect that output to a log file. Also as the files download, I want to decompress them, because they are zip in .gz. My code is the following
cat $ncbi_urls_file | while read line
do
echo " Downloading fasta files from NCBI..."
awk '{print $2}' | wget -i-
done
wget
wget does have options allowing logging to files, from man wget
Logging and Input File Options
-o logfile
--output-file=logfile
Log all messages to logfile. The messages are normally reported to standard error.
-a logfile
--append-output=logfile
Append to logfile. This is the same as -o, only it appends to logfile instead of overwriting the old log file. If logfile does not exist, a new file is created.
-d
--debug
Turn on debug output, meaning various information important to the developers of Wget if it does not work properly. Your system administrator may have chosen to compile Wget without debug support, in which case -d will not work. Please note that compiling with debug support is always safe---Wget compiled with the debug support will not print any debug info unless requested with -d.
-q
--quiet
Turn off Wget's output.
-v
--verbose
Turn on verbose output, with all the available data. The default output is verbose.
-nv
--no-verbose
Turn off verbose without being completely quiet (use -q for that), which means that error messages and basic information still get printed.
You would need to experiment to got what you need, if you need all logs in single file use -a log.out, which will cause wget to append logging information to said file and not writing to stderr.
Standard output can be redirected to a file in bash using the >> operator (for appending to the file) or the > operator (for truncating / overwriting the file). e.g.
echo hello >> log.txt
will append "hello" to log.txt. If you still want to be able to see the output in your terminal and also write it to a log file, you can use tee:
echo hello | tee.txt
However, wget outputs most of its basic progress information through standard error rather than standard output. This is actually a very common practice. Displaying progress information often involves special characters to overwrite lines (e.g. to update a progress bar), change terminal colors, etc. Terminals can process these characters sensibly in real time, but it often does not make much sense to store them in a file. For this reason, such kinds of incremental progress output are often separated from other output which is more sensible to store in a log file to make them easier to redirect accordingly, and hence incremental progress information is often output through standard error rather than standard output.
However, you can still redirect standard error to a log file:
wget example.com 2>> log.txt
Or using tee:
wget example.com 2>&1 | tee log.txt
(2>&1 redirects standard error through standard output, which is then piped to tee).

How to make apache skip specific stdout lines when executing cgi or stop shared library from printing to stdout

My apache server executes a cgi bin and reads the lines outputted to stdout. The thing is, every time the cgi-bin is executed some lines due to the API dynamic library loading are present. Apache can't distinguish now those 3 lines from HTTP requests to initiate my streaming.
There is a way to make apache skip the lines that are keeping it from working or to stop the shared libraries from printing its loads?
If your site is doing fewer than a couple of requests per second, you can use a simple hack by wrapping your CGI with a shell script that will filter out the problem lines:
-- put the code below in your new cgi script and make sure to chmod +x
-- test by running it manually in the shell --
#! /bin/bash
/path/to/old-cgi | egrep -v troublestring1\|troublestring2\|troublestring3
You must make sure that the trouble strings are unique enough to never legitimately appear in your real output. If you you cannot find such strings, you may need a more sophisticated parsing script in place of egrep filter.

Linux - sh script - download multiple files from FTP

I need script that can download files from ftp via my sh code
I have use expect with ftp, but if I do for loop inside code, I got
wrong # args: should be "for start test next command"
while executing "for v in "a b c""
My code
/usr/bin/expect << EXCEPT_SCRIPT
set timeout 1
spawn ftp -n ${HOST}
send "user ${USER} ${PASSWD}\n"
expect "ftp>"
send "bin\n"
expect "ftp>"
send "dir\n"
for v in "${files_to_download[#]}"
do
ftp_file="${v}.bz2"
#download file
echo ${ftp_file}
#put code to dl here
done
expect "ftp>"
send "bye\n"
EXCEPT_SCRIPT
wget
expect can be tricky to work with so I'd prefer to use GNU Wget as an alternative. The following should work as long as you don’t have any spaces in any of the arguments.
for v in "${files_to_download[#]}"
do
ftp_file="${v}.bz2"
wget --user=${USER} --password=${PASSWD} ${HOST}/${ftp_file}
done
Request multiple resources using only one FTP connection
If it’s an issue, you can avoid having to make multiple FTP connections to the server by using FTP globbing (wildcard characters), e.g.
wget --user=${USER} --password=${PASSWD} "${HOST}/*.bz2"
Make sure the URL is quoted so the shell doesn’t expand the wildcard.
You can also use the -nc, --no-clobber option to avoid re-downloading files that have already been downloaded.
On the other hand, requesting multiple resources (wget url2 url2 ...) does result in multiple FTP logins.
Note: I checked both of the above operations by monitoring Port 21 on my own FTP server with tcpdump.
ncftp
You may also be interested in ncftp. While I haven’t used it in years, I found it – and its ncftpget and ncftpput utilities – much easier to use for command line scripting than the standard UNIX ftp program.

What's a .sh file?

So I am not experienced in dealing with a plethora of file types, and I haven't been able to find much info on exactly what .sh files are. Here's what I'm trying to do:
I'm trying to download map data sets which are arranged in tiles that can be downloaded individually: http://daymet.ornl.gov/gridded
In order to download a range of tiles at once, they say to download their script, which eventually leads to daymet-nc-retrieval.sh: https://github.com/daymet/scripts/blob/master/Bash/daymet-nc-retrieval.sh
So, what exactly am I supposed to do with this code? The website doesn't provide further instructions, assuming users know what to do with it. I'm guessing you're supposed to paste the code in to some other unmentioned application for a browser (using Chrome or Firefox in this case)? It almost looks like something that could be pasted in to Firefox/Greasemonkey, but not quite. Just by a quick Google on the file type I haven't been able to get heads or tails on it.
I'm sure there's a simple explanation on what to do with these files out there, but it seems to be buried in plenty of posts where people are already assuming you know what to do with these files. Anyone willing to just simply say what needs to be done from square one after getting to the page with the code to actually implementing it? Thanks.
What is a file with extension .sh?
It is a Bourne shell script. They are used in many variations of UNIX-like operating systems. They have no "language" and are interpreted by your shell (interpreter of terminal commands) or if the first line is in the form
#!/path/to/interpreter
they will use that particular interpreter. Your file has the first line:
#!/bin/bash
and that means that it uses Bourne Again Shell, so called bash. It is for all practical purposes a replacement for good old sh.
Depending upon the interpreter you will have different languages in which the file is written.
Keep in mind, that in UNIX world, it is not the extension of the file that determines what the file is (see "How to execute a shell script" below).
If you come from the world of DOS/Windows, you will be familiar with files that have .bat or .cmd extensions (batch files). They are not similar in content, but are akin in design.
How to execute a shell script
Unlike some unsafe operating systems, *nix does not rely exclusively on extensions to determine what to do with a file. Permissions are also used. This means that if you attempt to run the shell script after downloading it, it will be the same as trying to "run" any text file. The ".sh" extension is there only for your convenience to recognize that file.
You will need to make the file executable. Let's assume that you have downloaded your file as file.sh, you can then run in your terminal:
chmod +x file.sh
chmod is a command for changing file's permissions, +x sets execute permissions (in this case for everybody) and finally you have your file name.
You can also do it in your GUI. Most of the time you can right click on the file and select properties; in XUbuntu the permissions options look like this:
If you do not wish to change the permissions, you can also force the shell to run the command. In the terminal you can run:
bash file.sh
The shell should be the same as in the first line of your script.
How safe is it?
You may find it weird that you must perform another task manually in order to execute a file. But this is partially because of a strong need for security.
Basically when you download and run a bash script, it is the same thing as somebody telling you "run all these commands in sequence on your computer, I promise that the results will be good and safe". Ask yourself if you trust the party that has supplied this file, ask yourself if you are sure that you have downloaded the file from the same place as you thought, maybe even have a glance inside to see if something looks out of place (although that requires that you know something about *nix commands and bash programming).
Unfortunately apart from the warning above I cannot give a step-by-step description of what you should do to prevent evil things from happening with your computer; so just keep in mind that any time you get and run an executable file from someone you're actually saying, "Sure, you can use my computer to do something".
If you open your second link in a browser you'll see the source code:
#!/bin/bash
# Script to download individual .nc files from the ORNL
# Daymet server at: http://daymet.ornl.gov
[...]
# For ranges use {start..end}
# for individul vaules, use: 1 2 3 4
for year in {2002..2003}
do
for tile in {1159..1160}
do wget --limit-rate=3m http://daymet.ornl.gov/thredds/fileServer/allcf/${year}/${tile}_${year}/vp.nc -O ${tile}_${year}_vp.nc
# An example using curl instead of wget
#do curl --limit-rate 3M -o ${tile}_${year}_vp.nc http://daymet.ornl.gov/thredds/fileServer/allcf/${year}/${tile}_${year}/vp.nc
done
done
So it's a bash script. Got Linux?
In any case, the script is nothing but a series of HTTP retrievals. Both wget and curl are available for most operating systems and almost all language have HTTP libraries so it's fairly trivial to rewrite in any other technology. There're also some Windows ports of bash itself (git includes one). Last but not least, Windows 10 now has native support for Linux binaries.
sh files are unix (linux) shell executables files, they are the equivalent (but much more powerful) of bat files on windows.
So you need to run it from a linux console, just typing its name the same you do with bat files on windows.
Typically a .sh file is a shell script which you can execute in a terminal. Specifically, the script you mentioned is a bash script, which you can see if you open the file and look in the first line of the file, which is called the shebang or magic line.
I know this is an old question and I probably won't help, but many Linux distributions(e.g., ubuntu) have a "Live cd/usb" function, so if you really need to run this script, you could try booting your computer into Linux. Just burn a .iso to a flash drive (here's how http://goo.gl/U1wLYA), start your computer with the drive plugged in, and press the F key for boot menu. If you choose "...USB...", you will boot into the OS you just put on the drive.
How do I run .sh scripts?
Give execute permission to your script:
chmod +x /path/to/yourscript.sh
And to run your script:
/path/to/yourscript.sh
Since . refers to the current directory: if yourscript.sh is in the current directory, you can simplify this to:
./yourscript.sh
or with GUI
https://askubuntu.com/questions/38661/how-do-i-run-sh-scripts/38666#38666
https://www.cyberciti.biz/faq/run-execute-sh-shell-script/
open the location in terminal then type these commands
1. chmod +x filename.sh
2. ./filename.sh
that's it

Resources