Guys im researching about WGET command in linux, (im very new to linux) and i found this statement which i dont understand
GNU Wget is a free software package for retrieving files using HTTP, HTTPS and FTP, the most widely-used Internet protocols. It is a non-interactive commandline tool, so it may easily be called from scripts, cron jobs, terminals without X-Windows support, etc.
and what does
"without X-windows support means" too?
Also what i understand about wget is that it downloads something, but how come i can
wget http://google.com/
and see some weird text in the screen.
A little help here
Wget downloads content to a file. So the text you see in your terminal is just a job log.
Non interactive means that it doesn't prompt for any input while it works. You specify everything via command line parameters.
X (and related) handles GUI rendering. See http://en.wikipedia.org/wiki/X_Window_System for details.
easier to think of what wget DOESN'T do. Your typical browser reads a URL from a GUI interface, and when you click on it, the browser generates & sends a file request to retrieve an HTML file. It then translates the (text based) html source file, sends further requests for content like images etc., and renders the whole thing to GUI as a webpage.
Wget just sends the request & dowloads the file. It can be controlled to recursively fetch links in the source file, so you could download the whole internet with a few keystrokes XD.
It's useful by itself for grabbing graphic & audio files without having to sit through a point & click session. You can also pipe the html source through a custom sed or perl filter to extract data. (like going to the city transit page & converting schedule info to a spreadsheet format)
Related
Not quite sure if it possible at all, so decided to ask here.
I need to automate some things and search the way to drag'n'drop local file to browser window (with specific URL) via Terminal command for uploading.
I use Mac, but I think Linux will fit here as well.
If there is any solution or module on Bash / Python / Node.js I will gladly give it a try.
take a look at requests package in python language.
you can make a POST request and send information you want to the web server.
I'm using julia and gadfly to draw some plots on a remote server (connected through Putty) and the plots are supposed to open in my default server. They open in lynx, and so don't look like anything really. I'm presuming lynx is the default browser on my work server, and I was wondering whether there is any way to open them in chrome or firefox? I'm not the server administrator and have no permission to use all commands (ie sudo etc).
When trying to use xdg-utils I get an error saying "command not found" and I don't have any applications in my /usr/.local/applications nor could I find a mimeapps.list in the directory.
Is there anything I can do to open these plots in another internet browser instead of lynx? Thank you!
The order of preferences
Gadfly plots on Julia's display if it can (for example if you're using an interactive graphical notebook with Jupyter).
If there's no suitable way to render on the REPLDisplay, Gadfly will save the plot into a file, then trigger some platform-specific "open this file" logic.
Julia's own display
This is almost certainly the best option. If you run your Julia code in an environment that knows how to display your plots (such as an interactive graphical notebook with Jupyter), then there's nothing more to do.
If you must run your Julia code from a text prompt, you can use a text-based backend renderer, or deal with the fallback process.
xdg-open
Gadfly's fallback display code uses xdg-open to display plot files on Linux-based systems.
The xdg-open tool is part of a package called xdg-utils. The xdg-utils package contains several commands, but xdg-utils is not itself a command -- that's why trying to run "xdg-utils" fails with "command not found".
xdg-open has its own chain of opening things: it will try the opening procedures specific to GNOME, KDE, or whatever desktop environment you're using. It falls back to something called "perl-shared-mimeinfo".
Another tool in the xdg-utils package is xdg-mime, which can query the current file associations as well as change them. You need administrator privileges to change system-wide associations, but you don't need any special permissions to add your own per-user associations.
Since Gadfly is writing to a file then asking xdg-open to open the file, you'll need to handle the filetype (rather than "browser" or URL handler). It might look something like this for HTML files:
$ xdg-mime default mybrowser.desktop text/html
Which computer runs the browser?
Now, you mention that you're using SSH and PuTTY to connect to this server. PuTTY provides a text-based interface to your server -- even if the server had a graphical browser like Firefox installed on it, PuTTY couldn't display it. (You'd need something else on your computer that the server could use to draw the browser window.)
It would probably be more comfortable to use your computer's own browser.
So what do I do?
Launching a browser is a bit weird for a server computer anyway, and it can be fiddly to make it happen. So my recommendation would be either:
Skip PuTTY, display directly in a Jupyter notebook.
Save your output as HTML (or SVGJS) somewhere that your computer's browser can access it.
certain pages require confirmation(eg. check a checkbox to accept agreement) prior to download
http://www.oracle.com/technetwork/java/javase/downloads/jdk7-downloads-1880260.html
is it possible to download this from linux command line(no x server available)?
I use wget to directly download.
If you want a console only text web-browser, use lynx or links
If you want to script interaction, consider curl (and also libcurl, which is interfaced to many scripting languages, e.g. python etc).
I need to archive complete pages including any linked images etc. on my linux server. Looking for the best solution. Is there a way to save all assets and then relink them all to work in the same directory?
I've thought about using curl, but I'm unsure of how to do all of this. Also, will I maybe need PHP-DOM?
Is there a way to use firefox on the server and copy the temp files after the address has been loaded or similar?
Any and all input welcome.
Edit:
It seems as though wget is 'not' going to work as the files need to be rendered. I have firefox installed on the server, is there a way to load the url in firefox and then grab the temp files and clear the temp files after?
wget can do that, for example:
wget -r http://example.com/
This will mirror the whole example.com site.
Some interesting options are:
-Dexample.com: do not follow links of other domains
--html-extension: renames pages with text/html content-type to .html
Manual: http://www.gnu.org/software/wget/manual/
Use following command:
wget -E -k -p http://yoursite.com
Use -E to adjust extensions. Use -k to convert links to load the page from your storage. Use -p to download all objects inside the page.
Please note that this command does not download other pages hyperlinked in the specified page. It means that this command only download objects required to load the specified page properly.
If all the content in the web page was static, you could get around this issue with something like wget:
$ wget -r -l 10 -p http://my.web.page.com/
or some variation thereof.
Since you also have dynamic pages, you cannot in general archive such a web page using wget or any simple HTTP client. A proper archive needs to incorporate the contents of the backend database and any server-side scripts. That means that the only way to do this properly is to copy the backing server-side files. That includes at least the HTTP server document root and any database files.
EDIT:
As a work-around, you could modify your webpage so that a suitably priviledged user could download all the server-side files, as well as a text-mode dump of the backing database (e.g. an SQL dump). You should take extreme care to avoid opening any security holes through this archiving system.
If you are using a virtual hosting provider, most of them provide some kind of Web interface that allows backing-up the whole site. If you use an actual server, there is a large number of back-up solutions that you could install, including a few Web-based ones for hosted sites.
What's the best way to save a complete webpage on a linux server?
I tried couple of tools curl, wget included but nothing works up to my expectations.
Finally I found a tool to save a complete webpage (images, scripts, linked pages.... everything included). Its written in rust named monolith. Take a look.
It do not save images and other scripts/ stylesheets as separate files but pack them in 1 html file.
For example
If I had to save https://nodejs.org/en/docs/es6 to a es6.html with all page requisites packed in one file then I had to run:
monolith https://nodejs.org/en/docs/es6 -o es6.html
wget -r http://yoursite.com
Should be sufficient and grab images/media. There are plenty of options you can feed it.
Note: I believe wget nor any other program supports downloading images specified through CSS - so you may need to do that yourself manually.
Here may be some useful arguments: http://www.linuxjournal.com/content/downloading-entire-web-site-wget
I want to open html files from a shell script. I know that Ubuntu has a command x-www-browser that will open the default browser on the system. I also found via some Googling that the command is part of the debian system. I was wondering if the command is available on non debian based distros. If it isn't is there a standard way of opening an html file in the default browser on a linux OS via command line? Note that I'm using Bash.
If you are wanting to open an HTML file that is local (and maybe even remote, I'd have to check), you can use xdg-open. This is the rough equivalent to "double-clicking" on a file to open it, so it's not limited to html files. Since you want to always open in the user's default browser, this would be the same as if they just opened it themselves.
Of course, if they have their system set up to have HTML files open in a text editor (like I did for awhile), this would backfire. But that's pretty rare.
Quick update
I just checked and xdg-open http://google.com brought up Google in Firefox (my default browser). So it does work for non-local files.
You could use xdg-open.