Creating multi-threading in Linux

Creating multi-threading in Linux - linux

I have a Linux computer and want to do the following (two independent actions/processes if possible):
Run a series of CLI commands every x minutes towards a remote computer to see if a file exists there. If it does, I want to start downloading this file to one of my directories. The files could be big and there might be many remote computers, so ideally I should treat each connection as an own process.
Check to see if a new file has arrived in my file system. If it has, I want to go through this file, analyze the content with some algoritms and then store the result in a database that I have installed. Then delete the file that was analyzed.
Any recommendations on how to do this the "best" and most reliable way? Scripting? Java/C/etc? Multi threading or just a single process that is looping through the content? The result should be something that should run for months without stopping.
Any suggestions and/or sample code very welcome!
Thanks!
Z

For #1, you can use crantab(http://unixhelp.ed.ac.uk/CGI/man-cgi?crontab+5) to run your script.
I think you can use Shell, Python, Ruby to complete your tasks

for downloading you can use a mix of something like that script (not tested and incomplete):
function download_and_analyze(link){
Z=mktemp
cd $Z
wget -c -t0 -q -C $link
# analyze algorithm here
cd /tmp
rm -rf $Z
}
for A in $REMOTEFILE; do
download_and_analyze($A) &
done
this is just a scratch to you implement it as a shell script. Realiability is guaranteed by wget.
you also can use rsync if it's on another computer acessible by ssh.
cheers

Related

How to execute a shell program taking inputs with python?

First of all, I'm using Ubuntu 20.04 and Python 3.8.
I would like to run a program that takes command line inputs. I managed to start the program from python with the os.system() command, but after starting the program it is impossible to send the inputs. The program in question is a product interface application that uses the CubeSat Space Protocol (CSP) as a language. However, the inputs used are encoded in a .c file with their corresponding .h header.
In the shell, it looks like this:
starting the program
In python, it looks like this:
import os
os.chdir('/home/augustin/workspaceGS/gs-sw-nanosoft-product-interface-application-2.5.1')
os.system('./waf')
os.system('./build/csp-client -k/dev/ttyUSB1')
os.system('cmp ident') #cmp ident is typically the kind of command that does not work on python
The output is the same as in the shell but without the "cmp ident output", that is to say it's impossible for me to use the csp-client#
As you can probably see, I'm a real beginner trying to be as clear and precise as possible. I can of course try to give more information if needed. Thanks for your help !

It sounds like the pexpect module might be what you're looking for rather than using os.system it's designed for controlling other applications and interacting with them like a human is using them. The documentation for it is available here. But what you want will probably look something like this:
import pexpect
p = pexpect.spawnu("/home/augustin/workspaceGS/gs-sw-nanosoft-product-interface-application-2.5.1/build/csp-client -k/dev/ttyUSB1")
p.expect("csp-client")
p.sendline("cmp indent")
print(p.read())
p.close()

I'll try and give you some hints to get you started - though bear in mind I do not know any of your tools, i.e. waf or csp-client, but hopefully that will not matter.
I'll number my points so you can refer to the steps easily.
Point 1
If waf is a build system, I wouldn't keep running that every time you want to run your csp-client. Just use waf to rebuild when you have changed your code - that should save time.
Point 2
When you change directory to /home/augustin/workspaceGS/gs-sw-nanosoft-product-interface-application-2.5.1 and then run ./build/csp-client you are effectively running:
/home/augustin/workspaceGS/gs-sw-nanosoft-product-interface-application-2.5.1/build/csp-client -k/dev/ttyUSB1
But that is rather annoying, so I would make a symbolic link to that that from /usr/local/bin so that you can run it just with:
csp-client -k/dev/ttyUSB1
So, I would make that symlink with:
ln -s /home/augustin/workspaceGS/gs-sw-nanosoft-product-interface-application-2.5.1/build/csp-client /usr/local/bin/csp-client
You MAY need to put sudo at the start of that command. Once you have that, you should be able to just run:
csp-client -k/dev/ttyUSB1
Point 3
Your Python code doesn't work because every os.system() starts a completely new shell, unrelated to the previous line or shell. And the shell that it starts then exits before your next os.system() command.
As a result, the cmp ident command never goes to the csp-client. You really need to send the cmp ident command on the stdin or "standard input" of csp-client. You can do that in Python, it is described here, but it's not all that easy for a beginner.
Instead of that, if you just have aa few limited commands you need to send, such as "take a picture", I would make and test complete bash scripts in the Terminal, till I got them right and then just call those from Python. So, I would make a bash script in your HOME directory called, say csp-snap and put something like this in it:
#/bin/bash
# Extend PATH so we can find "/usr/local/bin/csp-client"
PATH=$PATH:/usr/local/bin
{
# Tell client to take picture
echo "nanoncam snap"
# Exit csp-client
echo exit
} | csp-client -k/dev/ttyUSB1
Now make that executable (only necessary once) with:
chmod +x $HOME/csp-snap
And then you can test it with:
$HOME/csp-snap
If that works, you can copy the script to /usr/local/bin with:
cp $HOME/csp-snap /usr/local/bin
You may need sudo at the start again.
Then you should be able to take photos from anywhere just with:
csp-snap
Then your Python code becomes easy:
os.system('/usr/local/bin/csp-snap')

When should I pause between shell commands in Linux?

I'm running a script of multiple commands using Linux and I want to know if I should use the sleep/pause script or not?
What's the disadvantage of not using sleep/pause? And will it affect my script?
My script for example will be looking like this :
#!/bin/bash
rm -rf /var/www/testdir/*
echo "Example1 deleted."
cp -r /var/www/testdirOrig/* /var/www/testdir/
echo "Example1 copied original files."
Thanks in advance.

The commands should be sequenced one after another.
Pausing/sleeping is not something you should normally do, except perhaps when busy-waiting for some file to appear (though there are better ways for that too)

Python - read from a remote logfile that is updated frequently

I have a logfile that is written constantly on a remote networking device (F5 bigip). I have a Linux hopping station from where I can fetch that log file and parse it. I did find a solution that would implement a "tail -f" but I cannot use nice or similar to keep my script running after I log out. What I can do is to run a cronjob and copy over the file every 5 min let's say. I can process the file I downloaded but the next time I copy it it will contain a lot of common data, so how do I process only what is new? Any help or sugestions are welcome!

Two possible (non-python) solutions for your problems. If you want to keep a script running on your machine after logout, check nohup in combination with & like:
nohup my_program & > /dev/null
On a linux machine you can extract the difference between the two files with
grep -Fxv -f old.txt new.txt > dif.txt
This might be slow if the file is large. The dif.txt file will only contain the new stuff and can be inspected by your program. There also might be a solution involving diff.

What's a .sh file?

So I am not experienced in dealing with a plethora of file types, and I haven't been able to find much info on exactly what .sh files are. Here's what I'm trying to do:
I'm trying to download map data sets which are arranged in tiles that can be downloaded individually: http://daymet.ornl.gov/gridded
In order to download a range of tiles at once, they say to download their script, which eventually leads to daymet-nc-retrieval.sh: https://github.com/daymet/scripts/blob/master/Bash/daymet-nc-retrieval.sh
So, what exactly am I supposed to do with this code? The website doesn't provide further instructions, assuming users know what to do with it. I'm guessing you're supposed to paste the code in to some other unmentioned application for a browser (using Chrome or Firefox in this case)? It almost looks like something that could be pasted in to Firefox/Greasemonkey, but not quite. Just by a quick Google on the file type I haven't been able to get heads or tails on it.
I'm sure there's a simple explanation on what to do with these files out there, but it seems to be buried in plenty of posts where people are already assuming you know what to do with these files. Anyone willing to just simply say what needs to be done from square one after getting to the page with the code to actually implementing it? Thanks.

What is a file with extension .sh?
It is a Bourne shell script. They are used in many variations of UNIX-like operating systems. They have no "language" and are interpreted by your shell (interpreter of terminal commands) or if the first line is in the form
#!/path/to/interpreter
they will use that particular interpreter. Your file has the first line:
#!/bin/bash
and that means that it uses Bourne Again Shell, so called bash. It is for all practical purposes a replacement for good old sh.
Depending upon the interpreter you will have different languages in which the file is written.
Keep in mind, that in UNIX world, it is not the extension of the file that determines what the file is (see "How to execute a shell script" below).
If you come from the world of DOS/Windows, you will be familiar with files that have .bat or .cmd extensions (batch files). They are not similar in content, but are akin in design.
How to execute a shell script
Unlike some unsafe operating systems, *nix does not rely exclusively on extensions to determine what to do with a file. Permissions are also used. This means that if you attempt to run the shell script after downloading it, it will be the same as trying to "run" any text file. The ".sh" extension is there only for your convenience to recognize that file.
You will need to make the file executable. Let's assume that you have downloaded your file as file.sh, you can then run in your terminal:
chmod +x file.sh
chmod is a command for changing file's permissions, +x sets execute permissions (in this case for everybody) and finally you have your file name.
You can also do it in your GUI. Most of the time you can right click on the file and select properties; in XUbuntu the permissions options look like this:
If you do not wish to change the permissions, you can also force the shell to run the command. In the terminal you can run:
bash file.sh
The shell should be the same as in the first line of your script.
How safe is it?
You may find it weird that you must perform another task manually in order to execute a file. But this is partially because of a strong need for security.
Basically when you download and run a bash script, it is the same thing as somebody telling you "run all these commands in sequence on your computer, I promise that the results will be good and safe". Ask yourself if you trust the party that has supplied this file, ask yourself if you are sure that you have downloaded the file from the same place as you thought, maybe even have a glance inside to see if something looks out of place (although that requires that you know something about *nix commands and bash programming).
Unfortunately apart from the warning above I cannot give a step-by-step description of what you should do to prevent evil things from happening with your computer; so just keep in mind that any time you get and run an executable file from someone you're actually saying, "Sure, you can use my computer to do something".

If you open your second link in a browser you'll see the source code:
#!/bin/bash
# Script to download individual .nc files from the ORNL
# Daymet server at: http://daymet.ornl.gov
[...]
# For ranges use {start..end}
# for individul vaules, use: 1 2 3 4
for year in {2002..2003}
do
for tile in {1159..1160}
do wget --limit-rate=3m http://daymet.ornl.gov/thredds/fileServer/allcf/${year}/${tile}_${year}/vp.nc -O ${tile}_${year}_vp.nc
# An example using curl instead of wget
#do curl --limit-rate 3M -o ${tile}_${year}_vp.nc http://daymet.ornl.gov/thredds/fileServer/allcf/${year}/${tile}_${year}/vp.nc
done
done
So it's a bash script. Got Linux?
In any case, the script is nothing but a series of HTTP retrievals. Both wget and curl are available for most operating systems and almost all language have HTTP libraries so it's fairly trivial to rewrite in any other technology. There're also some Windows ports of bash itself (git includes one). Last but not least, Windows 10 now has native support for Linux binaries.

sh files are unix (linux) shell executables files, they are the equivalent (but much more powerful) of bat files on windows.
So you need to run it from a linux console, just typing its name the same you do with bat files on windows.

Typically a .sh file is a shell script which you can execute in a terminal. Specifically, the script you mentioned is a bash script, which you can see if you open the file and look in the first line of the file, which is called the shebang or magic line.

I know this is an old question and I probably won't help, but many Linux distributions(e.g., ubuntu) have a "Live cd/usb" function, so if you really need to run this script, you could try booting your computer into Linux. Just burn a .iso to a flash drive (here's how http://goo.gl/U1wLYA), start your computer with the drive plugged in, and press the F key for boot menu. If you choose "...USB...", you will boot into the OS you just put on the drive.

How do I run .sh scripts?
Give execute permission to your script:
chmod +x /path/to/yourscript.sh
And to run your script:
/path/to/yourscript.sh
Since . refers to the current directory: if yourscript.sh is in the current directory, you can simplify this to:
./yourscript.sh
or with GUI
https://askubuntu.com/questions/38661/how-do-i-run-sh-scripts/38666#38666
https://www.cyberciti.biz/faq/run-execute-sh-shell-script/

open the location in terminal then type these commands
1. chmod +x filename.sh
2. ./filename.sh
that's it

Webapp update shell script

I feel silly asking this...
I am not an expert on shell scripting, but I am finally in enough of a sysadmin role that I want to do this correctly.
I have a production server that hosts a webapp. Here is my routine.
1 - ssh to server
2 - cd django_src/django_apps/team_proj
3 - svn update
4 - sudo /etc/init.d/apache2 restart
5 - logout
I want to create a shell script for steps 2,3,4.
I can do this, but it will be a very plain and simple bash script simply containing the actual commands I type at the command line.
My question: What is the best way to script this kind of repetitive procedure in bash (Linux, Ubuntu) for a remote server?
Thanks!

The best way is simply as you suggest. Some things you should do for your script would be:
put set -e at the top of the script (after the shebang). This will cause your script to stop if any of the commands fail. So if it cannot cd to the directory, it will not run svn update or restart apache. You can do this programmatically by putting || exit 0 after each command, but if that's all you're doing, you may as well use set -e
Use full paths in your script. Do not assume the directory that the script is run from. In this specific case, the cd command has a relative path. Use a full (absolute) path, or use an environment variable like $HOME.
You may want to set up sudo so that it can run the command without asking for a password. This makes your script non-interactive which means it can be run in the background and from cron jobs and such.
As time goes by, you may add features and take command line arguments to parameterise the script. But don't bother doing this up front. Just evolve your scripts as you need.

There is nothing wrong with a simple bash script simply containing the actual commands you type at the command line. Don't make it more complicated than necessary.

I'd setup a cron job doing that automatically.

Since you're using python, check out fabric - you can use it to automate these kind of tasks. First install fabric:
$ sudo easy_install fabric
then write your fabric script:
from __future__ import with_statement
from fabric.api import *
def svnupdate():
with cd('django_src/django_apps/team_proj'):
run('svn update')
sudo('/etc/init.d/apache2 restart')
Save as fabfile.py, then run using the fab command:
$ fab -H hostname svnupdate
Tell me that's not cool! :-)

you can do this with the shell (bash,ksh,zsh + ssh + tools), or programming languages such as Python,Perl(Ruby or PHP or Java) etc, basically a language that supports SSH protocol and operating system functions. The "best" one is the one that you are more comfortable and have knowledge in. If you are doing sysadmin, the shell is the closest thing you can use. Then after you have done your script, you can use the crontab (cron) , or the at command to schedule your task. check their man page for more information

You can easily do the above using bash/Bourne etc.
However I would take the time and effort to learn Perl (or some similarly powerful scripting language). Why ?
the language constructs are much more powerful
there are no end of libraries to interface to the systems/features you want to script
because of the library support, you won't have to spawn off different commands to achieve what you want (possibly valuable on a loaded system)
you can decompose frequently-used scripts into your own libraries for later use
I choose Perl particularly because it's been designed (perhaps designed is too strong a word for Perl) for these sort of tasks. However you may want to check out Ruby/Python or other suggestions from SO contributers.

For the basic steps look at camh's answer. If you plan to run the script via cron, then implement some simple logging, e.g. by appending start time of each command with exit code to a textfile which you can later analyze for failures of the script.

Expect -- scripting interactive applications
Expect is a tool for automating interactive applications such as telnet, ftp, passwd, fsck, rlogin, tip, etc.... Expect can make easy all sorts of tasks that are prohibitively difficult with anything else. You will find that Expect is an absolutely invaluable tool - using it, you will be able to automate tasks that you've never even thought of before - and you'll be able to do this automation quickly and easily.
http://expect.nist.gov
bonus: Your tax dollars at work!

I would probably do something like this...
project_update.sh
#!/bin/bash
#
# $1 - user#host
# $2 - project directory
[[ -z $1 || -z $2 ]] && { echo "usage: $(basename $0) user#host project_dir"; exit 1; }
declare host=$1 proj_dir=$2
ssh $host "cd $proj_dir;svn update;sudo /etc/init.d/apache2 restart" && echo "Success"

Just to add another tip - you should not give users access to some application in an unknown state. svn up might break during the update, users might see a page that's half-new half-old, etc. If you're deploying the whole application at once, I'd suggest doing svn export instead to a new directory and then either mv current old ; mv new current, or even keeping current as a link to the directory you're using now. Still not perfect and not blocking every possible race condition, but it definitely takes less time than svn up on the live copy.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string