subprocess.check_output(cmd) doesn't execute for tar -C option - python-3.x

I am trying to run the following command using subprocess.check_output(). The command (shown below) works fine if run directly in bash :
tar -C /tmp/models/ -czvf model.tar.gz .
It also runs fine if I don't use the "C" option when run via subprocess.
cmd = ['tar', 'czf', "/tmp/model.tar.gz", "/tmp/models/"]
output = subprocess.check_output(cmd).decode("utf-8").strip() # Works
But when I try to use the -C option with the above tar command, I get an exception which says tar: Must specify one of -c, -r, -t, -u, -x.
cmd = ['tar', 'C', '/tmp/models', 'cf', 'model.tar.gz', '.'] # fails. Other variations of this fail too.
How do I run the above tar command using subprocess correctly?. Thanks.
I am using python3.8

Looks like the dashes - should be specified:
$ tar C whatever czvf thing.tar .
tar: Must specify one of -c, -r, -t, -u, -x
$ tar C whatever -czvf thing.tar .
tar: could not chdir to 'whatever'
So the command should look like this:
cmd = ['tar', '-C', '/tmp/models', '-cf', 'model.tar.gz', '.']

Related

Bash repeating my directory and can not call the file

have to use a .sh script to unpack and prep some databases. The code is the following:
#
# Downloads and unzips all required data for AlphaFold.
#
# Usage: bash download_all_data.sh /path/to/download/directory
set -e
DOWNLOAD_DIR="$1"
for f in $(ls ${DOWNLOAD_DIR}/*.tar.gz)
do
tar --extract --verbose --file="${DOWNLOAD_DIR}/${f}" /
--directory="${DOWNLOAD_DIR}/mmseqs_dbs"
rm "${f}"
BASENAME="$(basename {f%%.*})"
DB_NAME="${BASENAME}_db"
OLD_PWD=$(pwd)
cd "${DOWNLOAD_DIR}/mmseqs_dbs"
mmseqs tar2exprofiledb "${BASENAME}" "${DB_NAME}"
mmseqs createindex "${DB_NAME}" "${DOWNLOAD_DIR}/tmp/"
cd "${OLD_PWD}"
done
When I run the code, I got that error:
(openfold_venv) watson#watson:~/pedro/openfold$ sudo bash scripts/prep_mmseqs_dbs.sh data/
tar: data//data//colabfold_envdb_202108.tar.gz: Cannot open: No such file or directory
tar: Error is not recoverable: exiting now
I don`t understand why the code repeats my "DOWNLOAD_DIR", the correct should be :
data/colabfold_envdb_202108.tar.gz
and not
data//data//colabfold_envdb_202108.tar.gz
Could anyone help me?
New code:
set -e
DOWNLOAD_DIR="$1"
for f in ${DOWNLOAD_DIR}/*.tar.gz;
do
tar --extract --verbose --file="$f" /
--directory="${DOWNLOAD_DIR}/mmseqs_dbs"
rm "${f}"
BASENAME="$(basename {f%%.*})"
DB_NAME="${BASENAME}_db"
OLD_PWD=$(pwd)
cd "${DOWNLOAD_DIR}/mmseqs_dbs"
mmseqs tar2exprofiledb "${BASENAME}" "${DB_NAME}"
mmseqs createindex "${DB_NAME}" "${DOWNLOAD_DIR}/tmp/"
cd "${OLD_PWD}"
done
To answer your first question: why is it repeating? Because you are repeating it in your code:
for f in ${DOWNLOAD_DIR}/*.tar.gz;
do
tar --extract --verbose --file="${DOWNLOAD_DIR}/$f"
If f is downloads/file.tar.gz then ${DOWNLOAD_DIR}/${f} will resolve to downloads/downloads/file.tar.tgz.
As to your second question: the escape character is the backslash \, not the forward slash. Your multiline command should look like this:
tar --extract --verbose --file="${DOWNLOAD_DIR}/${f}" \
--directory="${DOWNLOAD_DIR}/mmseqs_dbs"

bash: process one file after the other

I wrote a script that should take the first tar-file, execute script.sh, then the second tar-file and so on.
This is how script.sh looks like:
tarball=(`ls -a | cut -d "." -f 1`)
mkdir ./$tarball
tar -zxvf $tarball.tar -C ./$tarball
I execute script.sh with the following command:
for tarball in ./*.tar; do bash script.sh; done
but the assignment of the variable tarball only takes the first file and processes it (after the code posted above there are some awk commands that write some output to a file).
How do I script that after the first tar-file is taken, the second is taken and so on?
You can just do this in your script or just put the while loop in your script and pipe whatever list you want into the script:
ls -1 | cut -d "." -f 1 |
while read tarball
do
mkdir "$tarball"
tar -zxvf "${tarball}.tar" -C ./"$tarball"
done

scp multiple files with different names from source and destination

I am trying to scp multiple files from source to destination.The scenario is the source file name is different from the destination file
Here is the SCP Command i am trying to do
scp /u07/retail/Bundle_de.properties rgbu_fc#<fc_host>:/u01/projects/MultiSolutionBundle_de.properties
Basically i do have more than 7 files which i am trying seperate scps to achieve it. So i want to club it to a single scp to transfer all the files
Few of the scp commands i am trying here -
$ scp /u07/retail/Bundle_de.properties rgbu_fc#<fc_host>:/u01/projects/MultiSolutionBundle_de.properties
$ scp /u07/retail/Bundle_as.properties rgbu_fc#<fc_host>:/u01/projects/MultiSolutionBundle_as.properties
$ scp /u07/retail/Bundle_pt.properties rgbu_fc#<fc_host>:/u01/projects/MultiSolutionBundle_pt.properties
$ scp /u07/retail/Bundle_op.properties rgbu_fc#<fc_host>:/u01/projects/MultiSolutionBundle_op.properties
I am looking for a solution by which i can achieve the above 4 files in a single scp command.
Looks like a straightforward loop in any standard POSIX shell:
for i in de as pt op
do scp "/u07/retail/Bundle_$i.properties" "rgbu_fc#<fc_host>:/u01/projects/MultiSolutionBundle_$i.properties"
done
Alternatively, you could give the files new names locally (copy, link, or move), and then transfer them with a wildcard:
dir=$(mktemp -d)
for i in de as pt op
do cp "/u07/retail/Bundle_$i.properties" "$dir/MultiSolutionBundle_$i.properties"
done
scp "$dir"/* "rgbu_fc#<fc_host>:/u01/projects/"
rm -rf "$dir"
With GNU tar, ssh and bash:
tar -C /u07/retail/ -c Bundle_{de,as,pt,op}.properties | ssh user#remote_host tar -C /u01/projects/ --transform 's/.*/MultiSolution\&/' --show-transformed-names -xv
If you want to use globbing (*) with filenames:
cd /u07/retail/ && tar -c Bundle_*.properties | ssh user#remote_host tar -C /u01/projects/ --transform 's/.*/MultiSolution\&/' --show-transformed-names -xv
-C: change to directory
-c: create a new archive
Bundle_{de,as,pt,op}.properties: bash is expanding this to Bundle_de.properties Bundle_as.properties Bundle_pt.properties Bundle_op.properties before executing tar command
--transform 's/.*/MultiSolution\&/': prepend MultiSolution to filenames
--show-transformed-names: show filenames after transformation
-xv: extract files and verbosely list files processed

Groups of commands in Linux shell

Suppose I have this shell script call cpdir:
(cd $1 ; tar -cf - . ) | (cd $2 ; tar -xvf - )
When I ran it, the main shell should create two processes (subshells) to execute both groups of commands concurrently. However, how can the shell make sure that both processes change to appropriate directories, then first process package the content of the directory and send to the second process for unpacking?
Why is there no race condition? Is it a rule that every command of every process will execute in order, although processes can be parallel?
i.e. first process will run "cd $1", and then second process will run "cd $2" (or it should be execute the same time as the first process? Not sure), then first process will package everything and finally send to second process.
Although, one little thing I don't know about tar:
tar -cf - .
I know the dot (.) is the content of current directory. However, what's the '-' in the command?
You don't need to use cd because tar has a -C option which tells it to change to a directory. So you can simply use a command such as:
tar -C $1 -cvf - . | tar -C $2 -xvf -
- means stdin/stdout. The first hyphen tells tar to write to stdout. The second one tells tar to read from stdin.
Since - is the default, you don't even need to specify it. You can shorten your command to:
tar -C $1 -c . | tar -C $2 -x
As those groups run in independent processes, it doesn't matter which cd command runs first: each process has its own working directory.
So changing working directory does not affect the respectively other process.
You are piping the commands in your case. The result you expect is not very clear to me.
By the way, my GNU tar has no "-" value for this "-f" option.
So you commands might be not portable.

Makefile can't understand comments

If I put comments (# ...) in my Makefile, make gives me an error and quit. If I remove the comments, the makefile works fine.
Makefile:1: *** missing separator. Stop.
Make-version: 3.81
Linux: Ubuntu 9.04
The Makefile:
# Backup Makefile
#
# Create backups from various services and the system itself. This
# script is used to perform single backup tasks or a whole backup
# from the system. For more information about this file and how to
# use it, read the README file in the same directory.
BACKUP_ROOT = /srv/backup
ETC_PATH = /srv/config
SVN_PATH = /srv/svn/
TRAC_PATH = /srv/trac/sysinventory
PR10_PATH = /swsd/project/vmimages/...
PR10_MOUNT_PATH = /tmp/temp_sshfs_pr10
MYSQL_USER = "xxx"
MYSQL_PASSWORD = "xxx"
DATE = `date +%F`
help :
cat README
init-environment :
mkdir -p $(BACKUP_ROOT)
mkdir $(BACKUP_ROOT)/tmp
mkdir -p $(PR10_MOUNT_PATH)
backup : backup-mysql backup-configuration backup-svn backup-trac
upload-to-pr10 : mount-pr10
tar cf $(DATE)-backup-blizzard.tar -C $(BACKUP_ROOT) *.-backup.tar.gz
mv $(BACKUP_ROOT)/*-backup-blizzard.tar $(PR10_MOUNT_PATH)/
umount $(PR10_MOUNT_PATH)
mount-pr10 :
su xxx -d "sshfs -o allow_root xxx#xxx:$(PR10_PATH) $(PR10_MOUNT_PATH)"
fusermount -u $(PR10_MOUNT_PATH)
backup-mysql :
mysqldump --comments --user=$(MYSQL_USER) --password=$(MYSQL_PASSWORD) --all-databases --result-file=$(BACKUP_ROOT)/tmp/mysql_dump.sql
tar czf $(BACKUP_ROOT)/$(DATE)-mysql-backup.tar.gz -C
$(BACKUP_ROOT)/tmp/mysql_dump.sql
backup-configuration :
tar czf $(BACKUP_ROOT)/$(DATE)-configuration-backup.tar.gz $(ETC_PATH)/
backup-svn :
svnadmin dump $(SVN_PATH)/repository > $(BACKUP_ROOT)/tmp/svn_repository.dump
tar czf $(BACKUP_ROOT)/$(DATE)-subversion-backup.tar.gz -C $(BACKUP_ROOT)/tmp/svn_repository.dump
backup-trac :
tar czf $(BACKUP_ROOT)/$(DATE)-trac-backup.tar.gz $(TRAC_PATH)/
clean :
rm -f $(BACKUP_ROOT)/tmp/mysql_dump.sql
rm -f $(BACKUP_ROOT)/tmp/svn_repository.dump
rm -f $(BACKUP_ROOT)/*-backup.tar.gz
rm -f $(BACKUP_ROOT)/*-backup-blizzard.tar
Your Makefile works for me (with spaces replaced by tabs), so it sounds like you have a case of stray non-printing chars.
Try inspecting the output of "cat -vet Makefile". That will show where EOL, TAB and other unseen chars are.
You'll want to see something like this:
# Backup Makefile$
#$
# Create backups from various services and the system itself. This$
# script is used to perform single backup tasks or a whole backup$
# from the system. For more information about this file and how to$
# use it, read the README file in the same directory.$
$
BACKUP_ROOT = /srv/backup$
ETC_PATH = /srv/config$
SVN_PATH = /srv/svn/$
TRAC_PATH = /srv/trac/sysinventory$
PR10_PATH = /swsd/project/vmimages/...$
PR10_MOUNT_PATH = /tmp/temp_sshfs_pr10$
$
MYSQL_USER = "xxx"$
MYSQL_PASSWORD = "xxx"$
$
$
DATE = `date +%F`$
$
help :$
^Icat README$
$
$
init-environment :$
^Imkdir -p $(BACKUP_ROOT)$
^Imkdir $(BACKUP_ROOT)/tmp$
^Imkdir -p $(PR10_MOUNT_PATH)$
$
Make sure all commands are preceeded by "^I".
You could also try to looking for stray chars using something like:
cat -vet Makefile | grep "\^[^I]" --colour=auto
You may have used spaces instead of tabs for your comment.
Please post the makefile so we don't have to guess.

Resources