Piped un-tar kill pipe when finished extracting given file from tarball - linux

There is a big tarball that I am downloading using curl. I am just interested in one file within that tarball. So currently I am piping the output of curl to tar.
$ curl -S http://url/of/big/tarball.tar.gz | tar -xv path/of/one/file
Although it works fine this way. It will still download the humongous tarball completely even when the required file is already un-tared. Is there a way to interrupt it automatically when tar has finished extracting the required file?
Edit: For anyone wandering web for the same question. I ended up creating a small bash script
trap 'kill $(jobs -p)' EXIT
curl -S "${URL}" | tar -C "${OUTPUT_DIR}" -xv "${FILES[#]}" 2>&1 | head -"${FILES_CNT}" > "${CTRL_FILE}" 2>&1 &
# Wait for the required files to be found in the tar
until [[ -s "${CTRL_FILE}" && $(wc -l "${CTRL_FILE}" | cut -d' ' -f 8) -ge "${FILES_CNT}" ]]; do
sleep 10s
done

trap 'kill $(jobs -p)' EXIT
curl -S "${URL}" | tar -C "${OUTPUT_DIR}" -xv "${FILES[#]}" 2>&1 | head -"${FILES_CNT}" > "${CTRL_FILE}" 2>&1 &
# Wait for the required files to be found in the tar
until [[ -s "${CTRL_FILE}" && $(wc -l "${CTRL_FILE}" | cut -d' ' -f 8) -ge "${FILES_CNT}" ]]; do
sleep 10s
done

Related

How do I rerun something until succeed in the background with pipe in linux command line?

Story: I'm following this guide to setup my geth node:
https://github.com/enthusiastics/bsc-archive-snapshot/blob/master/build_archive_node.sh
However, for this code chunk
# Query S3 for all archives and download them in parallel to a new zfs dataset.
while IFS= read -r FILE_NAME; do
ZFS_NAME=$(echo "$FILE_NAME" | cut -d'.' -f1)
ARCHIVE_NAMES+=("$ZFS_NAME")
zfs create -o "mountpoint=/$ZFS_NAME" "tank/$ZFS_NAME"
bash -c "cd /$ZFS_NAME && aws s3 cp --request-payer=requester '$S3_BUCKET_PATH/bsc/$FILE_NAME' - | /zstd/zstd --long=30 -d | tar -xf -" &
done <<<"$(aws s3 ls --request-payer=requester "$S3_BUCKET_PATH/bsc/" | cut -d' ' -f4)"
To be exact for the following line
bash -c "cd /$ZFS_NAME && aws s3 cp --request-payer=requester '$S3_BUCKET_PATH/bsc/$FILE_NAME' - | /zstd/zstd --long=30 -d | tar -xf -" &
some of instance would results in broken pipe, but a automatic restart could help. So I want to rewrite the code so that it automatically retries until the above line is succeeded. I researched a bit and found the until do done method. Example
until passwd ; do echo "Try again" ; done;
However, I could not succeed in incorporating the above idea into the GitHub code. I try to rewrite it as:
bash -c "cd /$ZFS_NAME && until aws s3 cp --request-payer=requester '$S3_BUCKET_PATH/bsc/$FILE_NAME' - | /zstd/zstd --long=30 -d | tar -xf - ; do echo "Try again" ; done;" &
But it does not work... I test it on the minimum code:
until passwd | echo "randompipe" ; do echo "Try again" ; done;
it didn't run passwd until succeed

Script will not download files when called by cron or udev rules

So I'm trying to make a script that will download my podcasts upon detecting my smart watch connecting and transfer them to it. I've configured the udev rule to detect when the watch is connected and it executes /bin/watch_transfer.sh for which the code is as such:
#!/usr/bin/env sh
echo "Watch connected at $(date)" >>/tmp/scripts.log
# Download new podcasts
cd /home/pi/Scripts/
./bashpodder.shell >>/tmp/pscripts.log
echo "Upodder should've run by now">>/tmp/scripts.log
# Transfer podcasts
for file in /home/pi/Downloads/podcasts/*
do
/usr/bin/mtp-sendfile $file /Podcasts
echo "Processing $file" >>/tmp/scripts.log
done
echo "Sent all files" >>/tmp/scripts.log
I know that the file runs when the watch is connected because /tmp/scripts.log is created and updated, and also bashpodder.shelll creates the podcast.m3u file so the bashpodder script is running but it doesn't download any files to /~/Downloads/podcasts. Bashpodder is a simle podcast downloader (I was using upodder but switched because it didn't seem to work) and mtp-tools is a way to transfer files through MTP. Bashpodder.shell script below:
# By Linc 10/1/2004
# Find the latest script at http://lincgeek.org/bashpodder
# Revision 1.21 12/04/2008 - Many Contributers!
# If you use this and have made improvements or have comments
# drop me an email at linc dot fessenden at gmail dot com
# and post your changes to the forum at http://lincgeek.org/lincware
# I'd appreciate it!
# Make script crontab friendly:
cd $(dirname $0)
# datadir is the directory you want podcasts saved to:
datadir=/home/pi/Downloads/podcasts
# create datadir if necessary:
mkdir -p $datadir
# Delete any temp file:
rm -f temp.log
# Read the bp.conf file and wget any url not already in the podcast.log file:
while read podcast
do
file=$(xsltproc parse_enclosure.xsl $podcast 2> /dev/null || wget -q $podcast -O - | tr '\r' '\n' | tr \' \" | sed -n 's/.*url="\([^"]*\)".*/\1/p')
for url in $file
do
echo $url >> temp.log
if ! grep "$url" podcast.log > /dev/null
then
wget -t 10 -U BashPodder -c -q -O $datadir/$(echo "$url" | awk -F'/' {'print $NF'} | awk -F'=' {'print $NF'} | awk -F'?' {'print $1'}) "$url"
fi
done
done < bp.conf
# Move dynamically created log file to permanent log file:
cat podcast.log >> temp.log
sort temp.log | uniq > podcast.log
rm temp.log
# Create an m3u playlist:
ls $datadir | grep -v m3u > $datadir/podcast.m3u
I think it might be something to do with permissions? As when I run ./watch_transfer.sh from the terminal it runs perfectly. Thanks in advance for your help.
edit:
After connecting my watch:
Ouput of $ cat /tmp/scripts.log:
Watch connected at Thu Jul 16 22:25:47 BST 2020
Upodder should've run by now
Processing /home/pi/Downloads/podcasts/podcast.m3u
Sent all files
$ cat /tmp/psripts.log doesn't output anything but /tmp/pscripts.log does exist.
Output of $ cat ~/Scripts/temp.log:
http://rasterweb.net/raster/audio/rwaudio20060108.mp3
http://rasterweb.net/raster/audio/rwaudio20051020.mp3
http://rasterweb.net/raster/audio/rwaudio20051017.mp3
http://rasterweb.net/raster/audio/rwaudio20050807.mp3
http://rasterweb.net/raster/audio/rwaudio20050719.mp3
http://rasterweb.net/raster/audio/rwaudio20050615.mp3
http://rasterweb.net/raster/audio/rwaudio20050525.mp3
http://rasterweb.net/raster/audio/rwaudio20050323.mp3
This seems to suggest that bashpodder is running through the urls but not actually downloading them?

Linux/sh: How to list files one by one, compress each (by p7zip without save file on disk) and upload to ftp server (by curl/ncftp)?

Linux/sh: How to list all files one by one in specific folder,
compress each (by p7zip without save file on disk) and
upload to ftp server (by curl/ncftp) with same folder structure?
This script below work perfect but
I don't want to save 7z file on a disk each time. Because I always need to delete them all after uploaded.
I prefer stio from 7zip to curl, how to do that?
#!/bin/sh
FOLDER="/volume3/backup_3/kopia_nas/tmp"
BACKUP_DIR="/volume3/backup_3/kopia_nas/tmp2"
FTP_HOST=""
FTP_USER=""
FTP_PASS=""
FTP_PORT="21"
PASSWORD="abc123"
FTP_FOLDER="/backup2"
#####################################################################
echo "[$(date +'%d-%m-%Y %H:%M:%S')] starting..."
echo ""
/usr/bin/find "${FOLDER}" -type f | while read line; do
# echo "$line" #path+file
# echo "${line##*/}" #file
# echo "${line%/*}" #path
#
/usr/bin/p7zip/7za a "${BACKUP_DIR}${line}.7z" "${line}" -t7z -ms=off -m0=Copy -mhe -mmt -mx0 -p"${PASSWORD}"
curl -s --disable-epsv -v -T "${BACKUP_DIR}${line}.7z" -u "${FTP_USER}:${FTP_PASS}" "ftp://${FTP_HOST}/${FTP_FOLDER}${line%/*}/" --ftp-create-dirs;
#-S -show errors
#-s -silent mode
#-an - no file name
#v- verbose
#/usr/bin/ncftp/ncftpput -m -u -c "${FTP_USER}" -p "${FTP_PASS}" -P "${FTP_PORT}" "${FTP_HOST}" "${FTP_FOLDER}${line%/*}/" "${line##*/}.7z"
# if [ $? -ne 0 ]; then echo "[$(date +'%d-%m-%Y %H:%M:%S')] Upload failed"; fi
done
#rm -rf "${BACKUP_DIR}/" #delete temporary folder
echo ""
echo "[$(date +'%d-%m-%Y %H:%M:%S')] completed..."
exit 0
I try this but it doesn't work for me...
/usr/bin/p7zip/7za a -an -t7z -ms=off -m0=Copy -mhe -mmt -mx0 -so -p"${PASSWORD}" | curl -S --disable-epsv -v -T - -u "${FTP_USER}:${FTP_PASS}" "ftp://${FTP_HOST}/${FTP_FOLDER}${line}/" --ftp-create-dirs;

Update bash script, file check, how?

#!/bin/sh
LOCAL=/var/local
TMP=/var/tmp
URL=http://um10.eset.com/eset_upd
USER=""
PASSWD=""
WGET="wget --user=$USER --password=$PASSWD -t 15 -T 15 -N -nH -nd -q"
UPDATEFILE="update.ver"
cd $LOCAL
CMD="$WGET $URL/$UPDATEFILE"
eval "$CMD" || exit 1;
if [ -n "`file $UPDATEFILE|grep -i rar`" ]; then
(
cd $TMP
rm -f $TMP/$UPDATEFILE
unrar x $LOCAL/$UPDATEFILE ./
)
UPDATEFILE=$TMP/$UPDATEFILE
URL=`echo $URL|sed -e s:/eset_upd::`
fi
TMPFILE=$TMP/nod32tmpfile
grep file=/ $UPDATEFILE|tr -d \\r > $TMPFILE
FILELIST=`cut -c 6- $TMPFILE`
rm -f $TMPFILE
echo "Downloading updates..."
for FILE in $FILELIST; do
CMD="$WGET \"$URL$FILE\""
eval "$CMD"
done
cp $UPDATEFILE $LOCAL/update.ver
perl -i -pe 's/\/download\/\S+\/(\S+\.nup)/\1/g' $LOCAL/update.ver
echo "Done."
So I have this code to download definitions for my antivirus. The only problem is that, it downloads all files everytime i run script. Is it possible to implement some sort file checking ?, let's say for example,
"if that file is present and have same filesize skip it"
Bash Linux
The -nc argument to wget will not re-fetch files that already exist. It is, however, not compatible with the -N switch. So you'll have to change your WGET line to:
WGET="wget --user=$USER --password=$PASSWD -t 15 -T 15 -nH -nd -q -nc"

How to extract archive from this script (using tar)

I have absolutly no idea how to unpack the created archive. I give you the complete Script.
A Debian based Distibution named Univention uses this to backup several files in an tar archive.
The real archive is packed in an function. The main Content where they create the actual tar file is:
cat "$TMPDIR/freeinfo.txt" >> "$TMPDIR/Installinfo.txt" 2>/dev/null
echo >$TMPDIR/endtag.txt
echo "%%%%OXBACKUP_${DATE}_HEADER_ENDTAG" >> "$TMPDIR/endtag.txt"
BACKUPINFO="$BACKUPINFO endtag.txt"
cat 2>/dev/null << EOF > "$TMPDIR/Installinfo.sh"
BACKUPHOSTNAME="$hostname"
BACKUPDOMAINNAME="$domainname"
BACKUPBASEDN="$ldap_base"
BACKUPTIMEZONE="$(cat /etc/timezone)"
BACKUPLANG="$(echo $locale_default)"
BACKUPSAMBADOM="$windows_domain"
BACKUPSAMBAINSTALLED="$SAMBAINSTALLED"
BACKUPOXINTEGRATIONVERSION="$INTEGRATIONVERSION"
BACKUPSECLEVEL="$(univention-config-registry get version/security-patchlevel)"
BACKUPVERSION=2
SECRETFILES="$SECRETFILES"
OTHERFILES="$OTHERFILES"
OXCONFIG="$OXCONFIG"
CRONTABS="$CRONTABS"
CERTFILES="$CERTFILES"
EOF
pstatus=()
#
# the actual backup to stdout
#
sync ; sync ; sync
RETVAL=$(
(tar cO $BACKUPINFO 2>/dev/null
tar cO $SECRETFILES 2>/dev/null
tar cO $OTHERFILES 2>/dev/null
tar cO $OXCONFIG 2>/dev/null
tar cO $CRONTABS 2>/dev/null
tar cO $CERTFILES 2>/dev/null
[ -f $EXTRAFILES ] && tar --no-recursion -T $EXTRAFILES -cO 2>/dev/null
tar --no-recursion --null -T dirlist_mailandfilestore -cO 2>/dev/null
tar --null -T filelist_mailandfilestore -cO 2>/dev/null
tar --no-recursion --null -T dirlist_shares -cO 2>/dev/null
tar --null -T filelist_shares -cO 2>/dev/null
) |
#help us out with smbclient, perl, scp until we get a working curl
case "$BACKUPPROTOCOL" in
##stripped protocol specific stuff ... (*) is the way to go!
(*) dd 2>>${LOGFILE}_${DATE} > ${BACKUPPATH:-$DEFAULTBACKUPPATH}/backup_$DATE && echo "201"
chmod 640 "${BACKUPPATH}/backup_$DATE" >/dev/null 2>&1
chown root:www-data "${BACKUPPATH}/backup_$DATE" >/dev/null 2>&1
if [[ x"$BACKUPPATH" != x && "$BACKUPPATH" != "$DEFAULTBACKUPPATH" ]] ; then
# temporary permissions fix
ln -sf "${BACKUPPATH}/backup_$DATE" "$DEFAULTBACKUPPATH/"
fi
;;
esac
)
the archive is 54 GB on the system, tar xvf extract only the first level of the archive. Sorry hard to explain. All in all I only get 40MB out of this 54GB. All the Dirs that should be in the archive are not extracted.
The use of
$((tar ...
tar ... ) | dd > foo)
is also totally unknown to me, what does this script do?
I think I found a solution myself ( I updated the script a little bit):
The Script generates a tag which marks the end of the first archive.
I used grep -A1 -a -b "HEADER_ENDTAG" backup.tar
Value was 41247795
dd skip=41247795 if=../../backup of=test
Looks like I could now extract the "real" archive. Is there another way to automatically jump to this byte offset, e.g. without using grep manually?
Your script appears to concatenate several tar files together into a single large file.
To extract a single section, I use a shell function / script like this:
File tarsection:
#!/bin/sh
tar_section() {
local x=1;
while [ $x -lt $1 ]; do
tar t > /dev/null || echo "Error in section $x" >&2
x=$(( $x+1 ))
done
shift
tar f - "$#"
}
tarfile="$1"
shift
tar_section "$#" < "$tarfile"
Then you can do (for example, for part 3 of the big file):
tarsection YOUR_54GB_BACKUP_FILE 3 -t | less
cd ...extractlocation
tarsection YOUR_54GB_BACKUP_FILE 3 -x

Resources