Cron compress files

Cron compress files - linux

I would like to introduce a Cron tak that will 'gzip' files with the following rule:
Locate files in a folder name '/log' (can be located anywhere in the filesystem) and
gzip files, older than 2 days, that have './log' in the file name handle
I have written a the script below - which does not work - am I close? What is required to make it work? Thanks.
/usr/bin/find ./logs -mtime +2 -name "*.log*"|xargs gzip

In my crontab, I call:
/usr/sbin/logrotate -s ~/.logrotate/status ~/.logrotate/logrotate.conf
In my ~/.logrotate/logrotate.conf:
# rotate log files weekly
weekly
# keep 4 weeks worth of backlogs
rotate 4
## compression
# gzip(1)
#compresscmd /usr/bin/gzip
#compressoptions -9
#uncompresscmd /usr/bin/gunzip
#compressext .gz
# xz(1)
compresscmd /usr/bin/xz
uncompresscmd /usr/bin/xzdec
compressext .xz
/home/h3xx/.log/*.log /home/h3xx/.log/jack/*.log {
# copy and truncate original (for always-open file handles
# [read: tail -f])
copytruncate
# enable compression
compress
}
/home/h3xx/.usage/*.db {
# back up databases
copy
# enable compression
compress
}

The -name argument takes a glob. Your command would only match files literally named .log. Try -name "*.log".

Related

Recognize files depending on extension

i need to recognize files with different extensions even when there is a combination of multiple extensions
so if my cwd has this files:
file-1 .zip
file-2 .tar
file-3 .tar.gz
file-4 .gz
file-5 .zip.tar
file-6 .tar.gz
file-7 .gz
i need to tell bash what to do when the extension (in this case) is:
zip
tar
zip.tar
tar.gz
gz
because for every extension i need to do different things, this implies that if the extension is .tar (only) or .gz (only) i need to do certain things, but if the extension is .tar.gz i need to run another snippet.
example:
if the filename has .tar extension i need to do
# stuff
tar xf filename.tar
# other stuff
if the filename has .zip.tar extension i need to run more complex code (but the code is not totally dependent on the extensions, my only objective is to get the full extension of the filename (filename.tar.gz should return .tar.gz instead of .gz or .tar)
Also, is there any way using gawk?

Use case:
case "$filename" in
*.tar.gz) code for .tar.gz ;;
*.gz) code for .gz ;;
*.zip.tar) code for .zip.tar ;;
*.tar) code for .tar ;;
...
esac
Just make sure you put the combined extensions before the single extensions that they contain, because case executes the statements for the first pattern that matches.

The file command is a good option to detect file types, then you can write a logic
file -i test.*
test.gz: application/x-gzip; charset=binary
test.tar: application/x-tar; charset=binary
test.tar.gz: application/x-gzip; charset=binary
test.zip: application/zip; charset=binary

BusyBox tar: append workaround given limited disk space?

I'm on a Linux system with limited resources and BusyBox -- this version of tar does not support --append, -r. Is there a workaround that will allow me to [1] append files from directory B to an existing tar of files from directory A after [2] making the B-files appear to have come from directory A? (Later, when someone extracts the files, they should all end up in the same directory A.)
Situation: I have a list of files that I want to tar, but I must process some of these files first. The files might be used by other processes so I don't want to edit them in-place. I want to be conservative when using disk space so my script only copies those files which it needs to change (vs copying them all and then processing some and finally archiving them all with tar -- if I copied them all I might run into disk space issues).
This means the files I want to archive end up in two separate locations. But I want the resulting tar file to appear as if they were all in the same location. Near the end of my script, I end up with two text files listing the A and B files by name.
I think this is straightforward with a full-blown version of tar, but I have to work with the BusyBox version (usage below). Thanks in advance for any ideas!
Usage: tar -[cxtzjaZmvO] [-X FILE] [-f TARFILE] [-C DIR] [FILE]...
Create, extract, or list files from a tar file
Operation:
c Create
x Extract
t List
Options:
f Name of TARFILE ('-' for stdin/out)
C Change to DIR before operation
v Verbose
z (De)compress using gzip
j (De)compress using bzip2
a (De)compress using lzma
Z (De)compress using compress
O Extract to stdout
h Follow symlinks
m Don't restore mtime
exclude File to exclude
X File with names to exclude
T File with names to include

In principle, you just need to append a tar repository containing the additional files to the end of the tar file. It is only slightly more difficult than that.
A tar file consists of any number of repetitions of header + file. The header is always a single 512-byte block, and the file is padded to a multiple of 512 bytes, so you can think of these units as being a variable number of 512-byte blocks. Each block is independent; it's header starts with the full pathname to the file. So there is no requirement that files in a directory be tarred together.
There is one complication. At the end of the tar file, there are at least two 512-byte blocks completely filled with 0s. When tar is reading a tar file, it will ignore a single zero-filled header, but the second one will cause it to stop reading the file. If it hits EOF, it will complain, so the terminating empty headers are required.
There might be more than two headers, because tar actually writes in blocks which are a multiple of 512 bytes. Gnu tar, for example, by default writes in multiples of 20 512-byte chunks, so the smallest tar file is normally 10240 bytes.
In order to append new data, you need to first truncate the existing file to eliminate the empty blocks.
I believe that if the tar file was produced by busybox, there will only be two empty blocks, but I haven't inspected the code. That would be easy; you only need to truncate the last 1024 bytes of the file before appending the additional files.
For general tar files, it is trickier. If you knew that the files themselves didn't have NUL bytes in them (i.e. they were all simple text files), you could remove empty headers until you found a block with a non-0 byte in it, which wouldn't be too difficult.
What I would do is:
Truncate the last 1024 bytes of the tar file.
Remember the current size of the tar file.
Append a test tar file consisting of the tar of a file with a simple short message
Verify that tar tf correctly shows the test file
Truncate the file back to the remembered length,
If the tar tf found the test file's name, succeed
If the last 512 bytes of the tar file are all 0s, truncate the last 512 bytes of the file, and return to step 2.
Otherwise fail
If the above procedure succeeds, you can proceed to append the tar repository with the new files.
I don't know if you have a trunc command. If not, you can use dd copy a file over top of an old file at a specified offset (see the seek= option). dd will truncate the file automatically at the end of the copy. You can also use dd to read a 512 byte block (see the skip and count options).

The best solution is to cut the last 1024 bytes and concatenate a new tar after it. In order to append a tar to an existing tar file, they must be uncompressed.
For files like:
$ find a b
a
a/file1
b
b/file2
You can:
$ tar -C a -czvf a.tar.gz .
$ gunzip -c a.tar.gz | { head -c -$((512*2)); tar -C b -c .; } | gzip > a+b.tar.gz
With the result:
$ tar -tzvf a+b.tar.gz
drwxr-xr-x 0/0 0 2018-04-20 16:11:00 ./
-rw-r--r-- 0/0 0 2018-04-20 16:11:00 ./file1
drwxr-xr-x 0/0 0 2018-04-20 16:11:07 ./
-rw-r--r-- 0/0 0 2018-04-20 16:11:07 ./file2
Or you can create both tar in the same command:
$ tar -C a -c . | { head -c -$((512*2)); tar -C b -c .; } | gzip > a+b.tar.gz
Although this is for tar generated by busybox tar. As mentioned in previous answer, GNU tar add multiple of 20 blocks. You need to force the number of blocks to be 1 (--blocking-factor=1) in order to know in advance how many blocks to cut:
$ tar --blocking-factor=1 -C a -c . | { head -c -$((512*2)); tar -C b -c .; } | gzip | tar --blocking-factor=1 -tzv
Anyway, GNU tar do have --append. The last --blocking-factor=1 is only needed if you indent do append the resulting tar again.

How can i append files to one another in the order i want in linux using pipes or redirects?

Lets say i have different files in a folder that contains the same day data such as :
ThisFile_2012-10-01.txt
ThatFile_2012-10-01.txt
AnotherSilly_2012-10-01.txt
InnovativeFilesEH_2012-10-01.txt
How to i append them to each other in any preferred order? Would below be the exact way i need to type in my shellscript? The folder gets same files everyday but with different dates. Old dates disappear so every day there are these 4 files.
InnovativeFilesEH_*.txt >> ThatFile_*.txt
ThisFile_*.txt >> ThatFile_*.txt
AnotherSilly_*.txt >> ThatFile_*.txt

Finally, a use for "cat" as intended :-):
cat InnovativeFilesEH_*.txt ThisFile_*.txt AnotherSilly_*.txt >> ThatFile_*.txt

Assumption:
Want to preserve some specific ordering in which these files are appended.
Using the example you provided:
#!/bin/sh
# First find the actual files we want to operate on
# and save them into shell variables:
final_output_file="Desired_File_Name.txt"
that_file=$(find -name ThatFile_*.txt)
inno_file=$(find -name InnovativeFilesEH_*.txt)
this_file=$(find -name ThisFile_*.txt)
another_silly_file=$(find -name AnotherSilly_*.txt)
# Now append the 4 files to Desired_File_Name.txt in the specific order:
cat $that_file > $final_output_file
cat $inno_file >> $final_output_file
cat $this_file >> $final_output_file
cat $another_silly_file >> $final_output_file
Adjust the ordering in which you want the files to be appended by reordering / modifying the cat statements

in drupal language: grep and pipe - list all the findings to avoid overhead & serverperformance issues

As i have a serous sever performance warning with installing drupal-commons (this is a installation-profile) i now want to reduce the server load.
Why - i get a message when trying to install drupal commons: Too-many-files-open it says!
Well Drupal & modules (ab)uses too many files! 50,000 maximum files and maybe 5000 directories is their goal and that si what they only backup so its in
So my question: How can i get rid of all those silly translation files or whatever for tiny miny parts of info and
unnecesary subdivisions; How i can get rid of them!
Background: I would expect that file_exists() during the installation(or bootstrap-cycle) is the most expensive built-in PHP function measured as total time spent calling the function for all invocations in a single request.
Well now i try to get rid of all the overhead (especially of the translation-files that are so called - po-files) - and unnecessary files that are contained in the drupal-commons 6.x-2.3 in order to get it runnning on my server.
i want to get rid all those silly translation files or whatever for tiny miny parts of info and unnecesary subdivisions;
How to search for all those .po-files recursivly - with GREP i guess ..
Note: i do not now where they are!
linux-vi17:/home/martin/web_technik/drupal/commons_3_jan_12/commons-6.x-2.3/commons-6.x-2.3 # lsCHANGELOG.txt
._.htaccess install.php modules themes
._CHANGELOG.txt ._includes INSTALL.txt ._profiles ._update.php
COMMONS_RELEASE_NOTES.txt includes ._INSTALL.txt profiles update.php
._COMMONS_RELEASE_NOTES.txt ._index.php LICENSE.txt ._robots.txt UPGRADE.txt
COPYRIGHT.txt index.php ._LICENSE.txt robots.txt ._UPGRADE.txt
._COPYRIGHT.txt INSTALL.mysql.txt MAINTAINERS.txt ._scripts ._xmlrpc.php
._cron.php ._INSTALL.mysql.txt ._MAINTAINERS.txt scripts xmlrpc.php
cron.php INSTALL.pgsql.txt ._misc ._sites
.directory ._INSTALL.pgsql.txt misc sites
.htaccess ._install.php ._modules ._themes
linux-vi17:/home/martin/web_technik/drupal/commons_3_jan_12/commons-6.x-2.3/commons-6.x-2.3 # grep .po
Any way i want to remove all .po files with one bash command - is this possible
but wait: first of all - i want to find out all the files - and the ni want to list it:
- since i then know what i rease (/or remove)
Well - all language translations in Drupal are named with .po -
how to find them with GREP?
How to list them - and subsequently - how to erase them!?
update:
i did the search with
find -type f -name "*.po"
. well i found approx 930 files.
afterwards i did remove all them with
6.x-2.3 # find -type f -name "*.po" -exec rm -f {} \;
a final serach with that code
find -type f -name "*.po"
gave no results back so every po-file was erased!
manym many thanks for the hints.
greetings
zero

If you want to find all files named *.po in a directory named /some/directory, you can use find:
find /some/directory -type f -name "*.po"
If you want to delete them all in a row (you do have backups, don't you?), then append an action to this command:
find /some/directory -type f -name "*.po" -exec rm -f {} \;
Replace /some/directory with the appropriate value and you should be set.

The issue with "too many open files" isn't normally because there are too many files in the filesystem, but because there is a limitation to the amount of files an application or user can have open at one time. This issue has been covered on drupal forums, for example, see this thread to solve it more permanently/nicely:
http://drupal.org/node/474152
A few more links about open files:
http://www.cyberciti.biz/tips/linux-procfs-file-descriptors.html
http://blog.thecodingmachine.com/content/solving-too-many-open-files-exception-red5-or-any-other-application

Fast Concatenation of Multiple GZip Files

I have list of gzip files:
file1.gz
file2.gz
file3.gz
Is there a way to concatenate or gzipping these files into one gzip file
without having to decompress them?
In practice we will use this in a web database (CGI). Where the web will receive
a query from user and list out all the files based on the query and present them
in a batch file back to the user.

With gzip files, you can simply concatenate the files together, like so:
cat file1.gz file2.gz file3.gz > allfiles.gz
Per the gzip RFC,
A gzip file consists of a series of "members" (compressed data sets). [...] The members simply appear one after another in the file, with no additional information before, between, or after them.
Note that this is not exactly the same as building a single gzip file of the concatenated data; among other things, all of the original filenames are preserved. However, gunzip seems to handle it as equivalent to a concatenation.
Since existing tools generally ignore the filename headers for the additional members, it's not easily possible to extract individual files from the result. If you want this to be possible, build a ZIP file instead. ZIP and GZIP both use the DEFLATE algorithm for the actual compression (ZIP supports some other compression algorithms as well as an option - method 8 is the one that corresponds to GZIP's compression); the difference is in the metadata format. Since the metadata is uncompressed, it's simple enough to strip off the gzip headers and tack on ZIP file headers and a central directory record instead. Refer to the gzip format specification and the ZIP format specification.

Here is what man 1 gzip says about your requirement.
Multiple compressed files can be concatenated. In this case, gunzip will extract all members at once. For example:
gzip -c file1 > foo.gz
gzip -c file2 >> foo.gz
Then
gunzip -c foo
is equivalent to
cat file1 file2
Needless to say, file1 can be replaced by file1.gz.
You must notice this:
gunzip will extract all members at once
So to get all members individually, you will have to use something additional or write, if you wish to do so.
However, this is also addressed in man page.
If you wish to create a single archive file with multiple members so that members can later be extracted independently, use an archiver such as tar or zip. GNU tar supports the -z option to invoke gzip transparently. gzip is designed as a complement to tar, not as a replacement.

Just use cat. It is very fast (0.2 seconds for 500 MB for me)
cat *gz > final
mv final final.gz
You can then read the output with zcat to make sure it's pretty:
zcat final.gz
I tried the other answer of 'gz -c' but I ended up with garbage when using already gzipped files as input (I guess it double compressed them).
PV:
Better yet, if you have it, 'pv' instead of cat:
pv *gz > final
mv final final.gz
This gives you a progress bar as it works, but does the same thing as cat.

You can create a tar file of these files and then gzip the tar file to create the new gzip file
tar -cvf newcombined.tar file1.gz file2.gz file3.gz
gzip newcombined.tar

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Cron compress files - linux

The -name argument takes a glob. Your command would only match files literally named .log. Try -name "*.log".

Related

Recognize files depending on extension

BusyBox tar: append workaround given limited disk space?

How can i append files to one another in the order i want in linux using pipes or redirects?

in drupal language: grep and pipe - list all the findings to avoid overhead & serverperformance issues

Fast Concatenation of Multiple GZip Files

Categories

Resources