grab 2 numbers from file name then insert into command - linux

I'm a bit new to programming in general and I'm not sure how to go about accomplish this task in my bash script.
A quick background: when importing my music library (formerly organized by iTunes) to Banshee, all of the files were duplicated to fit Banshee's number style (ex: 02. instead of 02 ) on top of that, iTunes apparently did not save the ID3 tags to the files, so many of them are blank. So now I've got a few thousand tags to fix and duplicate files to get rid of.
To automate the process, I started learning to write bash scripts. I came up with a script (which you can see here) that does four things: removes unnecessary iTunes files, takes input from user about ID3 Tag information and stores it in variables, clears any present tag info from all files, writes new tags with info taken from user, using a program called eyeD3.
Now, here's where I run into my problem. This script is basically blindly writing info to all mp3 files in the dir. This is fine for tags that all the files have in common - like artist, album, total tracks, year, etc. But I can't tag each individual track number with this method. So I'm still editing the track# tags one at a time, manually. And that's something I really don't want to do 2,000+ times.
The files names all look like this:
01. song1.mp3
02. song2.mp3
03. song3.mp3
The command to write a track number to a tag looks like this:
$ eyeD3 -n 1 "01. song1.mpg"
So... I'm not sure how to go about automating this. I need to grab the first two digits of each file name, store them somewhere, then recall each one into a separate eyeD3 command.

You can loop over the files using globbing, and use substring expansion to capture the first two characters of the filename:
for f in *mp3; do
eyeD3 -n ${f:0:2} "$f"
done

Related

Command line tool to merge textfiles , keeping the common content and adding the rest of the content to it

I have got a folder with notes on different topics, say ~/notes/topica.txt ~/notes/topicb.txt etc... . I have been using them in different computers for some time now so they have diverted their content
So I have now something like
#computerA:
cat ~/notes/topica.txt
Lines added in Computer A
----------
Some common text
#computerB:
cat ~/notes/topica.txt
Lines added in Computer B
----------
Some common text
I want to be able to merge them in a way that I get
cat ~/notes/topica.txt
Lines added in Computer A
Lines added in Computer B
----------
Some common text
I have tried some tools like diff but they are really not achieving the objective I want.
Also I need to do it for an entire folder with a bunch of files. (Im alright with doing a bash script)
Thanks for your help

Unix create multiple files with same name in a directory

I am looking for some kind of logic in linux where I can place files with same name in a directory or file system.
For e.g. i create a file abc.txt, so the next time if any process creates abc.txt it should automatically check and make the file named as abc.txt.1 should be created, then next time abc.txt.2 and so on...
Is there a way to achieve this.
Any logic or third party tools are also welcomed.
You ask,
For e.g. i create a file abc.txt, so the next time if any process
creates abc.txt it should automatically check and make the file named
as abc.txt.1 should be created
(emphasis added). To obtain such an effect automatically, for every process, without explicit provision by processes, it would have to be implemented as a feature of the filesystem containing the files. Such filesystems are called versioning filesystems, though typically the details are slightly different from what you describe. Most importantly, however, although such filesystems exist for Linux, none of them are mainstream. To the best of my knowledge, none of the major Linux distributions even offers one as a distribution-supported option.
Although it's a bit dated, see also Linux file versioning?
You might be able to approximate that for many programs via a customized version of the C standard library, but that's not foolproof, and you should not expect it to have universal effect.
It would be an altogether different matter for an individual process to be coded for such behavior. It would need to check for existing files and choose an appropriate name when opening each new file. In doing so, some care needs to be taken to avoid related race conditions, but it can be done. Details would depend on the language in which you are writing.
You can use BASH expression to achieve this. For example if I wanted to make 10 files all with the same name, but having a unique number value I would do the following:
# touch my_file{01..10}.txt
This would create 10 files starting at 01 all the way to 10. This method is also hand for looping over files in a sequence or if your also creating directories.
Now if i am reading you question right your asking that if you move a file or create a file in a directory. you would want the a script to automatically create a new file for you? If that is the case then just use a test and if there is a file move that file and mark it. Me personally I use time stamps to do so.
Logic:
# The [ -f ] tests if the file is present
if [ -f $MY_FILE_NAME ]; then
# If the file is present move the file and give it the PID
# That way the name will always be unique
mv $MY_FILE_NAME $MY_FILE_NAME_$$
mv $MY_NEW_FILE .
else
# Move or make the file here
mv $MY_NEW_FILE .
fi
As you can see the logic is very simple. Hope this helps.
Cheers
I don't know about Your particular use case, but You may try to look at logrotate:
https://wiki.archlinux.org/index.php/Logrotate

How to code for iterating through multiple files in linux?

I have code which I am trying to update from another example. The aim is to run plink using files of: each chromosome, snp ids, and a file containing only 1 ID which is an individual's ID. Running these files in plink ultimately makes a vcf file per individual for a given chromosome.
I have 22 chromosome files, 1 snp file (which is always the same), and 500 individual files. For each individual I am aiming to make a vcf for each chromosome, so I have 22*500 (11000) vcf files as output.
With doing this at the moment I have tried a bash script with this:
ID=$SGE_TASK_ID
indiv=$SGE_TASK_ID
plink --bed chr${ID}.bed --bim chr${ID}.bim --fam chr${ID}.fam --extract snps.txt
--recode vcf-iid --out output${indiv}chr${ID}vcf --keep-fam individual${indiv}.txt
This runs, however it only runs through 1 individual, giving me 22 chromosome vcf files for that one person, and stops there. How do I make this run for all 500 people, would it be with a for loop? Looking through other questions I haven't been able to find one that matches my question and is in linux, any help would appreciated.
${indiv} would just be a number, so the text file that runs looks like individual1.txt and increases through the 500 individuals (individual1.txt, individual2.txt, individual3.txt)
Assuming that ${indiv} contains no spaces,
for indiv in $(<individuals.data); do
plink [...] individual${indiv}.txt
done
The file individuals.data would name the individuals, separated by spaces or newlines.
If unsure what the Bash shell's $(<...) operator does, try this:
for A in $(<individuals.data); do
echo "[$A]"
done
Note that, as #Kaz has observed, if wish your script to work also in shells other than Bash, then you might write $(cat ...) rather than $(<...)

How to stream log files content that is constantly changing file names in perl?

I a series of applications on Linux systems that I need to basically constantly 'stream' out or even just 'tail' out but the challenge is the filenames are constantly rolling and changing.
The are all date encoded (dates being in different formats) and each then have different incremented formats.
Most of them start with one and increase, but one doesn't have an extension and then adds an extension past the first file and the other increments a number but once hitting 99 rolls to increment a alpha and returns the numeric to 01 and then up again as it rolls so quickly.
I just have the OS level shell scripting, OS command line utilities, and perl available to me to handle this situation for another application to pickup and read these logs.
The new files are always created right when it starts writing to the new file and groups of different logs (some I am reading some I am not) are being written to the same directory so I cannot just pickup anything hitting the directory.
If I simply 'tail -n 1000000 -f |' them today this works fine for the reader application I am using until the file changes and I cannot setup file lists ranges within the reader application, but can pre-process them so they basically appear as a continuous stream to the reader vs. the reader directly invoking commands to read them. A simple Perl log reader like this also work fine for a static filename but not for dynamic ones. It is critical I don't re-process any logs lines and just capture new lines being written to the logs.
I admit I am not any form a Perl guru, and the best answers / clue I've been able to find so far is the use of Perl's Glob function to possibly do this but the examples I've found basically reprocess all of the files on each run then seem to stop.
Example File Names I am dealing with across multiple apps I am trying to handle..
appA_YYMMDD.log
appA_YYMMDD_0001.log
appA_YYMMDD_0002.log
WS01APPB_YYMMDD.log
WS02APPB_YYMMDD.log
WS03AppB_YYMMDD.log
APPCMMDD_A01.log
APPCMMDD_B01.log
YYYYMMDD_001_APPD.log
As denoted above the files do not have the same inode and simply monitoring the directory for change is not possible as a lot of things are written there. On the dev system it has more than 50 logs being written to the directory and thousands of files and I am only trying to retrieve 5. I am seeing if multitail can be made available to try that suggestion but it is not currently available and installing any additional RPMs in the environment is generally a multi-month battle.
ls -i
24792 APPA_180901.log
24805 APPA__180902.log
17011 APPA__180903.log
17072 APPA__180904.log
24644 APPA__180905.log
17081 APPA__180906.log
17115 APPA__180907.log
So really the root of what I am trying to do is simply a continuous stream regardless if the file name changes and not have to run the extract command repeatedly nor have big breaks in the data feed while some script figures out that the file being logged to has changed. I don't need to parse the contents (my other app does that).. Is there an easy way of handling this changing file name?
How about monitoring the log directory for changes with Linux inotify, e.g. Linux::inotify2? Then you could detect when new log files are created, stop reading from the old log file and start reading from the new log file.
Try tailswitch. I created this script to tail log files that are rotated daily and have YYYY-MM-DD on their names. To use this script, you just say:
% tailswitch '*.log'
The quoting prevents the shell from interpreting the glob pattern. The script will perform glob pattern from time to time to switch to a newer file based on its name.

Pattern match and string manipulation

I'm making a file monitor for a folder where I download subtitles. So far, it works like this:
Look for new .rar files in the folder.
If found, extract the subtitles and delete the .rar file
If a single .srt file was extracted, save the file name to a variable.
Now, I'm clueless about how to achieve the next (and final) part of the script:
I want to find a pattern based on the way subtitles are named.
Let's say, the subtitles file can be something like this:
SomeShow.1x03.stuff.srt
some_show s01e03-stuff.srt
some show 1-03 stuff.srt
etc.
I want to get something like: SomeShow 1 3 and based on that, start the video with the name that matches that pattern, which I guess would be a matter of reversing the process that was used to get the Show, season and episode based on the name of the .srt file.
Is this possible at all? It'd be really simple stuff in most languages, but I really need this to be a .bat and I'm clueless about how to approach this... so far all I've managed to do is to remove the extension from the variable.
Thanks in advance.
Batch files are Turing complete - you can do anything in them, but it is usually not wise to go to extremes. You might be able to package a sed or grep or your own binary alongside your .bat file for a good compromise between batchiness and function. If you can assume a suitable operating system, you will have Powershell installed and go that route.
You should recognize that the task is not exactly defined and that the "solution" may need some tweaking and be never robust enough.
For this reason, the richer language you can pick, the further you will get.

Resources