Sed removing last 4 characters in a file text inside a folder - linux

This is a linux problem
So I need to make a folder called "Notes", and inside that folder I need to make 6 files. The files are named with ID number, as follows:
a00144998.txt
a00154667.txt
a00130933.txt
a00143561.txt
a00157888.txt
The first 3 letter are always "a00", and the 4th & 5th letter are the year. So I now need to copy the ID number/title of file without the '.txt' extension, and paste it on a new file outside the folder "Notes"
Example file a00144998, it shows year 14(from the 4th & 5th letter), so I will copy a00144998 to a new file named "year14.txt" and sort it. Same as a00154667, it shows year 15, so I will copy a00154667 to a new file named "year15.txt". So at the end, file "year14.txt" will have :
a00143561
a00144998
I have found the code, and it works if the files are not in the folder. But once I create files inside folder, this code doesn't work, it keeps copying the txt extension. Any idea? Thanks!
ls ~/Notes/???14????.txt|sed 's/.\[4\]$//'|sort>year14.txt

No need for sed, you can do it all in a bash oneliner:
for f in a0014*.txt; do echo ${f:5:4}; done | sort > year14.txt
This will loop over every file matching the glob a0014*.txt and put each string in an f variable, echo out 4 characters starting after the 5th character.
TLDP has a great guide on string manipulation in bash.
You should also avoid parsing ls. It's meant to be human and not machine readable.

Related

Linux rename multiple files

I have some files named in a specifc pattern, for example, ab_2000_1.jpg. In this name 2000 is representing years and 1 representing month(1 means january). I have a 20 years of monthly files like this.
Now I want to rename every one of them into the following format ab_2000_1_12.jpg, ab_2000_2_12.jpg, etc
I know how to rename files using rename and sed command. But I want to know how can I loop this command for all files.
Any help is highly appreciated.
You can use a for loop to loop over all file names matching a pattern as for file in pattern; do some_commands; done.
You don't need sed to modify the file name in this case. A variable substitution like ${variable%pattern} will remove the shortest string matching pattern from the end of the variable value.
The following example code will remove .jpg from the end of the file name and append _12.jpg to the result.
for file in ab_*_*.jpg
do
mv "$file" "${file%.jpg}_12.jpg"
done

copy and append specific lines to a file with specific name format?

I am copying some specific lines from one file to another.
grep '^stringmatch' /path/sfile-*.cfg >> /path/nfile-*.cfg
Here what's happening: its creating a new file called nfile-*.cfg and copying those lines in that. The file names sfile- * and nfile- * are randomly generated and are generally followed by a number. Both sfile-* and nfile-* are existing files and there is only one such file in the same directory. Only the number that follows is randomly generated. The numbers following in sfile and nfile need not be same. The files are not created simultaneously but are generated when a specific command is given. But some lines from one file to the another file needs to be appended.
I'm guessing you actually want something like
for f in /path/sfile-*.cfg; do
grep '^stringmatch' "$f" >"/path/nfile-${f#/path/sfile-}"
done
This will loop over all sfile matches and create an nfile target file with the same number after the dash as the corresponding source sfile. (The parameter substitution ${variable#prefix} returns the value of variable with any leading match on the pattern prefix removed.)
If there is only one matching file, the loop will only run once. If there are no matches on the wildcard, the loop will still run once unless you enable nullglob, which changes the shell's globbing behavior so that wildcards with no matches expand into nothing, instead of to the wildcard expression itself. If you don't want to enable nullglob, a common workaround is to add this inside the loop, before the grep;
test -e "$f" || break
If you want the loop to only process the first match if there are several, add break on a line by itself before the done.
If I interpret your question correctly, you want to output to an existing nfile, which has a random number in it, but instead the shell is creating a file with an asterisk in it, so literally nfile-*.cfg.
This is happening because the nfile doesn't exist when you first run the command. If the file doesn't exist, bash will fail to expand nfile-*.cfg and will instead use the * as a literal character. This is correct behaviour in bash.
So, it looks like the problem is that the nfile doesn't exist when you start your grep. You'll need to create one.
I'll leave code to others, but I hope the explanation is useful.

Paste header line in multiple tsv (tab separated) files

I have multiple .tsv files named as choochoo1.tsv, choochoo2.tsv, ... choochoo(nth).tsv files. I also have a main.tsv file. I want to extract the header line in main.tsv and paste over all choochoo(nth).tsv files. Please note that there are other .tsv files in the directory that I don't want to change or paste header, so I can't do *.tsv and select all the .tsv files (so need to select choochoo string for wanted files). This is what I have tried using bash script, but could not make it work. Please suggest the right way to do it.
for x in *choochoo; do
head -n1 main.tsv > $x
done
You have a problem with the file glob, as well as the redirect:
the file glob will catch things like AAchoochoo but not choochoo1.tsv and not even AAchoochoo.tsv
the redirect will overwrite the existing files instead of adding to them. The redirect command for adding to a file is >>, but that will append text to the end and you want to prepend text in the beginning.
The problem with prepending text to an existing file, is that you have to open the file for both reading and writing and then stream both prepended text and original text, in order - and that is usually where people fail because the shell can't open files like that (there is a slightly more complex way of doing this directly, by opening the file for both reading and writing, but I'm not going to address that further).
You might want to use a temporary file, something like this:
for x in choochoo[0-9]*.tsv; do
mv "$x"{,.orig}
(head -n1 main.tsv; cat "$x.orig") > $x
rm "$x.orig"
done

Cygwin - How can I keep original file name when outputting results of a command on a file (cut command)?

I'm using a cut command to split up a file. I need the output of the file to keep the original file name. I will not know the name of the file, just what folder it is located in. I need to ultimately add a suffix and prefix to original file after the cut, which I've got figured out. My issue is that I do not know how to keep the original file name after I output the cut.
cut -d, -f1,2,3 for file in * $file > originalfilename.txt
There should only be 1 file in the "dropbox" folder at one time. So if I can store the variable of that file name somewhere and use later that works for me.
Also if there is a way to just modify the file using cut, rather than needing to output it somewhere this would satisfy my needs too, because I would obviously still have original file name then.
I just started using Cygwin a few days ago so I apologize if there is really an obvious answer to this! I have googled everything and couldn't find what I needed.
The answer is no, unix cut does not offer an in-place option. However you can look at alternate options here
You define a variable to store the name of the file and use that variable in the commands:
orig_file='originalfilename.txt'<br>
cut -d, -f1,2,3 for file in * $file > $orig_file <br>
echo "The name of the original file is $orig_file"

Merge 2 text files into one, same lines

I have one file and contains:
file2.txt
PRIMERB
PrinceValiant
Priory
PRISTINA
embossed
heavy
incised
light
Outline
ribbon
and
file1.txt
PRIMERB 333
PrinceValiant 581
Priory789
PRISTINA3!1
embossed509
heavy5#
incised999
light5*1
Outline937
ribbon-81
I'd like to combine/merge these two files together so they would be like :
PRIMERB 333 PRIMERB
PrinceValiant 581 PrinceValiant
Priory789 Priory
PRISTINA3!1 PISTINA
embossed509 embossed
heavy5# heavy
incised999 incised
light5*1 light
Outline937 Outline
ribbon-81 ribbon
How would I do this in notepad++?
Add space characters to the end of the first line of file1 until it is longer than the longest line in file1.
Do a column mode selection of the entire contents of file 2. Do this by holding the ALT key down while dragging the mouse across the file. As you drag you should see a rectangular area of the screen selected. It may be easiest to start the selection before the first character in the first line of file2. Could also do a column mode selection with just the keyboard. Hold the ALT and Shift keys down while moving the cursor with the arrow keys.
Copy the selected text. (Control-C or menu => Edit => Copy or context menu => copy.)
Paste after the spaces added to file1.
Remove unnecessary spaces.
If the existing spaces in files1 and file2 are important you might use a regular expression to alter every line in file2 to have some character or character sequence that does not occur in either file before selecting its contents. For example, find ^ and replace with !!. Then you can use another regular expression to remove only the spaces added by the paste. For example, replace _*!! (space, asterisk, exclamation-mark, exclamation-mark) with _ (space; note that spaces would show incorrectly in these two strings, so they are shown as underscores _ for clarity).
See also the Editing => Column mode editing section of the Notepad++ help pages.
Maybe you can try ConyEdit. It is a cross-editor plugin for the text editors, including Notepad++.
Follow the steps below:
1, keep ConyEdit running.
2, use the cc.gl a command line to push data to array a.
3, use the cc.gl b command line to push data to array b.
4, use the cc.p command line to print the contents of the array a and array b.
Gif Example
Instead of finding some way of automating this, I think it'll be easier for you to just copy & paste...
But that purely depends on how many rows of text you got in those text files. If they contain less then 50 lines, I suggest you just copy (or cut) and paste.
I wouldn't know any way to automate that in Notepad++ anyway.
Edit:
After your request I wrote a quick PHP script that takes the lines from 'file1.txt' and 'file2.txt' and combines it to 'file3.txt'
<?php
$files1 = file('file1.txt'); // read file1.txt
$files2 = file('file2.txt'); // read file2.txt
// Assuming both files have equal amount of rows.
for($x = 0; $x < count($files1); $x++) {
$files1[$x] = str_replace(array("\n", "\r"), "", $files1[$x]);
$files3[$x] = $files1[$x]." ".$files2[$x];
}
$result = implode("", $files3); // combines the array to a single string.
if(file_put_contents('file3.txt', $result)) { // puts the imploded string into file3.txt
echo "Writing to file 'file3.txt' was successfull.";
}
?>
Now I would like to help you best I can, but I cannot access my own domain at this time, and I have not yet wrote something for you to upload your own files to it.
You can run this your own by downloading the latest USBWebserver
1. Extract the files from the .zip you downloaded from the USBWebserver website.
2. Go to the just extracted 'root' folder.
3. Delete everything inside that 'root' folder.
4. Copy the code above and save it as 'index.php' inside the 'root' folder (you can do this with notepad++ too).
5. Move your 'file1.txt' and 'file2.txt' to the same 'root' folder.
6. Go up one folder and execute 'usbwebserver.exe'.
7. Click on 'localhost' when the window pops up.
8. If you get the message: "Writing to file 'file3.txt' was successfull." you should now have 'file3.txt' in that 'root' folder.

Resources