Using grep to overwrite its current file - linux

I have a list of directories within directories and this is what I am trying to attempt:
find a specific file format which is .xml
within all these .xml files, read the contents in the files and remove line 3
For line 3, its string is as follows: dxflib <Name of whatever folder it is in>.dxb
I tried using find -name "*.xml" | xargs grep -v "dxflib" in the terminal (I am using linux) and I found out that while my code works and it displays the results, it did not overwrite the changes to the file.
And as I googled online, it is mentioned that I will need to add in >> output.txt etc
And hence, are there anyways in which I can make it to save / overwrite its own file?

Removes third line in file:
sed -i '3d' file

Related

How to replace an unknown string in multiple files under Linux?

I want to change multiple different strings across all files in a folder to one new string.
When the string in the text files (within a same directory) is like this:
file1.json: "url/1.png"
file2.json: "url/2.png"
file3.json: "url/3.png"
etc.
I would need to point them all to a single PNG, i.e., "url/static.png", so all three files have the same URL inside pointing to the same PNG.
How can I do that?
you can use the command find and sed for this. make sure you are in the folder that you want to replace files.
find . -name '*.*' -print|xargs sed -i "s/\"url\/1.png\"/\"url\/static.png\"/g"
Suggesting bash script:
#!/bin/bash
# for each file with extension .json in current directory
for currFile in *.json; do
# extract files ordinal from from current filename
filesOrdinal=$(echo "#currFile"| grep -o "[[:digit:]]\+")
# use files ordinal to identify string and replace it in current file
sed -i -r 's|url/'"$filesOrdinal".png'|url/static.png|' $currFile
done

How to replace a string in multiple files in multiple subfolders with different file extensions in linux using command line

I have already followed this query # (How to replace a string in multiple files in linux command line).
My question is rather an extension of the same.
I want to check only specific file extensions in the subfolders also but not every file extension.
What I have already tried:
grep -rli 'old-word' * | xargs -i# sed -i 's/old-word/new-word/g' #
My problem: It is changing in every other file format as well. I want to search and replace only in one file extension.
Please add another answer where I can change the entire line of a file as well not just one word.
Thanks in advance.
Simplest solution is to use complex grep command:
grep -rli --include="*.html" --include=".json" 'old-word' *
The disadvantage of this solution. Is that you do not have clear control which files are scanned.
Better suggesting to tune a find command to locate your desired files.
Using RegExp filtering option -regex to filter file names.
So you verify the correct files are scanned.
Than feed the find command result to grep scanning list.
Example:
Assuming you are looking for file extensions txt pdf html .
Assuming your search path begins in /home/user/data
find /home/user/data -regex ".*\.\(html\|txt\|pdf\)$"
Once you have located your files. It is possible to grep match each file from the the above find command:
grep -rli 'old-word' $( find /home/user/data -regex ".*\.\(html\|txt\|pdf\)$" )

how to run grep from script and store output in a file in the destination directory from bash script

I am trying to filter out lines from a file through a bash script. I am able to find the path of the file from script location by running the command
Fgff=`find $D -maxdepth 1 -type f -name "*.gff"`
I can add a column to the found .gff file by running the command
sed -i '1 s/$/\tsample/; 1! s/$/\t'${D##*/}'/' $Fpsi
However if I try to filter the file and write the output in another file in the same folder then its not working.
grep 'ENSG00000155657\|ENSG00000198947' $Fgff > "$Fgff$filtered"
I want to know why grep is not working?
How can I filter all the lines having substring ENSG00000155657 or ENSG00000198947 in file apple.gff at ./dira/dirb/apple.gff and store it in ./dira/dirb/applefiltered.gff?
thanks
Providing that your $Fgff contains the correct filename, your grep command does exactly what you requested, searching for the string 'ENSG0000015565(7\|E)NSG00000198947' while you probably wanted '(ENSG00000155657)\|(ENSG00000198947)'.

bulk rename pdf files with name from specific line of its content in linux

I have multiple pdf files which I want to rename. new name should be taken from pdf's file content on specific(lets say 5th) line. for example, if file's 5th line has content some string <-- this string should be name of file. and same thing goes to the rest of files. each file should be renamed with content's 5th line. I tried this in terminal
for pdf in *.pdf
do
filename=`basename -s .pdf "${pdf}"`
newname=`awk 'NR==5' "${filename}.pdf"`
mv "${pdf}" "${newname}"
done
it copies the files, but name is invalid string. I know the system doesn't see the file as plain text and images, there are metadata, xml tags and so on.. but is there way to take content from that line?
Out of the box, bash and its usual utilities are not able to read pdf files. However, less is able to recover the text from a pdf file. You could change your script as follow :
for pdf in *.pdf
do
mv "$pdf" "$(less $pdf | sed '5q;d').pdf"
done
Explanation :
less "$pdf" : display the text part of the pdf file. Will take spacing into account
make some tests to see if less returns the desired output
sed '5q;d' : extracts the 5th line of the input file
Optionally, you could use the following script to remove blank lines and exceeding spaces :
mv "$pdf" "$(less "$pdf" | sed -e '/^\s*$/d' -e 's/ \+/ /g' | sed '5q;d').pdf"

Merge Files and Prepend Filename and Directory

I need to merge files in a directory and include the directory, filename, and line number in each line of the output. I've found many helpful posts about including the filename and line number but not the directory name. Grep -n gets line numbers and I've seen some find commands that get some of the other parts but I can't seem to pull them all together. (I'm using Ubuntu for all of the data processing.)
Imagine two files in directory named "8". (Each directory in the data I have is a number. The data were provided that way.)
file1.txt
JohnPaulGeorgeRingo
file2.txt
MickKeefBillBrianCharlie
The output should look like this:
8:file1.txt:1:John8:file1.txt:2:Paul8:file1.txt:3:George8:file1.txt:4:Ringo8:file2.txt:1:Mick8:file2.txt:2:Keef8:file2.txt:3:Bill8:file2.txt:4:Brian8:file2.txt:5:Charlie
The separators don't have to be colons. Tabs would work just fine.
Thanks much!
If it's just one directory level deep you could try something like so. We go into each directory, print each line with its number and then append the directory name to the front with sed:
$ for x in `ls`; do
(cd $x ; grep -n . *) | sed -e 's/^/'$x:'/g'
done
1:c.txt:2:B
1:c.txt:3:C
2:a.txt:1:A
2:a.txt:2:B

Resources