UNIX\LINUX: How to add a directory text to each line inside a file? - linux

UNIX\LINUX: How to add a directory text to each line inside a file?
NOTE: I am just using shell(CMD TOOL OF LINUX REDHAT EPIC) no other...
You see I have many log files(.txt.gz) and I was able to open all of them just by using:
foreach i (./*/*dumpfiles.txt.gz_*)
> foreach? zcat $i
> foreach? grep "-e" $i
> foreach? END
Meaning I am going through all those folders finding a file dumpfiles.txt.gz_
The the output is like:
0x4899252 move x -999
0x4899231 move y -0
0x4899222 find scribe
0x4899231 move x -999
etc..
The problem is that I need the directory to be set to each line of the file...
I could get the directory by the command pwd.
The question to my problem is how to add a directory name on each line of the file?
Example:
(directory) (per line of all files)
machine01 0x4899252 move x -999
machine01 0x4899231 move y -0
machine09 0x4899222 find scribe
machine09 0x4899231 move x -999
etc..
I tried using $ sed but I cant find the solution... :(
Thanks...

here's a little perl script that does what you ask for (input is the filename):
$file = shift;
$path = `pwd`;
chomp($path);
open(TRY, "< $file");
while ($line = <TRY>) { print ($path . $line);}
close(TRY);
of course this prints to the screen, but you can pour it to file and rename it at the end of the script to $file
if you want to run it on the entire dir and downyou can run
find . -exec scriptname {} \;
if you want it to be on the current dir only, you need to add a -maxdepth 1 flag to the find after the '.'
update:
this also works (with no script, just a shell line):
perl -pi -e 's/^/$ENV{PWD} /g'

Related

How to output difference of files from two folders and save the output with the same name on different folder

I have two folders which have same file names, but different contents. So, I am trying to generate a script to get the difference and to see what is being changed. I wrote a script below :
folder1="/opt/dir1"
folder2=`ls/opt/dir2`
find "$folder1/" /opt/dir2/ -printf '%P\n' | sort | uniq -d
for item in `ls $folder1`
do
if [[ $item == $folder2 ]]; then
diff -r $item $folder2 >> output.txt
fi
done
I believe this script has to work, but it is not giving any output on output folder.
So the desired output should be in one file . Ex:
cat output.txt
diff -r /opt/folder1/file1 /opt/folder2/file1
1387c1387
< ALL X'25' BY SPACE
---
> ALL X'0A' BY SPACE
diff -r /opt/folder1/file2 /opt/folder2/file2
2591c2591
< ALL X'25' BY SPACE
---
> ALL X'0A' BY SPACE
Any help is appreciated!
Ok. So twofold:
First get the files in one folder. Never use ls. Forget it exists. ls is for nice printing in our console. In scripts, use find.
Then for each file do some command. A simple while read loop.
So:
{
# make find print relative to `/opr/dir1` director
cd /opt/dir1 &&
# Use `%P` so that print without leading `./`
find . -mindepth 1 -type f -print "%P\n"
} |
while IFS= read -r file; do
diff /opt/dir1/"$file" /opt/dir2/"$file" >> output/"$file"
done
Notes:
always quote your variable
Why you shouldn't parse the output of ls(1)

How to rename file based on parent and child folder name in bash script

I would like to rename file based on parent/subparent directories name.
For example:
test.xml file located at
/usr/local/data/A/20180101
/usr/local/data/A/20180102
/usr/local/data/B/20180101
how to save test.xml file in /usr/local/data/output as
A_20180101_test.xml
A_20180102_test.xml
b_20180101_test.xml
tried shall script as below but does not help.
#!/usr/bin/env bash
target_dir_path="/usr/local/data/output"
for file in /usr/local/data/*/*/test.xml; do
l1="${file%%/*}"
l2="${file#*/}"
l2="${l2%%/*}"
filename="${file##*/}"
target_file_name="${l1}_${l2}_${filename}"
echo cp "$file" "${target_dir_path}/${target_file_name}"
done
Anything i am doing wrong in this shall script?
You can use the following command to do this operation:
source_folder="usr/local/data/";target_folder="target"; find $source_folder -type f -name test.xml | awk -v targetF=$target_folder 'BEGIN{FS="/"; OFS="_"}{printf $0" "; print targetF"/"$(NF-2),$(NF-1),$NF}' | xargs -n2 cp;
or on several lines for readibility:
source_folder="usr/local/data/";
target_folder="target";
find $source_folder -type f -name test.xml |\
awk -v targetF=$target_folder 'BEGIN{FS="/"; OFS="_"}{printf $0" "; print targetF"/"$(NF-2),$(NF-1),$NF}' |\
xargs -n2 cp;
where
target_folder is your target folder
source_folder is your source folder
the find command will search for all the test.xml named files present under this source folder
then the awk command will receive the target folder as a variable to be able to use it, then in the BEGIN bloc you define the field separator and output field separator, then you just print the initial filename as well as the new one
you use xargs to pass the result output grouped by 2 to the cp command and the trick is done
TESTED:
TODO:
you will just need to set up your source_folder and target_folder variables with what is on your environment and eventually put it in a script and you are good to go!
I've modified your code a little to get it to work. See comments in code
target_dir_path=""/usr/local/data/output"
for file in /usr/local/data/*/*/test.xml; do
tmp=${file%/*/*/*}
curr="${file#"$tmp/"}" # Extract wanted part of the filename
mod=${curr//[\/]/_} # Replace forward slash with underscore
mv "$file" "$target_dir_path$mod" # Move the file
done
if you have perl based rename command
$ for f in tst/*/*/test.xml; do
rename -n 's|.*/([^/]+)/([^/]+)/(test.xml)|./$1_$2_$3|' "$f"
done
rename(tst/A/20180101/test.xml, ./A_20180101_test.xml)
rename(tst/A/20180102/test.xml, ./A_20180102_test.xml)
rename(tst/B/20180101/test.xml, ./B_20180101_test.xml)
-n option is for dry run, remove it after testing
change tst to /usr/local/data and ./ to /usr/local/data/output/ for your usecase
.*/ to ignore file path
([^/]+)/([^/]+)/(test.xml) capture required portions
$1_$2_$3 re-arrange as required

Finding and deleting files using python script [duplicate]

This question already has answers here:
Get a filtered list of files in a directory
(14 answers)
Closed 6 years ago.
I am writing a Python script to find and remove all .py files having corresponding .pyc files.
How to extract this file list and remove them?
For example : consider there some file in /foo/bar:
file.py
file.pyc
file3.py
file2.py
file2.pyc...etc
I want to delete file.py,file2.py and not file3.py as it do not have corresponding .pyc file.
and I want to do in all folders under '/'.
Is there one-liner bash code for the same?
P.S : I am using CentOS 6.8, having python2.7
Here's my solution:
import os
ab=[]
for roots,dirs,files in os.walk("/home/foo/bar/"):
for file in files:
if file.endswith(".py"):
ab.append(os.path.join(roots,file))
bc=[]
for i in range(len(ab)):
bc.append(ab[i]+"c")
xy=[]
for roots,dirs,files in os.walk("/home/foo/bar/"):
for file in files:
if file.endswith(".pyc"):
xy.append(os.path.join(roots,file))
ex=[x[:-1] for x in bc if x in xy]
for i in ex:
os.remove(i)
P.S: Newbie in python scriptiing.
Bash solution:
#!/bin/bash
find /foo/bar -name "*.py" -exec ls {} \; > file1.txt
find /foo/bar/ -name "*.pyc" -exec ls {} \; > file2.txt
p=`wc -l file1.txt| cut -d' ' -f1`
for ((c=1;c<=$p;c++))
do
grep `sed -n ${c}p file1.txt | sed s/$/c/g` file2.txt > /dev/null
if [ $? -eq 0 ]
then
list=`sed -n ${c}p file1.txt`
echo " exist : $list"
rm -rf `sed -n ${c}p file1.txt`
fi
done
this is a very operating-system-near solution
maybe make a shell script from the following commands and invoke it from python using subprocess.call (How to call a shell script from python code?, Calling an external command in Python)
find . -name "*.pyc" > /tmp/pyc.txt
find . -name "*.py" > /tmp/py.txt
from the entries of these files remove path and file ending using sed or basename:
for f in $(cat /tmp/pyc.txt) ; do
sed 's/.*\///' remove path
sed 's/\.[^.]*$//' remove file ending
done
for f in $(cat /tmp/py.txt) ; do
sed 's/.*\///' remove path
sed 's/\.[^.]*$//' remove file ending
done
(https://unix.stackexchange.com/questions/44735/how-to-get-only-filename-using-sed)
awk 'FNR==NR{a[$1];next}($1 in a){print}' /tmp/pyc.txt /tmp/py.txt > /tmp/rm.txt (https://unix.stackexchange.com/questions/125155/compare-two-files-for-matching-lines-and-store-positive-results)
for f in $(cat /tmp/rm.txt) ; do
rm $f
done (Unix: How to delete files listed in a file)
The following code will work for a single layer directory. (Note: I wasn't sure how you wanted to handle multiple layers of folders --- e.g. if you have A.py in one folder and A.pyc in another, does it count as having both present, or do they have to be in the same layer of the same folder? If the latter case, it should be fairly simple to just loop through the folders and just call this code within each loop.)
import os
# Produces a sorted list of all files in a directory
dirList = os.listdir(folder_path) # Use os.listdir() if want current directory
dirList.sort()
# Takes advantage of fact that both py and pyc files will share same name and
# that pyc files will appear immediately after their py counterparts in dirList
lastPyName = ""
for file in dirList:
if file[-3:] == ".py":
lastPyName = file[:-3]
elif file[-4:] == ".pyc":
if lastPyName == file[:-4]:
os.remove(lastPyName + ".py")
os.remove(lastPyName + ".pyc") # In case you want to delete this too

Bash script to get all file with desired extensions

I'm trying to write a bash script that if I pass a text file containing some extension and a folder returns me an output file with the list of all files that match the desired extension, searching recursively in all sub-directories
the folder is my second parameter the extension list file my first parameter
I have tried:
for i in $1 ; do
find . -name $2\*.$i -print>>result.txt
done
but doesn't work
As noted from in comment:
It is not a good idea to write to a hard coded file name.
The given example fixes only the given code from the OP question.
Yes of course, it is even better to call with
x.sh y . > blabla
and remove the filename from the script itself. But my intention is not to fix the question...
The following bash script, named as x.sh
#!/bin/bash
echo -n >result.txt # delete old content
while read i; do # read a line from file
find $2 -name \*.$i -print>>result.txt # for every item do a find
done <$1 # read from file named with first arg from cmdline
with an text file named y with following content
txt
sh
and called with:
./x.sh y .
results in a file result.txt which contents is:
a.txt
b.txt
x.sh
OK, lets give some additional hints as got from comments:
If the results fiel should not collect any other conntent from other results of the script it can be simplified to:
#!/bin/bash
while read i; do # read a line from file
find $2 -name \*.$i -print # for every item do a find
done <$1 >result.txt # read from file named with first arg from cmdline
And as already mentioned:
The hard coded result.txt could be removed and the call can be something like
./x.sh y . > result.txt
Give this one-liner command a try.
Replace /mydir with the folder to search.
Change the list of extensions passed as argument to the egrep command:
find /mydir -type f | egrep "[.]txt|[.]xml" >> result.txt
After the egrep, each extension should be separated with |.
. char must be escaped with [.]

Insert line number in all files in folder

I have folder which contains more then 100 files. I want to insert line number into each file.
nl command gives output to standard output on terminal. But I want to add line number in all files of folder.
Can you suggest me how to do this?
Following on #Gianluca's answer, and using bash instead:
for i in *.c *.h ; do ( nl $i > $i.numbered ) && mv $i.numbered $i ; done
This replaces all files ending with .c or .h in the current directory with line-numbered versions.
Using tcsh you can do something like
foreach f (`ls *`)
nl $f >> $f.out
mv $f.out $f
end
You can delete them mv command if you don't want to rename the files
(try the script on a copy ;-) )

Resources