How can I replace part of a path with a fixed string? - string

I wanna get directory name with minor change from path string in one line.
For example, if given string is
'./dir1/dir2/dir3/xxx.txt'
I wanna get
'./dir1/fix_string_as_suffix'
And I want to combine it with find command like below though it's not fixed yet.
find . -type d \( -name 'vendor' \) -prune -o -type f -name '*.txt' -print | cut -d'/' -f1,2

You can use sed:
s='./dir1/dir2/dir3/xxx.txt'
sed -E 's~^(([^/]*/){2}).*~\1~' <<< "$s"
./dir1/

Use IFS & printf Builtin
There's more than one way to do what you want within the shell, depending on whether you want to use standard utilities like cut, sed, or awk, or if you want limit yourself to expansions and builtins found within the shell itself. One approach is to split the path into components, and then rejoin them with the printf command or builtin. For example:
path='./dir1/dir2/dir3/xxx.txt'
IFS='/' read -ra dirs <<< "$path"
printf '%s' "${dirs[0]}/" "${dirs[1]}/" "..." $'\n'
If you want to turn this into a one-liner, join the commands with a semicolon and/or use curly braces for command grouping. However, readability trumps conciseness, especially in shell programming.

Use the cut command:
#!/bin/bash
path="./dir1/dir2/dir3/xxx.txt"
suffix="something fixed string"
prefix=`echo $path | cut -d/ -f1,2`
echo "$prefix/$suffix"
Concatenate lines from above using ; to get it on one line:
prefix=`echo $path | cut -d/ -f1,2` ; echo "$prefix/$suffix"

I think it's what you look for
find . -type d \( -name 'vendor' \) -prune -o -type f -name '*.txt' -exec sh -c 'echo "${1%/*}/fix_string_as_suffix"' fstrasuff {} \;

Actually, I ended up as below.
find . -type d \( -name 'vendor' \) -prune -o -type f -name '*.txt' -print | cut -d'/' -f1,2 | uniq | awk '{print $1"/..."}'

Related

Sed and grep in multiple files

I want to use "sed" and "grep" to search and replace in multiples files, excluding some directories.
I run this command:
$ grep -RnI --exclude-dir={node_modules,install,build} 'chaine1' /projets/ | sed -i 's/chaine1/chaine2/'
I get this message:
sed: pas de fichier d'entrée
I also tried with these two commands:
$ grep -RnI --exclude-dir={node_modules,install,build} 'chaine1' . | xargs -0 sed -i 's/chaine2/chaine2/'
$ grep -RnI --exclude-dir={node_modules,install,build} 'chaine2' . -exec sed -i 's/chaine1/chaine2/g' {} \;
But,it doesn't work!!
Could you help me please?
Thanks in advance.
You want find with -exec. Don't bother running grep, sed will only change lines containing your pattern anyway.
find \( -name node_modules -o -name install -o -name build \) -prune \
-o -type f -exec sed -i 's/chaine1/chaine2/' {} +
First, the direct outputs of grep command are not file paths. They look like this {file_path}:{line_no}:{content}. So the first thing you need to do is to extract file paths. We can do this use cut command or use -l option of grep.
# This will print {file_path}
$ echo {file_path}:{line_no}:{content} | cut -f 1 -d ":"
# This is a better solution, because it only prints each file once even though
# the grep pattern appears at many lines of a file.
$ grep -RlI --exclude-dir={node_modules,install,build} "chaine1" /projets/
Second, sed -i does not read from stdin. We can use xargs to read each file path from stdin and then pass it to sed as its argument. You have already done this.
The complete command like this:
$ grep -RlI --exclude-dir={node_modules,install,build} "chaine1" /projets/ | xargs -i sed -i 's/chaine1/chaine2/' {}
Edit: Thanks to #EdMorton's comment, I dig into find. My previous solutions will dig into files not in exclusive directories once by grep, and then process files containing pattern string for another time by sed. However, we can first use find to filter files according to their path names, and then use sed to process files only once.
My find solution is almost the same as #knittl's, but with bug fixed. Besides, I try to explain why it gets the similar results with grep. Because I still not find how to skip binary files like -I option of grep.
$ find \( \( -name node_modules -o -name install -o -name build \) -prune -type f \
-o -type f \) -exec echo {} +
or
find \( \( -name node_modules -o -name install -o -name build \) -prune \
-o -type f \) -type f -exec echo {} +
\( -name pat1 -o -name pat2 \) gives paths matching pat1 or pat2 (include files and directories), where -o means logical or. -prune ignores a directory and the files under it. They combine to achieve similar function with exclude-dir in grep.
-type f gives paths of regular files.

How do I reformat the output of a BASH script?

I am trying to find specific files in a directory that contain a string.
Code I've written so far:
for x in $(find "$1" -type f -name "*."$2"") ;
do
grep -Hrnw $x -e "$3"
done
The output I get:
./crop.py:2:import torch
./crop.py:3:import torch.backends.cudnn as cudnn
I am trying to get spaces on both sides of the colon like this:
./crop.py : 2 : import torch
./crop.py : 3 : import torch.backends.cudnn as cudnn
I am fairly new to programing in BASH. I've tried using sed command but had not luck with it.
find "$1" -type f -name "*.$2" | xargs grep -Hrnw -e "$3"| sed 's/:/ : /g'
I am trying to get spaces on both sides of the colon
I've tried using sed command but had not luck with it.
sed 's/:/ : /g'
Why are you using a for-loop for browsing through the find results, while you can use a find ... -exec?
Like this:
find "$1" -type f -name "*."$2"" -exec grep -Hrnw {} -e "$3" \;
(I didn't test this, it might contain some bugs)
Suggesting to try one liner command:
grep -Hrnw "$3" $(find "$1" -type f -name "*.$2" -printf "\"%p\"\n") | sed 's/:/ : /g'

Unix display info about files matching one of two patterns

I'm trying to display on a Unix system recursively all the files that start with an a or ends with an a with some info about them: name, size and last modified.
I tried find . -name "*a" -o -name "a*" and it displays all the files okay but when I add -printf "%p %s" it displays only one result.
If you want the same action to apply to both patterns, you need to group them with parentheses. Also, you should add a newline to printf, otherwise all of the output will be on one line:
find . \( -name "*a" -o -name "a*" \) -printf "%p %s\n"
find . -name "*.c" -o -name "*.hh" | xargs ls -l | awk '{print $9,$6,$7,$8,$5}'

list the file and its base directory

I have some files in my folder /home/sample/* * /*.pdf and *.doc and * .xls etc ('**' means some sub-sub directory.
I need the shell script or linux command to list the files in following manner.
pdf_docs/xx.pdf
documents/xx.doc
excel/xx.xls
pdf_docs, documents and excel are directories, which is located in various depth in /home/sample. like
/home/sample/12091/pdf_docs/xx.pdf
/home/sample/documents/xx.doc
/home/excel/V2hm/1001/excel/xx.xls
You can try this:
for i in {*.pdf,*.doc,*.xls}; do find /home/sample/ -name "$i"; done | awk -F/ '{print $(NF-1) "/" $NF}'
I ve added a line of awk which will print the last 2 fields (seperated by '/' ) of the result alone
Something like this?
for i in {*.pdf,*.doc,*.xls}; do
find /home/sample/ -name "$i";
done | perl -lnwe '/([^\/]+\/[^\/]+)$/&&print $1'
How about this?
find /home/sample -type f -regex '^.*\.\(pdf\|doc\|xls\)$'
Takes into account spaces in file names, potential case of extension
for a in {*.pdf,*.doc,*.xls}; do find /home/sample/ -type f -iname "$a" -exec basename {} \; ; done
EDIT
Edited to take into account only files
You don't need to call out to an external program to chop the pathname like you're looking for:
$ filename=/home/sample/12091/pdf_docs/xx.pdf
$ echo ${filename%/*/*}
/home/sample/12091
$ echo ${filename#${filename%/*/*}?}
pdf_docs/xx.pdf
So,
find /home/sample -name \*.doc -o -name \*.pdf -o -name \*.xls -print0 |
while read -r -d '' pathname; do
echo "${pathname#${pathname%/*/*}?}"
done

how to find files containing a string using egrep

I would like to find the files containing specific string under linux.
I tried something like but could not succeed:
find . -name *.txt | egrep mystring
Here you are sending the file names (output of the find command) as input to egrep; you actually want to run egrep on the contents of the files.
Here are a couple of alternatives:
find . -name "*.txt" -exec egrep mystring {} \;
or even better
find . -name "*.txt" -print0 | xargs -0 egrep mystring
Check the find command help to check what the single arguments do.
The first approach will spawn a new process for every file, while the second will pass more than one file as argument to egrep; the -print0 and -0 flags are needed to deal with potentially nasty file names (allowing to separate file names correctly even if a file name contains a space, for example).
try:
find . -name '*.txt' | xargs egrep mystring
There are two problems with your version:
Firstly, *.txt will first be expanded by the shell, giving you a listing of files in the current directory which end in .txt, so for instance, if you have the following:
[dsm#localhost:~]$ ls *.txt
test.txt
[dsm#localhost:~]$
your find command will turn into find . -name test.txt. Just try the following to illustrate:
[dsm#localhost:~]$ echo find . -name *.txt
find . -name test.txt
[dsm#localhost:~]$
Secondly, egrep does not take filenames from STDIN. To convert them to arguments you need to use xargs
find . -name *.txt | egrep mystring
That will not work as egrep will be searching for mystring within the output generated by find . -name *.txt which are just the path to *.txt files.
Instead, you can use xargs:
find . -name *.txt | xargs egrep mystring
You could use
find . -iname *.txt -exec egrep mystring \{\} \;
Here's an example that will return the file paths of a all *.log files that have a line that begins with ERROR:
find . -name "*.log" -exec egrep -l '^ERROR' {} \;
there's a recursive option from egrep you can use
egrep -R "pattern" *.log
If you only want the filenames:
find . -type f -name '*.txt' -exec egrep -l pattern {} \;
If you want filenames and matches:
find . -type f -name '*.txt' -exec egrep pattern {} /dev/null \;

Resources