I need to GREP second column (path name) from the text file. I have a text file which has a md5checksum filepath size . Such as:
ce75d423203a62ed05fe53fe11f0ddcf kart/pan/mango.sh 451b
8e6777b67f1812a9d36c7095331b23e2 kart/hey/local 301376b
e0ddd11b23378510cad9b45e3af89d79 yo/cat/so 293188b
4e0bdbe9bbda41d76018219f3718cf6f asuo/hakl 25416b
the above is the text file, I used grep -Eo '[/]' file.txt but it prints only / , but i want the output like this:
kart/pan/mango.sh
kart/hey/local
yo/cat/so
asuo/hakl
Lastly I have to use GREP.
If you can live with spaces before and after, you can use:
grep -o "\s[[:alnum:]/]*\s"
If you need the spaces removed, you will need some zero-width look-ahead/look-behind which is only available with -P (perl regexes), if you have that you can use:
grep -Po "(?<=\s)[[:alnum:]/]+(?=\s)"
(?<=\s) - look-behind to see if there is a space preceding the string, but not capture it
(?=\s) - look-ahead to see if there is a space after the match, but not capture it
[:alnum:] - match alpha numeric chars
[[:alnum:]/] - match alphanumeric chars and /
+ - match one or more
However, grep is not the right tool for this, cut/sed/awk are way better
cut -d ' ' -f 2
-d ' ' means your delimiter is the space
-f 2 meant you want to only print field number two
Use awk instead.
awk '{print $2}' file.txt
When you are allowed to combine grep with other tools and your input file only has slashes in the second field, you can use
tr " " "\n" < file.txt | grep '/'
How to give all words from one file to tr for searching and deleting in text from another file?
For example, I have a file vocabulary.txt and loveStroty.txt. I'm trying to delete all words that in are vocabulary from love Story.
$ voc="one free" #files look like this strings
$ love="one two free four"
$ tr "$voc" '' <<< $love
Example for output (doesn't matter if it is with separators or with new line separated):
two
four
I'm assuming your input files look like this:
$ cat lovestory.txt
one two free four
$ cat vocabulary.txt
one free
In Bash, I can then use grep, process substitution and tr to remove every word from lovestory.txt that exists in vocabulary.txt like this:
$ grep -vFxf <(tr ' ' '\n' < vocabulary.txt) <(tr ' ' '\n' < lovestory.txt)
two
four
tr ' ' '\n' < file replaces every space in file with a newline; grep -vFx removes matches of complete lines (fixed strings, no regular expressions).
If files are not big enough, you could give sed utility a try:
# Define the text which replaces the searched words
replace="<Replacement string here>"
for word in $(cat /path/to/<file_containing_words>); do
sed -i "s/${word}/${replace}/g" <file_to_be_replaced>
done
So, for your specific example
replace=""
for word in $(cat /path/to/voc); do
sed -i "s/${word}/${replace}/g" /path/to/love
done
With GNU awk for multi-char RS:
$ awk -v RS='\\s+' 'NR==FNR{a[$0];next} !($0 in a)' vocabulary.txt lovestory.txt
two
four
I figured out how to get the line number of the last matching word in the file :
cat -n textfile.txt | grep " b " | tail -1 | cut -f 1
It gave me the value of 1787. So, I passed it manually to the sed command to search for the lines that contains the sentence "blades are down" after that line number and it returned all the lines successfully
sed -n '1787,$s/blades are down/&/p' myfile.txt
Is there a way that I can pass the line number from the first command to the second one through a variable or a file so I can but them in the script to be executed automatically ?
Thank you.
You can do this by just connecting your two commands with xargs. 'xargs -I %' allows you to take the stdin from a previous command and place it whenever you want in the next command. The '%' is where your '1787' will be written:
cat -n textfile.txt | grep " b " | tail -1 | cut -f 1 | xargs -I % sed -n %',$s/blades are down/&/p' myfile.txt
You can use:
command substitution to capture the result of the first command in a variable.
simple string concatenation to use the variable in your sed comand
startLine=$(grep -n ' b ' textfile.txt | tail -1 | cut -d ':' -f1)
sed -n ${startLine}',$s/blades are down/&/p' myfile.txt
You don't strictly need the intermediate variable - you could simply use:
sed $(grep -n ' b ' textfile.txt | tail -1 | cut -d ':' -f1)',$s/blades are down/&/p' myfile.txt`
but it may make sense to do error checking on the result of the command substitution first.
Note that I've streamlined the first command by using grep's -n option, which puts the line number separated with : before each match.
First we can get "half" of the file after the last match of string2, then you can use grep to match all the string1
tac your_file | awk '{ if (match($0, "string2")) { exit ; } else {print;} }' | \
grep "string1"
but the order is reversed if you don't care about the order. But if you do care, just add another tac at the end with a pipe |.
This might work for you (GNU sed):
sed -n '/\n/ba;/ b /h;//!H;$!d;x;//!d;s/$/\n/;:a;/\`.*blades are down.*$/MP;D' file
This reads through the file storing all lines following the last match of the first string (" b ") in the hold space.
At the end of file, it swaps to the hold space, checks that it does indeed have at least one match, then prints out those lines that match the second string ("blades are down").
N.B. it makes the end case (/\n/) possible by adding a new line to the end of the hold space, which will eventually be thrown away. This also caters for the last line edge condition.
How do I join the result of ls -1 into a single line and delimit it with whatever I want?
paste -s -d joins lines with a delimiter (e.g. ","), and does not leave a trailing delimiter:
ls -1 | paste -sd "," -
EDIT: Simply "ls -m" If you want your delimiter to be a comma
Ah, the power and simplicity !
ls -1 | tr '\n' ','
Change the comma "," to whatever you want. Note that this includes a "trailing comma" (for lists that end with a newline)
This replaces the last comma with a newline:
ls -1 | tr '\n' ',' | sed 's/,$/\n/'
ls -m includes newlines at the screen-width character (80th for example).
Mostly Bash (only ls is external):
saveIFS=$IFS; IFS=$'\n'
files=($(ls -1))
IFS=,
list=${files[*]}
IFS=$saveIFS
Using readarray (aka mapfile) in Bash 4:
readarray -t files < <(ls -1)
saveIFS=$IFS
IFS=,
list=${files[*]}
IFS=$saveIFS
Thanks to gniourf_gniourf for the suggestions.
I think this one is awesome
ls -1 | awk 'ORS=","'
ORS is the "output record separator" so now your lines will be joined with a comma.
Parsing ls in general is not advised, so alternative better way is to use find, for example:
find . -type f -print0 | tr '\0' ','
Or by using find and paste:
find . -type f | paste -d, -s
For general joining multiple lines (not related to file system), check: Concise and portable “join” on the Unix command-line.
The combination of setting IFS and use of "$*" can do what you want. I'm using a subshell so I don't interfere with this shell's $IFS
(set -- *; IFS=,; echo "$*")
To capture the output,
output=$(set -- *; IFS=,; echo "$*")
Adding on top of majkinetor's answer, here is the way of removing trailing delimiter(since I cannot just comment under his answer yet):
ls -1 | awk 'ORS=","' | head -c -1
Just remove as many trailing bytes as your delimiter counts for.
I like this approach because I can use multi character delimiters + other benefits of awk:
ls -1 | awk 'ORS=", "' | head -c -2
EDIT
As Peter has noticed, negative byte count is not supported in native MacOS version of head. This however can be easily fixed.
First, install coreutils. "The GNU Core Utilities are the basic file, shell and text manipulation utilities of the GNU operating system."
brew install coreutils
Commands also provided by MacOS are installed with the prefix "g". For example gls.
Once you have done this you can use ghead which has negative byte count, or better, make alias:
alias head="ghead"
Don't reinvent the wheel.
ls -m
It does exactly that.
just bash
mystring=$(printf "%s|" *)
echo ${mystring%|}
This command is for the PERL fans :
ls -1 | perl -l40pe0
Here 40 is the octal ascii code for space.
-p will process line by line and print
-l will take care of replacing the trailing \n with the ascii character we provide.
-e is to inform PERL we are doing command line execution.
0 means that there is actually no command to execute.
perl -e0 is same as perl -e ' '
To avoid potential newline confusion for tr we could add the -b flag to ls:
ls -1b | tr '\n' ';'
It looks like the answers already exist.
If you want
a, b, c format, use ls -m ( Tulains Córdova’s answer)
Or if you want a b c format, use ls | xargs (simpified version of Chris J’s answer)
Or if you want any other delimiter like |, use ls | paste -sd'|' (application of Artem’s answer)
The sed way,
sed -e ':a; N; $!ba; s/\n/,/g'
# :a # label called 'a'
# N # append next line into Pattern Space (see info sed)
# $!ba # if it's the last line ($) do not (!) jump to (b) label :a (a) - break loop
# s/\n/,/g # any substitution you want
Note:
This is linear in complexity, substituting only once after all lines are appended into sed's Pattern Space.
#AnandRajaseka's answer, and some other similar answers, such as here, are O(n²), because sed has to do substitute every time a new line is appended into the Pattern Space.
To compare,
seq 1 100000 | sed ':a; N; $!ba; s/\n/,/g' | head -c 80
# linear, in less than 0.1s
seq 1 100000 | sed ':a; /$/N; s/\n/,/; ta' | head -c 80
# quadratic, hung
sed -e :a -e '/$/N; s/\n/\\n/; ta' [filename]
Explanation:
-e - denotes a command to be executed
:a - is a label
/$/N - defines the scope of the match for the current and the (N)ext line
s/\n/\\n/; - replaces all EOL with \n
ta; - goto label a if the match is successful
Taken from my blog.
If you version of xargs supports the -d flag then this should work
ls | xargs -d, -L 1 echo
-d is the delimiter flag
If you do not have -d, then you can try the following
ls | xargs -I {} echo {}, | xargs echo
The first xargs allows you to specify your delimiter which is a comma in this example.
ls produces one column output when connected to a pipe, so the -1 is redundant.
Here's another perl answer using the builtin join function which doesn't leave a trailing delimiter:
ls | perl -F'\n' -0777 -anE 'say join ",", #F'
The obscure -0777 makes perl read all the input before running the program.
sed alternative that doesn't leave a trailing delimiter
ls | sed '$!s/$/,/' | tr -d '\n'
Python answer above is interesting, but the own language can even make the output nice:
ls -1 | python -c "import sys; print(sys.stdin.read().splitlines())"
You can use:
ls -1 | perl -pe 's/\n$/some_delimiter/'
If Python3 is your cup of tea, you can do this (but please explain why you would?):
ls -1 | python -c "import sys; print(','.join(sys.stdin.read().splitlines()))"
ls has the option -m to delimit the output with ", " a comma and a space.
ls -m | tr -d ' ' | tr ',' ';'
piping this result to tr to remove either the space or the comma will allow you to pipe the result again to tr to replace the delimiter.
in my example i replace the delimiter , with the delimiter ;
replace ; with whatever one character delimiter you prefer since tr only accounts for the first character in the strings you pass in as arguments.
You can use chomp to merge multiple line in single line:
perl -e 'while (<>) { if (/\$/ ) { chomp; } print ;}' bad0 >test
put line break condition in if statement.It can be special character or any delimiter.
Quick Perl version with trailing slash handling:
ls -1 | perl -E 'say join ", ", map {chomp; $_} <>'
To explain:
perl -E: execute Perl with features supports (say, ...)
say: print with a carrier return
join ", ", ARRAY_HERE: join an array with ", "
map {chomp; $_} ROWS: remove from each line the carrier return and return the result
<>: stdin, each line is a ROW, coupling with a map it will create an array of each ROW