I want to search the strings starting with "double" in a text file and pass the line numbers to two variable (Suppose I know there must be two lines have "double"). Next, I want to get the numbers in those strings and pass them to other two variables. After that, I want to delete those lines in the text. Could you tell me how to do it?
In order to store the line numbers in 2 variables, var1 and var2 try this:
read var1 var2 <<< $(grep -Fnm 2 double file | cut -d: -f1)
Now var1 and var2 contain the line numbers of the lines containing the word double.
To "pass them" to two other variables:
foo="$var1"
bar="$var2"
To delete the lines, use sed as shown below:
sed "${var1}d;${var2}d;" file
Related
Newbie to unix/shell/bash. I have a file name CellSite whose 6th line is as below:
btsName = "RV74XC038",
I want to extract the string from 6th line that is between double quotes (i.e.RV74XC038) and save it to a variable. Please note that the 6th line starts with 4 blank spaces. And this string would vary from file. So I am looking for a solution that would extract a string from 6th line between the double quotes.
I tried below. But does not work.
str2 = sed '6{ s/^btsName = \([^ ]*\) *$/\1/;q } ;d' CellSite;
Any help is much appreciated. TIA.
sed is a stream editor.
For just parsing files, you want to look into awk. Something like this:
awk -F \" '/btsName/ { print $2 }' CellSite
Where:
-F defines a "field separator", in your case the quotation marks "
the entire script consists of:
/btsName/ act only on lines that contain the regex "btsName"
from that line print out the second field; the first field will be everything before the first quotation marks, second field will be everything from the first quotes to the second quotes, third field will be everything after the second quotes
parse through the file named "CellSite"
There are possibly better alternatives, but you would have to show the rest of your file.
Using sed
$ str2=$(sed '6s/[^"]*"\([^"]*\).*/\1/' CellSite)
$ echo "$str2"
RV74XC038
You can use the following awk solution:
btsName=$(awk -F\" 'NR==6{print $2; exit}' CellSite)
Basically, get to the sixth line (NR==6), print the second field value (" is used to split records (lines) into fields) and then exit.
See the online demo:
#!/bin/bash
CellSite='Line 1
Line 2
Line 3
btsName = "NO74NO038",
Line 5
btsName = "RV74XC038","
Line 7
btsName = "no11no000",
'
btsName=$(awk -F\" 'NR==6{print $2; exit}' <<< "$CellSite")
echo "$btsName" # => RV74XC038
This might work for you (GNU sed):
var=$(sed -En '6s/.*"(.*)".*/\1/p;6q' file)
Simplify regexs and turn off implicit printing.
Focus on the 6th line only and print the value between double quotes, then quit.
Bash interpolates the sed invocation by means of the $(...) and the value extracted defines the variable var.
I have 2 variables, NUMS and TITLES.
NUMS contains the string
1
2
3
TITLES contains the string
A
B
C
How do I get output that looks like:
1 A
2 B
3 C
paste -d' ' <(echo "$NUMS") <(echo "$TITLES")
Having multi-line strings in variables suggests that you are probably doing something wrong. But you can try
paste -d ' ' <(echo "$nums") - <<<"$titles"
The basic syntax of paste is to read two or more file names; you can use a command substitution to replace a file anywhere, and you can use a here string or other redirection to receive one of the "files" on standard input (where the file name is then conventionally replaced with the pseudo-file -).
The default column separator from paste is a tab; you can replace it with a space or some other character with the -d option.
You should avoid upper case for your private variables; see also Correct Bash and shell script variable capitalization
Bash variables can contain even very long strings, but this is often clumsy and inefficient compared to reading straight from a file or pipeline.
Convert them to arrays, like this:
NUMS=($NUMS)
TITLES=($TITLES)
Then loop over indexes of whatever array, lets say NUMS like this:
for i in ${!NUMS[*]}; {
# and echo desired output
echo "${NUMS[$i]} ${TITLES[$i]}"
}
Awk alternative:
awk 'FNR==NR { map[FNR]=$0;next } { print map[FNR]" "$0} ' <(echo "$NUMS") <(echo "$TITLE")
For the first file/variable (NR==FNR), set up an array called map with the file number record as the index and the line as the value. Then for the second file, print the entry in the array as well as the line separated by a space.
I'm trying to read a file line by line, do string manipulations to each line and write the output to a file;
cat fileName | awk '{...}' >> fileOut
The specific string manipulation I am trying to accomplish is to, for each line, firstly print all the content after some index, the same for each line, say X, excluding the terminating newline, then " : ", then the first column, although I could also do this by substring if needed. I have found examples which combine variable declaration of column values, setting them to zero, variable declaration of substrings (with or without terminating on the last index), and combining these with print/f, but in all examples the use of substring and column indexing are mutually exclusive.
In every attempt to substitute one for the other in examples, the content of the first column always seems to simply replace the content of the substring. As I have tried many ways around this, I will provide the most recent attempt;
Say a line of input was "1234 abcd efgh IJKL mnop" and I want to print everything from index 10, then " : " then column 1, my command would look like:
cat fileName | awk '{printf(“%s : %s/n”,substr($0,10),$1)}' >> fileOut
cat fileName | awk '{A=substr($0,10);B=$1;printf(“%s : %s/n”,A,B)}' >> fileOut
cat fileName | awk '{print substr($0,10)” : “$1}' >> fileOut
However in every case so far, the string returned starts with the " : " followed by the contents of $1, followed by the substr with the first consistent number of characters removed from the front, e.g.
" : 1234L mnop", when I expect "efgh IJKL mnop : 1234"
Why does using a column overwrite the return of substr?
I have a file containing a lot of string words, severed by pipes. I would like to have a script (written in bash or in any other programming language) that is able to replace every word with an incremental unique integer (something like an ID).
From an input like this:
aaa|ccccc|ffffff|iii|j
aaa|ddd|ffffff|iii|j
bb|eeee|hhhhhh|iii|k
I'd like to have something like this
1|3|6|8|9
1|4|6|8|9
2|5|7|8|10
That is: aaa has been replaced by 1, bb has been replaced by 2, and so on.
How to do this? Thanks!
awk to the rescue...
this will do the numbering row-wise, I'm not sure it's important enough to make it columnar.
awk -F "|" -vOFS="|" '{
line=sep="";
for(i=1;i<=NF;i++) {
if(!a[$i])a[$i]=++c;
line=line sep a[$i];
sep=OFS
}
print line
}' words
1|2|3|4|5
1|6|3|4|5
7|8|9|4|10
to get the word associations into another file, you can replace
if(!a[$i])a[$i]=++c;
with
if(!a[$i]){
a[$i]=++c;
print $i"="a[$i] > "assoc"
}
You can define an associative array
declare -A array
use the word as keys and an incremental number as value
array[aaa]=$n
then replace the original words by the values
I have data in a tab separated file in the following form (filename.tsv):
#a 0 Espert A trius
#b 9 def J
I want to convert the data into the following form (I am introducing here in every second line):
##<a>
<0 Espert> <abc> <A trius>.
##<b>
<9 def> <abc> <J>.
I am introducing in every line. I know to do the same using python using csv module. But I am trying to learn linux commands, is there a way to do the same in linux terminal using linux commands like grep?
awk seems like the right tool for the job:
awk '{
printf "##<%s>\n<%s %s> <abc> <%s%s%s>.\n",
substr($1,2),
$2,
$3,
$4,
(length($5) ? " " : ""),
$5
}' filename.tsv
awk loops over all lines in the input file and breaks each line into fields by runs of tabs and/or spaces; $1 refers to the first field, $2, to the second, ...
printf functions the same as in C: a format (template) string containing placeholders is followed by corresponding arguments to substitute for the placeholders.
substr($1,2) returns the substring of the 1st field starting at the 2nd character (i.e., a for the 1st line, b for the 2nd) - note that indices in awk are 1-based.
(length($5) ? " " : "") is a C-style ternary expression that returns a single space if the 5th field is nonempty, and an empty string otherwise.