opening a file > replacing text in a loop in unix

opening a file > replacing text in a loop in unix - linux

I have a file abc.csv which is as follows;-
sn,test,control
1,file1,file2
2,file3,file4
i have another configuration file text.cfg as follows:-
model.input1 = /usr/bin/file1
model.input2 = /usr/bin/file2
now i want to replace the file1 and file2 with
file3 and file4 respectively and then
file5 and file6 and so on,, till the abc.csv file is exhausted
my attempt for the problem is :-
IFS =","
while read NUM TEST CONTROL
X=""
Y=""
do
echo "Serial Number :$NUM"
echo "Test File :$TEST"
echo "Control File:$CONTROL"
sed -i -e 's/$A/$B/g' text.cfg
X = $TEST
Y = $CONTROL
done < abc.csv
the ouput should be new config files as follows:-
this is the test1.cfg:
model.input1 = /usr/bin/file1
model.input2 = /usr/bin/file2
then the second file should be test2.cfg
model.input1 = /usr/bin/file3
model.input2 = /usr/bin/file4
and so on..

awk -F, 'NR>1{i = 0; while(++i < NF){print "model.input"i" = /usr/bin/"$(i+1) >> "test"$1".cfg"} }' File
Output:
AMD$ cat test1.cfg
model.input1 = /usr/bin/file1
model.input2 = /usr/bin/file2
AMD$ cat test2.cfg
model.input1 = /usr/bin/file3
model.input2 = /usr/bin/file4
Hope this helps.

Related

Print sum of Nth column at the header of file with existing rows bash

I have an input file with billions of records and a header.
Header consists of meta info, total number of rows and sum of the sixth column. I am splitting the file into small sizes, due to which my header record must be updated as the sum of sixth column and total rows is changed.
This is the sample record
filename: testFile.text
00|STMT|08-09-2022 13:24:56||5|13.10|SHA2
10|000047290|8ddcf4b2356dfa7f326ca8004a9bdb6096330fc4f3b842a971deaf660a395f65|18-01-2020|12:36:57|3.10|00004729018-01-20201|APP
10|000052736|cce280392023b23df2a00ace4b82db8eb61c112bb14509fb273c523550059317|07-02-2017|16:27:49|2.00|00005273607-02-20171|APP
10|000070355|f2e86d2731d32f9ce960a0f5883e9b688c7e57ab9c2ead86057f98426407d87a|17-07-2019|20:25:02|1.00|00007035517-07-20192|APP
10|000070355|54c1fc2667e160a11ae1dbf54d3ba993475cd33d6ececdd555fb5c07e64a241b|17-07-2019|20:25:02|5.00|00007035517-07-20192|APP
10|000072420|f5dac143082631a1693e0fb5429d3a185abcf3c47b091be2f30cd50b5cf4be11|14-06-2021|20:52:21|2.00|00007242014-06-20212|APP
Expected:
filename: testFile_1.text
00|STMT|08-09-2022 13:24:56||3|6.10|SHA2
10|000047290|8ddcf4b2356dfa7f326ca8004a9bdb6096330fc4f3b842a971deaf660a395f65|18-01-2020|12:36:57|3.10|00004729018-01-20201|APP
10|000052736|cce280392023b23df2a00ace4b82db8eb61c112bb14509fb273c523550059317|07-02-2017|16:27:49|2.00|00005273607-02-20171|APP
10|000070355|f2e86d2731d32f9ce960a0f5883e9b688c7e57ab9c2ead86057f98426407d87a|17-07-2019|20:25:02|1.00|00007035517-07-20192|APP
filename: testFile_2.text
00|STMT|08-09-2022 13:24:56||2|7.00|SHA2
10|000070355|54c1fc2667e160a11ae1dbf54d3ba993475cd33d6ececdd555fb5c07e64a241b|17-07-2019|20:25:02|5.00|00007035517-07-20192|APP
10|000072420|f5dac143082631a1693e0fb5429d3a185abcf3c47b091be2f30cd50b5cf4be11|14-06-2021|20:52:21|2.00|00007242014-06-20212|APP
I am able to split the file and calculate the sum but unable to replace the value in header part.
This is the script I have made
#!/bin/bash
splitRowCount=$1
transactionColumn=$2
filename=$(basename -- "$3")
extension="${filename##*.}"
nameWithoutExt="${filename%.*}"
echo "splitRowCount: $splitRowCount"
echo "transactionColumn: $transactionColumn"
awk 'NR == 1 { head = $0 } NR % '$splitRowCount' == 2 { filename = "'$nameWithoutExt'_" int((NR-1)/'$splitRowCount')+1 ".'$extension'"; print head > filename } NR != 1 { print >> filename }' $filename
ls *.txt | while read line
do
firstLine=$(head -n 1 $line);
awk -F '|' 'NR !=1 {sum += '$transactionColumn'}END {print sum} ' $line
done

Here's an awk solution for splitting the original file into files of n records. The idea is to accumulate the records until the given count is reached then generate a file with the updated header and the accumulated records:
n=3
file=./testFile.text
awk -v numRecords="$n" '
BEGIN {
FS = OFS = "|"
if ( match(ARGV[1],/[^\/]\.[^\/]*$/) ) {
filePrefix = substr(ARGV[1],1,RSTART)
fileSuffix = substr(ARGV[1],RSTART+1)
} else {
filePrefix = ARGV[1]
fileSuffix = ""
}
if (getline headerStr <= 0)
exit 1
split(headerStr, headerArr)
}
(NR-2) % numRecords == 0 && recordsCount {
outfile = filePrefix "_" ++filesCount fileSuffix
print headerArr[1],headerArr[2],headerArr[3],headerArr[4],recordsCount,recordsSum,headerArr[7] > outfile
printf("%s", records) > outfile
close(outfile)
records = ""
recordsCount = recordsSum = 0
}
{
records = records $0 ORS
recordsCount++
recordsSum += $6
}
END {
if (recordsCount) {
outfile = filePrefix "_" ++filesCount fileSuffix
print headerArr[1],headerArr[2],headerArr[3],headerArr[4],recordsCount,recordsSum,headerArr[7] > outfile
printf("%s", records) > outfile
close(outfile)
}
}
' "$file"
With the given sample you'll get:
testFile_1.text
00|STMT|08-09-2022 13:24:56||3|6.1|SHA2
10|000047290|8ddcf4b2356dfa7f326ca8004a9bdb6096330fc4f3b842a971deaf660a395f65|18-01-2020|12:36:57|3.10|00004729018-01-20201|APP
10|000052736|cce280392023b23df2a00ace4b82db8eb61c112bb14509fb273c523550059317|07-02-2017|16:27:49|2.00|00005273607-02-20171|APP
10|000070355|f2e86d2731d32f9ce960a0f5883e9b688c7e57ab9c2ead86057f98426407d87a|17-07-2019|20:25:02|1.00|00007035517-07-20192|APP
testFile_2.text
00|STMT|08-09-2022 13:24:56||2|7|SHA2
10|000070355|54c1fc2667e160a11ae1dbf54d3ba993475cd33d6ececdd555fb5c07e64a241b|17-07-2019|20:25:02|5.00|00007035517-07-20192|APP
10|000072420|f5dac143082631a1693e0fb5429d3a185abcf3c47b091be2f30cd50b5cf4be11|14-06-2021|20:52:21|2.00|00007242014-06-20212|APP

With your shown samples please try following awk code(Written and tested in GNU awk). Here I have defined awk variables named fileInitials which contains your output file's initial name eg: testFile then extension which contains output file's extension eg: .txt here. Then comes lines which will be your value on how many lines you want to have in a output file.
You need not to run shell + awk code, this could be done in a single awk like shown following.
awk -v count="1" -v fileInitials="testFile" -v extension=".txt" -v lines="3" '
BEGIN { FS=OFS="|" }
FNR==1{
match($0,/^([^|]*\|[^|]*\|[^|]*\|[^|]*\|[^|]*)\|[^|]*(.*)/,arr)
header1=arr[1]
header2=arr[2]
outputFile=(fileInitials count extension)
next
}
{
if(prev!=count){
print (header1,sum header2 ORS val) > (outputFile)
close(outputFile)
outputFile=(fileInitials count extension)
sum=0
val=""
}
sum+=$6
val=(val?val ORS:"") $0
prev=count
count=(++countline%lines==0?++count:count)
}
END{
if(count && val){
print (header1,sum header2 ORS val) > (outputFile)
close(outputFile)
}
}
' Input_file

cat one file to multiple file and output multiple files

I'm trying to cat 1 main-file to multiple single files. The output file should benamed main-file_file1
For example
main-file + file1 = mailfile_file1
main-file + file2 = mailfile_file2
main-file + file3 = mailfile_file3
.
.
.
main-file + fileN = mainfile_fileN
I guess I could
cat mail-file file1 > mail-file_file1
but I have 100 files to cat to mail-file so that won't be so efficient.
Any suggestions?

You need a bash for loop (assuming your shell is bash)! In you case you would do something like this:
for i in {1..100}; do cat mail-file file$i > mail-file_file$i; done

Search word from certain line GREP

File type = "ooTextFile"
Object class = "TextGrid"
xmin = 0
xmax = 82.7959410430839
tiers? <exists>
size = 1
item []:
item [1]:
class = "IntervalTier"
name = "ortho"
xmin = 0
xmax = 82.7959410430839
intervals: size = 6
intervals [1]:a
xmin = 0
xmax = 15.393970521541949
text = "Aj tento rok organizuje Rádio Sud piva. Kto chce súťažiť, nemusí sa nikde registrovať.
intervals [2]:
xmin = 15.393970521541949
xmax = 27.58997052154195
.
.
.
Hi I am working with hundreds of text files like this.
I want to filter all values xmin=... from this text file but only from 16th line because at the start there are xmins which are useless as you can see.
I tried:
cat text.txt | grep xmin
but it shows all lines where xmin is.
Please help me. I can't modify text files because I need to work with hundreds of them so I have to design suitable way how to filter them.

Like this:
awk 'FNR>15 && /xmin/' file*
xmin = 0
xmin = 15.393970521541949
It show all xmin from line 16
You can also print file name of the found xmin
awk 'FNR>15 && /xmin/ {$1=$1;print FILENAME" -> "$0}' file*
file22 -> xmin = 0
file22 -> xmin = 15.393970521541949
Update: Need to be FNR to work with multiple files.

Using sed and grep to look for "xmin" from 16th line till the end of a single file:
sed -n '16,$p' foobar.txt | grep "xmin"
In case of multiple files here is a bash script to get the output:
#!/bin/bash
for file in "$1"/*; do
output=$(sed -n '16,$p' "$file" | grep "xmin")
if [[ -n $output ]]; then
echo -e "$file has follwoing entries:\n$output"; fi; done
Run the script as bash script.sh /directory/containing/the/files/to/be/searched

How to find a text string within a config file and add two new lines below with one single line command?

i am trying to fight that Poodle SSL Bug and alot of configuration files have to be changed by me.
I try to save myself some work and do it by single line commands. Thus i Need to modify - for example - the /etc/lighttpd/lighttpd.conf.
I am looking for ssl.engine = "enable"
And i want to append two new lines BELOW that found string:
ssl.use-sslv2 = "disable"
ssl.use-sslv3 = "disable"
End Result should be:
$SERVER["socket"] == "0.0.0.0:80" {
ssl.engine = "enable"
ssl.use-sslv2 = "disable"
ssl.use-sslv3 = "disable"
ssl.pemfile = "/etc/ssl/cert.pem"
ssl.ca-file = "/etc/ssl/cert.bundle"
}
I tried with AWK but i cant get it to work due to the spaces and the carriage return.
Happy to see some one liners :)
Here is what i tried (surely did something wrong here):
awk '{a[NR]=$0}/ssl.engine/{a[NR+1]=a[NR+1]"ssl.use-sslv2 = "disable""}{a[NR+2]"ssl.use-sslv3 = "disable""}END{for(i=1;i<=NR;i++)print a[i]}' /etc/lighttpd/lighttpd.conf

Just print every line and if a line contains the regexp then print whatever you want:
awk '{print} /ssl\.engine = "enable"/{ print "ssl.use-sslv2 = \"disable\"\nssl.use-sslv3 = \"disable\"" }' file
Note that '.' is an RE metacharacter so it needs to be escaped in the RE context.
If you want to modify the original file then with GNU awk:
awk -i inplace '...' file
and with any awk or any other UNIX tool:
awk '...' file > tmp && mv tmp file
Explicitly:
awk -i inplace '{print} /ssl\.engine = "enable"/{ print "ssl.use-sslv2 = \"disable\"\nssl.use-sslv3 = \"disable\"" }' file
awk '{print} /ssl\.engine = "enable"/{ print "ssl.use-sslv2 = \"disable\"\nssl.use-sslv3 = \"disable\"" }' file > tmp && mv tmp file

How about sed?
sed 's/ssl.engine = "enable"/ssl.engine = "enable"\nssl.use-sslv2 = "disable"\nssl.use-sslv3 = "disable"/' config
Use sed -i to substitute in-place:
sed -i 's/ssl.engine = "enable"/ssl.engine = "enable"\nssl.use-sslv2 = "disable"\nssl.use-sslv3 = "disable"/' config

Linux awk text file processing

I have a file with a few thousand lines of data, each line is like: a:b:c:d
So for example:
0.0:2000.00:2000.04:2000.02
I want to get all a's in one file, b's in second file etc. How?

One way. Output files will be named fileX, with X for each column number.
Assuming infile with content:
0.0:2000.00:2000.04:2001.02
0.1:2002.00:2000.05:2003.02
0.2:2003.00:2002.04:2004.02
0.3:2001.00:2000.05:2000.03
0.3:2001.00:2000.04:2001.02
0.2:2001.00:2002.04:2000.02
Execute this awk command:
awk '
BEGIN {
FS = ":";
}
{
for ( i = 1; i <= NF; i++ ) {
print $i > "file" i;
}
}
' infile
Check output files:
head file[1234]
With following result:
==> file1 <==
0.0
0.1
0.2
0.3
0.3
0.2
==> file2 <==
2000.00
2002.00
2003.00
2001.00
2001.00
2001.00
==> file3 <==
2000.04
2000.05
2002.04
2000.05
2000.04
2002.04
==> file4 <==
2001.02
2003.02
2004.02
2000.03
2001.02
2000.02

Look at the awk (or gawk) manual.
You should use the -F: flag to set the field separator to :.
You should use print with > file to get the outputs to the file you want.
awk -F: '{ for (i = 1; i <= NF; i++) { file = "file." i; print $i > file; } }' input
(awk on Mac OS X 10.7.4 does not permit an expression as the file name; gawk does. The solution shown will work on both.)

What about:
cat filename|cut -d ':' -f1 > a.txt
Then you can write -f2 for the second field and put it in b.txt.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

opening a file > replacing text in a loop in unix - linux

awk -F, 'NR>1{i = 0; while(++i < NF){print "model.input"i" = /usr/bin/"$(i+1) >> "test"$1".cfg"} }' File Output: AMD$ cat test1.cfg model.input1 = /usr/bin/file1 model.input2 = /usr/bin/file2 AMD$ cat test2.cfg model.input1 = /usr/bin/file3 model.input2 = /usr/bin/file4 Hope this helps.

Related

Print sum of Nth column at the header of file with existing rows bash

cat one file to multiple file and output multiple files

Search word from certain line GREP

How to find a text string within a config file and add two new lines below with one single line command?

Linux awk text file processing

Categories

Resources