i have a file created in windows using notepad:
26453215432460
23543265235421
38654365876325
12354152435243
I have a script which will read every line, and create a command like below in other file for every line and will not consider blank lines:
CRE:EQU,264532154324600,432460,1;
Now if I save my input file after hitting enter after the last line of number 12354152435243, then the output file consists the command above corresponding to all numbers(including the last 12354152435243:
CRE:EQU,264532154324600,432460,1;
CRE:EQU,235432652354210,235421,1;
CRE:EQU,386543658763250,876325,1;
CRE:EQU,123541524352430,435243,1;
but if I save the file, without hitting enter after the last number is keyed in i.e after this 12354152435243, then after the script executes, I don't see the output file have the command for the last number:
CRE:EQU,264532154324600,432460,1;
CRE:EQU,235432652354210,235421,1;
CRE:EQU,386543658763250,876325,1;
Can somebody explain the error in the code:
while read LINE
do
[ -z "$LINE" ] && continue
IMEI=`echo $LINE | sed 's/ //g' | sed -e 's/[^ -~]//g'`
END_SERIAL=`echo $IMEI | cut -c9- | sed 's/ //g' | sed -e 's/[^ -~]//g'`
echo "CRE:EQU,${IMEI}0,${END_SERIAL},${list},,${TODAY};" >> /apps/ins/list.out
done < "${FILE_NAME}"
kindly help
Use
grep . "${FILE_NAME}" | while read LINE
or
while read LINE
do
....
done < <(grep . "${FILE_NAME}")
The grep is less sensible to line-ending, and you will get empty-line skip for a free... :)
Honestly, never tried windows, all above is OK for unix...
EDIT Explanation:
make the next file:
echo -n -e 'line\n\nanother\nno line ending here>' >file.txt
the file contains 4 lines (although the last "line" is not a "correct" one)
line
another
no line ending here>
Usual shell routines, as read or wc looking for line ending. Therefore,
$ wc -l file.txt
3 file.txt
When you grepping for '' (empty string) the grep returns every line where found the string, so
$ grep '' file.txt
prints
line
another
no line ending here>
When grep prints out the found lines - ensures than one `\n' exists at the end, so
$ grep '' file.txt | wc -l
returns
4
therefore, for these situations, is better to use grep with -c (count) and not wc.
$ grep -c '' file.txt
4
Now, the . dot. The dot mean any character. So, when you grepping for a ., you get all lines what contain at least one character. And therefore, it will skip all lines what doesn't contain any character = skips empty lines. So,
$ grep . file.txt
line
another
no line ending here>
again, with added line ending to the last line (and skipped the empty line). Remember, the (space) is character too, so when the line contains only one space it is NOT EMPTY. Counting non-empty lines
$ grep . file.txt | wc -l
3
or faster
$ grep -c . file.txt
3
If you do a help read it says for -d delim continue until the first character of DELIM is read, rather than newline.
So read will continue until it hits a \n or if you specify -d delim.
So you probably need to change the delim or you can try read -e
read will read untill a new line has been found and when it finds a new line, it will return you the line. But if the file ends without a new line, read treats this as an error. So, even if read has set the returning variable with the line read till now, the return code of read is set to indicate an error. Now the while read ... this loop body will only executes if the command executes with a success which is not a case here. Thus you miss the last line.
For overcoming this , you can change the condition to also check the returning variable is empty or not. Hence the condition succeeds even if read fails, as the variable is already set till the end of the file.
This is not related to line ending in different OS, i mean it's somehow related, but the exact root cause is always the read fails to find a new line at the end of the line/file, and the last line is missing the loop body.
Below is an example
[[bash_prompt$]]$ echo -ne 'hello\nthere' > log
[[bash_prompt$]]$ while read line; do echo $line; done < log
hello
[[bash_prompt$]]$ while read line || [ -n "$line" ]; do echo $line; done < log
hello
there
[[bash_prompt$]]$
read nees the end of line to read the input. Try
echo -n $'a\nb' | while read x ; do echo $x ; done
It only prints a.
To prevent script not reading last line of a file:
cat "somefile" | { cat ; echo ; } | while read line; do echo $line; done
Source : My open source project https://sourceforge.net/projects/command-output-to-html-table/
Related
I have a 60GB file that's a single line.
All I need to do is to change the last "," (the last character in the file).
The thing is sed can't process it because it's all in a single line and it fails to allocate memory.
// file.txt
[0] ...mple12,sample13),(sample21,sample22,sample23),
// desired file.txt
[0] ...mple12,sample13),(sample21,sample22,sample23);
I get an error Couldn't re-allocate memory
In such cases, a stream oriented approach might help.
Can easily be achieved with shell:
# First remove last character
head -c -1 < file.txt > file2.txt
# then add new last character ';' to the end
echo -n ";" >> file2.txt
Please note: If there is a CR at the end of the file, you need to use 'head -c -2' instead.
A one-liner would be:
head -c -1 <file.txt | (cat - ; echo ';') > file2.txt
To find the 2nd character it was grep -e '^.[aA]'. Then what will be for the 4th character? I tried grep -e'^...[aA]'. But it went wrong.
grep processes the input line by line. ^.[aA] is true if a or A is the second character on any line.
You can combine grep with head to only inspect the first line:
head -n1 filename | grep '^...[aA]'
But it still wouldn't work for a file whose first line is shorter than four characters:
x
ya
To really check the fourth character in a file, grep is not the best tool.
#! /bin/bash
read -N4 chars < filename
if [[ "${chars:3:1}" == [aA] ]] ; then
echo Found
fi
But if you tried hard enough, you can still use it. E.g., use tr to replace newlines by spaces, then you can run your grep:
tr '\n' ' ' < filename | grep '^...[aA]'
I have files containing below format, generated by another system
12;453453;TBS;OPPS;
12;453454;TGS;OPPS;
12;453455;TGS;OPPS;
12;453456;TGS;OPPS;
20;787899;THS;CLST;
33;786789;
i have to check the last line contains 33 , then have to continue to copy the file/files to other location. else discard the file.
currently I am doing as below
tail -1 abc.txt >> c.txt
awk '{print substr($0,0,2)}' c.txt
then if the o/p is saved to another variable and copying.
Can anyone suggest any other simple way.
Thank you!
R/
Imagine you have the following input file:
$ cat file
a
b
c
d
e
agc
Then you can run the following commands (grep, awk, sed, cut) to get the first 2 char of last line:
AWK
$ awk 'END{print substr($0,0,2)}' file
ag
SED
$ sed -n '$s/^\(..\).*/\1/p' file
ag
GREP
$ tail -1 file | grep -oE '^..'
ag
CUT
$ tail -1 file | cut -c '1-2'
ag
BASH SUBSTRING
line=$(tail -1 file); echo ${line:0:2}
All of those commands do what you are looking for, the awk command will just do the operation on the last line of the file so you do not need tail anymore, the said command will extract the last line of the file and store it in its pattern buffer, then replace everything that is not the first 2 chars by nothing and then print the pattern buffer (the 2 fist char of the last line), another solution is just to tail the last line of the file and to extract the first 2 chars using grep, by piping those 2 commands you can also do it in one step without using intermediate variables, files.
Now if you want to put everything in one script this become:
$ more file check_2chars.sh
::::::::::::::
file
::::::::::::::
a
b
c
d
e
33abc
::::::::::::::
check_2chars.sh
::::::::::::::
#!/bin/bash
s1=$(tail -1 file | cut -c 1-2) #you can use other commands from this post
s2=33
if [ "$s1" == "$s2" ]
then
echo "match" #implement the copy/discard logic
fi
Execution:
$ ./check_2chars.sh
match
I will let you implement the copy/discard logic
PROOF:
Given the task of either copying or deleting files based on their contents, shell variables aren't necessary.
Using the sed File name command and xargs the whole task can be done in just one line:
find | xargs -l sed -n '${/^33/!F}' | xargs -r rm ; cp * dest/dir/
Or preferably, with GNU sed:
sed -sn '${/^33/!F}' * | xargs -r rm ; cp * dest/dir/
Or if all the filenames contain no whitespace:
rm -r $(sed -sn '${/^33/!F}' *) ; cp * dest/dir/
That assumes all the files in the current directory are to be tested.
sed looks at the last line ($) of every file, and runs what's in the curly braces.
If any of those last lines line do not begin with 33 (/^33/!), sed outputs just those unwanted file names (F).
Supposing the unwanted files are named foo and baz -- those are piped to xargs which runs rm foo baz.
At this point the only files left should be copied over to dest/dir/: cp * dest/dir/.
It's efficient, cp and rm need only be run once.
If a shell variable must be used, here are two more methods:
Using tail and bash, store first two chars of the last line to $n:
n="$(tail -1 abc.txt)" n="${n:0:2}"
Here's a more portable POSIX shell version:
n="$(tail -1 abc.txt)" n="${n%${n#??}}"
You may explicitly test with sed for the last line ($) starting with 33 (/^33.*/):
echo " 12;453453;TBS;OPPS;
12;453454;TGS;OPPS;
12;453455;TGS;OPPS;
12;453456;TGS;OPPS;
20;787899;THS;CLST;
33;786789;" | sed -n "$ {/^33.*/p}"
33;786789;
If you store the result in a variable, you may test it for being empty or not:
lastline33=$(echo " 12;453453;TBS;OPPS;
12;453454;TGS;OPPS;
12;453455;TGS;OPPS;
12;453456;TGS;OPPS;
20;787899;THS;CLST;
33;786789;" | sed -n "$ {/^33.*/p}")
echo $(test -n "$lastline33" && echo not null || echo null)
not null
Probably you like the regular expression to contain the semicolon, because else it would match 330, 331, ...339, 33401345 and so on, but maybe that can be excluded from the context - to me it seems a good idea:
lastline33=$(sed -n "$ {/^33;.*/p}" abc.txt)
I want to add a text to the end of the first line of a file using a bash script.
The file is /etc/cmdline.txt which does not allow line breaks and needs new commands seperated by a blank, so text i want to add realy needs to be in first line.
What i got so far is:
line=' bcm2708.w1_gpio_pin=20'
file=/boot/cmdline.txt
if ! grep -q -x -F -e "$line" <"$file"; then
printf '%s' "$line\n" >>"$file"
fi
But that appends the text after the line break of the first line, so the result is wrong.
I either need to trim the file contend, add my text and a line feed or somehow just add it to first line of file not touching the rest somehow, but my knowledge of bash scripts is not good enough to find a solution here, and all the examples i find online add beginning/end of every line in a file, not just the first line.
This sed command will add 123 to end of first line of your file.
sed ' 1 s/.*/&123/' yourfile.txt
also
sed '1 s/$/ 123/' yourfile.txt
For appending result to the same file you have to use -i switch :
sed -i ' 1 s/.*/&123/' yourfile.txt
This is a solution to add "ok" at the first line on /etc/passwd, I think you can use this in your script with a little bit of 'tuning' :
$ awk 'NR==1{printf "%s %s\n", $0, "ok"}' /etc/passwd
root:x:0:0:root:/root:/bin/bash ok
To edit a file, you can use ed, the standard editor:
line=' bcm2708.w1_gpio_pin=20'
file=/boot/cmdline.txt
if ! grep -q -x -F -e "$line" <"$file"; then
ed -s "$file" < <(printf '%s\n' 1 a "$line" . 1,2j w q)
fi
ed's commands:
1: go to line 1
a: append (this will insert after the current line)
We're in insert mode and we're inserting the expansion of $line
.: stop insert mode
1,2j join lines 1 and 2
w: write
q: quit
This can be used to append a variable to the first line of input:
awk -v suffix="$suffix" '{print NR==1 ? $0 suffix : $0}'
This will work even if the variable could potentially contain regex formatting characters.
Example:
suffix=' [first line]'
cat input.txt | awk -v suffix="$suffix" '{print NR==1 ? $0 suffix : $0}' > output.txt
input.txt:
Line 1
Line 2
Line 3
output.txt:
Line 1 [first line]
Line 2
Line 3
I have a big file 150GB CSV file and I would like to remove the first 17 lines and the last 8 lines. I have tried the following but seems that's not working right
sed -i -n -e :a -e '1,8!{P;N;D;};N;ba'
and
sed -i '1,17d'
I wonder if someone can help with sed or awk, one liner will be great?
head and tail are better for the job than sed or awk.
tail -n+18 file | head -n-8 > newfile
awk -v nr="$(wc -l < file)" 'NR>17 && NR<(nr-8)' file
All awk:
awk 'NR>y+x{print A[NR%y]} {A[NR%y]=$0}' x=17 y=8 file
Try this :
sed '{[/]<n>|<string>|<regex>[/]}d' <fileName>
sed '{[/]<adr1>[,<adr2>][/]d' <fileName>
where
/.../=delimiters
n = line number
string = string found in in line
regex = regular expression corresponding to the searched pattern
addr = address of a line (number or pattern )
d = delete
Refer this link
LENGTH=`wc -l < file`
head -n $((LENGTH-8)) file | tail -n $((LENGTH-17)) > file
Edit: As mtk posted in comment this won't work. If you want to use wc and track file length you should use:
LENGTH=`wc -l < file`
head -n $((LENGTH-8)) file | tail -n $((LENGTH-8-17)) > file
or:
LENGTH=`wc -l < file`
head -n $((LENGTH-8)) file > file
LENGTH=`wc -l < file`
tail -n $((LENGTH-17)) file > file
What makes this solution less elegant than that posted by choroba :)
I learnt this today for the shell.
{
ghead -17 > /dev/null
sed -n -e :a -e '1,8!{P;N;D;};N;ba'
} < my-bigfile > subset-of
One has to use a non consuming head, hence the use of ghead from the GNU coreutils.
Similar to Thor's answer, but a bit shorter:
sed -i '' -e $'1,17d;:a\nN;19,25ba\nP;D' file.txt
The -i '' tells sed to edit the file in place. (The syntax may be a bit different on your system. Check the man page.)
If you want to delete front lines from the front and tail from the end, you'd have to use the following numbers:
1,{front}d;:a\nN;{front+2},{front+tail}ba\nP;D
(I put them in curly braces here, but that's just pseudocode. You'll have to replace them by the actual numbers. Also, it should work with {front+1}, but it doesn't on my machine (macOS 10.12.4). I think that's a bug.)
I'll try to explain how the command works. Here's a human-readable version:
1,17d # delete lines 1 ... 17, goto start
:a # define label a
N # add next line from file to buffer, quit if at end of file
19,25ba # if line number is 19 ... 25, goto start (label a)
P # print first line in buffer
D # delete first line from buffer, go back to start
First we skip 17 lines. That's easy. The rest is tricky, but basically we keep a buffer of eight lines. We only start printing lines when the buffer is full, but we stop printing when we reach the end of the file, so at the end, there are still eight lines left in the buffer that we didn't print - in other words, we deleted them.