How to copy data from file to another file starting from specific line - linux

I have two files data.txt and results.txt, assuming there are 5 lines in data.txt, I want to copy all these lines and paste them in file results.txt starting from the line number 4.
Here is a sample below:
Data.txt file:
stack
ping
dns
ip
remote
Results.txt file:
# here are some text
# please do not edit these lines
# blah blah..
this is the 4th line that data should go on.
I've tried sed with various combinations but I couldn't make it work, I'm not sure if it fit for that purpose as well.
sed -n '4p' /path/to/file/data.txt > /path/to/file/results.txt
The above code copies line 4 only. That isn't what I'm trying to achieve. As I said above, I need to copy all lines from data.txt and paste them in results.txt but it has to start from line 4 without modifying or overriding the first 3 lines.
Any help is greatly appreciated.
EDIT:
I want to override the copied data starting from line number 4 in
the file results.txt. So, I want to leave the first 3 lines without
modifications and override the rest of the file with the data copied
from data.txt file.

Here's a way that works well from cron. Less chance of losing data or corrupting the file:
# preserve first lines of results
head -3 results.txt > results.TMP
# append new data
cat data.txt >> results.TMP
# rename output file atomically in case of system crash
mv results.TMP results.txt

You can use process substitution to give cat a fifo which it will be able to read from :
cat <(head -3 result.txt) data.txt > result.txt

head -n 3 /path/to/file/results.txt > /path/to/file/results.txt
cat /path/to/file/data.txt >> /path/to/file/results.txt

if you can use awk:
awk 'NR!=FNR || NR<4' Result.txt Data.txt

Related

Fastest way to replace string in first row of huge file in linux command line?

I have a huge plain text file (~500Gb) on linux machine. I want the replace some string in header line (the first row of the file), but all the method I known seems to be slow and low efficiency.
example file:
foo apple cat
1 2 2
2 3 4
3 4 6
...
expected file output:
bar apple cat
1 2 2
2 3 4
3 4 6
...
sed:
sed -i '1s/foo/bar/g' file
-i can change the file in place, but this command generate a tmp file on disk and use the tmp file to replace the original one. The io waste time.
vim:
ex -c '1s/foo/bar/g' -c 'wq' file
vim doesn't generate a tmp file, but this tool load the whole file in to memory, which waste a lot of time either.
Is there a better solution that only read the first row in to memory and write it back to the original file? I known that linux head command can extract the first column very fast.
Could you please try following awk command and let me know if this helps you, I couldn't test it as I don't have a huge size file like 500 GB. For sure it shouldn't create any temp file in backend as it is not using inplace substitution on Input_file.
awk 'FNR==1{$1="bar";print;next} 1' Input_file > temp_file && mv temp_file Input_file

Edit text in a file with a script

I have a file called flw.py and would like to write a bash script that will replace some text in the file (take out the last two lines and add two new lines). I apologize if this seems like a stupid question. A thorough explanation would be appreciated since I am still learning to script. Thanks!
head -n -2 flw.py > tmp # (1)
echo "your first new line here..." >> tmp # (2)
echo "your second new line here...." >> tmp #
mv tmp flw.py # (3)
Explanation:
head normally prints out the first ten lines of a file. The -n argument can change the number of lines printed out. So if you wanted to print out the first 15 lines you would use head -n 15. If you give negative numbers to head it means the opposite: print out all lines but the last N lines. Which happens to be what you want: head -n -2
Then we redirect the output of our head command to a temporary file named tmp. > does the redirecting magic here. tmp now contains everything of flw.py but the last two lines.
Next we add the two new lines by using the echo command. We append the output of the echo "your first new line here..." to our tmp file. >> appends to an existing file, whereas > will overwrite an existing file.
We do the same thing for the second line we want to append.
Last, we move the tmp file to flw.py and the job is done.
You can use single sed command to get you expect result
sed -n 'N;$!P;$!D;a\line\n\line2' fly.py
Example:
cat fly.py
1
2
3
4
5
sed -n 'N;$!P;$!D;a\line\n\line2' fly.py
Output :
1
2
3
line1
line2
Note :
Using -i option to update your file

Getting extra line when output to file?

I'm using a diff command and it's printing out to a file. The file keeps getting an extra line in the end that I don't need to appear. How can I prevent it from being there?
The command is as follows:
diff -b <(grep -B 2 -A 1 'bedrock.local' /Applications/MAMP/conf/apache/httpd.conf) /Applications/MAMP/conf/apache/httpd.conf > test.txt
The file being used is here (thought I don't think it matters): http://yaharga.com/httpd.txt
Perhaps at least I'd like to know how to check the last line of the file and delete it only if it's blank.
To delete empty last line you can use sed, it will delete it only if it's blank:
sed '${/^\s*$/d;}' file
Ok i made research with your file on my MacOS.
I created file new.conf by touch new.conf and then copied data from your file to it.
btw i checked file and didn't have extra empty line at the bottom of it.
I wrote script script.sh with following:
diff -b <(grep -B 2 -A 1 'bedrock.local' new.conf) new.conf > test.txt
sed -i.bak '1d;s/^>//' test.txt
It diffed what was needed and deleted first useless row and all > saving it to a new file test.txt
I checked again and no extra empty line was presented.
Additionaly i would suggest you to try and delete the extra line you have like this: sed -i.bak '$d' test.txt
And check a number of lines before and after sed = test.txt
Probably your text editor somehow added this extra line to your file. Try something else - nano for example or vi

filter a file with other file in bash

i Have a file with numbers, for example:
$cat file
31038467
32048169
33058564
34088662
35093964
31018168
31138061
31208369
31538163
31798862
and other for example with
$cat file2
31208369
33058564
34088662
31538163
31038467
Then i need other file with lines that are in the first file but not in the second
cat $output
35093964
31018168
31138061
31798862
32048169
My real file has 12'000.0000 of lines.
Then how can i do it?
Is
grep -f file2 -v -F -x file1
sufficient?
NOTE1: Please specify in question, if the actual question is that, you need it to be time/memory optimized.
NOTE2: Get rid of any blank lines in file2.

Search for lines in a file that contain de lines of a second file

So I have a first file with a ID in each line, for example:
458-12-345
466-44-3-223
578-4-58-1
599-478
854-52658
955-12-32
Then I have a second file. It has a ID in each file followed by information, for example:
111-2457-1 0.2545 0.5484 0.6914 0.4222
112-4844-487 0.7475 0.4749 0.1114 0.8413
115-44-48-5 0.4464 0.8894 0.1140 0.1044
....
The first file only has 1000 lines, with the IDs of the info I need, while the second file has more than 200,000 lines.
I used the following bash command in a fedora with good results:
cat file1.txt | while read line; do cat file2.txt | egrep "^$line\ "; done > file3.txt
However I'm now trying to replicate the results in Ubuntu, and the output is a blank file. Is there a reason for this not to work in Ubuntu?
Thanks!
You can grep for several strings at once:
grep -f id_file data_file
Assuming that id_file contains all the IDs and data_file contains the IDs and data.
Typical job for awk:
awk 'FNR==NR{i[$1]=1;next} i[$1]{print}' file1 file2
This will print the lines from the second file that have an index in the first one. For even more speed, use mawk.
this line works fine for me in Ubuntu:
cat 1.txt | while read line; do cat 2.txt | grep "$line"; done
However, this may be slow as the second file (200000 lines) will be grepped 1000 times (number of lines in the first file)

Resources