how to edit a huge text file inline [closed] - linux

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
I have a huge text file(100GB) that requires editing a single line on linux.
Clearly this can't be done with a regular text editor.
Is there a way to do this? basically jumps to the nth line and then edit it and then save it back.

You can use the 'sed' stream editor to edit files of arbitrary size as it does not need to load the entire file in at once. for instance:
sed '54 s/[0-9][0-9]*/gone/' < file_in.txt > file_out.txt
will replace a number found on line 54 with the word 'gone'.
It also supports editing a file in place with the '-i' option, but I have never tried it on a hundred gigabyte file. No reason it shouldn't work.

If you known the exactly byte offset of the location to edit, and the edition does not change the length of the line, then you could fseek() to the line, read the line in, change it and then write out.

Suppose there is a 6000 line 'example.txt' and you want to change 3001th line to 'hello world'.
head -n 3000 example.txt > tmp.txt
echo 'hello world' >> tmp.txt
tail -n 2999 example.txt >> tmp.txt
mv tmp.txt example.txt

Related

How to find all lines which contain at least one of a set of words as a prefix [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 1 year ago.
Improve this question
I have a text file of words, one per line, called A.
I have another text file B.
How can I find all lines in B what have at least one of the words from A as a prefix?
I was hoping to be able to do this from the command line maybe using grep but any other command line solution would be great too.
For example, if A is
apple
bob
cheese
and B is
aple
bob123
ches
I would like the line bob123 to be returned.
One approach uses bash's process substitution and sed to add a regular expression beginning-of-line ^ anchor to each line of A, and then tells grep to use it as a list of regular expressions to search for:
$ grep -f <(sed 's/^/^/' a.txt) b.txt
bob123

Replace with sed on csv file [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 1 year ago.
Improve this question
I have a csv file and I am trying to substitute the last letter for a word...
The input is
1111;AAA;... (more columns);A1a;A
2222;XXX;... (more columns);T3g;B
... (more rows)
...(more rows)
4564;AdA;... (more columns);G1a;A
33321;B1X; ... (more columns);T3g;B
And I want to replace A for "Avocado" and B for "Banana"...
I tried
#sed -e "s/;A$/;C/g" file.csv
But doesn't work, any advice, please?
Is the following what you're trying to achieve?
tink#host:~/tmp$ sed 's/A$/Avocado/;s/B$/Banana/' file.csv
1111;AAA;... (more columns);A1a;Avocado
2222;XXX;... (more columns);T3g;Banana
... (more rows)
...(more rows)
4564;AdA;... (more columns);G1a;Avocado
33321;B1X; ... (more columns);T3g;Banana
If that looks correct, and you want to change in-file, add a -i to sed.
If you want a new file, add a > new_file to the end of the line.
This seems to work:
sed -i 's/A$/Avocado$/g' file.csv
sed -i 's/B$/Banana$/g' file.csv
The -i replaces the text and the Regex doesn't need the ; because it should use only one character, right? Therefore, one can just specify that character and replace it with a whole word.

How do a search a text file for a list of phrases contained in another text file? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
I have a text file with many redundancies in English, one on each line, for example
in excess of
in order to
in order for
...
I would like to search another text document to see if it contains any of these phrases. If it does all it need do is print the phrase, I can do the rest manually. Can I do this easily on the command line?
sure - try grep
grep -F -f phrases.txt doc.txt
phrases.txt
in excess of
in order to
in order for
doc.txt
use grep -f in order to match this line
this line won't be matched
you can use grep -o to only print the matched phrase - not the entire line:
$ grep -o -F -f phrases.txt doc.txt
in order to

remove whole line from file with 2 column from a text file with 1 column in linux [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
E.G
File A
abc 123
def 456
ghi 789
File B
def
Resultfile
abc 123
ghi 789
I tried it with sed, grep but it just won't work. I just stated learning linux and coudn't find anything similiar.
Thank you
//*-----------
grep -wvf worked but now i see that i have a problem with sting where a "#" is in front. those will be removed too. By modifying to grep -wxvf, the command won't work at all. Do i need another command other than grep?
awk 'FNR==NR{a[$0];next}(!($1 in a)){print}' fileb filea
{a[$0];next}- this block executes until FNR==NR(which means for all lines of fileb). "next" ensures that no code gets executed after this block.
At the end of fileb line you have a associative array with key as lines in filb and value as null.
Then processing of lines in filea starts.
{print} will be executed for all lines in filea but on condition (!($1 in a))
which means print the lines in filea only if first field of filea is existing as a key in associative array a
Since I am not able to reply to Ur comment I am posting here , in order to delete the line that begins with "#" try
sed '/^#/d' filea fileb

inserting a certain text between in every occurrence of two following tabs [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
I am trying to insert NA between every occurrence of two tab characters immediately following each other in a text file. How can I do it with a sed command?
This might work for you (GNU sed):
sed ':a;s/\t\t/\tNA\t/g;ta' file
This covers all occurrances of \t\t throughout a file
Or if you prefer:
sed 's/\t\t/\tNA\t/g;s//\tNA\t/g' file
Like this:
sed 's/xx/xNAx/g' file
where you type x using Control-V TAB
Or, if you have GNU sed, you can type:
sed 's/\t\t/\tNA\t/g' file

Resources