Edit huge sql data file - linux

I have a 23GB file and I would like to edit the 23rd line, but I have only 200 MB RAM available on the server. I do not want to open the file entirely because I have left only 20GB available disk space.
How can I do this. I tried to use head, tail sed but it seems it creates a temporary file. Is it possible to do it without a temporary file?

The solution is to edit the file with a hex editor. Hex editors are built to handle huge files, even whole disks and partitions.
You may find hexedit (ncurses based) or ghex (Gnome/Gtk based) useful. They are common utilities, therefore you will most probably find them in your distributions's repo.
All hex editors I have used, use a twin panel view with the left panel showing the bytes of the file in Hex, and the right panel trying to show an Ascii representation when that is possible.
In order to find and edit your 23rd line:
sed -n '23p' my_huge_dump.sql : Will print the contents of this line
sed -n '23p' my_huge_dump.sql | od -A n -t x1 : Will print the contents of this line in hexadecimal format.
or open the file with less -N my_huge_dump.sql and view the contents of line 23. (-N in less enables line numbering)
Now, knowing the content of the 23rd line:
If the text of this line is somewhat unique and different from surrounding lines, you may find it from the right (ascii) panel and navigate to this line with the arrows. In hexedit you use the Tab key to move between the Hex and Ascii panels. In gHex you can use your mouse as well. You may also search for the string you're interested: Move to the Ascii panel and press / in hexedit or use the menu in gHex.
If the line you want to edit has similar contents to other lines and you can't find it in the ascii panel, then you must count the "newline" separators to find the 23rd line. New lines (LF) are represented as 0A in hex. In the ASCII panel, new lines are represented as dots .
Then assuming you found the line you want to edit, you have the following options:
Hopefully, the new content of the 23rd line is shorter or equal in length to the existing content (so you won't need to grow and move the whole file). In this case, you have to enter the Fill-mode i.e. the mode in which you overwrite existing content typing over the old text. This is the default mode in both gHex and hexedit. Move to the location you want to edit and start typing. Pressing Backspace will undo your changes. If the new content is shorter than the existing, you may fill up the line with spaces to avoid truncating the file.
If the new content is longer than the existing one in this line, then you have to enter the Insert mode. You can do that using the Menu in gHex. In hexedit you have to use the EscI keybinding. Then start typing and the new characters will be appended in the current location.
In the first case, it is guaranteed that the editing and saving of the file will be instantaneous since an in-place edit will happen. In the later case, I'm not sure how the growing in size and the moving of following bytes will be handled, but I hope the filesystem uses a larger non-continuous block to move some of the contents and not move the whole file.
If you're happy with your changes, save the file:
Use the menu in gHex
Use Ctrlx in hexedit and answer (Y)es when questioned about whether to save the changes.
Always make sure you have a backup in place!
EDIT: I found out that gHex isn't suitable for your situation, since it tries to load the whole file in memory. hexedit will serve you fine. However, if you want a graphical editor like gHex, but with partial file loading capabilities, you may try wxHexEditor. Check also the Comparison of Hex editors page in Wikipedia.

Liquid Studio Community Edition contains a Large File Editor which can open and edit Terra-byte files on low spec machines, and its free.
It requires enough disk space to copy the file (when writing it back out), but hardly requires any memory.

Related

set tab-stop = 2 in vim permanently for a file

How to set the tab size as 2 for a file permanently in vim as whenever I open a file in other editors like nano or upload the file in github then my indentations are all incorrent whenever I try to resize the tab to 2 for an existing file which has all incorrect indentations. The tab-stop=2 does not permanently resizes the tab and I see all incorrect indentation when I open the same file in nano or view it in github.
Tabs don't have an inherent size so it is up to each program to decide how to display them and there is simply no way to guarantee that a tab will always look the same everywhere.
This is precisely the main issue people have with tabs: you can tell $SOME_TOOL and $SOME_OTHER_TOOL that a tab takes two spaces but that setting can't possibly be carried over to every tool.
Modelines are editor-specific (and they are too intrusive anyway) and Editorconfig is not universally supported so there is really no universal solution beyond using spaces for indentation.

IBM Mainframe copy/paste

Disclaimer: I'm new to using Rumba to access IBM Mainframe.
I have currently set up a library for personal use and I have some code that I want to store in a member of this library, how can I copy/paste from a .txt file on my desktop into this program??? As of right now I can successfully copy/paste one line at a time from documents outside of Rumba.
There are various ways. The best one will depend upon the size of the file/amount of data to be transferred.
If it's only a few lines, block copy and paste should work, but you might have to play with Rumba's 'paste' edit settings such as how to handle new lines, etc.
Bigger files can be transferred with the TSO file transfer program indĀ£file (maybe ind$file on your system) which essentially copies a file to the screen and then Rumba 'scrapes' the screen for data to put into a file (this is for a mainframe-to-PC transfer; for going the other way the operation is reversed). This can be surprisingly quick.
Lastly there's FTP - either from the command line or via a program such as WinSCP.
Edit:
Based on your comment that the files are about 300 lines long, I'd look into using Rumba's file-transfer option using the ind$file utility. Once you have the files on one system, speak to your mainframe tech support team about the best way to get them to the other systems.
If you need help uploading the files, then the tech support team should be your first point of call.
What mainframe editor are you running? TSO/ISPF?
I copy and and paste from ".txt" files into ISPS all the time with no problem.
Select the text you want to copy (in the ".txt" file)
Press CTRL-C
Open the mainframe file using ISPF Edit (option 2).
Enter line command "Inn" at the line where where you want the copy to start.
(This inserts "nn" empty lines to receive the copied data. Personally, I usually use "nn"=20)
Position your cursor at the first character of the first empty line.
Press CTRL-V

Is it possible to insert a file into an exe?

I need to insert a generated file into an exe at the time of download. Currently, I create an "empty" file (filled with a repeating character) and package that with the exe. When it comes time to download, I look at the bytes for the installer, find the file by looking for the repeating character, and insert the generated file.
This process however is not working. The repeating character just does not show in the bytes. But I am certain the file is there as it is unpacked if I run the exe. Am I doing something wrong or is inserting a file into an exe even possible?
Also note that I am using Inno Setup Script v5.5.1 to compile the project into an exe.
If you want to change the contents of a file specified in a [Files] entry and compiled into the setup executable, then you must:
Make a dummy file that is at least as large as the largest content you will want to insert.
Fill the file (or at least the first 64 bytes or so) with something unique and easily distinguishable.
Mark its [Files] entry with the "nocompression noencryption dontverifychecksum" flags.
You should then be able to scan the resulting executable for the marker in #2 and then substitute the data that you want. Note however that doing this might invalidate any digital signature on the setup file, although I haven't tested this to be sure.
Note that if the content you are inserting is smaller than the dummy file size, the extra bytes will still remain on the end of your inserted content. So whatever reads the file will have to have some way to ignore that or to recognise the end of the interesting content.
So, if your are making changes in the existing exe file, and if the text is not much, you can probably use some hex editor and make changes at desired location. If text is more , you might want to include some meaningless bytes, just as fillers.

How do I view a log file generated by screen (screenlog.0)

So I just found out I can create log files of everything I do in screen (C-a H). Sounds like a nice way to keep track of potential goofs in a particular screen session. However, when I went to try it out the logfile is reported as being a binary file (and can't be viewed like a regular text as such). So am I missing something? A quick man page looksee and searching Google (and SO) turns up nothing about this.
So my question is: How do I generate plain text log files in screen?
Assuming the answer is "What a noob... how about you try making them? RTFM." my question becomes: How do I use less to view screen logfiles I've created (since less screenlog.0 does not work on a binary file)?
EDIT: So cat works fine but less complains that the file is binary... why?
SOLUTION: as jcomeau_ictx helpfully pointed out, you can view these logfiles fine with cat or more but with less you must add the -r flag less -r screenlog.0
I just found a screenlog.0 on the net; it is plain text, with some escape sequences. Just 'cat' the file, you should be able to view it just fine.
[after more checking]
Control-A H is what generates the screenlog on my system. And though 'cat' works, you'll miss a lot of data. Use 'more' instead of 'less' to interpolate the escape codes.
I found neither less nor more nor cat to be an ideal solution for viewing screenlog files. All "replay" some of the control character so that e.g. screen deletions as produced by "clear" (don't remember the corresponding control character) are beeing shown, hiding what has been cleared.
What i know works great is: use "view" or "vi", it just shows the control character in escaped notation. Probably any other text editor works, too (not tested).
-L logs to file,
tail -f 'logfilename' to monitor this file

I want to change the way text is represented internally in ANY Text Editor

I want to use a algorithm to reduce memory used to save the particular text file.I don't really know how text is stored but i have an idea in mind.
Would it be better to extend a open source text editor (if yes than which one) or write a text editor myself.
It would be nice if someone could also give me a link or tutorial to some basics on how text editors work and the way data is stored.
Edited to add
To clarify, what I wanted to do is instead of saving duplicates of a word make a hash table and store the address where it needs to be placed.
That way I wouldn't be storing the duplicates.
This would have become specific to a particular text editor.
Update
thanks everyone I got what all of you'll are trying to say. Anyways all i wanted to do is instead of saving duplicates of a word make a hash table and store the address where it needs to be placed.
This was i wouldn't be storing the duplicates.
Yes and this would have become specific to a particular text editor. never realized that.
I want to use a algorithm to reduce memory used to save the particular text file
If you did this you would no longer have a text editor, but instead you would have created some sort of binary file editor.
The whole point of the text file format is that it is universal, meaning any text file can be open in any other text editor.
Emacs handles compression transparently. Just create a text file with .gz extension. Emacs will automatically compress contents of the file during save operation, and decompress when you open the file next time.
Text is basically stored as-is. i.e., every character takes up a byte or two (wide chars), and there is no conversion done on it when it's saved. It might add an end-of-file character or something though. Don't try coming up with your own algorithm to compress these files. That's why zip-files and other archives were created. They're really good at compressing text. If you wanted to add these feature to your text-editor, you'd have to add some sort of post-save hook to zip it, and then put a hook on the open command to unzip it. Unless you wanted to do it by hand every time. Don't try writing the text editor yourself from scratch, unless (maybe) you're writing notepad. Text editors with syntax highlighting aren't very easy to make, even with the proper libraries. I'd say write a plugin for something like Visual Studio or what have you. Or find an open-source text editor.

Resources