Linux - Finding location of a characters in a lines of text file

Linux - Finding location of a characters in a lines of text file - linux

If I have text file
AAAAABDCBBCDA
AAAAACDABBCDA
AAAAADAABBCDA
AAAAABBCBBCDA
AAAAADCBBBCDA
AAAAAABCBBCDA
Because these texts are short, I can see at a glance which position(column) is different when I look at them. Position(column) 1~5, 9~13 of every line are written in the same way. However, position(column) 6~8 are different.
Which command can I use to locate the same part? (I wonder which command I can use to get the results of positions 1 to 5, 9 to 13.)

Related

python - handle strings in a file with some Japanese in it

I have a .c file that I want to open with python 3 to update a specific number on a specific line.
It seems like the most common way to do this would be to read the file in, write each line to a temporary file, when I get to the line I want, modify it, then write it to the temp file and keep going. Once I'm done, write the contents of the temp file back to the original file.
The problem that I have, is that in the comments of the file there are Japanese characters. I know I can still read it in by adding the error equal ignore argument, that allows me to still read the lines in but it gets rid of the Japanese characters completely and I need to preserve those.
I haven't been able to find a way how to do this. Is there any way to read in a file that's part in Japanese and part in English?

Find a file with all given strings on Linux

Similar to this question How do I find all files containing specific text on Linux? but I want all the files that contain multiple given strings (these strings not necessarily next to each other or on the same line, just in the same file).
My use case is I am looking at a UI and want to modify the file which controls this particular screen. The codebase though is huge and it is difficult to locate this file. All I have to go on is some of the hardcoded strings on this screen which I would like to do the search on. The strings are quite generic though such as 'Done', 'Close', 'View Details'... Doing a search on any of these strings individually, using the answer from the linked question above, brings back too many results but I think doing the search on all of them together will filter it down enough to find the file.

Delimiting quick-open path with fullstops in Sublime Text 3?

I'm making the move to ST3, and I'm having some trouble. I'd like to be able to delimit the quick-open filepath (⌘ + T) with periods instead of slashes or spaces. However, I can't find the setting to do that.
For example:
component.biz_site_promotions.presentation
should be able to open the file that
component biz_site_promotions presentation
would.
Any help would be greatly appreciated!

There is no setting in Sublime that changes the way this works; the search term is always used to directly match the text in the list items (except for space characters).
Note however that the Goto Anything panel uses fuzzy matching on the text that you're entering, so in many cases trying to enter an entire file name is more time consuming anyway.
As an example, to find the file you're mentioning, you could try entering the text cbspp, which in this case is the first letters of all of the parts of the file name in question.
As you add to the search term, the file list immediately filters down to text that matches what you entered; first only filenames that contain a C, then only filenames that contain a C that is followed somewhere after by a B, and so on.
Depending on the complexity and number of files that you have in your project, you may need to add in a few extra characters to dial in better (e.g. comb_s_pp). Usually this search method will either end you up at the exact file you want, or filter the list so much that the file that you want will be easier to find and select.
Additionally, when you select an item and there was more than one possible match, Sublime remembers which item you selected for that particular search term and brings it to the top of the search results next time you do it, under the assumption that you want the same thing again.
As you use Sublime more (and with different projects) you will quickly get a handle on what partial search terms work the best for you.
In addition to finding files, you can do other things with that panel as well, such as jumping to a specific line and/or column or searching inside the file for a search term and jumping directly to it. This applies not only to the current file but also the one that you're about to open.
For more complete details, there is a page in the Unofficial Documentation that covers File Navigation with Goto Anything
As an extra aside, starting with Sublime Text build 3154, the fuzzy searching algorithm handles spaces differently than previous builds.
Historically, spaces in the search term are essentially ignored and the entire input is treated as one search term to be matched character by character.
Starting in build 3154, spaces are handled by splitting up a single search term into multiple search terms, which are applied one after the other.
This allows multiple search terms to hit out of order. For example, index doc in build 3154 will find doc/index.html, but it won't find it in previous versions because the terms aren't in the right order.
As such, assuming you're not currently using such a build (as of right now it's a development build, so only licensed users have access to it), moving forward if you continue to search the way you're searching in your question, you might start getting more results than you expected.

Edit huge sql data file

I have a 23GB file and I would like to edit the 23rd line, but I have only 200 MB RAM available on the server. I do not want to open the file entirely because I have left only 20GB available disk space.
How can I do this. I tried to use head, tail sed but it seems it creates a temporary file. Is it possible to do it without a temporary file?

The solution is to edit the file with a hex editor. Hex editors are built to handle huge files, even whole disks and partitions.
You may find hexedit (ncurses based) or ghex (Gnome/Gtk based) useful. They are common utilities, therefore you will most probably find them in your distributions's repo.
All hex editors I have used, use a twin panel view with the left panel showing the bytes of the file in Hex, and the right panel trying to show an Ascii representation when that is possible.
In order to find and edit your 23rd line:
sed -n '23p' my_huge_dump.sql : Will print the contents of this line
sed -n '23p' my_huge_dump.sql | od -A n -t x1 : Will print the contents of this line in hexadecimal format.
or open the file with less -N my_huge_dump.sql and view the contents of line 23. (-N in less enables line numbering)
Now, knowing the content of the 23rd line:
If the text of this line is somewhat unique and different from surrounding lines, you may find it from the right (ascii) panel and navigate to this line with the arrows. In hexedit you use the Tab key to move between the Hex and Ascii panels. In gHex you can use your mouse as well. You may also search for the string you're interested: Move to the Ascii panel and press / in hexedit or use the menu in gHex.
If the line you want to edit has similar contents to other lines and you can't find it in the ascii panel, then you must count the "newline" separators to find the 23rd line. New lines (LF) are represented as 0A in hex. In the ASCII panel, new lines are represented as dots .
Then assuming you found the line you want to edit, you have the following options:
Hopefully, the new content of the 23rd line is shorter or equal in length to the existing content (so you won't need to grow and move the whole file). In this case, you have to enter the Fill-mode i.e. the mode in which you overwrite existing content typing over the old text. This is the default mode in both gHex and hexedit. Move to the location you want to edit and start typing. Pressing Backspace will undo your changes. If the new content is shorter than the existing, you may fill up the line with spaces to avoid truncating the file.
If the new content is longer than the existing one in this line, then you have to enter the Insert mode. You can do that using the Menu in gHex. In hexedit you have to use the EscI keybinding. Then start typing and the new characters will be appended in the current location.
In the first case, it is guaranteed that the editing and saving of the file will be instantaneous since an in-place edit will happen. In the later case, I'm not sure how the growing in size and the moving of following bytes will be handled, but I hope the filesystem uses a larger non-continuous block to move some of the contents and not move the whole file.
If you're happy with your changes, save the file:
Use the menu in gHex
Use Ctrlx in hexedit and answer (Y)es when questioned about whether to save the changes.
Always make sure you have a backup in place!
EDIT: I found out that gHex isn't suitable for your situation, since it tries to load the whole file in memory. hexedit will serve you fine. However, if you want a graphical editor like gHex, but with partial file loading capabilities, you may try wxHexEditor. Check also the Comparison of Hex editors page in Wikipedia.

Liquid Studio Community Edition contains a Large File Editor which can open and edit Terra-byte files on low spec machines, and its free.
It requires enough disk space to copy the file (when writing it back out), but hardly requires any memory.

Fortran77 automatically determine how many lines of text are at the top of a data file

An (old) instrument of mine is generating ASCII data files with text descriptions at the top of the file, before the data. But the number of lines of descriptive text varies from run to run. How can I get Fortran77 to determine this automatically?
Here is an example data file, below the line.
Line of explanatory text.
Notice the possible blank lines.
More text.
The number of lines is NOT the same every time.
1.0, 2.0
2.0, 4.0
3.0, 6.0
4.0, 8.0

[I found the answer myself. Posting here to help others. It is quite annoying having to wait 8 hours to answer my own question, but I understand why the rule exists. Stupid posers!]
A crude but effective solution, if your text never starts with a number (which is my case):
Assume the input file is named Data.dat.
integer NumTextLines
real X
open(8,"Data.dat")
NumTextLines=-1
50 NumTextLines=NumTextLines+1
read(8,*,err=50) X
close(8)
open(8,"Data.dat")
Every time the program tries to read a word from a text line into the real variable X, the read statement errors and program control goes back to line 50. If the read statement is successful, then you don't want to increment NumTextLines any more. Close the file and re-open it to start over from the beginning. But now you know NumTextLines. So you can read the text one line at a time, and either save it or skip it.
{Above method works on most of my files, but not all. Another way is to read each line into a character*500 variable (say, A), then test the ASCII value of the first element of the character array. But that gets complicated.}

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string