Sublime Text 'find in files' gives <binary> in the find results

Sublime Text 'find in files' gives <binary> in the find results - search

HotKey: Shift+Ctrl+F
Correct result:
Error results,
you can find the results just show as '<binary>'.I've searched this problem in google,but get nothing,
Here is log text.
Thanks for anyone give suggestion

The file most likely contains non UTF-8 encoded characters, binary characters or the content encoding cannot be guessed. Thus, it is not reliable to show search result summary.

need convert control characters
java -jar replacecontrol.jar *filepath*
replacecontrol.jar

From Sublimetext Forum
In general, if a file contains any ASCII characters < 32,
except 0x3, 0x9, 0xa, 0xc or 0xd, then it’s considered binary
Reading that I searched [\x00-\x09] and found a \x02
Then deleted that character; Problem disappeared
Next step would have been [\x0B-\x1F] as x0A is newline
But I did not need it

u can use regexp [^[:print:]\t\r\n] to replace all text, then u can find your result without binary.

Related

SEC Edgar - Can someone help me identify what this text represents?

https://www.sec.gov/Archives/edgar/data/1383094/000119312518268345/0001193125-18-268345.txt
The part I'm unsure how to read can be found by using ctrl+f on the following characters: M_]C_X0QQ17AI9#
This is the start of a section I am not familiar with, which contains a long block of text characters with each row starting with "M".
Thanks for your help!

The character sequence you see there is an encoded JPEG picture. You can recognize this by checking the starting character sequence - that is:
begin 644 g61300267.jpg

How can I decipher a substitution cipher?

I have a ciphered text file where A=I a=i !=h etc., I know the right substitutions. How can I generate a readable form of the text?
I have read that it's Substitution Cipher

tr 'Aa!' 'Iih'
This performs the following transformations: A→I, a→i, !→h. If you want the other way around as well (A→I, I→A, …), the command is
tr 'Aa!Iih' 'IihAa!'
The N-th character of the first set is converted to the N-th character of the second set. Read man 1 tr for more information.
Please note that GNU tr, which you have on Linux, doesn't really have a concept of multibyte characters, but instead works one byte at a time; so if your substitutions involve non-ASCII multibyte UTF-8 characters, the command won't work as expected.

Use CyberChef or another encryption tool:
Deciphering
is fairly simple. Just select the Substitute operation and put it into the recipe, then place your key in line with your values such that keys and values are lined up in a column.
CyberChef was created by the GCHQ of Britain.

A Google search for "solve substitution cipher" yields several websites which can solve it for you. https://quipqiup.com https://www.guballa.de/substitution-solver

Removing lines containing encoding errors in a text file

I must warn you I'm a beginner. I have a text file in which some lines contain encoding errors. By "error", this is what I get when parsing the file in my linux console (question marks instead of characters):
I want to remove every line showing those "question marks". I tried to grep -v the problematic character, but it doesn't work. The file itself is UTF8 and I guess some of the lines come from texts encoded in another format. I know I could find a way to reconvert them properly, but I just want them gone for now.
Do you have any ideas about how I could do this please?
PS: Some lines contain diacritics which are displayed fine. The "strings" command seems to remove too many "good" lines.

When dealing with mojibake on character encodings other than ANSI you must check 2 things:
Is the file really encoded in X? (X being UTF-8 WITHOUT BOM in your case. You could be trying to read UTF-8 WITH BOM, UTF-16, latin-1, etc. as UTF-8, and that would be the problem). Try reading in (not converting to) other encodings and see if any of them fits.
Is your locale or text editor set to read the file as UTF-8? If not, that may be the problem. Check for support and figure out how to change the setting. In linux try locale and setlocale commands to check and set it properly.
I like how notepad++ for windows (which also runs perfectly in linux using wine) lets you set any encoding you want to read the file without trying to convert it (of course if you set any other than the one the file is encoded in you will only see those weird characters), and also has a different option which allows you to convert it from one encoding to another. That has been pretty useful to me.
If you are a beginner you may be interested in this article. It explains briefly and clearly the whats, whys and hows of character encoding.
[EDIT] If the above fails, even windows-1252 and such ANSI encodings, I've just learned here how to remove non-ascii characters using tr unix command, turning it into ASCII (but be aware information on extra characters is lost in this output and there is no coming back, so keep the input file just in case you find a better fix):
tr -cd '\11\12\40-\176' < $INPUT_FILE > $OUTPUT_FILE
or, if you want to get rid of the whole line:
grep -v -P "[^\11\12\40-\176]" $INPUT_FILE > $OUTPUT_FILE
[EDIT 2] This answer here gives a pretty good guess of what could be happening if none of the encodings work on your file (Unfortunately the only straight forward solution seems to be removing those problematic characters).

You can use a micro-Perl script like:
perl -pe 's/[^[:ascii:]]+//g;' my_utf8_file.txt

Problem printing central european characters in Vim

Here's the problem in a nutshell.
I wrote a text file which I need to print out (in a hurry) that contains central european characters (šđčćž/ŠĐČĆŽ).
Vim's encoding settings are as folows;
set encoding=cp1250
set fileencoding=
Upon printing out comes garbage. What should be changed to fix that?
Really hate Vim's frekin' 1001 options in a time like this. Can't it do a simple thing and just print what's on screen?!

Check the option printencoding.
The help says it's empty by default, and when the encoding is multi-byte Vim tries to convert them to the printencoding. Plus, if it's empty "the conversion will be to latin1". This is what may be causing the trouble.
I'd like to ask: why not to use UTF-8?

How to search for a character the displays as "<85>" in Vim

I have a file that was converted from EBCDIC to ASCII. Where there used to be new lines there are now characters that show up as <85> (a symbol representing a single character, not the four characters it appears to be) and the whole file is on one line. I want to search for them and replace them all with new lines again, but I don't know how.
I tried putting the cursor over one and using * to search for the next occurrence, hoping that it might show up in my / search history. That didn't work, it just searched for the word that followed the <85> character.
I searched Google, but didn't see anything obvious.
My goal is to build a search and replace string like:
:%s/<85>/\n/g
Which currently just gives me:
E486: Pattern not found: <85>

I found "Find & Replace non-printable characters in vim" searching Google. It seems like you should be able to do:
:%s/\%x85/\r/gc
Omit the c to do the replacement without prompting, try with c first to make sure it is doing what you want it to do.
In Vim, typing :h \%x gives more details. In addition to \%x, you can use \%d, \%o, \%u and \%U for decimal, octal, up to four and up to eight hexadecimal characters.

For special character searching, win1252 for example, for the case of <80>,<90>,<9d>...
type:
/\%u80, \/%u90, /\%u9d ...
from the editor.
Similarly for octal, decimal, hex, type: /\%oYourCode, /\%dYourCode, /\%xYourCode.

try this: :%s/<85>/^M/g
note: press Ctrl-V together then M
or if you don't mind using another tool,
awk '{gsub("<85>","\n")}1' file

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string