I have a ciphered text file where A=I a=i !=h etc., I know the right substitutions. How can I generate a readable form of the text?
I have read that it's Substitution Cipher
tr 'Aa!' 'Iih'
This performs the following transformations: A→I, a→i, !→h. If you want the other way around as well (A→I, I→A, …), the command is
tr 'Aa!Iih' 'IihAa!'
The N-th character of the first set is converted to the N-th character of the second set. Read man 1 tr for more information.
Please note that GNU tr, which you have on Linux, doesn't really have a concept of multibyte characters, but instead works one byte at a time; so if your substitutions involve non-ASCII multibyte UTF-8 characters, the command won't work as expected.
Use CyberChef or another encryption tool:
Deciphering
is fairly simple. Just select the Substitute operation and put it into the recipe, then place your key in line with your values such that keys and values are lined up in a column.
CyberChef was created by the GCHQ of Britain.
A Google search for "solve substitution cipher" yields several websites which can solve it for you. https://quipqiup.com https://www.guballa.de/substitution-solver
Related
I know I can use :! with a visual selection to pipe selected lines through an external command, but is there a way to do the same for a single word on a line? I need to base64 encode tokens in a config file and I'm having trouble and the entire line is sent to base64. If I move the word to its own line, I finish up with a trailing \n character encoded in the base64 string. I know there's a plugin specifically for this, but in general is it possible to pipe units of the buffer smaller than entire lines via an external program?
The Ex commands (:! is one of them) are all line-based, because the ex editor on which this mode is based was line-based.
If you need filtering of parts of lines often, I would indeed recommend to use one of the plugins. #romainl's answer outlines the (tedious) steps if you want to do this manually - plugins can greatly simplify that:
With the venerable vis.vim, you can use :B !base64
The unimpaired.vim plugin formerly had ]Y / [Y mappings to encode / decode Base64 directly (implemented in Vimscript)
express.vim defines a g= operator, and will then query for an expression to apply to it. You can use !base64 here.
Ex commands (everything that starts with :) work on lines and there's nothing you can do about that.
Filtering "non-lines" is more involved. You need to:
yank the selection,
escape it if necessary,
run your filter with that selection in a subshell,
clean up the output if necessary,
replace the selection with the output of the filter.
In a nutshell:
c<C-r>=system('echo "<C-r>"" | base64 | tr -d "\n"')<CR>
which is obviously a lot more work than for filtering lines. Map it to something easier.
I am trying to format some text that was converted from UTF-16 to ASCII, the output looks like this:
C^#H^#M^#M^#2^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#
T^#h^#e^#m^#e^# ^#M^#a^#n^#a^#g^#e^#r^# ^#f^#o^#r^# ^#3^#D^#S^#^#^#^#^#^#^#^#^#^#^#^#^#^#
The only text I want out of that is:
CHMM2
Theme Manager for 3DS
So there is a line break "\n" at the end of each line and when I use
tr -cs 'a-zA-Z0-9' 'newtext' infile.txt > outfile.txt
It is stripping the new line as well so all the text ends up in one big string on one line.
Can anyone assist with figuring out how to strip out only the ^#'s and keeping spaces and new lines?
The ^#s are most certainly null characters, \0s, so:
tr -d '\0'
Will get rid of them.
But this is not really the correct solution. You should simply use theiconv command to convert from UTF-16 to UTF-8 (see its man page for more information). That is, of course, what you're really trying to accomplish here, and this will be the correct way to do it.
This is an XY problem. Your problem is not deleting the null characters. Your real problem is how to convert from UTF-16 to either UTF-8, or maybe US-ASCII (and I chose UTF-8, as the conservative answer).
I have a task to edit about 5k files.
Must remove all strings starting with ?ver=2.35.1 where after the = all the numbers are random.
As I see I need to replace every ?ver= ... with empty string.
I tried with the linux console but I cant specify the random numbers.
You could use sed.
sed 's/^?ver=[0-9.]\+//' file
Explanation:
^ Asserts that we are at the start.
?var= Matches the string ?ver= . Here ? is not considered as a regex meta character.
[0-9.]\+ Matches one or more digits or dot.
Learn about ed, sed, gawk and combine them cleverly (e.g. using some for loop in your shell). Read Advanced Bash Scripting Guide
I am looking for some best practices as far as handling csv and tab delimited files.
For CSV files I am already doing some formatting if a value contains a comma or double quote but what if the value contains a new line character? Should I leave the new line intact and encase the value in double quotes + escape any double quotes within the value?
Same question for tab delimited files. I assume the answer would be very similar if not the same.
Usually you keep \n unaltered while exploiting the fact that the newline char will be enclosed in a " " string. This doesn't create ambiguities but it's really ugly if you have to take a look to the file using a normal texteditor.
But it is how you should do since you don't escape anything inside a string in a CSV except for the double quote itself.
#Jack is right, that your best bet is to keep the \n unaltered, since you'll expect it inside of double-quotes if that is the case.
As with most things, I think consistency here is key. As far as I know, your values only need to be double-quoted if they span multiple lines, contain commas, or contain double-quotes. In some implementations I've seen, all values are escaped and double-quoted, since it makes the parsing algorithm simpler (there's never a question of escaping and double-quoting, and the reverse on reading the CSV).
This isn't the most space-optimized solution, but makes reading and writing the file a trivial affair, for both your own library and others that may consume it in the future.
For TSV, if you want lossless representation of values, the "Linear TSV" specification is worth considering: http://paulfitz.github.io/dataprotocols/linear-tsv/index.html
For obvious reasons, most such conventions adhere to the following at a minimum:
\n for newline,
\t for tab,
\r for carriage return,
\\ for backslash
Some tools add \0 for NUL.
Vims errorformat (for parsing compile/build errors) uses an arcane format from c for parsing errors.
Trying to set up an errorformat for nant seems almost impossible, I've tried for many hours and can't get it. I also see from my searches that alot of people seem to be having the same problem. A regex to solve this would take minutesto write.
So why does vim still use this format? It's quite possible that the C parser is faster but that hardly seems relevant for something that happens once every few minutes at most. Is there a good reason or is it just an historical artifact?
It's not that Vim uses an arcane format from C. Rather it uses the ideas from scanf, which is a C function. This means that the string that matches the error message is made up of 3 parts:
whitespace
characters
conversion specifications
Whitespace is your tabs and spaces. Characters are the letters, numbers and other normal stuff. Conversion specifications are sequences that start with a '%' (percent) character. In scanf you would typically match an input string against %d or %f to convert to integers or floats. With Vim's error format, you are searching the input string (error message) for files, lines and other compiler specific information.
If you were using scanf to extract an integer from the string "99 bottles of beer", then you would use:
int i;
scanf("%d bottles of beer", &i); // i would be 99, string read from stdin
Now with Vim's error format it gets a bit trickier but it does try to match more complex patterns easily. Things like multiline error messages, file names, changing directory, etc, etc. One of the examples in the help for errorformat is useful:
1 Error 275
2 line 42
3 column 3
4 ' ' expected after '--'
The appropriate error format string has to look like this:
:set efm=%EError\ %n,%Cline\ %l,%Ccolumn\ %c,%Z%m
Here %E tells Vim that it is the start of a multi-line error message. %n is an error number. %C is the continuation of a multi-line message, with %l being the line number, and %c the column number. %Z marks the end of the multiline message and %m matches the error message that would be shown in the status line. You need to escape spaces with backslashes, which adds a bit of extra weirdness.
While it might initially seem easier with a regex, this mini-language is specifically designed to help with matching compiler errors. It has a lot of shortcuts in there. I mean you don't have to think about things like matching multiple lines, multiple digits, matching path names (just use %f).
Another thought: How would you map numbers to mean line numbers, or strings to mean files or error messages if you were to use just a normal regexp? By group position? That might work, but it wouldn't be very flexible. Another way would be named capture groups, but then this syntax looks a lot like a short hand for that anyway. You can actually use regexp wildcards such as .* - in this language it is written %.%#.
OK, so it is not perfect. But it's not impossible either and makes sense in its own way. Get stuck in, read the help and stop complaining! :-)
I would recommend writing a post-processing filter for your compiler, that uses regular expressions or whatever, and outputs messages in a simple format that is easy to write an errorformat for it. Why learn some new, baroque, single-purpose language unless you have to?
According to :help quickfix,
it is also possible to specify (nearly) any Vim supported regular
expression in format strings.
However, the documentation is confusing and I didn't put much time into verifying how well it works and how useful it is. You would still need to use the scanf-like codes to pull out file names, etc.
They are a pain to work with, but to be clear: you can use regular expressions (mostly).
From the docs:
Pattern matching
The scanf()-like "%*[]" notation is supported for backward-compatibility
with previous versions of Vim. However, it is also possible to specify
(nearly) any Vim supported regular expression in format strings.
Since meta characters of the regular expression language can be part of
ordinary matching strings or file names (and therefore internally have to
be escaped), meta symbols have to be written with leading '%':
%\ The single '\' character. Note that this has to be
escaped ("%\\") in ":set errorformat=" definitions.
%. The single '.' character.
%# The single '*'(!) character.
%^ The single '^' character. Note that this is not
useful, the pattern already matches start of line.
%$ The single '$' character. Note that this is not
useful, the pattern already matches end of line.
%[ The single '[' character for a [] character range.
%~ The single '~' character.
When using character classes in expressions (see |/\i| for an overview),
terms containing the "\+" quantifier can be written in the scanf() "%*"
notation. Example: "%\\d%\\+" ("\d\+", "any number") is equivalent to "%*\\d".
Important note: The \(...\) grouping of sub-matches can not be used in format
specifications because it is reserved for internal conversions.
lol try looking at the actual vim source code sometime. It's a nest of C code so old and obscure you'll think you're on an archaeological dig.
As for why vim uses the C parser, there are plenty of good reasons starting with that it's pretty universal. But the real reason is that sometime in the past 20 years someone wrote it to use the C parser and it works. No one changes what works.
If it doesn't work for you the vim community will tell you to write your own. Stupid open source bastards.