Displaying Unicode characters in local web page using Bash [duplicate] - linux

This question already has answers here:
How do you echo a 4-digit Unicode character in Bash?
(18 answers)
Closed 7 years ago.
Hello I am trying to display the Dice Unicode characters on a web sever using bash, however I am finding it more difficult then it should be. In short I found online that (printf '\u0026') works and prints & to my page. However when I change the number to my desired '\u2680' nothing is displayed. Admittedly I am not very knowledgeable in linux nor unicode. But I am very confused on why a lower number will work and a higher one will not, or what I am doing wrong.

I think I may have found the answer. I think that because I am echoing everything into a html it is parsing the Unicode in html and not using linux. (Not 100% sure about that so correct me if I am wrong.)
Either way, by simply putting the html code for the dice characters into the sh file I was able to display the characters that I wanted.
(i.e echo '&#9856(;)' without the parentheses)

First you need to provide more information regarding what you are generating using bash. Put a sample script and specify what operating system and web server are you using.
It is essential to be sure that the encoding using by shell/bash is the same as the default encoding of the webserver. The HTML page must have a proper encoding specified in the header.

Related

Beautiful Soup - meaning of letter 'u' in documentation [duplicate]

Like in:
u'Hello'
My guess is that it indicates "Unicode", is that correct?
If so, since when has it been available?
You're right, see 3.1.3. Unicode Strings.
It's been the syntax since Python 2.0.
Python 3 made them redundant, as the default string type is Unicode. Versions 3.0 through 3.2 removed them, but they were re-added in 3.3+ for compatibility with Python 2 to aide the 2 to 3 transition.
The u in u'Some String' means that your string is a Unicode string.
Q: I'm in a terrible, awful hurry and I landed here from Google Search. I'm trying to write this data to a file, I'm getting an error, and I need the dead simplest, probably flawed, solution this second.
A: You should really read Joel's Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!) essay on character sets.
Q: sry no time code pls
A: Fine. try str('Some String') or 'Some String'.encode('ascii', 'ignore'). But you should really read some of the answers and discussion on Converting a Unicode string and this excellent, excellent, primer on character encoding.
My guess is that it indicates "Unicode", is it correct?
Yes.
If so, since when is it available?
Python 2.x.
In Python 3.x the strings use Unicode by default and there's no need for the u prefix. Note: in Python 3.0-3.2, the u is a syntax error. In Python 3.3+ it's legal again to make it easier to write 2/3 compatible apps.
I came here because I had funny-char-syndrome on my requests output. I thought response.text would give me a properly decoded string, but in the output I found funny double-chars where German umlauts should have been.
Turns out response.encoding was empty somehow and so response did not know how to properly decode the content and just treated it as ASCII (I guess).
My solution was to get the raw bytes with 'response.content' and manually apply decode('utf_8') to it. The result was schöne Umlaute.
The correctly decoded
für
vs. the improperly decoded
fĂźr
All strings meant for humans should use u"".
I found that the following mindset helps a lot when dealing with Python strings: All Python manifest strings should use the u"" syntax. The "" syntax is for byte arrays, only.
Before the bashing begins, let me explain. Most Python programs start out with using "" for strings. But then they need to support documentation off the Internet, so they start using "".decode and all of a sudden they are getting exceptions everywhere about decoding this and that - all because of the use of "" for strings. In this case, Unicode does act like a virus and will wreak havoc.
But, if you follow my rule, you won't have this infection (because you will already be infected).

Excel VBA on Mac german special characters not encoded correctly (ÄÜÖ)

I have an Excel VBA Script that I originally wrote for Windows (where it works fine) and now had to port to Mac OS. I don't think that it matters but the script is calling cURL to get a JSON Response from a web API which is then parsed, edited and inserted into the spreadsheet.
Some of the fields in the parsed JSON contain special characters like Ä, Ü, Ö (German characters). The script can handle these just fine on Windows but on Mac instead of ÖÜÄ I get other symbols. This breaks the tool as it depends on some vlookup-functions where the values are written by hand (with the correct symbols).
I tried lots of googling but was not able to find anything.
One thing that might be interesting is that the code itself changes on Mac as well! I have some statements printed to the console and even the hardcoded strings that contain a special character are broken as soon as I open the script on a Mac.
The question is for Mac VBA. This is a pain. The only solution I have is to send the curl output to a file, then open that file with workbooks.opentext and Origin:=65001 and all the response is in cell A1, correctly encoded.
I have asked my own question on that, to see if any one has a more recent answer.
How to read UTF8 data output from cURL in popen/fread in VBA on Mac?

How to create a linux terminal ASCII character logo?

I'm a newbie to Linux. I want to create my own ASCII character logo to be displayed on the Linux terminal(it is for pleasure and also to learn). I searched through the internet and found there are tools available for that work. For example,Figlet,Neofetch,Screenfetch etc. But I want to know if there is any method to create a such a logo except hard-coding the logo. If anyone know please help.
You're much more likely to get an answer if you explain clearly.
By 'ASCII character logo' you could mean that you want to create your own character (as in letter/number/symbol) and use it in the terminal just like you'd be able to display the 'L' letter. For that you'd need to create your own font, add your character in and set the terminal to use that font. There are plenty of tools online that can help you create a font.
If you mean you simply want to display some ascii art on the terminal, you can use something like this: http://patorjk.com/software/taag/#p=display&f=Graffiti&t=Type%20Something%20. You can save the text to a file on your linux computer, and print it back out again using the cat command.
An example is below:
cat ascii_logo.txt
You do also list things like figet, which will automatically generate the ASCII art for you from some text - I'm not sure why these don't fulfill your needs?

Different file size between powershell and cmd [duplicate]

This question already has answers here:
Using redirection within the script produces a unicode output. How to emit single-byte ASCII text?
(6 answers)
Closed 4 years ago.
I am using a little processconf.js tool to build a configuration.json file from multiple .json files.
Here the command I am using :
node processconf.js file1.json file2.json > configuration.json
I was using cmd for a moment, but today I tried using Powershell and somehow from the same files and the same command I do have different results.
One file is 33kb(cmd) the other 66kb(powershell), looking at the files they have the exact same lines and I can't find any visual differences, why is that ?
PowerShell defaults to UTF16LE, while cmd doesn't do Unicode by default for redirection (which may sometimes end up mangling your data as well).
If you don't use the redirection operator in PowerShell but instead Out-File you can specify an encoding, e.g.
node processconf.js file1.json file2.json | Out-File -Encoding Utf8 configuration.json
I think -Encoding Oem would be somewhat the same as the cmd behaviour, but usually doesn't support Unicode and there's a conversion involved.
The redirection operator of course has no provisions for specifying any options, so it's often not the best choice when you care about the exact output format. And since PowerShell, contrary to Unix shells, handles objects, text and random binary data are very different things.
You'd get the same behaviour from cmd if you ran it with cmd /u, by the way.

Linux command line : edit hacked index files [duplicate]

This question already has an answer here:
Hacked Site - SSH to remove a large body of javascript from 200+ files [closed]
(1 answer)
Closed 2 years ago.
I'm unfortunately once more dealing with a hacked site on a Linux Plesk server. While the issue is fixed with FTP access changed (it got down to the famous Filezilla FTP codes hack on a PC) I'd appreciate to know how to edit files as it may take over an hour to restore the site to the most recent backup we have, and I'd be glad to have it back online faster.
The hack is rather simple: a javascript code was inserted in many index* (only index.php it seems) files in the site.
I'm looking for a way to mass-edit the hacked files, knowing that even though the target javascript code is the same, it is called from a number of probably also hacked sites. So while my legitimate index file used to start with
<?php
it now starts like
<script type="text/javascript" src="http://(RANDOMDOMAINHERE)/facebook.php"></script><?php
As that chain contains a variable, could you help me find a sure-fire method to edit all the changed Index files (about 80 found) ?
I have used a SED replace before but this time part of the chain to replace varies, so could I use a wildcard ?
Best regards, thanks for shedding light !
find -name 'index.php' -print0 |
xargs -0 sed -i '1s#^<script type="text/javascript" src="http://.*\?/facebook.php"></script>##g'
Should do wonders
the sed command:
1 (match in first line)
s#pattern#replacement#g (replace pattern by replacement, not that the latter is empty)
^ must match at start of line
.*\? accept arbitrary length of sequence of characters; however if more than one a match for the whole pattern could be made, only match the shortest possible variant of it
Cheers
I sincerely hope your not actually adminning a production domain. You should inform your users, get the problem fixed, offer the users to go back to a recent backup that hasn't got the problem.
There is no telling what else has been tampered with.
I'm glad my VPS is somewhere else!
I would fix the Cross side scripting exploit before this problem is addressed or it will all be in vain. When thats done a simple search and replace of blocks of script that contain a common string should be sufficient.

Resources