I'm looking for a function that takes a Char as input and gives the unicode name of that code point (::Char->String), but I couldn't find any results on Hoogle. I assume that there is no builtin (If there is, please let me know) and so I wonder what's the best way to write this function and its inverse (::String->Maybe Char).
I know you'd have to read UnicodeData.txt or a similar document, but I don't know what the best/fastest function would be.
The unicode-names package contains the function
getCharacterName :: Char -> String
First of all, thanks to #TwanVanLaarhoven who provided an excellent answer. I did however need a function that did the reverse of getCharacterName.
What I originally wanted was a function that would read the file and not have it hard-coded, but I eventually realized that that would require unsafe IO operations.
What I decided to do was to copy UnicodeData.txt into notepad++ and use the following regex replacements:
write module UnicodeNames (characterToName,nameToCharacter) where
paste UnicodeData.txt
replace this: ^([\dA-F]+);([^<;>]+).*$|^([\dA-F]+);(?:[^;]*;){9}([^<;>]+).*$
with this: characterToName '\\x$1$3' = "$2$4"
append characterToName _ = ""
paste again
replace this (again): ^([\dA-F]+);([^<;>]+).*$|^([\dA-F]+);(?:[^;]*;){9}([^<;>]+).*$
with this: nameToCharacter "$2$4" = Just '\\x$1$3'
append nameToCharacter _ = Nothing
replace ^.*<.*$ with nothing to remove extra lines.
The file will be incredibly long and take forever to compile :-) In addition to having an inverse function, this method has the advantage of providing more names than the unicode-names package by using unicode 1.0 names as well. The two functions in this file rely on pattern matching to act as a dictionary from char to string and vice-versa. I would put my solution on PasteBin or somewhere else if it didn't use a ton of memory.
Related
I want to use printing command bellow in many places of my script. But I need to keep replacing "Survived" with some other string.
print(df.Survived.value_counts())
Can I automate the process by formating variable the same way as string? So if I want to replace "Survived" with "different" can I use something like:
var = 'different'
text = 'df.{}.value_counts()'.format(var)
print(text)
unfortunately this prints out "df.different.value_counts()" as as a string, while I need to print the value of df.different.value_counts()
I'm pretty sure alot of IDEs, have this option that is called refactoring, and it allows you to change a similar line of code/string on every line of code to what you need it to be.
I'm aware of VSCode's way of refactoring, is by selecting a part of the code and right click to select the option called change all occurances. This will replace the exact code on every line if it exists.
But if you want to do what you proposed, then eval('df.{}.value_counts()'.format(var)) is an option, but this is very unsecured and dangerous, so a more safer approach would be importing the ast module and using it's literal_eval function which is safer. ast.literal_eval('df.{}.value_counts()'.format(var)).
if ast.literal_eval() doesn't work then try this final solution that works.
def cat():
return 1
text = locals()['df.{}.value_counts'.format(var)]()
Found the way: print(df[var].value_counts())
I have a phrase, where only some words will change, and I need to store those words on a variable.
Example:
phrase = "I cannot connect to server XPTO\TEST for the last five hours"
The only part that will change is XPTO\TEST and I need to store it on a variable so that I can use it later.
Any ideas, or is it possible?
Seems like you need some form of placeholders, if that is a case, then you can use string.format or string.gsub.
local t = {name="lua", version="5.3"}
x = string.gsub("$name-$version.tar.gz", "%$(%w+)", t)
--> x="lua-5.3.tar.gz"
With PHP for example you can achieve what you want without any extra work done, because there is a feature called string interpolation (wiki).
But at the same time Lua doesn't have one, that's why you can't do that without extra string post-processing.
I have to write a MATLAB function with the following description:
function counts = letterStatistics(filename, allowedChar, N)
This function is supposed to open a text file specified by filename and read its entire contents. The contents will be parsed such that any character that isn’t in allowedChar is removed. Finally it will return a count of all N-symbol combinations in the parsed text. This function should be stored in a file name “letterStatistics.m” and I made a list of some commands and things of how the function should be organized according to my professors' lecture notes:
Begin the function by setting the default value of N to 1 in case:
a. The user specifies a 0 or negative value of N.
b. The user doesn’t pass the argument N into the function, i.e., counts = letterStatistics(filename, allowedChar)
Using the fopen function, open the file filename for reading in text mode.
Using the function fscanf, read in all the contents of the opened file into a string variable.
I know there exists a MATLAB function to turn all letters in a string to lower case. Since my analysis will disregard case, I have to use this function on the string of text.
Parse this string variable as follows (use logical indexing or regular expressions – do not use for loops):
a. We want to remove all newline characters without this occurring:
e.g.
In my younger and more vulnerable years my father gave me some advice that I've been turning over in my mind ever since.
In my younger and more vulnerableyears my father gave me some advicethat I’ve been turning over in my mindever since.
Replace all newline characters (special character \n) with a single space: ' '.
b. We will treat hyphenated words as two separate words, hence do the same for hyphens '-'.
c. Remove any character that is not in allowedChar. Hint: use regexprep with an empty string '' as an argument for replace.
d. Any sequence of two or more blank spaces should be replaced by a single blank space.
Use the provided permsRep function, to create a matrix of all possible N-symbol combinations of the symbols in allowedChar.
Using the strfind function, count all the N-symbol combinations in the parsed text into an array counts. Do not loop through each character in your parsed text as you would in a C program.
Close the opened file using fclose.
HERE IS MY QUESTION: so as you can see i have made this list of what the function is, what it should do, and using which commands (fclose etc.). the trouble is that I'm aware that closing the file involves use of 'fclose' but other than that I'm not sure how to execute #8. Same goes for the whole function creation. I have a vague idea of how to create a function using what commands but I'm unable to produce the actual code.. how should I begin? Any guidance/hints would seriously be appreciated because I'm having programmers' block and am unable to start!
I think that you are new to matlab, so the documentation may be complicated. The root of the problem is the basic understanding of file I/O (input/output) I guess. So the thing is that when you open the file using fopen, matlab returns a pointer to that file, which is generally called a file ID. When you call fclose you want matlab to understand that you want to close that file. So what you have to do is to use fclose with the correct file ID.
fid = open('test.txt');
fprintf(fid,'This is a test.\n');
fclose(fid);
fid = 0; % Optional, this will make it clear that the file is not open,
% but it is not necessary since matlab will send a not open message anyway
Regarding the function creation the syntax is something like this:
function out = myFcn(x,y)
z = x*y;
fprintf('z=%.0f\n',z); % Print value of z in the command window
out = z>0;
This is a function that checks if two numbers are positive and returns true they are. If not it returns false. This may not be the best way to do this test, but it works as example I guess.
Please comment if this is not what you want to know.
I have a structure in MATLAB called dat. I want to rename dat as an existing string.
Existing_str='NewName'
$(Existing_str)=dat
This fails as I don't think MATLAB lets me use the dollar sign in this way. The code below creates a copy of dat literally called Existing_str and destroys the Existing_str in the process.
Existing_str=dat
While the code below generates a collosal empty structure which clearly is not a copy!
eval(Existing_str)=dat
In the task I am actually trying to perform I don't know the name of the existing_str in advance so that is not a solution.
You were almost there with your `eval'. What you want is:
eval([Existing_str '=dat;']);
This works because you're composing a string inside your square brackets. If you just looked at the resulting string, it would look like NewName=dat; The eval command simply tells Matlab to evaluate the string as if you typed it into the command line.
You can use dynamic field naming (Bas's suggestion), and avoid eval:
For example, if you have just loaded a structure dat from a file 'somefile.ext' with some custom parsing function:
filename = 'somefile.ext'; % presume you actually have a list of files from dir or ls
dat = yourfunction(filename);
[~, name, ~] = fileparts(filename);
alldat.(name)=dat;
This is equivalent to:
alldat.somefile = dat;
Except that we've just automatically taken the name from the filename (in this case just by stripping off the path/extension, but you could do other things depending on the pattern of the filename).
The bonus of this is that you can then, say, with a structure that has fields alldat.file1, alldata.file2, alldat.file3, all of which have a subfield, say, size do things like this:
names = fieldnames(alldat)
for n = 1:length(names)
alldat.(names{n}).mean = mean(alldata.(names{n}).size);
end
Every sub-structure now has a field, mean, which contains the mean of the data. If you had a bunch of different named structures you would need to eval everything you wanted to do to them collectively, and the code becomes difficult to read and maintain.
The other option is a cell array. Here's an easy trick:
dat = % whatever you do to make this structure
alldat{end+1} = dat;
This just appends the new dat onto the end of an existing cell array. {end+1} ensures it doesn't overwrite existing data.
I'm currently teaching myself Lua for iOS game development, since I've heard lots of very good things about it. I'm really impressed by the level of documentation there is for the language, which makes learning it that much easier.
My problem is that I've found a Lua concept that nobody seems to have a "beginner's" explanation for: nested brackets for quotes. For example, I was taught that long strings with escaped single and double quotes like the following:
string_1 = "This is an \"escaped\" word and \"here\'s\" another."
could also be written without the overall surrounding quotes. Instead one would simply replace them with double brackets, like the following:
string_2 = [[This is an "escaped" word and "here's" another.]]
Those both make complete sense to me. But I can also write the string_2 line with "nested brackets," which include equal signs between both sets of the double brackets, as follows:
string_3 = [===[This is an "escaped" word and "here's" another.]===]
My question is simple. What is the point of the syntax used in string_3? It gives the same result as string_1 and string_2 when given as an an input for print(), so I don't understand why nested brackets even exist. Can somebody please help a noob (me) gain some perspective?
It would be used if your string contains a substring that is equal to the delimiter. For example, the following would be invalid:
string_2 = [[This is an "escaped" word, the characters ]].]]
Therefore, in order for it to work as expected, you would need to use a different string delimiter, like in the following:
string_3 = [===[This is an "escaped" word, the characters ]].]===]
I think it's safe to say that not a lot of string literals contain the substring ]], in which case there may never be a reason to use the above syntax.
It helps to, well, nest them:
print [==[malucart[[bbbb]]]bbbb]==]
Will print:
malucart[[bbbb]]]bbbb
But if that's not useful enough, you can use them to put whole programs in a string:
loadstring([===[print "o m g"]===])()
Will print:
o m g
I personally use them for my static/dynamic library implementation. In the case you don't know if the program has a closing bracket with the same amount of =s, you should determine it with something like this:
local c = 0
while contains(prog, "]" .. string.rep("=", c) .. "]") do
c = c + 1
end
-- do stuff