Why do these two string operations produce different output? - python-3.x

I have a program that is grabbing variables stored on the local file system and storing them in a variable. I then attempt to URL encode them for use in a web API call. I noticed however that several of my calls were producing errors and after researching it appears that the encoding is not working as expected.
This string encoding produces the correct result.
newstring = urllib.parse.quote(u"Müller".encode('utf8'))
print(newstring)
Output
M%C3%83%C2%BCller
However, this code does not produce the correct output
string2 = "Müller"
newstring2 = urllib.parse.quote(string2.encode('utf8'))
print(string2)
Output
Müller
Any idea what the difference is here and how I can fix it so that the second bit of code produces accurate results?

Perhaps you meant to write print(newstring2) in your second example? That will produce the same output as in the first example.
In [1]: string2 = "Müller"
In [2]: print(urllib.parse.quote(string2.encode('utf8')))
M%C3%BCller

Related

Twincat3 ST variables conversion

I have some complicated array of structures and I want to write it into CSV file. So I need "variable to string" conversion.
Beckhoff as always doesn't care about documentation and their INT_TO_STRING function doesn't work (UNEXPECTED INT_TO_STRING TOKEN when I try to write INT_TO_STRING(20) ).
Moreover their string functions works correctly with only 255 chars.
So I need one of following:
working functions or function blocks or library which allows to convert different types to string
something like sprintf without limitations
some functions to convert between number and ascii char (0x55 is letter 'U') in both directions.
btw. Beckhoff gives us some weird CSV example code, but without data conversion (array has already strings in cells).
Thanks in advance!
I tried to use:
INT_TO_STRING() BYTE_TO_STRING() WHATEVER_TO_STRING()
but it is not working. And there is no clue how many arguments it should have or anything. There is no documentation in Beckhoff information System.

Is there a way to compare the format in which a line of a text file was written in Fortran?

I'm developing a Fortran program that must obtain some data from a text file and generate another text file using specific data from the first one.
The input file have many lines written in several specific formats which I know of. Although I know the formats, the lines in this file are generated in a "random way".
It would be much easier to generate the output file if I could compare the format in which each line was written, then I would know exactly what data I can get from that line of the input file to use it in the output file.
What I need is something like, for example, knowing that the format of the line read and stored in the LINHA variable is described in the FORMATO variable, do something like:
    
IF (FORMATO = '(1X, 15,3F8.1,2 (5A, 1X))') THEN
READ (LINHA, '(6X, F8.1)') my_variable
END IF
Because there might be another format such as
'(6A, 2F8.1, F8.6,2 (6A))'
in which, if I use the same READ statement, I will read an F8.1 variable in my_variable, however this value is not the correct one.
A (not so elegant) work-around that I can think of is to read the entire line using the advance = no option of read() and parse each character in the line separately. While doing so, you may count white spaces or other specific characters that you know of and then identify the different formats from there.
It would be helpful if you could give more specifications of the nature of the task.
The best option is to read without format, keeping each line in a character array. Then read the line variable as an internal file with the required format using the variable IOSTAT in order to check if the format is the correct.
INT max_size = 80
CHARACTER(LEN=max_size) :: line
READ(*,*) line
READ(line,'(1X, 15,3F8.1,2 (5A, 1X))',IOSTAT=ios) var1, var2, ...
Problem solved using a mixture of some of the suggestions posted.
I read each of the lines of the input file in an internal variable (RLINFILE) in the format '(A165)'. After that, I read all the contents of the string that I put in this internal variable in several dummy variables, using the format I knew of the lines from where I wanted to get some information (read all the information of the line in the desired and get IOSTAT = 0 guarantee that this is the correct line), so if the result of the reading is ok (IOSTAT = 0), it is because the line I just read was the correct one for the information I wanted, so I store the contents of some of the dummy variables that represent the values that interest me. In the code, the solution looked something like this:
OPEN(UNIT=LU1,FILE=RlinName,STATUS='OLD')
ilin = 0
formato = '(14X,A,1X,F7.1,1X,F7.1,5X,A,1X,A,1X,A,5X,A,I5,1X,A,I3,3F8.1,A,A,A,1X,A,2(1X,F8.2),1X,A,1X,A)'
DO WHILE (.TRUE.)
READ(LU1,'(A165)',END=300) RLINFILE
READ(RLINFILE,formato,IOSTAT=linhaok) dum2_a1,dum2_f1,dum2_f2,dum2_a2,dum2_a3,dum2_a4,dum2_a5,dum2_i1,dum2_a6,dum2_i2,dum2_f3,dum2_f4,dum2_f5,dum2_a7,dum2_a8,dum2_a9,dum2_a10,dum2_f6,dum2_f7,dum2_a11,dum2_a12
IF(linhaok.EQ.0) THEN
ilin = ilin+1
rlin_lshu(ilin) = dum2_a4
rlin_nbpa(ilin) = dum2_i1
rlin_ncir(ilin) = dum2_i2
rlin_ppij(ilin) = dum2_f3
rlin_pqij(ilin) = dum2_f4
rlin_tapn(ilin) = dum2_a7
END IF
END DO
300 CLOSE(UNIT=LU1)
The description of the problem you are trying to solve is a bit vague to me, but the simplest solutions that comes to my mind, given the description of the problem, is to modify the original code that generates the input data file, to write the used Fortran READ format before the data line in the input file. This way, you can read the format as a string and use it in the subsequent data IO in your second code.
If you describe the specific task your tryting to accomplish in more details, perhaps more experienced Fortranners could help.

Python3: Building dictionary from .fasta - "strange" printout for value

I am parsing a .fasta-File containing one big sequence into python by using:
for rec in SeqIO.parse(faFile, "fasta"):
identifier=(rec.id)
sequence=(rec.seq)
Then, I am building a dictionary:
d={identifier:sequence}
When printing sequence only, I get the following result:
CAGCCAGATGGGGGGAGGGGTGAGCGCTCTCCCGCTCAAAACCTCCAGCACTTT...CAT
Note: All letters are printed, I made dots to shorten this
When printing the dictionary, I get:
{'NC_003047.1': Seq('CAGCCAGATGGGGGGAGGGGTGAGCGCTCTCCCGCTCAAAACCTCCAGCACTTT...CAT', SingleLetterAlphabet())}
Where does the "Seq" and the SingleLetter alphabet come from?
Desired result would be:
{'NC_003047.1':'CAGCCAGATGGGGGGAGGGGTGAGCGCTCTCCCGCTCAAAACCTCCAGCACTTT...CAT'}
Update1:
following the link in the comments, I tried
input_file=open(faFile)
d=SeqIO.to_dict(SeqIO.parse(faFile,"fasta"))
resulting in:
{'NC_003047.1': SeqRecord(seq=Seq('CAGCCAGATGGGGGGAGGGGTGAGCGCTCTCCCGCTCAAAACCTCCAGCACTTT...CAT', SingleLetterAlphabet()), id='NC_003047.1', name='NC_003047.1', description='NC_003047.1 Sinorhizobium meliloti 1021 chromosome, complete genome', dbxrefs=[])}
So, sadly, this does not help :(
Thanks in advance for your time and effort :)
SeqIO doesn't return a string, it returns an object. When you print it, you print the object's string representation, which in this case is not just the data contained in (some attribute of) the object.
(Some objects are designed so that printing the object will print just the data inside it. This depends on how the library is put together and how the programmer designed its __str__() method. This is probably not useful for you at this point, but might help you understand other related resources you find if you pursue this further.)
I'm not familiar with SeqIO but quick googling suggests you probably want
d={identifier: sequence.seq}
to put just the SeqIO object's seq attribute as the value for this identifier.

MATLAB selecting items considering the end of their name

I have to extract the onset times for a fMRI experiment. I have a nested output called "ResOut", which contains different matrices. One of these is called "cond", and I need the 4th element of it [1,2,3,4]. But I need to know its onset time just when the items in "pict" matrix (inside ResOut file) have a name that ends with "*v.JPG".
Here's the part of the code that I wrote (but it's not working):
for i=1:length(ResOut);
if ResOut(i).cond(4)==1 && ResOut(i).pict== endsWith(*"v.JPG")
What's wrong? Can you halp me to fix it out?
Thank you in advance,
Adriano
It's generally helpful to start with unfamiliar functions by reading their documentation to understand what inputs they are expecting. Per the documentation for endsWith, it expects two inputs: the input text and the pattern to match. In your example, you are only passing it one (incorrectly formatted) string input, so it's going to error out.
To fix this, call the function properly. For example:
filepath = ["./Some Path/mazeltov.jpg"; "~/Some Path/myfile.jpg"];
test = endsWith(filepath, 'v.jpg')
Returns:
test =
2×1 logical array
1
0
Or, more specifically to your code snippet:
endsWith(ResOut(i).pict, 'v.JPG')
Note that there is an optional third input, 'IgnoreCase', which you can pass as a boolean true/false to control whether or not the matching ignores case.

Same for loop, giving out two different results using .write()

this is my first time asking a question so let me know if I am doing something wrong (post wise)
I am trying to create a function that writes into a .txt but i seem to get two very different results between calling it from within a module, and writing the same loop in the shell directly. The code is as follows:
def function(para1, para2): #para1 is a string that i am searching for within para2. para2 is a list of strings
with open("str" + para1 +".txt", 'a'. encoding = 'utf-8') as file:
#opens a file with certain naming convention
n = 0
for word in para2:
if word == para1:
file.write(para2[n-1]+'\n')
print(para2[n-1]) #intentionally included as part of debugging
n+=1
function("targetstr". targettext)
#target str is the phrase I am looking for, targettext is the tokenized text I am
#looking through. this is in the form of a list of strings, that is the output of
#another function, and has already been 'declared' as a variable
when I define this function in the shell, I get the correct words appearing. However, when i call this same function through a module(in the shell), nothing appears in the shell, and the text file shows a bunch of numbers (eg: 's93161), and no new lines.
I have even gone to the extent of including a print statement right after declaration of the function in the module, and commented everything but the print statement, and yet nothing appears in the shell when I call it. However, the numbers still appear in the text file.
I am guessing that there is a problem with how I have defined the parameters or how i cam inputting the parameters when I call the function.
As a reference, here is the desired output:
‘She
Ashley
there
Kitty
Coates
‘Let
let
that
PS: Sorry if this is not very clear as I have very limited knowledge on speaking python
I have found the solution to issue. Turns out that I need to close the shell and restart everything before the compiler recognizes the changes made to the function in the module. Thanks to those who took a look at the issue, and those who tried to help.

Resources