Python equivalent to matlab's var = char(fread(fid,100,'char'))'; - python-3.x

I need to read a binary file in python for which I have a Matlab code already. I'm converting line by line of matlab code to python but stuck at this place, where I'm reading text data from binary file but the output is not in readable text format. Looking for below Matlab's equivalaent code in python
tried struct module in python to unpack but the output string is not readable straight away into a list
Matlab code:
var = char(fread(fid,100,'char'))';
Python code that I tried:
tmp = f.read(100)
abc, = struct.unpack('100c',tmp)
But value of 'abc' is not regular text string, instead it is something like b'/val1 val2 val3 val4'
I need to get the val1, val2, val3, val4 as strings in to a list

I think using the numpy function fromfile does exactly what you want.
import numpy as np
data = np.fromfile(filename, dtype=np.uint8, count=100,sep='')
count tells how many bytes to read and the empty sep treats the file as binary. See the documentation for details.

Related

Converting string to dictionary from a opened file

A text file contains dictionary as below
{
"A":"AB","B":"BA"
}
Below are code of python file
with open('devices_file') as d:
print (d["A"])
Result should print AB.
As #rassar and #Ivrf suggested in comments you can use ast.literal_eval() as well as json.loads() to achieve this. Both code snippets outputs AB.
Solution with ast.literal_eval():
import ast
with open("devices_file", "r") as d:
content = d.read()
result = ast.literal_eval(content)
print(result["A"])
Solution with json.loads():
import json
with open("devices_file") as d:
content = json.load(d)
print(content["A"])
Python documentation about ast.eval_literal() and json.load().
Also: I noticed that you're not using the correct syntax in the code snippet in your question. Indented lines should be indented with 4 spaces, and between the print keyword and the associated parentheses there's no whitespace allowed.

How to index a tuple.txt file using a list.txt file using python?

I would like to index a tuple .txt file using a list .txt file.
That is, if my tuple file is a .txt file that reads somewhat like:
[[["-0.07636114002660116", "-0.5365621532160825", "-0.39960655510421184", "0.6733612454339026"], ["0.0", "0.0", "0.0", "37.2155259"], ["-0.05958626994915151", "-0.023029990708366282", "-0.24325076433502524", "0.9288248327845068"], ["0.05958626994915151", "0.023029990708366282", "0.24325076433502524", "36.286701067215496"], ["0.09995740879332612", "-0.48667451106459764", "-0.23779637140751794", "0.5508093478212072"], ["-0.2359048187690788", "-0.07291763285985114", "-0.4050609480317191", "1.0513767303972021"], ["-0.3081573380300473", "-0.08270260281220124", "-0.2497148935020871", "1.0220121263617357"], ["0.18471536734852254", "0.04700586284011614", "0.13075317534249653", "36.28656567125096"], ["-0.05287657813840254", "-0.014190902179399766", "-0.04284846553710331", "0.0295"], ["-2.6166252598538904", "1.7701571470098587", "2.171220685416502", "3.833325363231776"]]]
and I have a list.txt file that reads something like:
1
3
4
7
I want to create a new tuple by indexing the first tuple.txt file with the list.txt file
For instance, in this case, my new tuple (if I save it as new_tuple) should read:
new_tuple
Output:
[[["0.0", "0.0", "0.0", "37.2155259"], ["0.05958626994915151", "0.023029990708366282", "0.24325076433502524", "36.286701067215496"], ["0.09995740879332612", "-0.48667451106459764", "-0.23779637140751794", "0.5508093478212072"], ["0.18471536734852254", "0.04700586284011614", "0.13075317534249653", "36.28656567125096"]]]
Here are the raw .txt files in case they help.
tuple.txt: https://drive.google.com/file/d/1SdFVtxlUDj1XFm6wBUtNUS48dqJQBzwh/view?usp=sharing
list.txt: https://drive.google.com/file/d/1AUSzV5kV3aEL8AhkW-PfKsCiVZ9iyk22/view?usp=sharing
I have little to no idea how to begin this. Theoretically this should be possible, however, I am not sure how to begin writing a code that is pythonic enough to get the job done. The actual files I want to use the code on are much larger than the files I have used in my examples above. Therefore, an efficient pythonic code would be very helpful.
You need to know how to:
read a file as a string,
convert a string to a literal,
index a list with another list (here I used a "list-comprehension"),
convert a list to a string,
write to a file.
This is one way you could do that:
import ast
tuples = ast.literal_eval(open('tuple.txt').read())[0]
indexes = [int(i) for i in open('list.txt').read().split(' ')]
open('new_tuple.txt','w').write(str([tuples[i] for i in indexes])+'\n')
Test:
>>> import ast
>>> tuples = ast.literal_eval(open('tuple.txt').read())[0]
>>> indexes = [int(i) for i in open('list.txt').read().split(' ')]
>>> open('new_tuple.txt','w').write(str([tuples[i] for i in indexes])+'\n')
319
>>>
$ cat new_tuple.txt
[['0.0', '0.0', '0.0', '37.2155259'], ['0.05958626994915151', '0.023029990708366282', '0.24325076433502524', '36.286701067215496'], ['0.09995740879332612', '-0.48667451106459764', '-0.23779637140751794', '0.5508093478212072'], ['0.18471536734852254', '0.04700586284011614', '0.13075317534249653', '36.28656567125096']]
$

How to use python to convert a backslash in to forward slash for naming the filepaths in windows OS?

I have a problem in converting all the back slashes into forward slashes using Python.
I tried using the os.sep function as well as the string.replace() function to accomplish my task. It wasn't 100% successful in doing that
import os
pathA = 'V:\Gowtham\2019\Python\DailyStandup.txt'
newpathA = pathA.replace(os.sep,'/')
print(newpathA)
Expected Output:
'V:/Gowtham/2019/Python/DailyStandup.txt'
Actual Output:
'V:/Gowtham\x819/Python/DailyStandup.txt'
I am not able to get why the number 2019 is converted in to x819. Could someone help me on this?
Your issue is already in pathA: if you print it out, you'll see that it already as this \x81 since \201 means a character defined by the octal number 201 which is 81 in hexadecimal (\x81). For more information, you can take a look at the definition of string literals.
The quick solution is to use raw strings (r'V:\....'). But you should take a look at the pathlib module.
Using the raw string leads to the correct answer for me.
import os
pathA = r'V:\Gowtham\2019\Python\DailyStandup.txt'
newpathA = pathA.replace(os.sep,'/')
print(newpathA)
OutPut:
V:/Gowtham/2019/Python/DailyStandup.txt
Try this, Using raw r'your-string' string format.
>>> import os
>>> pathA = r'V:\Gowtham\2019\Python\DailyStandup.txt' # raw string format
>>> newpathA = pathA.replace(os.sep,'/')
Output:
>>> print(newpathA)
V:/Gowtham/2019/Python/DailyStandup.txt

Error in reading a ascii encoded csv file?

I have a csv file named Qid-NamedEntityMapping.csv having data like this:
Q1000070 b'Myron V. George'
Q1000296 b'Fred (footballer, born 1979)'
Q1000799 b'Herbert Greenfield'
Q1000841 b'Stephen A. Northway'
Q1001203 b'Buddy Greco'
Q100122 b'Kurt Kreuger'
Q1001240 b'Buddy Lester'
Q1001867 b'Fyodor Stravinsky'
The second column is 'ascii' encoded, and when I am reading the file using the following code, then also it not being read properly:
import chardet
import pandas as pd
def find_encoding(fname):
r_file = open(fname, 'rb').read()
result = chardet.detect(r_file)
charenc = result['encoding']
return charenc
my_encoding = find_encoding('datasets/KGfacts/Qid-
NamedEntityMapping.csv')
df = pd.read_csv('datasets/KGfacts/Qid-
NamedEntityMapping.csv',error_bad_lines=False, encoding=my_encoding)
But the output looks like this:
Also, I tried to use encoding='UTF-8'. but still, the output is the same.
What can be done to read it properly?
Looks like you have an improperly saved TSV file. Once you circumvent the TAB problem (as suggested in my comment), you can convert the column with names to a more suitable representation.
Let's assume that the second column of the dataframe is called "names". The b'XXX' thing is probably a bytes [mis]representation of a string. Convert it to a bytes object with ast.literal_eval and then decode to a string:
import ast
df["names"].apply(ast.literal_eval).apply(bytes.decode)
#0 Myron...
#1 Fred...
Last but not least, your problem has almost nothing to do with encodings or charsets.
Your issue looks like the CSV is actually tab separated; so you need to have sep='\t' in the read_csv function. It's reading everything else as a single column, except "born 1979" in the first row, as that is the only cell with a comma in it.

using a list from subprocess output

I use subprocess and python3 in the following script:
import subprocess
proc = subprocess.Popen("fail2ban-client -d".split(' '), stdout=subprocess.PIPE)
out, err = proc.communicate()
out.decode('ascii')
print(out)
The output is the following:
['set', 'syslogsocket', 'auto']
['set', 'loglevel', 'INFO']
['set', 'logtarget', '/var/log/fail2ban.log']
['set', 'dbfile', '/var/lib/fail2ban/fail2ban.sqlite3']
['set', 'dbpurgeage', 86400]
...
My issue is all this output is not a list. This is just a really big string with new line.
I have tried to convert each line to a list with this command:
eval(out.decode('ascii').split('\n')[0])
But I don't think this is the good way.
So my question is how can I convert a string (which looks like a list) to a list.
Though people generally are afraid of using eval, it is the way to go if you need to convert text literals into data. Instead of splitting up your (potentially long) string, you can simply read off the first line like so:
import io
f = io.StringIO(out.decode('ascii'))
first_list = eval(f.readline())
second_list = eval(f.readline())

Resources