Reading file using python is not working properly in Linux - linux

I'm running a python code where we read a fixed width file, which we extracted from ftp server. the code is working on windows without any issue. but when i am running the same code in the linux ec2 instance it's giving an error saying that "UnicodeDecodeError: 'utf-8' codec can't decode byte 0x99 in position 1819: invalid start byte". but the same code running in windows without any error.since i am not aware about the encoding type of the source file i am passing the encoding type as None. And this also working fine in windows but when we running the code in linux its giving an error saying that "encoding type None is not recognize".
i am using the codecs library to read the file and python version that i am using is 3.7.3
with codecs.open("recode.dat",encoding=None,errors='replace') as open_src:
with open("target_file.dat", 'w+',encoding=None) as open_tgt:
for src_rec in open_src:
new_rec = ''
for f_length in data_type_length:
f_length = int(f_length)
field = '"' + src_rec[:f_length].strip() + '"|'
new_rec += (field)
src_rec = src_rec[f_length:]
open_tgt.write(new_rec[:-1] + '\n')

Related

fs.readFile can't read file paths on windows, wrong encoding?

I'm passing the string "C:\random-folder\ui-debug.log" file path to fs.readFile() on a virtualized Windows 11 installation under macos and a x64 bit electron installation and nodeJS 18 (idk if that matters).
Using the following code:
const filepath = path.resolve(path.normalize(pathStr))
const file = fs.readFileSync(filepath)
but continuously get this error:
TypeError [ERR_INVALID_ARG_VALUE]: The argument 'path' must be a string or Uint8Array without null bytes. Received 'C:\\random-folder\\ui-debug.log\x00'
The file path always seems to be interpreted incorrectly by readFileSync, the \x00 is always added to the end of the path it errors with. This same code works fine on macos.
What is the correct way to read arbitrary file paths with nodejs/electron under Windows? Thanks!
The issue was to do with how the string was being parsed in the first place. Before passing the filepath to readFileSync, it was being converted to ucs2
const pathStr = clipboard.readBuffer('FileNameW').toString('ucs2')
On windows some extra characters were added at the end of the string but was console logged as normal (some internal conversion going on?) so it looked like a normal path in console but internally it had some extra encoded characters on the end.
I solved it with some regex:
const pathStr = clipboard.readBuffer('FileNameW').toString('ucs2').replace(RegExp(String.fromCharCode(0), 'g'), '')
Now readFileSync reads the file just fine!

I cant read the image file in the particular directory

I took some images in my camera and tried to resize them using opencv library but i think that i can't read the images I don't know the reason why.Thank you for the help in advance.
I have a python 3.8 version and the updated opencv library version.Not much of a background I guess.
import os,cv2
count=0
for file in os.listdir('E:\Projects\Python\Resixing images\Images'):
if file.endswith('.jpg'):
print(file)
img=cv2.imread(file)
img2=img.copy()
img2=cv2.resize(img2,(700,700))
name="resize"+str(count)+".jpg"
cv2.imwrite(name,img2)
count+=1
I receive an error message
P_20191107_214848_SRES.jpg
Traceback (most recent call last):
File "E:\Projects\Python\Resixing images\image changing res.py", line 7, in
img2=img.copy()
AttributeError: 'NoneType' object has no attribute 'copy'
[Finished in 9.7s]
Try this:
img=cv2.imread('E:\Projects\Python\Resixing images\Images' + '\' + file)
The problem was that you were sending only the name of the file to the python program, so the program tried to look for the image in the current directory and not at your specified path. The above change should fix the problem.
also, a good idea would be to have 2 // instead of 1 /, just to avoid any format specifier in the middle of things, or you could just use r to mention the path to be raw string
img=cv2.imread('E:\\Projects\\Python\\Resixing images\\Images' + '\\' + file)
img=cv2.imread(r'E:\Projects\Python\Resixing images\Images\' + file)

Python Script got ERROR when switch from Windows to Linux

I write a Python script on Windows and work pretty well, now I just installed "elementary Os" that is a Ubuntu based distro, but some how when I start the script it just crashed... I dont really now how to fixed it.
I let u a part of the script making problem:
memos=open(str(os.getcwd())+'\\LOG\\tres.txt','w')
menf='3)PRESION LATERAL DEL SUELO DE RELLENO\n Ka = '+str(round(Ka,2))+'\nP = '+str(round(P,2))+'\nY = '+str(round(Y,2))+'''
\nHm = '''+str(round(Hm,2))+'\nPm = '+str(round(Pm,2))+'\nYm = '+str(round(Ym,2))
memos.write(menf)
memos.close()
So the deal should be...
memos=open(str(os.getcwd())+'\\LOG\\tres.txt','w')
Because show me an error...
UnicodeEncodeError: 'ascii' codec can't encode character '\xba' in position 199: ordinal not in range(128)
Now, when I change for...
memos=open(str(os.getcwd())+'/LOG/tres.txt','w')
It got me another error...
FileNotFoundError: [Errno 2] No such file or directory: '/home/jojaror/Documentos/Scripts de Python/LOG/tres.txt'
I tryed to solve at my own, but i can't... so, if anyway could help me on this it would be helpful.

Python requests: UnicodeEncodeError: 'charmap' codec can't encode character

I scraped a webpage (name changed in code here) as follows:
import requests
r = requests.get('https://www.samplewebpage.com')
Then I tried to write r.text to a file as follows:
f = open ('filename', 'w')
f.write(r.text)
f.close()
I get an error as:
UnicodeEncodeError: 'charmap' codec can't encode character '\u20b9' in position 158691: character maps to <undefined>
r.encoding shows UTF-8. How to resolve the above?
Have also tried the following:
- few other random webpages and am able to run the code without any error for most.
- instead of r.text used r.content.decode('utf-8', 'ignore') but same error as above
My environment/system specifications:
Python 3.6.4
Windows 8.1 Pro, 64 bit
Default IDLE as installed from https://www.python.org.
Tried with a script in Atom as well, but same error.
Suspecting console encoding mismatch as I read in another similar problem on this forum, I reconfirmed from that the Atom console is set to UTF-8, though I believe console encoding is not the problem here, as I want to write to a file.
Thanks
Try explicitly specifying the file's encoding:
f = open ('filename', 'w', encoding='utf8')
f.write(r.text)
f.close()

Encoding error Python3.5

So I'm encountering a strange encoding error in Python3.5, I'm reading a string consisting html-data, and I'm handling the string like this :
def parseHtml(self,url):
r = requests.get(self.makeUrl())
data = r.text.encode('utf-8').decode('ascii', 'ignore')
self.soup = BeautifulSoup(data,'lxml')
The error happens when I'm trying to print the following:
def extractTable(self):
table = self.soup.findAll("table", { "class" : "messageTable" })
print(table)
I have checked my locale, and tried various variations of encode / decode as stated in previous similar posts on SO. The strangest thing (for me) is that the script works flawlessly on a different machine and on my laptop. But on my Windows Machine (using cygwin to a remote server) and on my Ubuntu install it simply wont run and gives me:
UnicodeEncodeError: 'ascii' codec can't encode character '\xa0' in position 1273: ordinal not in range(128)
Okay, so I moved the file from the remote server to my local-machine and it executed perfectly. I then checked my sys.stdout.encoding :
>>> import sys
>>> sys.stdout.encoding
'ANSI_X3.4-1968'
Clearly something was wrong, so I ended up exporting :
export PYTHONIOENCODING=utf-8
And voìla!

Resources