So I'm trying to edit a csv file by writing to a temporary file and eventually replacing the original with the temp file. I'm going to have to edit the csv file multiple times so I need to be able to reference it. I've never used the NamedTemporaryFile command before and I'm running into a lot of difficulties. The most persistent problem I'm having is writing over the edited lines.
This part goes through and writes over rows unless specific values are in a specific column and then it just passes over.
I have this:
office = 3
temp = tempfile.NamedTemporaryFile(delete=False)
with open(inFile, "rb") as oi, temp:
r = csv.reader(oi)
w = csv.writer(temp)
for row in r:
if row[office] == "R00" or row[office] == "ALC" or row[office] == "RMS":
pass
else:
w.writerow(row)
and I get this error:
Traceback (most recent call last):
File "H:\jcatoe\Practice Python\pract.py", line 86, in <module>
cleanOfficeCol()
File "H:\jcatoe\Practice Python\pract.py", line 63, in cleanOfficeCol
for row in r:
_csv.Error: iterator should return strings, not bytes (did you open the file in text mode?)
So I searched for that error and the general consensus was that "rb" needs to be "rt" so I tried that and got this error:
Traceback (most recent call last):
File "H:\jcatoe\Practice Python\pract.py", line 86, in <module>
cleanOfficeCol()
File "H:\jcatoe\Practice Python\pract.py", line 67, in cleanOfficeCol
w.writerow(row)
File "C:\Users\jcatoe\AppData\Local\Programs\Python\Python35-32\lib\tempfile.py", line 483, in func_wrapper
return func(*args, **kwargs)
TypeError: a bytes-like object is required, not 'str'
I'm confused because the errors seem to be saying to do the opposite thing.
If you read the tempfile docs you'll see that by default it's opening the file in 'w+b' mode. If you take a closer look at your errors, you'll see that you're getting one on read, and one on write. What you need to be doing is making sure that you're opening your input and output file in the same mode.
You can do it like this:
import csv
import tempfile
office = 3
temp = tempfile.NamedTemporaryFile(delete=False)
with open(inFile, 'r') as oi, tempfile.NamedTemporaryFile(delete=False, mode='w') as temp:
reader = csv.reader(oi)
writer = csv.writer(temp)
for row in reader:
if row[office] == "R00" or row[office] == "ALC" or row[office] == "RMS":
pass
else:
writer.writerow(row)
Related
I am facing problems with my code that has been working perfectly fine and ran everything that I needed it to. This happens from time to time but this time I don't know what my problem is. I recently tried to place a sampling frequency so I can control how many times my data is running in a second but since I made those changes I had nothing but errors so I deleted the changes that I made and now I have errors although I am using the original code that I was using before hand.
My electrical connection is perfect so this is not the issue. I also am not getting any errors in the terminal while using i2cget -y 1
This is my python code (also using INA219 sensor):
#Importing libraries
import csv
from ina219 import INA219
from ina219 import DeviceRangeError
SHUNT_OHMS = 0.1
read_ina = INA219(SHUNT_OHMS)
read_ina.configure()
def read_all():
data = {}
data['Bus Voltage'] = read_ina.voltage()
data['Bus Current'] = read_ina.current()
data['Power'] = read_ina.power()
data['Shunt Voltage'] = read_ina.shunt_voltage()
return data
with open('SensorData.csv', 'w') as f:
data = read_all()
writer = csv.DictWriter(f,
fieldnames = list (data.keys()))
writer.writeheader()
exit = False
while not exit:
try:
writer.writerow(data)
data = read_all()
except KeyboardInterrupt:
exit = True
It is supposed to create a csv file that shows the voltage and all of that in a loop (in the csv file). The code is pretty straightforward. Can anyone help me fix this issue?
This is the error that I keep facing:
Traceback (most recent call last):
File "/home/pi/Downloads/scripts/Assignment2 CreateCSV/SensorData.py", line 40, in <module>
data = read_all()
File "/home/pi/Downloads/scripts/Assignment2 CreateCSV/SensorData.py", line 20, in read_all
data['Bus Voltage'] = read_ina.voltage()
File "/usr/local/lib/python3.5/dist-packages/ina219.py", line 180, in voltage
value = self._voltage_register()
File "/usr/local/lib/python3.5/dist-packages/ina219.py", line 363, in _voltage_register
register_value = self._read_voltage_register()
File "/usr/local/lib/python3.5/dist-packages/ina219.py", line 367, in _read_voltage_register
return self.__read_register(self.__REG_BUSVOLTAGE)
File "/usr/local/lib/python3.5/dist-packages/ina219.py", line 394, in __read_register
register_value = self._i2c.readU16BE(register)
File "/usr/local/lib/python3.5/dist-packages/Adafruit_GPIO/I2C.py", line 190, in readU16BE
return self.readU16(register, little_endian=False)
File "/usr/local/lib/python3.5/dist-packages/Adafruit_GPIO/I2C.py", line 164, in readU16
result = self._bus.read_word_data(self._address,register) & 0xFFFF
File "/usr/local/lib/python3.5/dist-packages/Adafruit_PureIO/smbus.py", line 226, in read_word_data
ioctl(self._device.fileno(), I2C_RDWR, request)
OSError: [Errno 121] Remote I/O error
i search how can i select some row with word in line so i use this script
import pandas
import datetime
df = pandas.read_csv(
r"C:StockEtablissement_utf8(1)\StockEtablissement_utf8.csv",
sep=",",
)
communes = ["PERPIGNAN"]
print()
df = df[~df["libelleCommuneEtablissement"].isin(communes)]
print()
so my script work well with a normal csv
but with a heavy Csv (4Go) the scipt say :
Traceback (most recent call last):
File "C:lafinessedufiness.py", line 5, in <module>
df = pandas.read_csv(r'C:StockEtablissement_utf8(1)\StockEtablissement_utf8.csv',
File "C:\Users\\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pandas\util\_decorators.py", line 311, in wrapper
return func(*args, **kwargs)
File "C:\Users\\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pandas\io\parsers\readers.py", line 680, in read_csv
return _read(filepath_or_buffer, kwds)
File "C:\Users\\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pandas\io\parsers\readers.py", line 581, in _read
return parser.read(nrows)
File "C:\Users\\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pandas\io\parsers\readers.py", line 1250, in read
index, columns, col_dict = self._engine.read(nrows)
File "C:\Users\\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pandas\io\parsers\c_parser_wrapper.py", line 225, in read
chunks = self._reader.read_low_memory(nrows)
File "pandas\_libs\parsers.pyx", line 805, in pandas._libs.parsers.TextReader.read_low_memory
File "pandas\_libs\parsers.pyx", line 883, in pandas._libs.parsers.TextReader._read_rows
File "pandas\_libs\parsers.pyx", line 1026, in pandas._libs.parsers.TextReader._convert_column_data
File "pandas\_libs\parsers.pyx", line 1072, in pandas._libs.parsers.TextReader._convert_tokens
File "pandas\_libs\parsers.pyx", line 1172, in pandas._libs.parsers.TextReader._convert_with_dtype
File "pandas\_libs\parsers.pyx", line 1731, in pandas._libs.parsers._try_int64
MemoryError: Unable to allocate 128. KiB for an array with shape (16384,) and data type int64
do you know how can i fix this error please?
The pd.read_csv() function has an option to read the file in chunks, rather than loading it all at once. Use iterator=True and specify a reasonable chunk size (rows per chunk).
import pandas as pd
path = r'C:StockEtablissement_utf8(1)\StockEtablissement_utf8.csv'
it = pd.read_csv(path, sep=',', iterator=True, chunksize=10_000)
communes = ['PERPIGNAN']
filtered_chunks = []
for chunk_df in it:
chunk_df = chunk_df.query('libelleCommuneEtablissement not in #communes')
filtered_chunks.append(chunk_df)
df = pd.concat(filtered_chunks)
As you can see, you don't have enough memory available for Pandas to load that file entirely into memory.
One reason is that based on Python38-32 in the traceback, you're running a 32-bit version of Python, where 4 gigabytes (or is it 3 gigabytes?) is the limit for memory allocations anyway. If your system is 64-bit, you should switch to the 64-bit version of Python, so that's one obstacle less.
If that doesn't help, you'll just also need more memory. You could configure Windows's virtual memory, or buy more actual memory and install it in your system.
If those don't help, then you'll have to come up with a better approach than to load the big CSV entirely into memory.
For one, if you really only care about rows with the string PERPIGNAN (no matter the column; you can really filter it again in your code), you could do grep PERPIGNAN data.csv > data_perpignan.csv and work with that (assuming you have grep; you can do the same filtering with a short Python script).
Since read_csv() accepts any iterable of lines, you can also just do something like
def lines_from_file_including_strings(file, strings):
for i, line in enumerate(file):
if i == 0 or any(string in line for string in strings):
yield line
communes = ["PERPIGNAN", "PARIS"]
with open("StockEtablissement_utf8.csv") as f:
df = pd.read_csv(lines_from_file_including_strings(f, communes), sep=",")
for an initial filter.
I am hoping someone can help me with this. After having a nightmare installing numpy on a raspberry pi, I am stuck again!
The gist of what I am trying to do is I have an arduino, that sends numbers (bib race numbers entered by hand) over lora, to the rx of the raspberry pi.
This script is supposed to read the incoming data, - it prints so I can see it in the terminal. Pandas is then supposed to compare the number against a txt/csv file, and if it matches in the bib number column it is supposed to append the matched row to a new file.
Now, The first bit works (capturing the data and printing) and on my windows PC, the 2nd bit works when I was testing with a fixed number rather than incoming data.
I have basically tried my best to mash them together to get the incoming number to compare instead.
I should also state that the error happened after I pressed 3 on the arduino (which then printed on the terminal of the raspberry pi before erroring), so probably why it is keyerror 3
My code is here
#!/usr/bin/env python3
import serial
import csv
import pandas as pd
#import numpy as np
if __name__ == '__main__':
ser = serial.Serial('/dev/ttyS0', 9600, timeout=1)
ser.flush()
while True:
if ser.in_waiting > 0:
line = ser.readline().decode('utf-8').rstrip()
print(line)
with open ("test_data.csv","a") as f:
writer = csv.writer(f,delimiter=",")
writer.writerow([line])
df = pd.read_csv("data.txt")
#out = (line)
filtered_df = df[line]
print('Original Dataframe\n---------------\n',df)
print('\nFiltered Dataframe\n------------------\n',filtered_df)
filtered_df.to_csv("data_amended.txt", mode='a', index=False, header=False)
#print(df.to_string())
And my error is here:
Python 3.7.3 (/usr/bin/python3)
>>> %Run piserialmashupv1.py
3
Traceback (most recent call last):
File "/home/pi/.local/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 3361, in get_loc
return self._engine.get_loc(casted_key)
File "pandas/_libs/index.pyx", line 76, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 108, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 5198, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 5206, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: '3'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/pi/piserialmashupv1.py", line 20, in <module>
filtered_df = df[line]
File "/home/pi/.local/lib/python3.7/site-packages/pandas/core/frame.py", line 3455, in __getitem__
indexer = self.columns.get_loc(key)
File "/home/pi/.local/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 3363, in get_loc
raise KeyError(key) from err
KeyError: '3'
>>>
I had been asked to put the first few lines of data.txt
_id,firstname,surname,team,info
1, Peter,Smith,,Red Walk (70 miles- 14 mile walk/run + 56 mile cycle)
2, Samantha,Grey,Team Grey,Blue walk (14 mile walk/run)
3, Gary,Parker,,Red Walk (70 miles- 14 mile walk/run + 56 mile cycle)
I think it must be the way I am referencing the incoming rx number?
Any help very much appreciated!
Dave
I have it working, see the final code below
I know that Pandas just didnt like the way the data was inputting originally.
This fixes it. I also had to ensure it knew it was dealing with an integer when filtering, as the first attempt I didn't, and it couldn't filter the data properly.
``
import serial
import csv
import time
import pandas as pd
if __name__ == '__main__':
ser = serial.Serial('/dev/ttyS0', 9600, timeout=1)
ser.flush()
while True:
if ser.in_waiting > 0:
line = ser.readline().decode('utf-8').rstrip()
print(line)
with open ("test_data.txt","w") as f:
writer = csv.writer(f,delimiter=",")
writer.writerow([line])
time.sleep(0.1)
ser.write("Y".encode())
df = pd.read_csv("data.txt")
out = df['_id'] == int(line)
filtered_df = df[out]
print('Original Dataframe\n---------------\n',df)
print('\nFiltered Dataframe\n---------\n',filtered_df)
filtered_df.to_csv("data_amended.txt", mode='a',
index=False, header=False)
time.sleep(0.1)
``
So I copied and pasted a demo program from the book I am using to learn Python:
#!/usr/bin/env python
import csv
total = 0
priciest = ('',0,0,0)
r = csv.reader(open('purchases.csv'))
for row in r:
cost = float(row[1]) * float(row[2])
total += cost
if cost == priciest[3]:
priciest = row + [cost]
print("You spent", total)
print("Your priciest purchase was", priciest[1], priciest[0], "at a total cost of", priciest[3])
And I get the Error:
Traceback (most recent call last):
File "purchases.py", line 2, in <module>
import csv
File "/Users/Solomon/Desktop/Python/csv.py", line 5, in <module>
r = csv.read(open('purchases.csv'))
AttributeError: 'module' object has no attribute 'read'
Why is this happening? How do I fix it?
Update:
Fixed All The Errors
Now I'm getting:
Traceback (most recent call last):
File "purchases.py", line 6, in <module>
for row in r:
_csv.Error: line contains NULL byte
What was happening in terms of the CSV.py:
I had a file with the same code named csv.py, saved in the same directory. I thought that the fact that it was named csv .py was screwing it up, so I started a new file called purchases.py, but forgot to delete csv
Don't name your file csv.py.
When you do, Python will look in your file for the csv code instead of the standard library csv module.
Edit: to include the important note in the comment: if there's a csv.pyc file left over in that directory, you'll have to delete that. that is Python bytecode which would be used in place of re-running your csv.py file.
There is a discrepancy between the code in the traceback of your error:
r = csv.read(open('purchases.csv'))
And the code you posted:
r = csv.reader(open('purchases.csv'))
So which are you using?
At any rate, fix that indentation error in line 2:
#!/usr/bin/env python
import csv
total = 0
And create your csv reader object with a context handler, so as not to leave the file handle open:
with open('purchases.csv') as f:
r = csv.reader(f)
I'm trying to make dictionary attack on zip file using Pool to increase speed.
But I face next error in Python 3.6, while it works in Python 2.7:
Traceback (most recent call last):
File "zip_crack.py", line 42, in <module>
main()
File "zip_crack.py", line 28, in main
for result in results:
File "/usr/lib/python3.6/multiprocessing/pool.py", line 761, in next
raise value
File "/usr/lib/python3.6/multiprocessing/pool.py", line 450, in _ handle_tasks
put(task)
File "/usr/lib/python3.6/multiprocessing/connection.py", line 206, in send
self._send_bytes(_ForkingPickler.dumps(obj))
File "/usr/lib/python3.6/multiprocessing/reduction.py", line 51, in dumps
cls(buf, protocol).dump(obj)
TypeError: cannot serialize '_io.BufferedReader' object
I tried to search for same errors but couldn't find answer that can help here.
Code looks like this
def crack(pwd, f):
try:
key = pwd.strip()
f.extractall(pwd=key)
return True
except:
pass
z_file = zipfile.ZipFile("../folder.zip")
with open('words.dic', 'r') as passes:
start = time.time()
lines = passes.readlines()
pool = Pool(50)
results = pool.imap_unordered(partial(crack, f=z_file), lines)
pool.close()
for result in results:
if result:
pool.terminate()
break
pool.join()
I also tried another approach using map
with contextlib.closing(Pool(50)) as pool:
pool.map(partial(crack, f=z_file), lines)
which worked great and found passwords quickly in Python 2.7 but it throws same exception in python 3.6