I am trying to run a multiprocessing pool to access multiple urls and pull data with a specific function simultaneously. Unfortunately I keep getting duplicate output for eachticker in the ticker_list. I want a single line of output for eachTicker
code
ticker_list = []
with open('/home/a73nk-xce/Documents/Python/SharkFin/SP500_Fin/SP_StockChar/ticker.csv', 'r', newline='') as csvfile:
spamreader = csv.reader(csvfile)
for rows in spamreader:
pass
for eachTicker in rows:
ticker_list.append(eachTicker)
if __name__ == '__main__':
pool = mp.Pool(20)
pool.map(IncData, ticker_list)
pool.terminate()
OUTPUT
[28105000, 16863000, 11242000, 0, 8355000, 87000, 0, 0, 2800000, -15000, 2785000, 395000, 2390000, 864000, 0, 1509000, 0, 0, 0, 0, 1526000, 0, 1526000]
[1262006, 829648, 432358, 0, 252384, 0, 0, 0, 179974, -2082, 177892, 11392, 166500, 45959, 0, 120541, -2171, 0, 0, 0, 118370, 0, 118370]
[6981000, 3388000, 3593000, 719000, 2043000, 0, 0, 0, 831000, -72000, 759000, 113000, 646000, 142000, 0, 504000, 0, 0, 0, 0, 504000, 0, 504000]
[6981000, 3388000, 3593000, 719000, 2043000, 0, 0, 0, 831000, -72000, 759000, 113000, 646000, 142000, 0, 504000, 0, 0, 0, 0, 504000, 0, 504000]
[69269000, 0, 69269000, 0, 55852000, 20058000, 6666000, 0, 13794000, 570000, 28054000, 13690000, 14364000, 686400
As you can see the output ablove as duplicates in it and it does that while running the entire program
Not sure I understand the problem correctly as the provided information are incomplete.
If IncData is writing its results on a single file, then the issue is due to the concurrent access of the worker processes to that file. If a file is written by more than one process at the same time, the writes will overlap resulting in the corruption of the file. This might be the reason for what you call "duplicated lines".
The best approach would be letting the IncData callable return the result output to the parent process instead of writing itself on the file. The parent process would then aggregate the results from the workers and write them sequentially on a file.
Something like:
if __name__ == '__main__':
with mp.Pool(20) as pool:
results_list = pool.map(IncData, ticker_list)
with open('/path/to/your/file.csv', 'w') as csvfile:
csvfile.writelines(results_list)
Related
I am trying to render a gltf model with ash (vulkan) in rust.
I sent all my data to the gpu and I am seeing this:
Naturally my suspicion is that the normal data is wrong. So I checked with renderdoc:
Those seem ok, maybe the attributes are wrong?
All those normals seem like they add to 1, should be fine. Maybe my pipeline is wrong?
Seems like the correct format and binding (I am sending 3 buffers and binding one to binding 0, one to 1 and the last to 2, binding 1 has the normals).
The only thing I find that is weird is, if I go to the vertex input pipeline stage see the buffers:
This is what the buffer at index 1 shows:
This does not happen for the buffer at index 0 (positions) which also render properly. So Whatever is causing the normals to show up as hex codes here is likely the cause of the bug. But i have no idea why this is happening. As far as I can see the pipeline and buffer were all set properly.
You presumably want to use one separate buffer for each vertex attribute (aka non-interleaved vertex buffer, SoA),
but your VkVertexInputAttributeDescription::offset values [0, 12, 24] is what you would use for one vertex buffer interleaving all attributes (provided that their binding values point to one and the same VkVertexInputBindingDescription).
e.g.
// Interleaved:
// Buffer 0: |Position: R32G32B32_FLOAT, Normal: R32G32B32_FLOAT, Uv: R32G32B32_FLOAT|, * vertex count
VkVertexInputBindingDescription {
.binding = 0,
.stride = 12 * 3, // 3 `R32G32B32_FLOAT`s !
.inputRate = VK_VERTEX_INPUT_RATE_VERTEX
};
// All attributes in the same `binding` == `0`
VkVertexInputAttributeDescription[3] {
{
.location = 0,
.binding = 0,
.format = VK_FORMAT_R32G32G32_SFLOAT,
.offset = 0 // [0, 11] portion
},
{
.location = 1,
.binding = 0,
.format = VK_FORMAT_R32G32G32_SFLOAT,
.offset = 12 // [12, 23] portion
},
{
.location = 2,
.binding = 0,
.format = VK_FORMAT_R32G32G32_SFLOAT,
.offset = 24 // [24, 35] portion
}
};
Your VkVertexInputBindingDescription[1].stride == 12 tells Vulkan that your vertex buffer 1 uses 12 bytes for each vertex, and your VkVertexInputAttributeDescription[1].offset == 12 says the normal value is at offset 12, which is out of bounds.
Same deal with your VkVertexInputAttributeDescription[2].offset == 24 overstepping (by a large amount) VkVertexInputBindingDescription[2].stride == 12.
For using one tightly-packed buffer for each vertex attribute, you need to correctly set your VkVertexInputAttributeDescription[n].offset values to 0, which looks something like:
// Non-interleaved:
// Buffer 0: |Position: R32G32B32_FLOAT|, * vertex count
// Buffer 1: |Normal: R32G32B32_FLOAT|, * vertex count
// Buffer 2: |Uv: R32G32B32_FLOAT|, * vertex count
VkVertexInputBindingDescription[3] {
{
.binding = 0,
.stride = 12,
.inputRate = VK_VERTEX_INPUT_RATE_VERTEX
},
{
.binding = 1,
.stride = 12,
.inputRate = VK_VERTEX_INPUT_RATE_VERTEX
},
{
.binding = 2,
.stride = 12,
.inputRate = VK_VERTEX_INPUT_RATE_VERTEX
}
};
// Each attribute in its own `binding` == `location`
VkVertexInputAttributeDescription[3] {
{
.location = 0,
.binding = 0,
.format = VK_FORMAT_R32G32G32_SFLOAT,
.offset = 0 // Whole [0, 11]
},
{
.location = 1,
.binding = 1,
.format = VK_FORMAT_R32G32G32_SFLOAT,
.offset = 0 // Whole [0, 11]
},
{
.location = 2,
.binding = 2,
.format = VK_FORMAT_R32G32G32_SFLOAT,
.offset = 0 // Whole [0, 11]
}
};
Worth noting is the comment line // vertex stride 12 less than total data fetched 24 generated by RenderDoc in the Buffer Format section, and how it does so.
It detects when your vertex attribute description oversteps its binding description's stride:
if(i + 1 == attrs.size())
{
// for the last attribute, ensure the total size doesn't overlap stride
if(attrs[i].byteOffset + cursz > stride && stride > 0)
return tr("// vertex stride %1 less than total data fetched %2")
.arg(stride)
.arg(attrs[i].byteOffset + cursz);
}
I'm at the last stages of completing my program's transition from Tkinter to Gtk. The last step is updating the ListStore in an efficient manner. So far, what I'm trying is working...but for only about 15-120 seconds before it crashes (error code is 139; sometimes a coredump with no other message, sometimes a "Warning: corrupted double-linked list detected"). While it's running, top reports less than 100 MB of RAM in use.
The ListStore contains 51 rows with 25 columns, created as follows:
def init_ListStore():
# initialize the main ListStore database
for symbol in cbp_symbols.keys():
name = cbp_symbols[symbol]['name']
row = [symbol, name, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
liststore1.append(row)
treeview1.set_model(liststore1)
cell = Gtk.CellRendererText()
columns = list(my_vars['liststore_main']['column_data'].keys())
for i in range(liststore1.get_n_columns()):
treeviewcolumn = Gtk.TreeViewColumn(columns[i])
treeview1.append_column(treeviewcolumn)
treeviewcolumn.pack_start(cell, True)
treeviewcolumn.add_attribute(cell, "text", i)
After this is done, I do the following (3) steps:
Update my account information (token balances), get a new message from the websocket (stored as dictionary), update the new info, on a cell-to-cell basis, something like:
liststore1[row][col.index('Open')] = float(ticker_data["open_24h"])
liststore1[row][col.index('Last')] = float(ticker_data["price"])
liststore1[row][col.index('Low')] = float(ticker_data["low_24h"])
liststore1[row][col.index('High')] = float(ticker_data["high_24h"])
liststore1[row][col.index('Volume')] = float(ticker_data["volume_24h"])
I would like to be able to edit/replace the entire row in one shot; the above seems slow and burdensome. Once the initial creation/updates are done, everything is being done with the same function (there are about 20 total updates to be made with each message).
UPDATE #1: this time I got a more definitive error. The problem is in the lib/python3.8/site-packages/gi/overrides/Gtk.py module, line 1680:
if GTK2 or GTK3:
_Gtk_main_quit = Gtk.main_quit
#override(Gtk.main_quit)
def main_quit(*args):
_Gtk_main_quit()
_Gtk_main = Gtk.main
#override(Gtk.main)
def main(*args, **kwargs):
with register_sigint_fallback(Gtk.main_quit):
with wakeup_on_signal():
return _Gtk_main(*args, **kwargs) << this line
UPDATE #2: I think this is related to multithread locking, but I don't have a clue how that works (module is imported by the websocket client, not by me). I did try to apply a lock on the main function that parses the incoming socket message, but it didn't help. I do know that if I run the program without the Gtk window active, it never crashes.
Finally got this working. The trick was to use Gtk.TreeIter to reference the current row, then update the entire row in one shot:
titer = liststore1.get_iter(row)
my_columns = [
df.columns.get_loc('Last') + 1,
df.columns.get_loc('Open') + 1,
df.columns.get_loc('Change') + 1,
df.columns.get_loc('Change %') + 1,
df.columns.get_loc('Low') + 1,
df.columns.get_loc('High') + 1,
df.columns.get_loc('Value') + 1,
df.columns.get_loc('Val Open') + 1,
df.columns.get_loc('Port %') + 1,
df.columns.get_loc('Cost Basis') + 1,
df.columns.get_loc('CB Change') + 1,
df.columns.get_loc('Value') + 1,
df.columns.get_loc('Val Open') + 1,
df.columns.get_loc('Profit') + 1,
df.columns.get_loc('Volume') + 1,
df.columns.get_loc('30d Avg') + 1,
df.columns.get_loc('Vol Act %') + 1,
df.columns.get_loc('Trade Val') + 1
]
my_data = [
df.at[row, 'Last'],
df.at[row, 'Open'],
df.at[row, 'Change'],
df.at[row, 'Change %'],
df.at[row, 'Low'],
df.at[row, 'High'],
df.at[row, 'Value'],
df.at[row, 'Val Open'],
df.at[row, 'Port %'],
df.at[row, 'Cost Basis'],
df.at[row, 'CB Change'],
df.at[row, 'Value'],
df.at[row, 'Val Open'],
df.at[row, 'Profit'],
df.at[row, 'Volume'],
df.at[row, '30d Avg'],
df.at[row, 'Vol Act %'],
df.at[row, 'Trade Val']
]
liststore1.set(titer, my_columns, my_data)
I know this code needs to be streamlined, but I tend to get something working first, then simplify/streamline afterwards.
Usually to read metadata from an image I used PyExifTool witch is very powerfull:
import exiftool
exiftool.executable = "exiftool.exe"
img_path = 'test.JPG'
with exiftool.ExifTool() as et:
metadata = et.get_metadata(img_path) # read metadata
To edit an image (resize, crop, etc.) I use a pillow or OpenCV library.
Example:
from PIL import Image
img_path = 'test.JPG'
img = Image.open(img_path)
img = img.resize((100,100),Image.BILINEAR) # edit the image
exif = img.info['exif']
img.save('output.jpg',exif=exif)
However, not all the metadata is on the EXIF TAG.
For Example:
'Composite:Aperture': 5.6,
'Composite:CircleOfConfusion': 0.0110169622305844,
'Composite:FOV': 73.7398575770811,
'Composite:FocalLength35efl': 24,
'Composite:GPSAltitude': 455.764,
'Composite:GPSLatitude': 41.3867478333333,
'Composite:GPSLongitude': -70.5641583055556,
'Composite:GPSPosition': '41.3867478333333 -70.5641583055556',
'Composite:HyperfocalDistance': 1.25520730117252,
'Composite:ImageSize': '5472 3648',
'Composite:LightValue': 13.2927817492278,
'Composite:Megapixels': 19.961856,
'Composite:ScaleFactor35efl': 2.72727272727273,
'Composite:ShutterSpeed': 0.003125,
'EXIF:ApertureValue': 5.59834346238078,
'EXIF:ColorSpace': 1,
'EXIF:ComponentsConfiguration': '0 3 2 1',
'EXIF:CompressedBitsPerPixel': 3.368460728,
'EXIF:Compression': 6,
'EXIF:Contrast': 0,
'EXIF:CreateDate': '2020:04:11 09:42:31',
'EXIF:CustomRendered': 0,
'EXIF:DateTimeOriginal': '2020:04:11 09:42:31',
'EXIF:DigitalZoomRatio': 'undef',
'EXIF:ExifImageHeight': 3648,
'EXIF:ExifImageWidth': 5472,
'EXIF:ExifVersion': '0230',
'EXIF:ExposureCompensation': -0.34375,
'EXIF:ExposureIndex': 'undef',
'EXIF:ExposureMode': 0,
'EXIF:ExposureProgram': 2,
'EXIF:ExposureTime': 0.003125,
'EXIF:FNumber': 5.6,
'EXIF:FileSource': 3,
'EXIF:Flash': 32,
'EXIF:FlashpixVersion': '0010',
'EXIF:FocalLength': 8.8,
'EXIF:FocalLengthIn35mmFormat': 24,
'EXIF:GPSAltitude': 455.764,
'EXIF:GPSAltitudeRef': 0,
'EXIF:GPSLatitude': 41.3867478333333,
'EXIF:GPSLatitudeRef': 'N',
'EXIF:GPSLongitude': 70.5641583055556,
'EXIF:GPSLongitudeRef': 'W',
'EXIF:GPSVersionID': '2 3 0 0',
'EXIF:GainControl': 0,
'EXIF:ISO': 100,
'EXIF:ImageDescription': 'DCIM\\PANORAMA\\101_0586\\PAN',
'EXIF:InteropIndex': 'R98',
'EXIF:InteropVersion': '0100',
'EXIF:LightSource': 1,
'EXIF:Make': 'DJI',
'EXIF:MaxApertureValue': 2.79917173119039,
'EXIF:MeteringMode': 2,
'EXIF:Model': 'FC6310S',
'EXIF:ModifyDate': '2020:04:11 09:42:31',
'EXIF:Orientation': 1,
'EXIF:ResolutionUnit': 2,
'EXIF:Saturation': 0,
'EXIF:SceneCaptureType': 0,
'EXIF:SceneType': 1,
'EXIF:SerialNumber': '46c76034607b1c0616d67f3643c34a5b',
'EXIF:Sharpness': 0,
'EXIF:ShutterSpeedValue': '0.00312701097912618',
'EXIF:Software': 'v01.08.1719',
'EXIF:SubjectDistance': 0,
'EXIF:SubjectDistanceRange': 0,
'EXIF:ThumbnailImage': '(Binary data 10230 bytes, use -b option to extract)',
'EXIF:ThumbnailLength': 10230,
'EXIF:ThumbnailOffset': 10240,
'EXIF:WhiteBalance': 0,
'EXIF:XPComment': 'Type=N, Mode=P, DE=None',
'EXIF:XPKeywords': 'v01.08.1719;1.3.0;v1.0.0',
'EXIF:XResolution': 72,
'EXIF:YCbCrPositioning': 1,
'EXIF:YResolution': 72,
'ExifTool:ExifToolVersion': 12.05,
'ExifTool:Warning': '[minor] Possibly incorrect maker notes offsets (fix by '
'1783?)',
'File:BitsPerSample': 8,
'File:ColorComponents': 3,
'File:Directory': '.',
'File:EncodingProcess': 0,
'File:ExifByteOrder': 'II',
'File:FileAccessDate': '2020:09:12 11:46:45+01:00',
'File:FileCreateDate': '2020:09:12 11:46:45+01:00',
'File:FileModifyDate': '2020:04:21 09:54:42+01:00',
'File:FileName': 'PANO0001.JPG',
'File:FilePermissions': 666,
'File:FileSize': 8689815,
'File:FileType': 'JPEG',
'File:FileTypeExtension': 'JPG',
'File:ImageHeight': 3648,
'File:ImageWidth': 5472,
'File:MIMEType': 'image/jpeg',
'File:YCbCrSubSampling': '2 1',
'MPF:DependentImage1EntryNumber': 0,
'MPF:DependentImage2EntryNumber': 0,
'MPF:ImageUIDList': '(Binary data 66 bytes, use -b option to extract)',
'MPF:MPFVersion': '0010',
'MPF:MPImageFlags': 8,
'MPF:MPImageFormat': 0,
'MPF:MPImageLength': 255918,
'MPF:MPImageStart': 8433897,
'MPF:MPImageType': 65537,
'MPF:NumberOfImages': 2,
'MPF:PreviewImage': '(Binary data 255918 bytes, use -b option to extract)',
'MPF:TotalFrames': 1,
'MakerNotes:CameraPitch': -90,
'MakerNotes:CameraRoll': 0,
'MakerNotes:CameraYaw': -114.099998474121,
'MakerNotes:Make': 'DJI',
'MakerNotes:Pitch': -10.5,
'MakerNotes:Roll': 5.19999980926514,
'MakerNotes:SpeedX': 0,
'MakerNotes:SpeedY': 0,
'MakerNotes:SpeedZ': 0,
'MakerNotes:Yaw': -114.699996948242,
'SourceFile': 'PANO0001.JPG',
'XMP:About': 'DJI Meta Data',
'XMP:AbsoluteAltitude': '+455.76',
'XMP:AlreadyApplied': False,
'XMP:CalibratedFocalLength': 3666.666504,
'XMP:CalibratedOpticalCenterX': 2736.0,
'XMP:CalibratedOpticalCenterY': 1824.0,
'XMP:CamReverse': 0,
'XMP:CreateDate': '2020:04:11',
'XMP:FlightPitchDegree': -10.5,
'XMP:FlightRollDegree': '+5.20',
'XMP:FlightXSpeed': '+0.00',
'XMP:FlightYSpeed': '+0.00',
'XMP:FlightYawDegree': -114.7,
'XMP:FlightZSpeed': '+0.00',
'XMP:Format': 'image/jpg',
'XMP:GPSLatitude': 41.38674784,
'XMP:GPSLongtitude': -70.56415831,
'XMP:GimbalPitchDegree': -90.0,
'XMP:GimbalReverse': 0,
'XMP:GimbalRollDegree': '+0.00',
'XMP:GimbalYawDegree': -114.1,
'XMP:HasCrop': False,
'XMP:HasSettings': False,
'XMP:Make': 'DJI',
'XMP:Model': 'FC6310S',
'XMP:ModifyDate': '2020:04:11',
'XMP:RelativeAltitude': '+69.60',
'XMP:RtkFlag': 0,
'XMP:SelfData': 'Undefined',
'XMP:Version': 7.0
Any suggestion on:
How edit an image and keep the same metadata?
How to edit a specific TAG from the original image?
Regarding 2) - Just use PIL.Image for that:
TAG_ID = 271
if hasattr(image, '_getexif'): # only present in JPEGs
exif = image._getexif() # returns None if no EXIF data
if exif is not None:
try:
orientation = exif[TAG_ID]
except KeyError:
pass
else:
exif[TAG_ID] = 'Canon'
I'm trying to create a histogram for each image in a folder, and save the plots of them into a CSV file. The user would enter the folder the images are saved, and then the files would get created and named accordingly
files = ["1", "2", "3", "4", "5", "6", "7", "8", "9", "10"] #loop to get all files from folder
for x in files:
image = "x + 1"
img2 = cv2.imread('similarImages/' + directory + '/' + image + '.png', cv2.IMREAD_COLOR)
histSim = cv2.calcHist([img2], [1], None, [256], [0, 256]) # create histo of each image
np.savetxt('Test/similarImage' + x + '.csv', histSim, delimiter=',') # save save plots to csv
From my previous knowledge of python, I've theory crafted the above code, but in classic fashion, it doesnt work (shocker)
Am I going along the right lines? If not could I get a nudge in the right direction, and if so, why doesnt it work?
It's been a while since I took on something like this, and such I am a little rusty, many thanks, Ben
You can use pandas for this task. If you want to store all the histograms in a single csv file you can use a list and append all the histogram values to it using this
df = []
df.append(cv2.calcHist(img, [1], None, [256], [0, 256])[:, 0]) # check if you want like this or transpose.
Then convert it to a dataframe using pd.DataFrame and store it as a csv file using df.to_csv
If you want to save each histogram to its independent csv file then you can:
histSim = pd.DataFrame(cv2.calcHist(img, [1], None, [256], [0, 256]))
histSim.to_csv('histogram.csv', index=False)
The issue I was having was the image variable was a string, and thus, when I was adding 1 to it, it was concatenating, and not adding the values, so using integers and then converting to a string when I needed it in the file path worked
image = 0
files = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] # loop to get all files from folder
for x in files:
image = x + 1
image = str(image)
img2 = cv2.imread('similarImages/' + directory + '/' + image + '.png', cv2.IMREAD_COLOR)
histSim = pd.DataFrame(cv2.calcHist([img2], [1], None, [256], [0, 256])) # create histo of each image
histSim.to_csv('Test/histogram' + image + '.csv', index=False)
I keep running into the error: string indices must be integers when using placeholders for my script.
My program is supposed to track the growth of veggies purely on calculation. The idea is that each plant has it's own characteristics (eg carrotData) but instead of having code for each 'plantData' I replaced the code with (whichPlant and later whichPlantData) as a temporary placeholder (so that I don't need new code for each plant I have in my garden or that I want to add at a later point).
This is when I get the error in the last line (Plant is a class) marked with the ***. When I use (carrotData) instead of (whichPlantData) my script works. But as soon as I put in the temporary placeholder (whichPlantData) is breaks.
What causes this (so that I can avoid doing this in future projects) and how can I fix this?
thanks for the support!!
carrotData = {'plantID': '','plantingTime': dt(year=now.year, month=3, day=1), "dateOfPlanting": 0, "numberOfPlants": 0, "germinationTime": 7, "growthTime": 227, "flowerTime": 247, "harvestTime": 254, "liveCycles": 1, "status": 0}
potatoData = {'plantID': '','plantingTime': dt(year=now.year, month=3, day=1), "dateOfPlanting": 0, "numberOfPlants": 0, "germinationTime": 7, "growthTime": 227, "flowerTime": 247, "harvestTime": 254, "liveCycles": 1, "status": 0}
print ("imported plant datasheets")
#functions:
#if plant is added
def addPlant():
whichPlant = input("Which plant do you want to add? ")
n = int(input("How many plants do you want to add? "))
i = 0
whichPlantData = whichPlant + "Data"
if whichPlant in inventory:
while i < n:
i += 1
if whichPlant in plants:
plants[whichPlant] += 1
else:
plants[whichPlant] = 1
***Error*** whichPlant = Plant("", whichPlantData['plantingTime'], dt.now(), n, dt.now() + timedelta(days=whichPlantData['germinationTime']), dt.now() + timedelta(days=whichPlantData['growthTime']), dt.now() + timedelta(days=whichPlantData['flowerTime']),whichPlantData['harvestTime'], whichPlantData['liveCycles'], whichPlantData['status'])
Your problem seems to be with whichPlantData = whichPlant + "Data". whichPlant is a string returned by the input function. I think what you're trying to do is get a dictionary of plant information based on input from the user. Furthermore; whichPlant + "Data" seems like an attempt at making whichPlant the same as a variable name pointing towards a dictionary of plant information. Just because the string whichPlant may be equal to the variable name carrotData does not make it the same as the variable. I would suggest making a list of dictionaries full of the information about the plant then iterate over the items in that list to see if the dictionaries name key is the same as the user input.
Similar to this:
plants = [{"Name": "Carrot", 'plantID': '','plantingTime':0, "dateOfPlanting": 0, "numberOfPlants": 0, "germinationTime": 7, "growthTime": 227, "flowerTime": 247, "harvestTime": 254, "liveCycles": 1, "status": 0},
{"Name": "Potato", 'plantID': '','plantingTime': 0, "dateOfPlanting": 0, "numberOfPlants": 0, "germinationTime": 7, "growthTime": 227, "flowerTime": 247, "harvestTime": 254, "liveCycles": 1, "status": 0}]
PlantName = input("Enter a Plant: ")
for plant in plants:
if plant['Name'] == PlantName:
print("{}'s germinationTime is {}".format(PlantName, plant["germinationTime"]))
# DO SOMETHING