This code works, but it's returning directory names and filenames. I haven't found a parameter that tells it to return only files or only directories.
Can glob.glob do this, or do I have to call os.something to test if I have a directory or file. In my case, my files all end with .csv, but I would like to know for more general knowledge as well.
In the loop, I'm reading each file, so currently bombing when it tries to open a directory name as a filename.
files = sorted(glob.glob(input_watch_directory + "/**", recursive=True))
for loop_full_filename in files:
print(loop_full_filename)
Results:
c:\Demo\WatchDir\
c:\Demo\WatchDir\2202
c:\Demo\WatchDir\2202\07
c:\Demo\WatchDir\2202\07\01
c:\Demo\WatchDir\2202\07\01\polygonData_2022_07_01__15_51.csv
c:\Demo\WatchDir\2202\07\01\polygonData_2022_07_01__15_52.csv
c:\Demo\WatchDir\2202\07\01\polygonData_2022_07_01__15_53.csv
c:\Demo\WatchDir\2202\07\01\polygonData_2022_07_01__15_54.csv
c:\Demo\WatchDir\2202\07\01\polygonData_2022_07_01__15_55.csv
c:\Demo\WatchDir\2202\07\05
c:\Demo\WatchDir\2202\07\05\polygonData_2022_07_05__12_00.csv
c:\Demo\WatchDir\2202\07\05\polygonData_2022_07_05__12_01.csv
Results needed:
c:\Demo\WatchDir\2202\07\01\polygonData_2022_07_01__15_51.csv
c:\Demo\WatchDir\2202\07\01\polygonData_2022_07_01__15_52.csv
c:\Demo\WatchDir\2202\07\01\polygonData_2022_07_01__15_53.csv
c:\Demo\WatchDir\2202\07\01\polygonData_2022_07_01__15_54.csv
c:\Demo\WatchDir\2202\07\01\polygonData_2022_07_01__15_55.csv
c:\Demo\WatchDir\2202\07\05\polygonData_2022_07_05__12_00.csv
c:\Demo\WatchDir\2202\07\05\polygonData_2022_07_05__12_01.csv
For this specific program, I can just check if the file name contains.csv, but I would like to know in general for future reference.
Line:
files = sorted(glob.glob(input_watch_directory + "/**", recursive=True))
replace with the line:
files = sorted(glob.glob(input_watch_directory + "/**/*.*", recursive=True))
I have the following code:
element_search_field = browser.find_element_by_id(search_field_id)
op.set_preference("browser.download.folderList",2)
op.set_preference("browser.download.manager.showWhenStarting", False)
op.set_preference("browser.download.dir","C:\\Users\\user\\Selenium")
op.set_preference("browser.helperApps.neverAsk.saveToDisk", "application/octet-stream,application/vnd.csv")
downloadcsv = browser.find_element_by_css_selector('#downloadOCTable')
downloadcsv.click();
I am having trouble at the last line downloadcsv.click();. I was hoping that changing "application/octet-stream,application/vnd.csv") from "application/octet-stream,application/vnd.ms-excel") would automatically save the .csv file to the downloads folder but i am still get the DialogBox. Is there anyway I can have it saved without the DiaglogBox?
Edit:
As suggested by #Prophet:
I had made these changes but i am still get the pop up
profile = FirefoxProfile()
profile.set_preference("general.useragent.override", userAgent)
profile.set_preference("browser.download.dir", "C:\\Users\\[user]\\Selenium Options")
profile.set_preference("browser.helperApps.neverAsk.saveToDisk", "application/csv,application/excel,application/vnd.ms-excel,application/vnd.msexcel,text/anytext,text/comma-separated-values,text/csv,text/plain,text/x-csv,application/x-csv,text/x-comma-separated-values,text/tab-separated-values,data:text/csv")
profile.set_preference("browser.helperApps.neverAsk.saveToDisk", "application/xml,text/plain,text/xml,image/jpeg,application/octet-stream,data:text/csv")
profile.set_preference("browser.download.folderList",2)
profile.set_preference("browser.download.manager.showWhenStarting",False)
profile.set_preference("browser.helperApps.neverAsk.openFile","application/csv,application/excel,application/vnd.ms-excel,application/vnd.msexcel,text/anytext,text/comma-separated-values,text/csv,text/plain,text/x-csv,application/x-csv,text/x-comma-separated-values,text/tab-separated-values,data:text/csv")
profile.set_preference("browser.helperApps.neverAsk.openFile","application/xml,text/plain,text/xml,image/jpeg,application/octet-stream,data:text/csv")
profile.set_preference("browser.helperApps.alwaysAsk.force", False)
profile.set_preference("browser.download.useDownloadDir", True)
profile.set_preference("dom.file.createInChild", True)
There are several kinds of csv applications. We can't know what will work for the specific site you are working with.
I have all these preferences set and so far it works for me in all the cases
op.set_preference("browser.download.dir", downloadsPath)
op.set_preference("browser.helperApps.neverAsk.saveToDisk", "application/csv,application/excel,application/vnd.ms-excel,application/vnd.msexcel,text/anytext,text/comma-separated-values,text/csv,text/plain,text/x-csv,application/x-csv,text/x-comma-separated-values,text/tab-separated-values,data:text/csv")
op.set_preference("browser.helperApps.neverAsk.saveToDisk", "application/xml,text/plain,text/xml,image/jpeg,application/octet-stream,data:text/csv")
op.set_preference("browser.download.folderList",2)
op.set_preference("browser.download.manager.showWhenStarting",False)
op.set_preference("browser.helperApps.neverAsk.openFile","application/csv,application/excel,application/vnd.ms-excel,application/vnd.msexcel,text/anytext,text/comma-separated-values,text/csv,text/plain,text/x-csv,application/x-csv,text/x-comma-separated-values,text/tab-separated-values,data:text/csv")
op.set_preference("browser.helperApps.neverAsk.openFile","application/xml,text/plain,text/xml,image/jpeg,application/octet-stream,data:text/csv")
op.set_preference("browser.helperApps.alwaysAsk.force", False)
op.set_preference("browser.download.useDownloadDir", True)
op.set_preference("dom.file.createInChild", True)
downloadsPath here is a path to Downloads folder
You should use the MIME type for .csv as text/csv, you can check rfc4180
The below following type could help you
application/csv
application/x-csv
text/csv
text/comma-separated-values
text/x-comma-separated-values
text/tab-separated-values
text/plain
text/x-csv
So you can check it as below
op.set_preference("browser.helperApps.neverAsk.saveToDisk","text/csv,application/octet-stream")
here is a great discussion happens you can check it
I am currently working with several XML files that require the text of the element mods:namePart changed. I have created a script that should loop through all the XML files I have specified in a particular directory and make the intended changes. However, when I run the script the changes are not reflected in the new files. It executes as expected, and I even get the "namepart changed" output in my console, but the text I want to replace remains the same. PLEASE HELP!! I am extremely new to coding so any tips/comments are welcome. Here is the code I'm using:
list_of_files = glob.glob('/Users/#####/Documents/test_xml_files/*.xml')
for file in list_of_files: xmlObject = ET.parse(file)
root = xmlObject.getroot()
namespaces = {'mods':'http://www.loc.gov/mods/v3'}
for namePart in root.iterfind('mods:name/mods:namePart', namespaces):
if namePart.text == 'Tsukioka, Kōgyo, 1869-1927':
new_namePart = namePart.text.replace('Tsukioka, Kōgyo, 1869-1927', 'Tsukioka Kōgyo, 1869-1927', 1)
namePart.text == new_namePart
print('namepart changed')
else:
continue
nf = open(os.path.join('/Users/####/Documents/updated_test_directory', os.path.basename(file)), 'wb')
xmlString = ET.tostring(root, encoding="utf-8", method="xml", xml_declaration=None)
nf.write(xmlString)
nf.close()
I am running in a very, strange case... Yesterday I wrote a little script. It's goal is to check one condition in a file based on another file. It worked as intended. But since this morning, it doesn't. I haven't changed anything to the best of my knowledge. the code doesn't throw any error. I think the culprit is glob.glob
for file in glob.glob('*private.vcf.gz'):
seen = False
vcf = VCF(file)
print("test")
if not (file == "controlH.g.vcf.gz" or file == "output.g.vcf.gz"):
sample_name = file.split('.')[0]
out = "{}.FalsePositiveRefCallPurged.vcf".format(sample_name)
w = Writer(out, vcf)
for v in vcf:
seen = False
ref = VCF('output.g.vcf.gz')
for r in ref:
if seen :
break
if not seen:
if v.CHROM == r.CHROM:
if v.start == r.start or v.start > r.start and v.start < r.end:
if r.FILTER == "RefCall":
continue#print(str(v))
else:
w.write_record(v)
seen = True
w.close()
Indeed, when simply running
glob.glob('.*private.vcf.gz')
I get an empty list.
Here is the output of bash
ls *.private.vcf.gz
D2A1.private.vcf.gz D3A1.private.vcf.gz D5B3.private.vcf.gz H2C3.private.vcf.gz H4C2.private.vcf.gz
D2B3.private.vcf.gz D4A3.private.vcf.gz H2A3.private.vcf.gz H4A4.private.vcf.gz H5A3.private.vcf.gz
So I am sure the files are there ... I really don't understand why suddenly glob.glob has troubles finding them.
Any help would be greatly appreciated, thanks
Ok, it appearst that a direactory
/.ipynb_checkpoints
was created and my notebook using it as its default directory...
I have a problem with a corrupted excel file. So far I have used 7zip to open it as an archive and extract most of the data. But some important sheets cannot be extracted.
Using the l command of 7zip I get the following output :
7z.exe l -slt "C:\Users\corrupted1.xlsm" xl/worksheets/sheet3.xml
Output:
Listing archive: C:\Users\corrupted1.xlsm
--
Path = C:\Users\corrupted1.xlsm
Type = zip
Physical Size = 11931916
----------
Path = xl\worksheets\sheet3.xml
Folder = -
Size = 57217
Packed Size = 12375
Modified = 1980-01-01 00:00:00
Created =
Accessed =
Attributes = .....
Encrypted = -
Comment =
CRC = 553C3C52
Method = Deflate
Host OS = FAT
Version = 20
However when trying to extract it (or test it for that matter) I get :
7z.exe t -slt "C:\Users\corrupted1.xlsm" xl/worksheets/sheet3.xml
Output:
Processing archive: C:\Users\corrupted1.xlsm
Testing xl\worksheets\sheet3.xml Unsupported Method
Sub items Errors: 1
The method listed above says Deflate, which is the same for all the worksheets.
Is there anything I can do? What kind of corruption is this? Is it the CRC? Can I ignore it somehow or something?
Please help!
Edit:
The following is the error when trying to extract or edit the xml file through 7zip:
Edit 2:
Tried with WinZip as well, getting :
Extracting to "C:\Users\axpavl\AppData\Local\Temp\wzf0b9\"
Use Path: yes Overlay Files: yes
Extracting xl\worksheets\sheet2.xml
Unable to find the local header for xl\worksheets\sheet2.xml.
Severe Error: Cannot find a local header.
This might help:
https://superuser.com/questions/145479/excel-edit-the-xml-inside-an-xlsx-file
and this on too: http://www.techrepublic.com/blog/tr-dojo/recover-data-from-a-damaged-office-file-with-the-help-of-7-zip/