Automatically downloading .csv in Selenium - python-3.x

I have the following code:
element_search_field = browser.find_element_by_id(search_field_id)
op.set_preference("browser.download.folderList",2)
op.set_preference("browser.download.manager.showWhenStarting", False)
op.set_preference("browser.download.dir","C:\\Users\\user\\Selenium")
op.set_preference("browser.helperApps.neverAsk.saveToDisk", "application/octet-stream,application/vnd.csv")
downloadcsv = browser.find_element_by_css_selector('#downloadOCTable')
downloadcsv.click();
I am having trouble at the last line downloadcsv.click();. I was hoping that changing "application/octet-stream,application/vnd.csv") from "application/octet-stream,application/vnd.ms-excel") would automatically save the .csv file to the downloads folder but i am still get the DialogBox. Is there anyway I can have it saved without the DiaglogBox?
Edit:
As suggested by #Prophet:
I had made these changes but i am still get the pop up
profile = FirefoxProfile()
profile.set_preference("general.useragent.override", userAgent)
profile.set_preference("browser.download.dir", "C:\\Users\\[user]\\Selenium Options")
profile.set_preference("browser.helperApps.neverAsk.saveToDisk", "application/csv,application/excel,application/vnd.ms-excel,application/vnd.msexcel,text/anytext,text/comma-separated-values,text/csv,text/plain,text/x-csv,application/x-csv,text/x-comma-separated-values,text/tab-separated-values,data:text/csv")
profile.set_preference("browser.helperApps.neverAsk.saveToDisk", "application/xml,text/plain,text/xml,image/jpeg,application/octet-stream,data:text/csv")
profile.set_preference("browser.download.folderList",2)
profile.set_preference("browser.download.manager.showWhenStarting",False)
profile.set_preference("browser.helperApps.neverAsk.openFile","application/csv,application/excel,application/vnd.ms-excel,application/vnd.msexcel,text/anytext,text/comma-separated-values,text/csv,text/plain,text/x-csv,application/x-csv,text/x-comma-separated-values,text/tab-separated-values,data:text/csv")
profile.set_preference("browser.helperApps.neverAsk.openFile","application/xml,text/plain,text/xml,image/jpeg,application/octet-stream,data:text/csv")
profile.set_preference("browser.helperApps.alwaysAsk.force", False)
profile.set_preference("browser.download.useDownloadDir", True)
profile.set_preference("dom.file.createInChild", True)

There are several kinds of csv applications. We can't know what will work for the specific site you are working with.
I have all these preferences set and so far it works for me in all the cases
op.set_preference("browser.download.dir", downloadsPath)
op.set_preference("browser.helperApps.neverAsk.saveToDisk", "application/csv,application/excel,application/vnd.ms-excel,application/vnd.msexcel,text/anytext,text/comma-separated-values,text/csv,text/plain,text/x-csv,application/x-csv,text/x-comma-separated-values,text/tab-separated-values,data:text/csv")
op.set_preference("browser.helperApps.neverAsk.saveToDisk", "application/xml,text/plain,text/xml,image/jpeg,application/octet-stream,data:text/csv")
op.set_preference("browser.download.folderList",2)
op.set_preference("browser.download.manager.showWhenStarting",False)
op.set_preference("browser.helperApps.neverAsk.openFile","application/csv,application/excel,application/vnd.ms-excel,application/vnd.msexcel,text/anytext,text/comma-separated-values,text/csv,text/plain,text/x-csv,application/x-csv,text/x-comma-separated-values,text/tab-separated-values,data:text/csv")
op.set_preference("browser.helperApps.neverAsk.openFile","application/xml,text/plain,text/xml,image/jpeg,application/octet-stream,data:text/csv")
op.set_preference("browser.helperApps.alwaysAsk.force", False)
op.set_preference("browser.download.useDownloadDir", True)
op.set_preference("dom.file.createInChild", True)
downloadsPath here is a path to Downloads folder

You should use the MIME type for .csv as text/csv, you can check rfc4180
The below following type could help you
application/csv
application/x-csv
text/csv
text/comma-separated-values
text/x-comma-separated-values
text/tab-separated-values
text/plain
text/x-csv
So you can check it as below
op.set_preference("browser.helperApps.neverAsk.saveToDisk","text/csv,application/octet-stream")
here is a great discussion happens you can check it

Related

How to change the location?

I'm trying to automate image uploading to Instagram using Selenium in Python. I'm successful till opening the fileDialogue but I'm not able to change the directory to where the image is located. It returns an error that ToolbarWindow32 can't be detected by AutoIt.
My code:
ActionChains(browser).move_to_element(browser.find_element_by_xpath(
"/html/body/div[8]/div[2]/div/div/div/div[2]/div[1]/div/div/div[2]/div/button")).click().perform()
handle = f"[CLASS:#32770; TITLE:Open]"
autoit.win_wait(handle, 60)
autoit.control_set_text(handle, "ToolbarWindow32", photopath) # This line give me the Error
autoit.control_set_text(handle, "Edit1", photopath)
autoit.control_click(handle, "Button1")
Take a look how this is done in _WD_SelectFiles with: https://github.com/Danp2/au3WebDriver/blob/master/wd_helper.au3
You should be able to do the same directly with python+selenium without using AutoIt.
Also take a look on:
https://github.com/Danp2/au3WebDriver/blob/master/wd_demo.au3
There is example how directly in AutoIt it is possible to do the same with WebDriver UDF without opening any FileOpenDialog :
Func DemoUpload()
; REMARK This example uses PNG files created in DemoWindows
; navigate to "file storing" website
_WD_Navigate($sSession, "https://www.htmlquick.com/reference/tags/input-file.html")
; select single file
_WD_SelectFiles($sSession, $_WD_LOCATOR_ByXPath, "//section[#id='examples']//input[#name='uploadedfile']", #ScriptDir & "\Screen1.png")
; select two files at once
_WD_SelectFiles($sSession, $_WD_LOCATOR_ByXPath, "//p[contains(text(),'Upload files:')]//input[#name='uploadedfiles[]']", #ScriptDir & "\Screen1.png" & #LF & #ScriptDir & "\Screen2.png")
; accept/start uploading
Local $sElement = _WD_FindElement($sSession, $_WD_LOCATOR_ByXPath, "//p[contains(text(),'Upload files:')]//input[2]")
_WD_ElementAction($sSession, $sElement, 'click')
EndFunc ;==>DemoUpload

Why are the changes I am making within my code not writing to thew new file I have created?

I am currently working with several XML files that require the text of the element mods:namePart changed. I have created a script that should loop through all the XML files I have specified in a particular directory and make the intended changes. However, when I run the script the changes are not reflected in the new files. It executes as expected, and I even get the "namepart changed" output in my console, but the text I want to replace remains the same. PLEASE HELP!! I am extremely new to coding so any tips/comments are welcome. Here is the code I'm using:
list_of_files = glob.glob('/Users/#####/Documents/test_xml_files/*.xml')
for file in list_of_files: xmlObject = ET.parse(file)
root = xmlObject.getroot()
namespaces = {'mods':'http://www.loc.gov/mods/v3'}
for namePart in root.iterfind('mods:name/mods:namePart', namespaces):
if namePart.text == 'Tsukioka, Kōgyo, 1869-1927':
new_namePart = namePart.text.replace('Tsukioka, Kōgyo, 1869-1927', 'Tsukioka Kōgyo, 1869-1927', 1)
namePart.text == new_namePart
print('namepart changed')
else:
continue
nf = open(os.path.join('/Users/####/Documents/updated_test_directory', os.path.basename(file)), 'wb')
xmlString = ET.tostring(root, encoding="utf-8", method="xml", xml_declaration=None)
nf.write(xmlString)
nf.close()

Flask file download: filename persists within the session

I have a flask site and a webform that generates excel files. The problem I'm having is, if I send the user back to the form to submit again, the previous file -- with same file name and data -- is downloaded even though new files are generated in the tmp directory. So, I think this has to do with my session variable.
I add a timestamp to the file name with this function to make sure the file names are unique:
def rightnow():
return dt.datetime.now().strftime("%m%d%y%h%m%S%f")
In routes.py, here is the call for the download:
#app.route('/download/', methods=['POST','GET'])
def download_file():
output_file = session.get('new_file', None)
r = send_file(output_file, attachment_filename=output_file, as_attachment=True)
return r
This is the code for the script that generates the excel files:
new_file = 'output_' + rightnow() + '.xlsx'
writer = pd.ExcelWriter('tmp/' + new_file, engine='xlsxwriter')
df.to_excel(writer, sheet_name="data")
writer.save()
session['new_file'] = 'tmp/' + new_file
The download statement from the template page:
<a class="button" href="{{url_for('download_file')}}">Download new data</a>
I have a "Submit Again" button tied to simple javascript
<button onclick="goBack()">Submit Again</button>
<script>//for "revise search" button
function goBack() {
window.history.back();
}
</script>
I have played around with session.clear() with no success.
How can I drop the session when the user click's the "Submit Again" button so the saved file name is dropped?
EDIT: I checked the variables for the filename and the session variable and they are identical, and different from the file name assigned on download. Forinstance, the file is named 'output_May0554733504.xlsx' by the script that I wrote -- I can see it in the tmp directory. But when I go to download the file, the filename is different: 'output_May0536794357.xlsx'
This other file name is not that of a different file in the tmp directory. Any file I download will be 'output_May0536794357.xlsx'.
If session.pop('new_file') doesn't work, you could try session.modified = True to force the change to the session.

PyQt QFileEditor default suffix

I have looked out through bunch of code but this peace of code doesn't work as expected for me:
export_dialog = QtGui.QFileDialog()
export_dialog.setWindowTitle('Export')
export_dialog.setDirectory(EXPORT_DIR)
export_dialog.setAcceptMode(QtGui.QFileDialog.AcceptSave)
export_dialog.setNameFilter('INI files (*.ini)')
export_dialog.setDefaultSuffix('ini')
export_file, _ = export_dialog.getSaveFileName()
print(export_file)
I'm saving my file without extension, counting on that my above configurations will set it properly, but it doesn't work. There is no extension added.
Any suggestions?
Thanks
export_dialog = QtGui.QFileDialog()
export_dialog.setWindowTitle('Export')
export_dialog.setDirectory(EXPORT_DIR)
export_dialog.setAcceptMode(QtGui.QFileDialog.AcceptSave)
export_dialog.setNameFilter('INI files (*.ini)')
export_dialog.setDefaultSuffix('ini')
if export_dialog.exec_() == QtGui.QFileDialog.Accepted:
print(export_dialog.selectedFiles()[0])
This code will return a full file path with selected filter also.

Corrupted Excel File & 7zip

I have a problem with a corrupted excel file. So far I have used 7zip to open it as an archive and extract most of the data. But some important sheets cannot be extracted.
Using the l command of 7zip I get the following output :
7z.exe l -slt "C:\Users\corrupted1.xlsm" xl/worksheets/sheet3.xml
Output:
Listing archive: C:\Users\corrupted1.xlsm
--
Path = C:\Users\corrupted1.xlsm
Type = zip
Physical Size = 11931916
----------
Path = xl\worksheets\sheet3.xml
Folder = -
Size = 57217
Packed Size = 12375
Modified = 1980-01-01 00:00:00
Created =
Accessed =
Attributes = .....
Encrypted = -
Comment =
CRC = 553C3C52
Method = Deflate
Host OS = FAT
Version = 20
However when trying to extract it (or test it for that matter) I get :
7z.exe t -slt "C:\Users\corrupted1.xlsm" xl/worksheets/sheet3.xml
Output:
Processing archive: C:\Users\corrupted1.xlsm
Testing xl\worksheets\sheet3.xml Unsupported Method
Sub items Errors: 1
The method listed above says Deflate, which is the same for all the worksheets.
Is there anything I can do? What kind of corruption is this? Is it the CRC? Can I ignore it somehow or something?
Please help!
Edit:
The following is the error when trying to extract or edit the xml file through 7zip:
Edit 2:
Tried with WinZip as well, getting :
Extracting to "C:\Users\axpavl\AppData\Local\Temp\wzf0b9\"
Use Path: yes Overlay Files: yes
Extracting xl\worksheets\sheet2.xml
Unable to find the local header for xl\worksheets\sheet2.xml.
Severe Error: Cannot find a local header.
This might help:
https://superuser.com/questions/145479/excel-edit-the-xml-inside-an-xlsx-file
and this on too: http://www.techrepublic.com/blog/tr-dojo/recover-data-from-a-damaged-office-file-with-the-help-of-7-zip/

Resources