how to import a module more then one time - python-3.x

I'm creating a script which can find movies from more than one site and play the movie with subtitle : server.py have all the information how to find the websites that have movies and the file have more than one function. Then I created a folder in the same directory with the server.py. This folder holds more the one website.py this files have the rules how to locate the movie file from the movie website my problem is that I'm importing functions from the server.py to this files exp : (
"""import server
server.org_link""") when I import the same function to the second file I get an error (AttributeError: module 'server' have no attribute 'org_link')
when I remove from the second file and I run server.py all work normally
(I cant find out what the problem is)
import os
import subprocess
import server # im importing this to the second file
from selenium import webdriver as wb
from selenium.webdriver.firefox.options import Options
option = Options()
option.headless = True
"""Set option headless to use with firefox"""
browser = wb.Firefox(options=option)
"""Set The browser WebDriver FireFox"""
with browser as driver:
driver.get(server.org_link)
element = driver.find_element_by_id('DtsBlkVFQx').get_attribute('innerHTML')
movie_link = server.hosted_server + '/stream/' + element
if os.name != 'nt':
vlc = subprocess.Popen([os.path.join("vlc"),os.path.join(movie_link)])
else:
vlc = subprocess.Popen([os.path.join("C:/", "Program Files(x86)", "VideoLAN", "VLC", "vlc.exe"), os.path.join(movie_link)])

Make:
server.org_link()
Instead of:
server.org_link

Related

How to download file in pdf with selenium edge web driver in specific custom folder in python selenium?

I am using selenium webdriver to automate downloading several PDF files. I get the PDF preview window (see below), and now I would like to download the file. How can I accomplish this using edge as the browser?
Sample Screenshot i want to download
Here's I've got so far but it's not working.
path = "F:\Anuzz\Desktop\sel\msedgedriver.exe"
options = EdgeOptions()
options.add_experimental_option('prefs', {
"download.default_directory": "F:\Anuzz\Desktop\sel\test.py",
"download.prompt_for_download": False,
"plugins.always_open_pdf_externally": True
})
driver = Edge(path, options=options)
driver.get('https://sscstudy.com/ssc-chsl-paper-pdf-download/')
driver.find_element_by_xpath('//*[#id="post-11490"]/div/div/p[4]/a/strong').click()
NEW (works on edge)
To use this you have to install pyautogui library with the command pip install pyautogui
import time
import pyautogui
from selenium import webdriver
driver = webdriver.Edge()
pdf_url = 'http://www.africau.edu/images/default/sample.pdf'
driver.get(pdf_url)
time.sleep(3)
pyautogui.hotkey('ctrl', 's')
time.sleep(2)
path_and_filename = r'C:\Users\gt\Desktop\test.pdf'
pyautogui.typewrite(path_and_filename)
pyautogui.press('enter')
OLD (works on chrome)
This is the code I use to automatically download a pdf to a specific path. If you have windows, just put your account name in r'C:\Users\...\Desktop'. Moreover, you have to put the path of your driver in chromedriver_path. The code below downloads a sample pdf.
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.service import Service
options = webdriver.ChromeOptions()
download_path = r'C:\Users\...\Desktop'
options.add_experimental_option('prefs', {
"download.default_directory": download_path, # change default directory for downloads
"download.prompt_for_download": False, # to auto download the file
"download.directory_upgrade": True,
"plugins.always_open_pdf_externally": True # it will not show PDF directly in chrome
})
chromedriver_path = '...'
driver = webdriver.Chrome(options=options, service=Service(chromedriver_path))
pdf_url = 'http://www.africau.edu/images/default/sample.pdf'
driver.get(pdf_url)
After testing, I think that the problem is mainly caused by the site you provided, which seems to embed other PDF viewers instead of the one that comes with Edge.
So you may need code like this to achieve your needs( url splicing ):
from selenium import webdriver
from selenium.webdriver.edge import service
import time
edgeOption = webdriver.EdgeOptions()
edgeOption.use_chromium = True
edgeOption.add_argument("start-maximized")
edgeOption.add_experimental_option('prefs', {
"download.default_directory": "C:\\Downloads",
"download.prompt_for_download": False
})
edgeOption.binary_location = r"C:\Program Files (x86)\Microsoft\Edge\Application\msedge.exe"
s=service.Service(r'C:\Users\Administrator\Desktop\msedgedriver.exe')
driver = webdriver.Edge(service=s, options=edgeOption)
driver.get('https://sscstudy.com/ssc-chsl-paper-pdf-download/')
url = driver.find_element_by_xpath('//*[#id="post-11490"]/div/div/p[4]/a').get_attribute('href')
driver.get("https://drive.google.com/uc?id="+url[32:(len(url)-17)]+"&export=download")
time.sleep(1)
Note: Test with Selenium 4.1.0 and Edge 101.0.1210.53. Please modify path of the Edge Driver and other possible parameters according to your own situation.

Can you actually change the default download directory for an already open chrome session using Selenium on Python?

I'm trying to download some zips files from this page to an specific path for an already chrome browser session open using the code down below:
import time
import numpy as np
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
opt = Options() #the variable that will store the selenium options
opt.add_experimental_option("debuggerAddress", "localhost:9222") #this allows bulk-dozer to take control of your Chrome Browser in DevTools mode.
opt.add_experimental_option("prefs", {"download.default_directory": r"C:\Users\ResetStoreX\Downloads\Binance futures data\ADAUSDT-Mark_Prices_Klines_1h_Timeframe"}) #set the path to save the desired zipped data
s = Service(r'C:\Users\ResetStoreX\AppData\Local\Programs\Python\Python39\Scripts\chromedriver.exe') #Use the chrome driver located at the corresponding path
driver = webdriver.Chrome(service=s, options=opt) #execute the chromedriver.exe with the previous conditions
#Why using MarkPrices: https://support.btse.com/en/support/solutions/articles/43000557589-index-price-and-mark-price#:~:text=Index%20Price%20is%20an%20important,of%20cryptocurrencies%20on%20major%20exchanges.&text=Mark%20Price%20is%20the%20price,be%20fair%20and%20manipulation%20resistant.
if driver.current_url == 'https://data.binance.vision/?prefix=data/futures/um/daily/markPriceKlines/ADAUSDT/1h/' :
number = 2 #initialize an int variable to 2 because the desired web elements in this page starts from 2
counter = 0
the_dictionary_links = {}
while number <= np.size(driver.find_elements(By.XPATH, '//*[#id="listing"]/tr')): #iterate over the tbody array
data_file_name = driver.find_element(By.XPATH, f'//*[#id="listing"]/tr[{number}]/td[1]/a').text
if data_file_name.endswith('CHECKSUM') == False:
the_dictionary_links[data_file_name] = driver.find_element(By.XPATH, f'//*[#id="listing"]/tr[{number}]/td[1]/a').get_attribute('href')
print(f'Saving {data_file_name} and its link for later use')
counter += 1
number += 1
print(counter)
i = 0
o = 0
for i,o in the_dictionary_links.items():
driver.get(o)
print(f'Downloading {i}')
time.sleep(1.8)
And unfortunately it's not working, it throws the following error:
InvalidArgumentException: invalid argument: cannot parse capability:
goog:chromeOptions from invalid argument: unrecognized chrome option:
prefs
So, I would like to know what could have gone wrong? I coded the program above based on this solution but it only seems to work for a new chrome session, and I need the download default directory to be capable of being reset when needed for an already open session. Any ideas?

How to get the docx in the iframe with selenium?

I want to get the document in the url such as below:
document in the iframe
Try with wget command ,the downloaded file contain no document.
The document contained in the webpage can't be printed in pdf file in chrome.
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
options = webdriver.ChromeOptions()
driver = driver = webdriver.Chrome(options=options)
target_doc_url = "http://www.ibodao.com/OfficePreview?furl=/Public/uploads/files/2020/0219/5e4cc551729af.docx"
driver.get(target_doc_url)
iframeMsg = driver.find_element_by_id("office_iframe")
driver.switch_to_frame(iframeMsg);
with open('/tmp/target.html','w') as writer:
writer.write(driver.page_source)
Open the /tmp/target.html,no document in it.
How to get the document in the iframe whose id is office_iframe?
import re
import urllib.request
from selenium import webdriver
driver = webdriver.Chrome()
target_doc_url = "http://www.ibodao.com/OfficePreview?furl=/Public/uploads/files/2020/0219/5e4cc551729af.docx"
driver.get(target_doc_url)
iframeMsg = driver.find_element_by_id("office_iframe")
src=iframeMsg.get_attribute("src")
m = re.search('.*?url=(.+?)/vector-output', src)
doc = m.group(1)
print(doc)
urllib.request.urlretrieve(doc, "a.docx")
this will save document as docx file , the src attribute in iframe shows the actual document file you don't need the vector-output part from the source
You can manually download it by going to :
http://static.ibodao.com/Public/uploads/files/2020/0219/5e4cc551729af.docx
Make it more simple after getting the src which contains real url:
target_url = src.split("=")[1]
urllib.request.urlretrieve(target_url, "target.docx")

How to stop connecting by proxy/disable proxy in selenium webdriver in python in Windows? I don't want any proxy anymore

Previously, I was using selenium webdriver for web scraping purposes for various websites. But on a whim, I decided to try to use rotating proxy IP addresses for web scraping purposes, because I wanted to learn what that was. For that purpose, I searched online and found this article, and decided to try it out:
https://medium.com/ml-book/multiple-proxy-servers-in-selenium-web-driver-python-4e856136199d
But when I used it in my code, I am not even able to go to any damn website; not even the 'get' statement works :(
I get this message in my Anaconda Spyder console .
Note: I took a screenshot and put its here:
Then, I removed the code that I had copied from this article. Even then, my code won't stop connecting by proxy!!!! It simply refuses to not use proxy, it's like my code is taking revenge on me.
Here is my code:
import xlrd
import pandas as pd
import datetime as dt
import xlwings as xw
import sys
import math
import xlwt
from xlwt import Workbook
import openpyxl
from openpyxl import load_workbook
from collections import Counter
import shutil as shu
import os
import time
from selenium import webdriver
#from http_request_randomizer.requests.proxy.requestProxy import RequestProxy
sz = ('Coast_Retail_-_Auto_Weekly_Update.xlsx')
sz1 = xlrd.open_workbook(sz)
sz2 = sz1.sheet_by_index(0)
hz='Coast_Retail_-_Auto_Weekly_Update.xlsx'
hz1=load_workbook(hz)
hz2=hz1.worksheets[0]
req_proxy = RequestProxy() #you may get different number of proxy when you run this at each time
proxies = req_proxy.get_proxy_list() #this will create proxy list
PROXY = proxies[0].get_address()
webdriver.DesiredCapabilities.CHROME['proxy']={
"httpProxy":PROXY,
"ftpProxy":PROXY,
"sslProxy":PROXY,
"proxyType":"MANUAL",
}
d = webdriver.Chrome(executable_path=r'R:\Sulaiman\temp_code_vineet\nick\chromedriver.exe')
time.sleep(5)
d.get("https://tfl.compass.inovatec.ca")
time.sleep(5)
un = d.find_element_by_id("UserName")
un.send_keys("vpande")
pw = d.find_element_by_id("Password")
pw.send_keys("v123456A")
sb = d.find_element_by_class_name("red-btn")
sb.click()
time.sleep(5)
qz=[]
for i in range(4,sz2.nrows):
try:
if(sz2.cell_value(i,13)=="Booked"):
fn=sz2.cell_value(i,0)
ln=sz2.cell_value(i,1)
fun=fn+" "+ln
sch = d.find_element_by_class_name("search")
sch.send_keys(fun)
sch.send_keys(u'\ue007')
time.sleep(5)
d.find_element_by_xpath('//*[#id="body"]/section/div/div[2]/div[1]/div[2]/a[2]').click()
time.sleep(5)
x=d.find_element_by_xpath('/html/body/div[5]/section/div/div[2]/div/div[2]/div[2]/div[4]/span[2]').text
print(x)
qz.append(x)
d.get("https://tfl.compass.inovatec.ca")
time.sleep(5)
except:
print("err at "+str(i))
pass
print(qz)
I guess your main browser is Chrome, so you might be modifying chrome's settings from selenium or something like that.
Try stopping the session by calling d.close() or d.quit()

How do I relocate/disable GeckoDriver's log file in selenium, python 3?

Ahoy, how do I disable GeckoDriver's log file in selenium, python 3?
If that's not possible, how do I relocate it to Temp files?
To relocate the GeckoDriver logs you can create a directory within your project space e.g. Log and you can use the argument log_path to store the GeckoDriver logs in a file as follows :
from selenium import webdriver
driver = webdriver.Firefox(executable_path=r'C:\path\to\geckodriver.exe', log_path='./Log/geckodriver.log')
driver.get('https://www.google.co.in')
print("Page Title is : %s" %driver.title)
driver.quit()
ref: 7. WebDriver API > Firefox WebDriver
according to the documents, you can relocate it to Temp following:
from selenium import webdriver
from selenium.webdriver.firefox.options import Options$
import os
options = Options()
driver = webdriver.Firefox(executable_path=geckodriver_path, service_log_path=os.path.devnull, options=options)
Following argument are deprecated:
firefox_options – Deprecated argument for options
log_path – Deprecated argument for service_log_path
using WebDriver(log_path=path.devnull) and WebDriver(service_log_path=path.devnull are both deprecated at this point in time, both result in a warning.
using a service object is now the prefered way of doing this:
from os import path
from selenium.webdriver.firefox.service import Service
from selenium.webdriver.firefox.webdriver import WebDriver
service = Service(log_path=path.devnull)
driver = WebDriver(service=service)
driver.close()
You should be using the service_log_path, as of today the log_path is deprecated, example with pytest:
#pytest.mark.unit
#pytest.fixture
def browser(pytestconfig):
"""
Args:
pytestconfig (_pytest.config.Config)
"""
driver_name = pytestconfig.getoption('browser_driver')
driver = getattr(webdriver, driver_name)
driver = driver(service_log_path='artifacts/web_driver-%s.log' % driver_name)
driver.implicitly_wait(10)
driver.set_window_size(1200, 800)
yield driver
driver.quit()
No #hidehara, but I found a way how to do it. I looked up the file init in the Selenium2Library directory. in my case: C:\Users\Eigenaardig\AppData\Local\Programs\Python\Lib\site-packages\SeleniumLibrary
there I added these 2 lines...
from selenium import webdriver
driver = webdriver.Firefox(executable_path=r'C:\Users\Eigenaar\eclipse-workspace\test\test\geckodriver.exe', log_path='./Log/geckodriver.log')
created the directory LOG (in Windows Explorer)
helaas, that started 2 instances.
I added in a separate library (.py file)
which looks like this (for test purposes):
import time
import random
from selenium import webdriver
driver = webdriver.Firefox(executable_path=r'C:\Users\specimen\RobotFrameWorkExperienced\RobotLearn\Log\geckodriver.exe', service_log_path='./Log/geckodriver.log')
class CustomLib:
ROBOT_LIBRARY_SCOPE = 'RobotLearn'
num = random.randint(1, 18)
if num % 2 == 0:
def get_current_time_as_string(self):
localtime = time.localtime()
formatted_time = time.strftime("%Y%m%d%H%M%S", localtime)
return formatted_time
else:
def get_current_time_as_string(self):
localtime = time.localtime()
formatted_time = time.strftime("%S%m%d%H%M%Y", localtime)
return formatted_time
But now it opens up 2 instances,
1 runs correct,
1 stays open and does nothing furthermore.
help help.
If it all for some reason does not work. (which was the case in our case).
then go to this (relative) directory:
C:\Users\yourname\AppData\Local\Programs\Python\Python38\Lib\site-packages\SeleniumLibrary\keywords\webdrivertools
there is a file called: webdrivertools.py
on line 157 you can edit
service_log_path='./robots/robotsiot/Results/Results/Results/geckoresults', executable_path=executable_path,
advantages:
#1 if you're using something like Github and you synchronize a directory then the log files are kept separate.
#2 the original file of the previous run gets overwritten
(if that is what you want, but in some cases that is exactly how you need it).
note: the section written above is in case you're using FireFox, if you are using another browser you'll have to edit it on a different line.
note2: this path overrides on a high level, so arguments in Eclipse->Robot framework will not have any effect anymore.
use this option with caution: it's sort of a last resort if the other options don't work!

Resources