All.
The following section of code below is a snippet from a larger program, using Python 3.x. The colorSkusArray has values:
[[https://us.testcompany.com/eng-us/products/test-productv#443],[https://us.testcompany.com/eng-us/products/test-productv#544],[https://us.testcompany.com/eng-us/products/test-productv#387]]
listTwo = []
for a in range(0, len(colorSkusArray)):
browser.get(colorSkusArray[a])
print(colorSkusArray[a]) # the link to each skus page we'll pull views from
sBin = browser.find_elements_by_xpath("// *[ # id = 'lightSlider'] / li / img")
listOne = []
for b in sBin:
storer = b.get_attribute('src')
print(storer) # the src of each img on the sku's page
listOne.append(storer)
print('NEXT ELEMENT')
listTwo.append(listOne)
del sBin
del storer
del listOne[:]
print(listTwo)
The printout from this reads:
https://us.testcompany.com/eng-us/products/test-productv#443
https://us.testcompany.com/images/is/image/lv/1/PP_VP_L/test-productv#443_PM2_Front%20view.jpg?wid=140&hei=140
https://us.testcompany.com/images/is/image/lv/1/PP_VP_L/test-productv#443_PM1_Side%20view.jpg?wid=140&hei=140
https://us.testcompany.com/images/is/image/lv/1/PP_VP_L/test-productv#443_PM1_Interior%20view.jpg?wid=140&hei=140
https://us.testcompany.com/images/is/image/lv/1/PP_VP_L/test-productv#443_PM1_Other%20view.jpg?wid=140&hei=140
https://us.testcompany.com/images/is/image/lv/1/PP_VP_L/test-productv#443_PM1_Other%20view2.jpg?wid=140&hei=140
NEXT ELEMENT
https://us.testcompany.com/eng-us/products/test-productv#544
https://us.testcompany.com/images/is/image/lv/1/PP_VP_L/test-productv#443_PM2_Front%20view.jpg?wid=140&hei=140
https://us.testcompany.com/images/is/image/lv/1/PP_VP_L/test-productv#443_PM1_Side%20view.jpg?wid=140&hei=140
https://us.testcompany.com/images/is/image/lv/1/PP_VP_L/test-productv#443_PM1_Interior%20view.jpg?wid=140&hei=140
https://us.testcompany.com/images/is/image/lv/1/PP_VP_L/test-productv#443_PM1_Other%20view.jpg?wid=140&hei=140
https://us.testcompany.com/images/is/image/lv/1/PP_VP_L/test-productv#443_PM1_Other%20view2.jpg?wid=140&hei=140
NEXT ELEMENT
https://us.testcompany.com/eng-us/products/test-product#M543
https://us.testcompany.com/images/is/image/lv/1/PP_VP_L/test-productv#443_PM2_Front%20view.jpg?wid=140&hei=140
https://us.testcompany.com/images/is/image/lv/1/PP_VP_L/test-productv#443_PM1_Side%20view.jpg?wid=140&hei=140
https://us.testcompany.com/images/is/image/lv/1/PP_VP_L/test-productv#443_PM1_Interior%20view.jpg?wid=140&hei=140
https://us.testcompany.com/images/is/image/lv/1/PP_VP_L/test-productv#443_PM1_Other%20view.jpg?wid=140&hei=140
https://us.testcompany.com/images/is/image/lv/1/PP_VP_L/test-productv#443_PM1_Other%20view2.jpg?wid=140&hei=140
NEXT ELEMENT
[[], [], []]
The issue I'm having appears(?) to be with the sBin WebElement. What is supposed to happen:
The link of each page visited gets printed: SUCCESS. See the
address before each of the 5 following addresses from storer.
The link to each view (5) for each product gets printed:
UNSUCCESSFUL. See the same 5 links to the same 5 views are printed,
three times over. Each of these three blocks should have 5 unique links, but apparently the same 5 links are being referenced three times over.
The full list of lists (listTwo) should be printing with all its contents: UNSUCCESSFUL. See the three empty lists in listTwo's printout at the bottom of the output.
Regarding 2): I've been looking at this for close to four hours now, and cannot figure out what's going on. All I can guess after debugging for a while is that the sBin variable may not be updating properly. I inserted a del command to reset it at the end of each loop, but this didn't resolve the issue. Otherwise, I don't know why the same 5 src's keep getting appended, despite a new link being passed into the browser.get() method each time.
Regarding 3): I have printed lists of lists before, so I believe PyCharm should be able to handle this printout. Perhaps those I've printed in the past were different in some way (accounting for the difference in output here), but as far as I am aware, they were exactly the same. I've read about using Numpy for printing arrays, but as far as I can tell, it isn't necessary to print here.
I am new to Python and Selenium, so all suggestions and comments are appreciated!
I have resolved 2. By placing browser.quit() at the end of the for loop, the sBin WebElement is now updating properly with each iteration of the loop. I understand why this works, although I don't understand why it is necessary. It doesn't answer my original question, but it does apparently resolve the issue...
Regarding 3., I still don't know what's wrong. If a list (e.g. List[0]) will print out its contents just fine, I don't understand why a jagged list just prints empty [].
Related
Hello I am currently working on a script that goes onto a website and automatically adds an Item to cart and purchases it for you I have a script that works except the only problem is that It is only able to checkout a single Item Item. Here is an example fo the script:
Item_code = input('Item code: ')
Size = input('Size: ')
def BOT():
driver = webdriver.Chrome(executable path=
URL = .....
driver.get(URL)
while True:
try:
driver.find_element_by_css_selector(Item_code).click()
break
except NoSuchElementException:
driver.refresh()
select = Select(driver.find_element_by_id('s'))
select.select_by_visible_text(Size)
The script finds the item with the item code that I use and then selects the size from the users choice
I want to be able to write the code and the size but If I want to bot to cart two Items in different sizes I want it to type in a , and insert the next Item code and Size For example:
12345, 6789
Large, Medium
I want to somehow write that if a comma is included to read and use the code after it after it uses the first one and repeat that for every comma so If I wanted to get 3 or even 4 all I would have to do is this:
1234, 5678, 7890, etc...
Large, medium, Small, etc...
If anyone could Help me out I would really appreciate it I was thinking of something like
for , in BOT():
(something like this but Im not sure )
I know how to tell the script that if Item_code == (',') then do this but that would not work because it needs to be just the comma and I do not know how to tell it to repeat the BOT() a second time and use the second code and size
If someone could help me out I would really appreciate it thanks.
(I want to be able to share this bot since I have already made a GUI for it and all that)
(not sure if the executable path will work when sharing)
I have fixed this problem I created several custom functions and called them on special if statements
ex.
if Item.count(',') == (1):
Cart1()
Checkout()
etc.....
I have a set of multiple API's I need to source data from and need four different data categories. This data is then used for reporting purposes in Excel.
I initially created web queries in Excel, but my Laptop just crashes because there is too many querie which have to be updated. Do you guys know a smart workaround?
This is an example of the API I will source data from (40 different ones in total)
https://api.similarweb.com/SimilarWebAddon/id.priceprice.com/all
The data points I need are:
EstimatedMonthlyVisits, TopOrganicKeywords, OrganicSearchShare, TrafficSources
Any ideas how I can create an automated report which queries the above data on request?
Thanks so much.
If Excel is crashing due to the demand, and that doesn't surprise me, you should consider using Python or R for this task.
install.packages("XML")
install.packages("plyr")
install.packages("ggplot2")
install.packages("gridExtra")
require("XML")
require("plyr")
require("ggplot2")
require("gridExtra")
Next we need to set our working directory and parse the XML file as a matter of practice, so we're sure that R can access the data within the file. This is basically reading the file into R. Then, just to confirm that R knows our file is in XML, we check the class. Indeed, R is aware that it's XML.
setwd("C:/Users/Tobi/Documents/R/InformIT") #you will need to change the filepath on your machine
xmlfile=xmlParse("pubmed_sample.xml")
class(xmlfile) #"XMLInternalDocument" "XMLAbstractDocument"
Now we can begin to explore our XML. Perhaps we want to confirm that our HTTP query on Entrez pulled the correct results, just as when we query PubMed's website. We start by looking at the contents of the first node or root, PubmedArticleSet. We can also find out how many child nodes the root has and their names. This process corresponds to checking how many entries are in the XML file. The root's child nodes are all named PubmedArticle.
xmltop = xmlRoot(xmlfile) #gives content of root
class(xmltop)#"XMLInternalElementNode" "XMLInternalNode" "XMLAbstractNode"
xmlName(xmltop) #give name of node, PubmedArticleSet
xmlSize(xmltop) #how many children in node, 19
xmlName(xmltop[[1]]) #name of root's children
To see the first two entries, we can do the following.
# have a look at the content of the first child entry
xmltop[[1]]
# have a look at the content of the 2nd child entry
xmltop[[2]]
Our exploration continues by looking at subnodes of the root. As with the root node, we can list the name and size of the subnodes as well as their attributes. In this case, the subnodes are MedlineCitation and PubmedData.
#Root Node's children
xmlSize(xmltop[[1]]) #number of nodes in each child
xmlSApply(xmltop[[1]], xmlName) #name(s)
xmlSApply(xmltop[[1]], xmlAttrs) #attribute(s)
xmlSApply(xmltop[[1]], xmlSize) #size
We can also separate each of the 19 entries by these subnodes. Here we do so for the first and second entries:
#take a look at the MedlineCitation subnode of 1st child
xmltop[[1]][[1]]
#take a look at the PubmedData subnode of 1st child
xmltop[[1]][[2]]
#subnodes of 2nd child
xmltop[[2]][[1]]
xmltop[[2]][[2]]
The separation of entries is really just us, indexing into the tree structure of the XML. We can continue to do this until we exhaust a path—or, in XML terminology, reach the end of the branch. We can do this via the numbers of the child nodes or their actual names:
#we can keep going till we reach the end of a branch
xmltop[[1]][[1]][[5]][[2]] #title of first article
xmltop[['PubmedArticle']][['MedlineCitation']][['Article']][['ArticleTitle']] #same command, but more readable
Finally, we can transform the XML into a more familiar structure—a dataframe. Our command completes with errors due to non-uniform formatting of data and nodes. So we must check that all the data from the XML is properly inputted into our dataframe. Indeed, there are duplicate rows, due to the creation of separate rows for tag attributes. For instance, the ELocationID node has two attributes, ValidYN and EIDType. Take the time to note how the duplicates arise from this separation.
#Turning XML into a dataframe
Madhu2012=ldply(xmlToList("pubmed_sample.xml"), data.frame) #completes with errors: "row names were found from a short variable and have been discarded"
View(Madhu2012) #for easy checking that the data is properly formatted
Madhu2012.Clean=Madhu2012[Madhu2012[25]=='Y',] #gets rid of duplicated rows
Here is a link that should help you get started.
http://www.informit.com/articles/article.aspx?p=2215520
If you have never used R before, it will take a little getting used to, but it's worth it. I've been using it for a few years now and when compared to Excel, I have seen R perform anywhere from a couple hundred percent faster to many thousands of percent faster than Excel. Good luck.
I am trying to write a script that sends pings to ip's between 2 given range and tries to understand if they are available or not . I did some progress but there things that i still cant solve . For example, i get outputs twice . If you can help me it would be perfect here is my code:
import subprocess
ipfirst=input("1st ip:")
iplast=input("2nd ip:")
currentip=""
ip_adresses_list=[]
"""this function may be my problem because i didnt understand the response,result structure if anyone can explain this i'd be appreciated"""
def ip_checker(ip):
response,result=subprocess.getstatusoutput("ping -c1 -w0.5 "+ip)
if (response==0):
print(ip,"ALIVE")
else:
print(ip,"NOT ALIVE")
return ip
""""splitting in order to increase"""
ipfirst=ipfirst.split(".")
ipfirst=list(map(int,ipfirst))
iplast=iplast.split(".")
iplast=list(map(int,iplast))
"""here while loop increases the ipfirst and appends to the list (i used just append at first but it didnt work , it only added the last ipn umber times all ip numbers like ([1.1.1.5],[1.1.1.5],[1.1.1.5],[1.1.1.5])) and my problem may be occuring because of that structure ".copy()" """
while (iplast>ipfirst):
ip_adresses_list.append(ipfirst.copy())
ipfirst[3] = ipfirst[3]+1
ip_adresses_list.append(ipfirst.copy())
if (ipfirst[3]>254):
ipfirst[3]=0
ipfirst[2]=ipfirst[2]+1
ip_adresses_list.append(ipfirst.copy())
if (ipfirst[2]>254):
ipfirst[2]=0
ipfirst[1]=ipfirst[1]+1
ip_adresses_list.append(ipfirst.copy())
if(ipfirst[1]>254):
ipfirst[1]=0
ipfirst[0]=ipfirst[0]+1
ip_adresses_list.append(ipfirst.copy())
"""i rearrange the list in order to get ip structure(num1.num2.num3.num4 like that) and i mixed it with ping function(ip_checker())"""
for i in ip_adresses_list:
ip_indice1=i[0]
ip_indice2=i[1]
ip_indice3=i[2]
ip_indice4=i[3]
currentip=str(str(ip_indice1)+"."+str(ip_indice2)+"."+str(ip_indice3)+"."+str(ip_indice4))
ip_checker(currentip)
and if i run this code i get and output like that icant understand why does it pings twice each ip except the first one
144.122.152.10 NOT ALIVE
144.122.152.11 ALIVE
144.122.152.11 ALIVE
144.122.152.12 ALIVE
144.122.152.12 ALIVE
144.122.152.13 ALIVE
144.122.152.13 ALIVE
the problem is in your while loop that appends to the ip_address_list, it appends adds 1 appends, cycles through then appends the same ip again, adds 1 appends, so on and so forth, thats why you are getting doubles, simply moving the first append outside the loop fixed it.
also your for loop was really redundant I did the same thing in 2 lines lol
import subprocess
ipfirst=input("1st ip:")
iplast=input("2nd ip:")
currentip="144.122.152.13"
ip_adresses_list=[]
"""this function may be my problem because i didnt understand the response,result structure if anyone can explain this i'd be appreciated"""
def ip_checker(ip):
response,result=subprocess.getstatusoutput("ping -c1 -w0.5 "+ip)
if (response==0):
print(ip,"ALIVE")
else:
print(ip,"NOT ALIVE")
return ip
""""splitting in order to increase"""
ipfirst=ipfirst.split(".")
ipfirst=list(map(int,ipfirst))
iplast=iplast.split(".")
iplast=list(map(int,iplast))
"""here while loop increases the ipfirst and appends to the list (i used just append at first but it didnt work , it only added the last ipn umber times all ip numbers like ([1.1.1.5],[1.1.1.5],[1.1.1.5],[1.1.1.5])) and my problem may be occuring because of that structure ".copy()" """
ip_adresses_list.append(ipfirst.copy())
while (iplast>ipfirst):
ipfirst[3] = ipfirst[3]+1
ip_adresses_list.append(ipfirst.copy())
if (ipfirst[3]>254):
ipfirst[3]=0
ipfirst[2]=ipfirst[2]+1
ip_adresses_list.append(ipfirst.copy())
if (ipfirst[2]>254):
ipfirst[2]=0
ipfirst[1]=ipfirst[1]+1
ip_adresses_list.append(ipfirst.copy())
if(ipfirst[1]>254):
ipfirst[1]=0
ipfirst[0]=ipfirst[0]+1
ip_adresses_list.append(ipfirst.copy())
"""i rearrange the list in order to get ip structure(num1.num2.num3.num4 like that) and i mixed it with ping function(ip_checker())"""
for ip_indice in ip_adresses_list:
currentip=str(str(ip_indice[0])+"."+str(ip_indice[1])+"."+str(ip_indice[2])+"."+str(ip_indice[3]))
ip_checker(currentip)
I'm new to using Xdebug via the Vim plugin Vdebug.
I'm getting on OK, but I noticed that if I create an array with over 32 elements, the Watch window only shows elements 0-31 (i.e. the first 32). There does not seem to be a way to obtain the next 32, or to tell it to fetch all of them (or 1000 of them or whatever)?
Is this a bug/feature-lack in Vdebug? Is there anything I can do about it?
I'm debugging Drupal, which has very big, complex arrays (which sometimes contain recursive references #sigh), so at first I thought maybe it's iterating, getting in a loop and hitting max data limit. But I tried just looking at for ($i=0;$i<50;$i++) $a[] = $i; and this, too, only lists elements 0-31.
I have tried
let g:vdebug_features['max_depth'] = 1000
let g:vdebug_features['max_data'] = 1000000
but they have not made any difference.
Thanks,
After Vdebug is loaded, put this
let g:vdebug_features = { 'max_children': 128 }
Or whatever you'd like your max to be.
All credit to romaini for this answer as it was his comment that meant I found this problem listed as an issue on the vdebug github repo.
Here is the situation:
The first problem I'm having is with obtaining information from a CSV file. The purpose of the code I'm writing is to get a bunch of information on ZCTAs (zip codes), for a number of different cohorts (there are six currently being used, but the code is meant to be flexible to have any number of cohorts). One file contains the population, by cohort, for each ZCTA. Another file has the number of 'cases' (cases of cancer observed) for each cohort, for each ZCTA. Another file has the crude rate for each cohort, for the state of Iowa (the focus of this research), for the rate at which one can 'expect' to see the number of people who have cancer, for a population, by cohort. There are a couple of other files, but these are the focus, as this is where my issue is exhibited.
What my code does, initially, is to read the population file and get the population of each cohort by ZCTA. Each ZCTA, and the information, is stored in a list, which is then stored in a list of lists (nested), containing all of the ZCTAs. The code then gets the crude rate. Then, the crude rate is taken times the appropriate cohort, for each ZCTA and summed with all of the other cohorts within each ZCTA, to get the total number of people we can EXPECT to see having cancer, for each ZCTA. The population is also summed up. This information is stored in a another list, as well as a list containing all of the ZCTAs. This information will be the focus (The list of all of the ZCTAs, which each contain the total population and the total number of expected cases).
So, the problem is that I then need to take this newly acquired list and get the number of OBSERVED cases, for each cohort, sum those together, append it to the appropriate ZCTA and write it to a new file. I have code implemented that does this fine, EXCEPT that the bottom 22 or so ZCTAs don't get the number of observed cases. I don't know if it is the code, or what, but it works for all of the other 906, but doesn't get the bottom 22.
The reader will find sample data for the files I've discussed (the observed case file, and the output file) at: Gist
Here is the code I'm using:
`expectedcsv = open('ExpectedCases.csv', 'w', newline= '')
expectedwriter = csv.writer(expectedcsv, delimiter = ',')
expectedHeader = ['zcta', 'expected', 'pop', 'observed']
thecasesreader = csv.reader(thecasescsv, delimiter = ',')
for zcta in zctaPop:
caseCounter = 0
thecasescsv = open('NewCaseFile.csv', 'r', newline = '')
thecasesreader = csv.reader(thecasescsv, delimiter = ',')
for case in thecasesreader:
if case[0] == zcta[0]:
for i in range(3, len(case)):
caseCounter += int(case[i])
zcta.append(caseCounter)
expectedwriter.writerow(zcta)
expectedcsv.close()
thecasescsv.close()`
Something else I would also like to bring up is that later on in the code, the actual purpose for all of this, is to create an SMR filter, for each grid point. The grid points are somewhat arbitrary they have been placed (via coordinates) over the entire state of Iowa. The SMR is the number of observed divided by the number of expected cases. The threshold, that is, how many expected cases for a particular filter, is set by the user. So, if a user wants a filter created on 150 expected cases (for each grid point), the code goes through each ZCTA, summing up the expected cases until greater than 150 are found. The distance to this last ZCTA is the 'radius' of the filter.
To do this, I built a distance matrix (the distance from each grid point to every ZCTA) and then sorted it, nearest to furthest. Because of the size of the file (2300 X 930), I have to read this file line by line and get all of the information from other files. So, starting with the nearest ZCTA, I get the population, expected cases, and observed cases (the problem with this file was discussed above) and add these each to their respective counter (one for population, one for observed and one for expected). Then it goes to the next closest ZCTA and does the same, until the the threshold is exceeded.
The problem here is that I couldn't use the CSV Module to read these files, as I was already reading from another file and the index would be lost. So, I had to use just the regular filename.read(), which then required some interesting use of maketrans and .translate. I'm not sure its efficient or works great. Everything seems to be fine, but without the above problem being fixed, it's impossible to tell. I have included the code below, but was wondering if anybody had any better ideas/suggestions?
`expectedCSV = open('ExpectedCases.csv', 'r', newline = '')
table = str.maketrans('\r', ' ')
content = expectedCSV.read()
expectedCSV.close()
content = content.translate(table)
content = content.split(sep = '\n')
newContent = []
for item in content:
newContent.append((item.split(sep= ',')))
content = ' '
for item in newContent:
if item[0] == currentZcta:
expectedTotal += (float(item[1]))
totalPop += (float(item[2]))
totalObservedCount += (float(item[3]))`
Also, I couldn't figure out how to color the methods blue and the variables red, as some of the more awesome users of this site do. I would be very much interested in learning how to do that for future posts.
If anybody needs more info or anything clarified to help answer/formulate a solution, please, by all means, ask! Thanks for taking the time to read!
So, I ended up "solving" this by computing the observed along with the expected and population, by opening the file for each ZCTA computed. This did not really solve the issue I was dealing with, but rather found a way around it. I'm somewhat disappointed that more people didn't view and/or respond to this. If someone comes up with an answer to the actual problem, by all means, post it here. -Mike