Returning a non-conflicting selection - python-3.x

i am stuck on my code that i am working on and i need some help.
now imagine if you have a list:
[(1,5,7),(2,1,4),(3,0,3),(4,6,10),(5,7,9)]
each one represent a node (ID , start time, finish time)
now i need my output to be:
[(1,5,7),(2,1,4),(5,7,9)]
so that there is no confliction between times.
my code prints:
[(1,5,7),(2,1,4),(3,0,3),(5,7,9)]
and as you can see (3,0,3) conflicts with (2,1,4)

Initialise an empty list for storing busy times.
Use a for loop and go through the list.
When you get the item: 1,5,7. You add all the times between 5 and 7 to that busy times list. So busy times now has 5,6,7.
For each one those node, check if the interval of numbers exist in the busylist. If they don't add them to your non-conflicing selection list, and add those numbers to your busy list.

Related

Is there a faster way to delete table rows so my script doesn't take hours to run?

My script functions fine when there are only a few rows of data to remove. However, the larger the dataset gets, it becomes slower and unusable. Deleting 50 table rows took multiple hours to run. I think the loop to go through each address in the array is slowing it down, as I can see it deleting one row at a time. However, I am not sure that there is a way to delete all rows in the array without going through a loop.
const rowAddressToRemove = rangeView.getRows().map((r) => r.getRange().getAddress());
rowAddressToRemove.splice(0, 1);
const sheet = sourceTable.getWorksheet();
rowAddressToRemove.reverse().forEach((address) => {
sheet.getRange(address).delete(ExcelScript.DeleteShiftDirection.up);
});
The current code is working, but it is just slow, and I'm thinking there is something (or some things) horribly optimized in my code that is slowing this down to the point of unusability.
Here is an example of the rowAddressToRemove variable output on the console: (2) ["Pending!A7:G7", "Pending!A8:G8"]
0: "Pending!A7:G7"
1: "Pending!A8:G8"
I don't understand this:
...getRange(address).delete(ExcelScript.DeleteShiftDirection.up)
You say that you want to remove entire rows, so why not opt for this:
getRange(address).entireRow.delete
(I don't know if the entireRow.delete needs any arguments, so you might need to see my proposal as pseudo-code)
I think your issue is that you're getting row addresses individually instead of getting the address at once. This is for your rowAddressToRemove variable. So instead of having code like:
const rowAddressToRemove = rangeView.getRows().map((r) => r.getRange().getAddress());
You can have code like this:
const rowAddressToRemove = rangeView.getRange().getAddress()
So you can try to see if this makes things faster. If it doesn't, it may also have to put the code for your rangeView variable in the post as well.

Compare two text columns to find partial match and new rows

I have two lists of messages. The first one is short messages and the second one is a master file which has longer texts which includes the short messages in the first list but also has many new messages. I want to find the new ones in master file (second list) which has no partial matches.
something like above. then NO means they are new errors
I tried =IF(ISERROR(VLOOKUP(""&A2&"",C:C,1,0)),"No","Yes") but it is other way around. it will find short messages within master file with big messages. I want to check big messages which have the short messages inside compare with the list with short messages and if there is no (partial) match label it as new.
This should work, currently can't test it though
=if(sumproduct(--isnumber(search($a$2:$a$8,b2)))>0,"YES","NO")
Try:
=IF(OR(ISNUMBER(FIND(" "&$A$2:$A$8&" "," "&B2& " "))),"YES","NO")
Note the use of spaces otherwise aaa would be found in kkaaa

want to optimize a "phone number generator" code with loop deduction

Description of program:
1.makes unique random phone numbers based on how many you want it to i mean: if you pass 100 it makes 100 phone numbers.
2.creates text files based on the range you pass to it, i mean: if you need 100 text files containing 100 unique phone numbers either unique comparing to each number within or the other to be made forthcoming text files.
meanwhile it creates phone numbers sorts the phone numbers like below if it makes sense:
This format to expect in the text files :
1909911304
1987237347
........... and so on.............
This is the method responsible to do so:
(Note: I use make_numbers method as construction of operation, Actually num_doc_amount should be used.)
def make_numbers(self):
"""dont use this method:this method supports num_doc_amount method"""
# sorry for this amount of loops it was inevitable to make the code work
for number_of_files in range(self.amount_numbs):
# this loop maintains the pi_digits.txt making(txt)
number_of_files += 1
if number_of_files == self.amount_files:
sys.exit()
for phone_numbers in range(self.amount_numbs):
# This loop maintains the amount of phone numbers in each pi_digits.txt
file = open(f"{self.directory}\\{number_of_files}.{self.format}", 'w')
for numbers in range(self.amount_numbs):
# This loop is parallel to the previous one and
# writes that each number is which one from the
# whole amount of numbers
file.write(f"{numbers + 1}. - {self.first_fourz}{choice(nums)}"
f"{choice(nums)}{choice(nums)}{choice(nums)}"
f"{choice(nums)}{choice(nums)}{choice(nums)}\n")
def num_doc_amount(self):
"""first make an instance and then you can use this method."""
os.mkdir(f"{self.directory}") # makes the folder
for num_of_txt_files in range(self.amount_files):
# This loop is for number of text files.
num_of_txt_files += 1
self.make_numbers()
Note That:
1.The only problem i have is with those parallel loops going with each other, i don't know if i can make the code simplified.(please let me know if it can be simplified.)
2.The code works and has no error.
if there is any way to make this code simplified please help me.Thank you.
1.The only problem i have is with those parallel loops going with each other, i don't know if i can make the code simplified.(please let me know if it can be simplified.)
Even if that's not the only problem, indeed there are unnecessarily many loops in the above code. It takes not more than two: one loop over the files, one loop over the numbers; see below.
2.The code works and has no error.
That's false, since you want all the phone numbers to be unique. As said, the code has no provision that the written phone numbers are unique. To achieve this, it's easiest to generate all unique numbers once at the start.
def num_doc_amount(self):
"""first make an instance and then you can use this method."""
os.mkdir(self.directory) # makes the folder
un = sample(range(10**7), self.amount_files*self.amount_numbs) # all the unique numbers
# This loop is for number of text files:
for num_of_txt_file in range(self.amount_files):
with open("%s/%s.%s"%(self.directory, 1+num_of_txt_file, self.format), 'w') as file:
# This loop maintains the amount of phone numbers in each .txt:
for number in range(self.amount_numbs):
# writes one number from the whole amount of numbers
file.write("%s. - %s%07d\n" %(1+number, self.first_fourz, un.pop()))

Is there anyway to stop implicity wait during try/except?

I have a selenium script that automates signing up on a website. During the process, I have driver.implicity_wait(60) BUT there is a segment of code where I have a try/except statement where it tries to click something but if it can't be found, it continues. The issue is that if the element isn't there to be clicked, it waits 60 seconds before doing the except part of code. Is there anyway I can have it not wait the 60 seconds before doing the except part? Here is my code:
if PROXYSTATUS==False:
driver.find_element_by_css_selector("img[title='中国大陆']").click()
else:
try:
driver.find_element_by_css_selector("img[title='中国大陆']").click()
except:
pass
In other words if a proxy is used, a pop up will occasionally display, but sometimes it won't. That's why I need the try/except.
You can use set_page_load_timeout to change the default timeout to a lower value that suits you.
You will still need to wait for some amount of time, otherwise you might simply never click on the element you are looking for, because your script will be faster than the page load.
In the try block u can lower the timeout say 10 by using driver.implicity_wait(10) or even to 0. Place this before the find element statement in the try block. Add a finally block and set this back to 60 driver.implicity_wait(60).

Create automated report from web data

I have a set of multiple API's I need to source data from and need four different data categories. This data is then used for reporting purposes in Excel.
I initially created web queries in Excel, but my Laptop just crashes because there is too many querie which have to be updated. Do you guys know a smart workaround?
This is an example of the API I will source data from (40 different ones in total)
https://api.similarweb.com/SimilarWebAddon/id.priceprice.com/all
The data points I need are:
EstimatedMonthlyVisits, TopOrganicKeywords, OrganicSearchShare, TrafficSources
Any ideas how I can create an automated report which queries the above data on request?
Thanks so much.
If Excel is crashing due to the demand, and that doesn't surprise me, you should consider using Python or R for this task.
install.packages("XML")
install.packages("plyr")
install.packages("ggplot2")
install.packages("gridExtra")
require("XML")
require("plyr")
require("ggplot2")
require("gridExtra")
Next we need to set our working directory and parse the XML file as a matter of practice, so we're sure that R can access the data within the file. This is basically reading the file into R. Then, just to confirm that R knows our file is in XML, we check the class. Indeed, R is aware that it's XML.
setwd("C:/Users/Tobi/Documents/R/InformIT") #you will need to change the filepath on your machine
xmlfile=xmlParse("pubmed_sample.xml")
class(xmlfile) #"XMLInternalDocument" "XMLAbstractDocument"
Now we can begin to explore our XML. Perhaps we want to confirm that our HTTP query on Entrez pulled the correct results, just as when we query PubMed's website. We start by looking at the contents of the first node or root, PubmedArticleSet. We can also find out how many child nodes the root has and their names. This process corresponds to checking how many entries are in the XML file. The root's child nodes are all named PubmedArticle.
xmltop = xmlRoot(xmlfile) #gives content of root
class(xmltop)#"XMLInternalElementNode" "XMLInternalNode" "XMLAbstractNode"
xmlName(xmltop) #give name of node, PubmedArticleSet
xmlSize(xmltop) #how many children in node, 19
xmlName(xmltop[[1]]) #name of root's children
To see the first two entries, we can do the following.
# have a look at the content of the first child entry
xmltop[[1]]
# have a look at the content of the 2nd child entry
xmltop[[2]]
Our exploration continues by looking at subnodes of the root. As with the root node, we can list the name and size of the subnodes as well as their attributes. In this case, the subnodes are MedlineCitation and PubmedData.
#Root Node's children
xmlSize(xmltop[[1]]) #number of nodes in each child
xmlSApply(xmltop[[1]], xmlName) #name(s)
xmlSApply(xmltop[[1]], xmlAttrs) #attribute(s)
xmlSApply(xmltop[[1]], xmlSize) #size
We can also separate each of the 19 entries by these subnodes. Here we do so for the first and second entries:
#take a look at the MedlineCitation subnode of 1st child
xmltop[[1]][[1]]
#take a look at the PubmedData subnode of 1st child
xmltop[[1]][[2]]
#subnodes of 2nd child
xmltop[[2]][[1]]
xmltop[[2]][[2]]
The separation of entries is really just us, indexing into the tree structure of the XML. We can continue to do this until we exhaust a path—or, in XML terminology, reach the end of the branch. We can do this via the numbers of the child nodes or their actual names:
#we can keep going till we reach the end of a branch
xmltop[[1]][[1]][[5]][[2]] #title of first article
xmltop[['PubmedArticle']][['MedlineCitation']][['Article']][['ArticleTitle']] #same command, but more readable
Finally, we can transform the XML into a more familiar structure—a dataframe. Our command completes with errors due to non-uniform formatting of data and nodes. So we must check that all the data from the XML is properly inputted into our dataframe. Indeed, there are duplicate rows, due to the creation of separate rows for tag attributes. For instance, the ELocationID node has two attributes, ValidYN and EIDType. Take the time to note how the duplicates arise from this separation.
#Turning XML into a dataframe
Madhu2012=ldply(xmlToList("pubmed_sample.xml"), data.frame) #completes with errors: "row names were found from a short variable and have been discarded"
View(Madhu2012) #for easy checking that the data is properly formatted
Madhu2012.Clean=Madhu2012[Madhu2012[25]=='Y',] #gets rid of duplicated rows
Here is a link that should help you get started.
http://www.informit.com/articles/article.aspx?p=2215520
If you have never used R before, it will take a little getting used to, but it's worth it. I've been using it for a few years now and when compared to Excel, I have seen R perform anywhere from a couple hundred percent faster to many thousands of percent faster than Excel. Good luck.

Resources