I'm new to Python/AnyTree and am trying to get a list of raw ingredients to make a bunch of Harburgers (Hamburgers at Point San Pablo Harbor near San Francisco - you need to check it out if you're in the area!! In fact, the winning answer here gets a free Harburger on me next time you're in town!) But I digress...
The question is how can I get access to the 'qty' and 'uom' fields in the tree?
from anytree import Node, RenderTree, PreOrderIter
Harburger=Node("Harburger", children=[
Node("RoundRoll", qty=1, uom='ea'),
Node("GriddleGhee", qty = 1, uom='gm'),
Node("SmashedBurger", qty = 5, uom='oz')])
print(RenderTree(Harburger))
Node('/Harburger')
├── Node('/Harburger/RoundRoll', qty=1, uom='ea')
├── Node('/Harburger/GriddleGhee', qty=1, uom='gm')
└── Node('/Harburger/SmashedBurger', qty=5, uom='oz')
So far, so good.
Now I can traverse the tree, like:
#print ingredients for 5 Harburgers
print([(node.name for node in PreOrderIter(Harburger)])
['Harburger', 'RoundRoll', 'GriddleGhee', 'SmashedBurger']
How can I modify this command to get qty and uom?
I've tried
print([(node.name, node.qty) for node in PreOrderIter(Harburger)])
only to get errors!
The issue your code accessing the extra attributes has is that the top-level Node doesn't have the qty and uom attributes, so when it comes up first in the pre-order tree traversal, the code quits with an exception.
You can fix this in a few ways. One way, which you've commented has worked, is to add the attributes to the root node too.
Another option might be to test for the attributes before using them, with something like:
print([(node.name, node.qty) for node in PreOrderIter(Harburger) if hasattr(node, "qty")])
If you can rely upon your tree only having the two levels (the top-level root node and its children), you can iterate over just the child nodes instead of doing a full traversal. Just use Harburger.children rather than PreOrderIter(Harburger).
Related
I am trying to perform a basic merge operation to add nonexistent nodes and relationships to my graph by going through a csv file row by row. I'm using py2neo v4, and because there is basically no documentation or examples of how to use py2neo, I can't figure out how to actually get it done. This isn't my real code (it's very complicated to handle many different cases) but its structure is basically like this:
import py2neo as pn
graph = pn.Graph("bolt://localhost:###/", user="neo4j", password="py2neoSux")
matcher = pn.NodeMatcher(graph)
tx = graph.begin()
if (matcher.match("Prefecture", name="foo").first()) == None):
previousNode = pn.Node("Type1", name="fo0", yc=1)
else:
previousNode = matcher.match("Prefecture", name="foo").first())
thisNode = pn.Node("Type2", name="bar", yc=1)
tx.merge(previousNode)
tx.merge(thisNode)
theLink = pn.Relationship(thisNode, "PARTOF", previousNode)
tx.merge(theLink)
tx.commit()
Currently this throws the error
ValueError: Primary label and primary key are required for MERGE operation
the first time it needs to merge a node that it hasn't found (i.e., when creating a node). So then I change the line to:
tx.merge(thisNode,primary_label=list(thisNode.labels)[0], primary_key="name")
Which gives me the error IndexError: list index out of range from somewhere deep in the py2neo source code (....site-packages\py2neo\internal\operations.py", line 168, in merge_subgraph at node = nodes[i]). I tried to figure out what was going wrong there, but I couldn't decipher where the nodes list come from through various connections to other commands.
So, it currently matches and creates a few nodes without problem, but at some point it will match until it needs to create and then fails in trying to create that node (even though it is using the same code and doing the same thing under the same circumstances in a loop). It made it through all 20 rows in my sample once, but usually stops on the row 3-5.
I thought it had something to do with the transactions (see comments), but I get the same problem when I merge directly on the graph. Maybe it has to do with the py2neo merge function finding more identities for nodes than nodes. Maybe there is something wrong with how I specified my primarily label and/or key.
Because this error and code are opaque I have no idea how to move forward.
Anybody have any advice or instructions on merging nodes with py2neo?
Of course I'd like to know how to fix my current problem, but more generally I'd like to learn how to use this package. Examples, instructions, real documentation?
I am having a similar problem and just got done ripping my hair out to figure out what was wrong! SO! What I learned was that at least in my case.. and maybe yours too since we got similar error messages and were doing similar things. The problem lied for me in that I was trying to create a Node with a __primarykey__ field that had a different field name than the others.
PSEUDO EXAMPLE:
# in some for loop or complex code
node = Node("Example", name="Test",something="else")
node.__primarykey__ = "name"
<code merging or otherwise creating the node>
# later on in the loop you might have done something like this cause the field was null
node = Node("Example", something="new")
node.__primarykey__ = "something"
I hope this helps and was clear I'm still recovering from wrapping my head around things. If its not clear let me know and I'll revise.
Good luck.
All.
The following section of code below is a snippet from a larger program, using Python 3.x. The colorSkusArray has values:
[[https://us.testcompany.com/eng-us/products/test-productv#443],[https://us.testcompany.com/eng-us/products/test-productv#544],[https://us.testcompany.com/eng-us/products/test-productv#387]]
listTwo = []
for a in range(0, len(colorSkusArray)):
browser.get(colorSkusArray[a])
print(colorSkusArray[a]) # the link to each skus page we'll pull views from
sBin = browser.find_elements_by_xpath("// *[ # id = 'lightSlider'] / li / img")
listOne = []
for b in sBin:
storer = b.get_attribute('src')
print(storer) # the src of each img on the sku's page
listOne.append(storer)
print('NEXT ELEMENT')
listTwo.append(listOne)
del sBin
del storer
del listOne[:]
print(listTwo)
The printout from this reads:
https://us.testcompany.com/eng-us/products/test-productv#443
https://us.testcompany.com/images/is/image/lv/1/PP_VP_L/test-productv#443_PM2_Front%20view.jpg?wid=140&hei=140
https://us.testcompany.com/images/is/image/lv/1/PP_VP_L/test-productv#443_PM1_Side%20view.jpg?wid=140&hei=140
https://us.testcompany.com/images/is/image/lv/1/PP_VP_L/test-productv#443_PM1_Interior%20view.jpg?wid=140&hei=140
https://us.testcompany.com/images/is/image/lv/1/PP_VP_L/test-productv#443_PM1_Other%20view.jpg?wid=140&hei=140
https://us.testcompany.com/images/is/image/lv/1/PP_VP_L/test-productv#443_PM1_Other%20view2.jpg?wid=140&hei=140
NEXT ELEMENT
https://us.testcompany.com/eng-us/products/test-productv#544
https://us.testcompany.com/images/is/image/lv/1/PP_VP_L/test-productv#443_PM2_Front%20view.jpg?wid=140&hei=140
https://us.testcompany.com/images/is/image/lv/1/PP_VP_L/test-productv#443_PM1_Side%20view.jpg?wid=140&hei=140
https://us.testcompany.com/images/is/image/lv/1/PP_VP_L/test-productv#443_PM1_Interior%20view.jpg?wid=140&hei=140
https://us.testcompany.com/images/is/image/lv/1/PP_VP_L/test-productv#443_PM1_Other%20view.jpg?wid=140&hei=140
https://us.testcompany.com/images/is/image/lv/1/PP_VP_L/test-productv#443_PM1_Other%20view2.jpg?wid=140&hei=140
NEXT ELEMENT
https://us.testcompany.com/eng-us/products/test-product#M543
https://us.testcompany.com/images/is/image/lv/1/PP_VP_L/test-productv#443_PM2_Front%20view.jpg?wid=140&hei=140
https://us.testcompany.com/images/is/image/lv/1/PP_VP_L/test-productv#443_PM1_Side%20view.jpg?wid=140&hei=140
https://us.testcompany.com/images/is/image/lv/1/PP_VP_L/test-productv#443_PM1_Interior%20view.jpg?wid=140&hei=140
https://us.testcompany.com/images/is/image/lv/1/PP_VP_L/test-productv#443_PM1_Other%20view.jpg?wid=140&hei=140
https://us.testcompany.com/images/is/image/lv/1/PP_VP_L/test-productv#443_PM1_Other%20view2.jpg?wid=140&hei=140
NEXT ELEMENT
[[], [], []]
The issue I'm having appears(?) to be with the sBin WebElement. What is supposed to happen:
The link of each page visited gets printed: SUCCESS. See the
address before each of the 5 following addresses from storer.
The link to each view (5) for each product gets printed:
UNSUCCESSFUL. See the same 5 links to the same 5 views are printed,
three times over. Each of these three blocks should have 5 unique links, but apparently the same 5 links are being referenced three times over.
The full list of lists (listTwo) should be printing with all its contents: UNSUCCESSFUL. See the three empty lists in listTwo's printout at the bottom of the output.
Regarding 2): I've been looking at this for close to four hours now, and cannot figure out what's going on. All I can guess after debugging for a while is that the sBin variable may not be updating properly. I inserted a del command to reset it at the end of each loop, but this didn't resolve the issue. Otherwise, I don't know why the same 5 src's keep getting appended, despite a new link being passed into the browser.get() method each time.
Regarding 3): I have printed lists of lists before, so I believe PyCharm should be able to handle this printout. Perhaps those I've printed in the past were different in some way (accounting for the difference in output here), but as far as I am aware, they were exactly the same. I've read about using Numpy for printing arrays, but as far as I can tell, it isn't necessary to print here.
I am new to Python and Selenium, so all suggestions and comments are appreciated!
I have resolved 2. By placing browser.quit() at the end of the for loop, the sBin WebElement is now updating properly with each iteration of the loop. I understand why this works, although I don't understand why it is necessary. It doesn't answer my original question, but it does apparently resolve the issue...
Regarding 3., I still don't know what's wrong. If a list (e.g. List[0]) will print out its contents just fine, I don't understand why a jagged list just prints empty [].
I have a set of multiple API's I need to source data from and need four different data categories. This data is then used for reporting purposes in Excel.
I initially created web queries in Excel, but my Laptop just crashes because there is too many querie which have to be updated. Do you guys know a smart workaround?
This is an example of the API I will source data from (40 different ones in total)
https://api.similarweb.com/SimilarWebAddon/id.priceprice.com/all
The data points I need are:
EstimatedMonthlyVisits, TopOrganicKeywords, OrganicSearchShare, TrafficSources
Any ideas how I can create an automated report which queries the above data on request?
Thanks so much.
If Excel is crashing due to the demand, and that doesn't surprise me, you should consider using Python or R for this task.
install.packages("XML")
install.packages("plyr")
install.packages("ggplot2")
install.packages("gridExtra")
require("XML")
require("plyr")
require("ggplot2")
require("gridExtra")
Next we need to set our working directory and parse the XML file as a matter of practice, so we're sure that R can access the data within the file. This is basically reading the file into R. Then, just to confirm that R knows our file is in XML, we check the class. Indeed, R is aware that it's XML.
setwd("C:/Users/Tobi/Documents/R/InformIT") #you will need to change the filepath on your machine
xmlfile=xmlParse("pubmed_sample.xml")
class(xmlfile) #"XMLInternalDocument" "XMLAbstractDocument"
Now we can begin to explore our XML. Perhaps we want to confirm that our HTTP query on Entrez pulled the correct results, just as when we query PubMed's website. We start by looking at the contents of the first node or root, PubmedArticleSet. We can also find out how many child nodes the root has and their names. This process corresponds to checking how many entries are in the XML file. The root's child nodes are all named PubmedArticle.
xmltop = xmlRoot(xmlfile) #gives content of root
class(xmltop)#"XMLInternalElementNode" "XMLInternalNode" "XMLAbstractNode"
xmlName(xmltop) #give name of node, PubmedArticleSet
xmlSize(xmltop) #how many children in node, 19
xmlName(xmltop[[1]]) #name of root's children
To see the first two entries, we can do the following.
# have a look at the content of the first child entry
xmltop[[1]]
# have a look at the content of the 2nd child entry
xmltop[[2]]
Our exploration continues by looking at subnodes of the root. As with the root node, we can list the name and size of the subnodes as well as their attributes. In this case, the subnodes are MedlineCitation and PubmedData.
#Root Node's children
xmlSize(xmltop[[1]]) #number of nodes in each child
xmlSApply(xmltop[[1]], xmlName) #name(s)
xmlSApply(xmltop[[1]], xmlAttrs) #attribute(s)
xmlSApply(xmltop[[1]], xmlSize) #size
We can also separate each of the 19 entries by these subnodes. Here we do so for the first and second entries:
#take a look at the MedlineCitation subnode of 1st child
xmltop[[1]][[1]]
#take a look at the PubmedData subnode of 1st child
xmltop[[1]][[2]]
#subnodes of 2nd child
xmltop[[2]][[1]]
xmltop[[2]][[2]]
The separation of entries is really just us, indexing into the tree structure of the XML. We can continue to do this until we exhaust a path—or, in XML terminology, reach the end of the branch. We can do this via the numbers of the child nodes or their actual names:
#we can keep going till we reach the end of a branch
xmltop[[1]][[1]][[5]][[2]] #title of first article
xmltop[['PubmedArticle']][['MedlineCitation']][['Article']][['ArticleTitle']] #same command, but more readable
Finally, we can transform the XML into a more familiar structure—a dataframe. Our command completes with errors due to non-uniform formatting of data and nodes. So we must check that all the data from the XML is properly inputted into our dataframe. Indeed, there are duplicate rows, due to the creation of separate rows for tag attributes. For instance, the ELocationID node has two attributes, ValidYN and EIDType. Take the time to note how the duplicates arise from this separation.
#Turning XML into a dataframe
Madhu2012=ldply(xmlToList("pubmed_sample.xml"), data.frame) #completes with errors: "row names were found from a short variable and have been discarded"
View(Madhu2012) #for easy checking that the data is properly formatted
Madhu2012.Clean=Madhu2012[Madhu2012[25]=='Y',] #gets rid of duplicated rows
Here is a link that should help you get started.
http://www.informit.com/articles/article.aspx?p=2215520
If you have never used R before, it will take a little getting used to, but it's worth it. I've been using it for a few years now and when compared to Excel, I have seen R perform anywhere from a couple hundred percent faster to many thousands of percent faster than Excel. Good luck.
Currently, I made a tool to rename view numbers (“Detail Number”) on a sheet based on their location on the sheet. Where this is breaking is the transactions. Im trying to do two transactions sequentially in Revit Python Shell. I also did this originally in dynamo, and that had a similar fail , so I know its something to do with transactions.
Transaction #1: Add a suffix (“-x”) to each detail number to ensure the new numbers won’t conflict (1 will be 1-x, 4 will be 4-x, etc)
Transaction #2: Change detail numbers with calculated new number based on viewport location (1-x will be 3, 4-x will be 2, etc)
Better visual explanation here: https://www.docdroid.net/EP1K9Di/161115-viewport-diagram-.pdf.html
Py File here: http://pastebin.com/7PyWA0gV
Attached is the python file, but essentially what im trying to do is:
# <---- Make unique numbers
t = Transaction(doc, 'Rename Detail Numbers')
t.Start()
for i, viewport in enumerate(viewports):
setParam(viewport, "Detail Number",getParam(viewport,"Detail Number")+"x")
t.Commit()
# <---- Do the thang
t2 = Transaction(doc, 'Rename Detail Numbers')
t2.Start()
for i, viewport in enumerate(viewports):
setParam(viewport, "Detail Number",detailViewNumberData[i])
t2.Commit()
Attached is py file
As I explained in my answer to your comment in the Revit API discussion forum, the behaviour you describe may well be caused by a need to regenerate between the transactions. The first modification does something, and the model needs to be regenerated before the modifications take full effect and are reflected in the parameter values that you query in the second transaction. You are accessing stale data. The Building Coder provides all the nitty gritty details and numerous examples on the need to regenerate.
Summary of this entire thread including both problems addressed:
http://thebuildingcoder.typepad.com/blog/2016/12/need-for-regen-and-parameter-display-name-confusion.html
So this issue actually had nothing to do with transactions or doc regeneration. I discovered (with some help :) ), that the problem lied in how I was setting/getting the parameter. "Detail Number", like a lot of parameters, has duplicate versions that share the same descriptive param Name in a viewport element.
Apparently the reason for this might be legacy issues, though im not sure. Thus, when I was trying to get/set detail number, it was somehow grabbing the incorrect read-only parameter occasionally, one that is called "VIEWER_DETAIL_NUMBER" as its builtIn Enumeration. The correct one is called "VIEWPORT_DETAIL_NUMBER". This was happening because I was trying to get the param just by passing the descriptive param name "Detail Number".Revising how i get/set parameters via builtIn enum resolved this issue. See images below.
Please see pdf for visual explanation: https://www.docdroid.net/WbAHBGj/161206-detail-number.pdf.html
I have read the answer at Best strategies for reading J code, but in my console, I am not seeing the box structure. My console has the original command returned.
Please help me with I have missed.
Thank you.
#Eelvex correctly identifies the underlying mechanisms which control display of results in J. However, 5!:2 only applies to its argument, and (9!:3) 2 will last throughout the session, but be reset to linear display (i.e. the "original command" you're seeing) when you restart J or spin up a new session.
If you want this change to persist, you can configure the session manager with your preference. That way, every time you start a new J session, the display will default to boxed.
How to configure the IDE depends on which version of J you're running, and which frontend you're using. Below is a screenshot of how to configure J6, which currently has the largest install base (though it's been superceded by J7 and J8).
If you tell us which version of J and which frontend you're using, we can give you instructions specific to your environment. Otherwise, the general advice is to add the line (9!:3) 2 to your personal startup script.
Note also that other displays are available:
Linear: "original command returned"
Paren: similar, but command is fully parenthesized
Box: Shows structure of command
Tree: shows relationship between components of command
You can configure the session manager to display commands in one or more of these formats. Try selecting multiple checkboxes in the configuration dialog, or listing out several options in the argument to 9!:3, as in:
(9!:3) 5 2 NB. Linear followed by boxed
+/ % #
+/ % #
+-----+-+-+
|+-+-+|%|#|
||+|/|| | |
|+-+-+| | |
+-----+-+-+
Finally, while these fancy display formats are useful when you're learning J, you're likely to find them less useful as your understanding of the language develops, and ultimately they may become distracting. That's why the default display is "linear", and most J developers end up using it, supplemented by the occasional analysis using 5!:2, 5!:4, or even userland tools like "map display"
http://www.jsoftware.com/pipermail/general/2008-July/032128.html
http://www.jsoftware.com/pipermail/general/2008-July/032123.html
http://www.jsoftware.com/pipermail/general/2008-July/032126.html
You can use a representation like this:
f =: (+/) % #
5!:2 <'f'
┌─────┬─┬─┐
│┌─┬─┐│%│#│
││+│/││ │ │
│└─┴─┘│ │ │
└─────┴─┴─┘
or you can turn on the box representation by setting it's global parameter:
(9!:3) 2
f
┌─────┬─┬─┐
│┌─┬─┐│%│#│
││+│/││ │ │
│└─┴─┘│ │ │
└─────┴─┴─┘
Boxed representation (9!:3) 2 was default in previous versions of J.
See also Learning J/Chapter 27