Table values moved after being added to old QGIS version: behaves normally on current version - excel

I am teaching a man the basics of QGis for a project he needs to do at his work. He has very little computer knowledge and would like to standardise the work as much as possible (specific workarounds would complicate it too much for him). His QGis version is 3.16 "Hannover" and as this is a work laptop he does not have permission to download a newer version.
We have been having problem with one specific table. The first few rows are below, written exactly as they are originally.
Baum-Nr. Baumart BHD Alter Y X Biotopbaum Klassifizierung Bemerkungen
1 Buche 86 120 49.1356 11.0488 A Altbaum Freistellen !!!
2 Kiefer 45 100 49.13561 11.04883 Hlb,Bs,Th Höhlenbaum
3 Kiefer 32 100 49.13571 11.04579 Hlb,Sw,Th Höhlenbaum
4 Kiefer 74 120 49.13513 11.0495 A Altbaum
After adding it from Excel to QGis through "add vector layer", the header "Klassifizierung" becomes one of the coordinates and I believe one of the columns are switched (unfortunately, I can't remember specifics. This is a small side job and I haven't had time to look into it for days. I should have taken a photo, but this isn't possible anymore). We have attempted to copy the column into a new Excel document and transferring it to QGis again, and this time the headers were shoved one cell to the right such that "Y" was placed over "X" and "Biotopbaum" over "Klassifizierung", for example.
I could not find a way to fix the import problem in his laptop. He e-mailed me the problematic table and I opened it successfully in my QGis 3.26 "Buenos Aires".
I believe this may be a problem with his QGis version, but it is curious that we only encountered it with this one table. All other tables we have worked with have the same headers and the same kind of data on their respective rows.
Is this a known problem, or have other people faced similar situations? Could someone explain what could be causing it? Would there be a way to fix it such that we can successfully import the table without having to edit it in QGis? This is not a solution the man would accept.
Thank you in advance.

Remove the commas in the Biotopbaum field or replace them with a less common delimiter. In fact, remove all punctionation (e.g., Baum-Nr. >> remove the period ".").
Also save the table into a csv format and try to import.

Related

Create automated report from web data

I have a set of multiple API's I need to source data from and need four different data categories. This data is then used for reporting purposes in Excel.
I initially created web queries in Excel, but my Laptop just crashes because there is too many querie which have to be updated. Do you guys know a smart workaround?
This is an example of the API I will source data from (40 different ones in total)
https://api.similarweb.com/SimilarWebAddon/id.priceprice.com/all
The data points I need are:
EstimatedMonthlyVisits, TopOrganicKeywords, OrganicSearchShare, TrafficSources
Any ideas how I can create an automated report which queries the above data on request?
Thanks so much.
If Excel is crashing due to the demand, and that doesn't surprise me, you should consider using Python or R for this task.
install.packages("XML")
install.packages("plyr")
install.packages("ggplot2")
install.packages("gridExtra")
require("XML")
require("plyr")
require("ggplot2")
require("gridExtra")
Next we need to set our working directory and parse the XML file as a matter of practice, so we're sure that R can access the data within the file. This is basically reading the file into R. Then, just to confirm that R knows our file is in XML, we check the class. Indeed, R is aware that it's XML.
setwd("C:/Users/Tobi/Documents/R/InformIT") #you will need to change the filepath on your machine
xmlfile=xmlParse("pubmed_sample.xml")
class(xmlfile) #"XMLInternalDocument" "XMLAbstractDocument"
Now we can begin to explore our XML. Perhaps we want to confirm that our HTTP query on Entrez pulled the correct results, just as when we query PubMed's website. We start by looking at the contents of the first node or root, PubmedArticleSet. We can also find out how many child nodes the root has and their names. This process corresponds to checking how many entries are in the XML file. The root's child nodes are all named PubmedArticle.
xmltop = xmlRoot(xmlfile) #gives content of root
class(xmltop)#"XMLInternalElementNode" "XMLInternalNode" "XMLAbstractNode"
xmlName(xmltop) #give name of node, PubmedArticleSet
xmlSize(xmltop) #how many children in node, 19
xmlName(xmltop[[1]]) #name of root's children
To see the first two entries, we can do the following.
# have a look at the content of the first child entry
xmltop[[1]]
# have a look at the content of the 2nd child entry
xmltop[[2]]
Our exploration continues by looking at subnodes of the root. As with the root node, we can list the name and size of the subnodes as well as their attributes. In this case, the subnodes are MedlineCitation and PubmedData.
#Root Node's children
xmlSize(xmltop[[1]]) #number of nodes in each child
xmlSApply(xmltop[[1]], xmlName) #name(s)
xmlSApply(xmltop[[1]], xmlAttrs) #attribute(s)
xmlSApply(xmltop[[1]], xmlSize) #size
We can also separate each of the 19 entries by these subnodes. Here we do so for the first and second entries:
#take a look at the MedlineCitation subnode of 1st child
xmltop[[1]][[1]]
#take a look at the PubmedData subnode of 1st child
xmltop[[1]][[2]]
#subnodes of 2nd child
xmltop[[2]][[1]]
xmltop[[2]][[2]]
The separation of entries is really just us, indexing into the tree structure of the XML. We can continue to do this until we exhaust a path—or, in XML terminology, reach the end of the branch. We can do this via the numbers of the child nodes or their actual names:
#we can keep going till we reach the end of a branch
xmltop[[1]][[1]][[5]][[2]] #title of first article
xmltop[['PubmedArticle']][['MedlineCitation']][['Article']][['ArticleTitle']] #same command, but more readable
Finally, we can transform the XML into a more familiar structure—a dataframe. Our command completes with errors due to non-uniform formatting of data and nodes. So we must check that all the data from the XML is properly inputted into our dataframe. Indeed, there are duplicate rows, due to the creation of separate rows for tag attributes. For instance, the ELocationID node has two attributes, ValidYN and EIDType. Take the time to note how the duplicates arise from this separation.
#Turning XML into a dataframe
Madhu2012=ldply(xmlToList("pubmed_sample.xml"), data.frame) #completes with errors: "row names were found from a short variable and have been discarded"
View(Madhu2012) #for easy checking that the data is properly formatted
Madhu2012.Clean=Madhu2012[Madhu2012[25]=='Y',] #gets rid of duplicated rows
Here is a link that should help you get started.
http://www.informit.com/articles/article.aspx?p=2215520
If you have never used R before, it will take a little getting used to, but it's worth it. I've been using it for a few years now and when compared to Excel, I have seen R perform anywhere from a couple hundred percent faster to many thousands of percent faster than Excel. Good luck.

2 Sequential Transactions, setting Detail Number (Revit API / Python)

Currently, I made a tool to rename view numbers (“Detail Number”) on a sheet based on their location on the sheet. Where this is breaking is the transactions. Im trying to do two transactions sequentially in Revit Python Shell. I also did this originally in dynamo, and that had a similar fail , so I know its something to do with transactions.
Transaction #1: Add a suffix (“-x”) to each detail number to ensure the new numbers won’t conflict (1 will be 1-x, 4 will be 4-x, etc)
Transaction #2: Change detail numbers with calculated new number based on viewport location (1-x will be 3, 4-x will be 2, etc)
Better visual explanation here: https://www.docdroid.net/EP1K9Di/161115-viewport-diagram-.pdf.html
Py File here: http://pastebin.com/7PyWA0gV
Attached is the python file, but essentially what im trying to do is:
# <---- Make unique numbers
t = Transaction(doc, 'Rename Detail Numbers')
t.Start()
for i, viewport in enumerate(viewports):
setParam(viewport, "Detail Number",getParam(viewport,"Detail Number")+"x")
t.Commit()
# <---- Do the thang
t2 = Transaction(doc, 'Rename Detail Numbers')
t2.Start()
for i, viewport in enumerate(viewports):
setParam(viewport, "Detail Number",detailViewNumberData[i])
t2.Commit()
Attached is py file
As I explained in my answer to your comment in the Revit API discussion forum, the behaviour you describe may well be caused by a need to regenerate between the transactions. The first modification does something, and the model needs to be regenerated before the modifications take full effect and are reflected in the parameter values that you query in the second transaction. You are accessing stale data. The Building Coder provides all the nitty gritty details and numerous examples on the need to regenerate.
Summary of this entire thread including both problems addressed:
http://thebuildingcoder.typepad.com/blog/2016/12/need-for-regen-and-parameter-display-name-confusion.html
So this issue actually had nothing to do with transactions or doc regeneration. I discovered (with some help :) ), that the problem lied in how I was setting/getting the parameter. "Detail Number", like a lot of parameters, has duplicate versions that share the same descriptive param Name in a viewport element.
Apparently the reason for this might be legacy issues, though im not sure. Thus, when I was trying to get/set detail number, it was somehow grabbing the incorrect read-only parameter occasionally, one that is called "VIEWER_DETAIL_NUMBER" as its builtIn Enumeration. The correct one is called "VIEWPORT_DETAIL_NUMBER". This was happening because I was trying to get the param just by passing the descriptive param name "Detail Number".Revising how i get/set parameters via builtIn enum resolved this issue. See images below.
Please see pdf for visual explanation: https://www.docdroid.net/WbAHBGj/161206-detail-number.pdf.html

How can I programatically copy and paste with source formatting in PowerPoint 2010?

I am currently automating some PowerPoint 2010 functions in Groovy using Scriptom - though this problem may be generic to any PowerPoint automation approach (ie more of a "VBA macro" issue than the particular environment I'm using?).
(Scriptom allows you to use ActiveX or COM Windows components from Groovy. Underneath the hood it uses the Jacob library (Java COM Bridge), I believe. The underlying code is similar to what I'd use in a VBA macro or other Microsoft automation component, and is based on the PowerPoint 2010 Object API.)
My current code works well and opens PowerPoint visibly and does a range of functions on it - except for a component where I "copy and paste" slides from one document to another, "keeping source formatting".
I have tried two attemps to do this copy and paste step, both leading to a different problem. I wonder if anyone has thoughts on solving either (or both?) of these problems:
Method 1: I use a basic "copy" and "paste" methods, suggested by various others, namely:
sourceSlide.Copy()
destinationSlide = destinationPresentation.Slides.Paste(slideIndex+i-1)
destinationSlide.Design = sourceSlide.Design
destinationSlide.ColorScheme = sourceSlide.ColorScheme
destinationSlide.FollowMasterBackground = sourceSlide.FollowMasterBackground
... and so on copying formats...
That is, I manually copy all the formats across to keep slide-formatting. This is the method used prior to PowerPoint 2010. I've actually got this working, however to copy the formats I loop over each slide in the "source" slidepack and do the copy/paste code above. In this loop, the following line (alone) is problematic:
destinationSlide.Design = sourceSlide.Design
This line runs incredibly slowly once the destination SlidePack has a large number of "Designs" in the SlideMaster. I am copying a source slide-pack of 19 slides, each of which has a different SlideMaster Design Theme (that's how it comes to me). This single line of code takes about 0.01 seconds for copying the first slide over, but by the time it comes to the final slide in the loop, the single line of code takes over 20 seconds to run each time. Thus, copying the first five slides might take <1 second, but the total 20 slides takes around 100 seconds in total, with all the latter slides taking longer and longer to run just this single line. The rest of the code races by!
The slow-down isn't linear, and gets even worse beyond 20 slides. It isn't related to the content on the final slide(s), but seems to be that as the number of SlideMaster "designs/themes" increases it slows exponentially to copy across the "sourceSlide.Design". I realise having a different "Design" object for each slide is a bit of a waste, but I don't own the initial source presentation, and often they do come to me like this, with just slight difference between the Designs for each slide. If I remove the "destinationSlide.Design" line the time it takes can reduce from 100+ seconds to around 1 second!
Method 2: In order to avoid this, and given I'm using PowerPoint 2010, I tried to use the following code, instead:
sourceSlide.Copy()
def destinationPresentation = objPpt.Presentations.Open(destinationFilename)
destinationPresentation.CommandBars.ExecuteMso("PasteSourceFormatting")
I believe this should provide direct access to PowerPoint 2010 "Paste with Source Formatting" functionality. However, this fails with a "null pointer exception" at the ExecuteMso("PasteSourceFormatting") line.
What am I doing wrong? Is there any way to speed up the slow line in Method 1? Why is method 2 not working at all? It looks like "destinationPresentation.CommandBars" is not null, but that the "ExecuteMso" line throws a null pointer exception.
Are there any other suggestions for efficiently "copy and paste" slides that should work in a reasonable time frame for 20-100 slides, even if there are multiple different designs/themes?
Thanks, in advance, for any ideas.
The problem with method 2 is I was using:
destinationPresentation.CommandBars.ExecuteMso("PasteSourceFormatting")
Whereas this should have been:
destinationPresentation.Application.CommandBars.ExecuteMso("PasteSourceFormatting")
Using this code, I no longer get the null pointer exception.
I've submitted this as an answer to assist anyone who makes a similar error in future.
That being said, I still find that the performance of this approach (method 2) is not significantly better than the manual "copy and paste formats" method (method 1). In both cases, the performance with "paste with source formatting" functionality is many, many times slower than a normal "paste" - and takes around 2 minutes to do the paste of about 20 slides (each with its own design template). This is reduced to less than a second if I use "destination formatting", or don't have each slide with its an individual design template.
This may just be an issue with PowerPoint 2010 performance, however, so I will accept this answer unless someone has more information that would provide a better solution to the performance aspect of the original query.
Not sure it helps, but you can do this sort of thing with Apache POI:
#Grab( 'org.apache.poi:poi-ooxml:3.10-beta1' )
import org.apache.poi.xslf.usermodel.XMLSlideShow
new File( '/tmp/Presentation1.pptx' ).withInputStream { p1 ->
new File( '/tmp/Presentation2.pptx' ).withInputStream { p2 ->
// Load our 2 presentations
inpptx = new XMLSlideShow( p1 )
outpptx = new XMLSlideShow( p2 )
// Add slide 1 from inpptx to the end of outpptx
outpptx.createSlide().importContent( inpptx.slides[ 0 ] )
// Save it out again to a 3rd presentation
new File( '/tmp/Presentation3.pptx' ).withOutputStream { out ->
outpptx.write( out )
}
}
}

Increasing the number of suggests shown in omnibox

Is it in anyway possible to increase the number of suggests that your extension may show in the omnibox?
By default it looks like 5 rows is the limit. I've read about a command line switch to change the number of rows (--omnibox-popup-count) but I am really interested in dynamically being able to set this in my extension.
5 rows isn't really enough for the information my extension want to show.
[Update 2018 - thanx version365 & Aaron!]
» Not hardcoded anymore! chrome://flags/#omnibox-ui-max-autocomplete-matches . Credit goes to #version365 answering below: http://stackoverflow.com/a/47806290/234309 « – Aaron Thoma
..[historical] detail: since the removal of --omnibox-popup-count flag (http://codereview.chromium.org/2013008) in May 2010 the number [was] hardcoded:
const size_t AutocompleteResult::kMaxMatches = 6;
a discussion two years later on the chromium-discuss mailing list about the 'Omnibox default configuration' "concluded"
"On lower performing machines generating more result would slow down the result display. The current number look like a good balance."
[which is not a very valid argument, considering the probably infinitesimal overhead retrieving more items than the hard-wired number]
why the heck is there no chromium fork that removes stupid UX limitations like this (or the no tabbar in fullscreen mode)^^
Since there is no more --omnibox-popup-count flag; you can use another flag which is new.
chrome://flags/#omnibox-ui-max-autocomplete-matches lets you select a maximum of 12 rows.
In fact there is no more --omnibox-popup-count flag
http://code.google.com/p/chromium/issues/detail?id=40083
So I think there is no way to enlarge the omnibox.

Not using colnames when reading .xls files with RODBC

I have another puzzling problem.
I need to read .xls files with RODBC. Basically I need a matrix of all the cells in one sheet, and then use greps and strsplits etc to get the data out. As each sheet contains multiple tables in different order, and some text fields with other options inbetween, I need something that functions like readLines(), but then for excel sheets. I believe RODBC the best way to do that.
The core of my code is following function :
.read.info.default <- function(file,sheet){
fc <- odbcConnectExcel(file) # file connection
tryCatch({
x <- sqlFetch(fc,
sqtable=sheet,
as.is=TRUE,
colnames=FALSE,
rownames=FALSE
)
},
error = function(e) {stop(e)},
finally=close(fc)
)
return(x)
}
Yet, whatever I tried, it always takes the first row of the mentioned sheet as the variable names of the returned data frame. No clue how to get that solved. According to the documentation, colnames=FALSE should prevent that.
I'd like to avoid the xlsReadWrite package. Edit : and the gdata package. Client doesn't have Perl on the system and won't install it.
Edit:
I gave up and went with read.xls() from the xlsReadWrite package. Apart from the name problem, it turned out RODBC can't really read cells with special signs like slashes. A date in the format "dd/mm/yyyy" just gave NA.
Looking at the source code of sqlFetch, sqlQuery and sqlGetResults, I realized the problem is more than likely in the drivers. Somehow the first line of the sheet is seen as some column feature instead of an ordinary cell. So instead of colnames, they're equivalent to DB field names. And that's an option you can't set...
Can you use the Perl-based solution in the gdata instead? That happens to be portable too...

Resources