Understanding document types in MAG and MSA Data dump - azure

I am currently using Microsoft Academic's datadump for a project and unable to identify the total number of theses & dissertations(T&D) present. Based on their website, 38% of data is categorised to OTHERS type (one among them is T&D). But their 60+GB CSV dump doesn't explicitly indicate the T&D records. Can someone help me with the statistics for T&D or how find the same?
I tried their API too and unable to find using their API too.

The Microsoft Academic Graph does not explicitly segment publication sub-types by thesis or dissertations, which means that neither the API or website will either.
If this is something you'd like to see added, please let them know by using the "Feedback" link in the lower-right corner on the Microsoft Academic website.

Related

Recognize if object is completely or partially visible with Bing/Azure Cognitive API

Wondering, how do I recognize if an image contains a specific object and this object is completely visible (not partially).
Cognitive Services Computer Vision API provides set of tags and description of the image I send, however, there is no information if object is completely or partially represented.
My goal is to have a service that I can upload a picture of, say, car, and get information is it a full car visible or just part of it.
Unfortunately the Computer Vision API is currently unable to perform such a function.
The tags returned do have a 'score' which represents the confidence that this item is in the image. You may find there's some correlation between the confidence and how much of the item is in the image, but you'd need to run some experiments to see how well it matches up. If the object is obscured too much, it may not even detect it all.
Feel free to drop a suggestion on our User Voice, if you think this would be a useful feature.

FourSquare vs. Google Places vs. Yelp API

I am trying to create an app that will help users find restaurants/movie theaters/malls/etc. to hang out based on ratings and distance. Other than just the place itself, I would also like to know more detailed information about the place. For example, if I were to look for parks, I would also like to know if theres a basketball or tennis court there. Ratings and popularity would also be an important aspect to prioritize suggestions.
After looking through all three of the APIs, I could not really find any substantial differences other than their search limits. Could anyone really differentiate each API for me? Maybe even recommend one based on my specific need?
Thanks!
The Foursquare API would fit this use case perfectly because you can supply very specific filters through the API. Also, they have extensive coverage around the world, unlike Google or Yelp.
I would check out the venues/explore endpoint and use a categoryId of Parks. You can use a query parameter of "basketball" or "tennis" to find parks that have courts for these.

TFS Query results with a list of linked work item IDs in Excel?

How can I get a list of linked work item IDs for a set of work items?
Excel-hosted queries preferred. API Sample is acceptable.
Direct DB table query is acceptable (read-only and unsupported of course!)
Many thanks in advance! -Zephan
MORE INFORMATION
UPDATE: No answers for my original Q so broadening scope of acceptable answers as follows:
Answer for TFS2015 (migrating very shortly) or TFS2013 (potentially useful for TFS2015) is preferred over TFS2010
Coding acceptable if there are any APIs or PowerShell cmdlets (MS or community).
Connecting directly (read-only!) to TFS DB tables is acceptable (source tables and related relationship link table names). Yes, directly referencing TFS DB tables is VERY unsupported, read-only, and "AT YOUR OWN RISK." Still beats having to manually copy/paste data or reconstruct list of links in Excel.
ORIGINAL QUESTION & DETAILS
My team uses TFS2010 (soon 2013 or hoping 2015) and VS2010-2015. I need to support traceability reports and analyze/quantify our coverage of ~300 Test Case work items linked to ~400 Requirement work items. Direct Link and Tree queries are close but don't give me related links on the same row as parent work item. Many thanks in advance for your suggestions and any related code fragments.
Example:
3 test cases (Test1, Test2, Test3)
4 Requirements (Req1, Req2, Req3, Req4)
For simplicity let's just use TFS work item IDs to represent each TestN and ReqN. In actuality, I have a keyword to identify my validation requirements (separate from the 1,000's of other requirements in this Team Project). The only Test Case WI I care about for this problem are those linked to one or more Validation Requirement trace-ability.
Scenarios:
1:1 (simple) Test1 is linked to Req1
1:2 (1:n) Test2 is linked to Req2 and Req3
2:1 (n:1) Test3 (and Test2) are both linked to Req3
0:1 (Requirement missing Test coverage) Req4 has no test case links
I have a good coverage gap query by creating a Direct Link query for all Requirements then set "linking filters" to Only return items that do not have the specified links.
Desired output (all tests with list of related work items):
|Test1 | Req1 |
|Test2 | Req2, Req3 |
|Test3 | Req3 |
For row #2 I am OK with other separators or even entire list using same separator (.CSV or TAB delimited).
Skip right to answer now if you have a tidy answer. If not then I added considerable RELATED RESEARCH info below to help kick-start an idea that fits the need! Especially since this hasn't been discoverably solved in the last 5 years :-).
RELATED RESEARCH (loooong but may be useful)
1. Visual Studio Queries
Flat Queries should support a list of linked items out-of-the-box... but it does not. RelatedLinkCount field is handy for knowing if there are any links to chase, but that's it for flat queries. 
Direct Link queries give a list of all direct links, but the related IDs are on rows below the parent work item. I am seriously considering creating a formula to look on the next X rows to build a list of IDs, but this would be fragile especially when over 3 requirements are linked to same test. Still might solve 80% of my tracing needs.
Tree Queries also show links, but on different rows. Additionally they tend to follow just one link type. Ideally I will need list of User Requirements linked to Functional Requirements linked to Test Case(s).
2. Tools / Plug-ins
SmartExcel4TFS (eDEVTech, http://www.modernrequirements.com/smartexcel4tfs/) has 3 reports it supports, but none get me the core data I need in easily used format. At least it is FREE if you have an MSDN Premium subscription.
Requirements to Tests Trace Matrix is super-interesting. Alass, right now I need to go the other way (Requirements linked to a given test case). Also it merges cells and has sub-sections that are hard to manipulate I think. (I may revisit this option though.)
Intersection Traceability Matrix report is WAY too wide for a full 300 x 400 grid :-O.
Work Item Decomposition Matrix also didn't give me desired contents. (though frankly I've forgotten this report layout from when I checked ~1 month ago.)
3. TFS API calls
I have actually avoided this route in favor of native Excel solution... but if I can get an example of Excel VBA code (or other code with link to calling within Excel) I may go this route. At this point I don't have time to dig into rolling my own... but this would be cool assuming performance is acceptable.
Relevant API/code fragments:
Retrieving TFS Results from a Tree Query (Blogs.msdn.com 2012.02.22) - Looks like this would get me the data I need, but it is not in Excel so I'd need a bridge example of some sort calling this within Excel.
Retrieving work items and their linked work items in a single query using the TFS APIs (stackoverflow.com 2012.01.12) - Also looks very promising, but not connected to Excel. Gives hints for 2 level and 3 level nested links and performance consideration (don't make second call for each item returned!)
Retrieving work items using the Team Foundation Server API (pwee167.github.io 2012.09.18) - Excellently written introductory walkthrough blog posting to learn how to build an (ASP.Net MVC3) app that calls TFS APIs to run Flat or Tree queries. Start here if writing C# (which I could do but don't have time/justification unless easy example to integrate with Excel).
How can I query work items and their linked changesets in TFS? (stackoverflow.com 2011.05.10) - I don't need changesets but this has VB code to instantiate new TfsTeamProjectCollection which might work directly in Excel VBA (assuming proper reference is found and added)
var projectCollection = new TfsTeamProjectCollection(
new Uri("http://localhost:8080/tfs"),
new UICredentialsProvider());
OK, that's everything I have gathered on this problem. Please help contribute with the missing magic tool/snippet or follow the info above to build that last bit I have not had time to prototype & debug. Many thanks in advance!! -Zephan

how schema.org can help in nlp

I am basically working on nlp, collecting interest based data from web pages.
I came across this source http://schema.org/ as being helpful in nlp stuff.
I go through the documentation, from which I can see it adds additional tag properties to identify html tag content.
It may help search engine to get specific data as per user query.
it says : Schema.org provides a collection of shared vocabularies webmasters can use to mark up their pages in ways that can be understood by the major search engines: Google, Microsoft, Yandex and Yahoo!
But I don't understand how it can help me being nlp guy? Generally I parse web page content to process and extract data from it. schema.org may help there, but don't know how to utilize it.
Any example or guidance would be appreciable.
Schema.org uses microdata format for representation. People use microdata for text analytics and extracting curated contents. There can be numerous application.
Suppose you want to create news summarization system. So you can use hNews microformats to extract most relevant content and perform summrization onit
Suppose if you have review based search engine, where you want to list products with most positive review. You can use hReview microfomrat to extract the reviews, now perform sentiment analysis on it to identify product has -ve or +ve review
If you want to create skill based resume classifier then extract content with hResume microformat. Which can give you various details like contact (uses the hCard microformat), experience, achievements , related to this work, education , skills/qualifications, affiliations
, publications , performance/skills for performance etc. You can perform classifier on it to classify CVs with particular skillsets
Thought schema.org does not helps directly to nlp guys, it provides platform to perform text processing in better way.
Check out this http://en.wikipedia.org/wiki/Microformat#Specific_microformats to see various mircorformat, same page will give you more details.
Schema.org is something like a vocabulary or ontology to annotate data and here specifically Web pages.
It's a good idea to extract microdata from Web pages but is it really used by Web developper ? I don't think so and I think that the majority of microdata are used by company such as Google or Yahoo.
Finally, you can find data but not a lot and mainly used by a specific type of website.
What do you want to extract and for what type of application ? Because you can probably use another type of data such as DBpedia or Freebase for example.
GoodRelations also supports schema.org. You can annotate your content on the fly from the front-end based on the various domain contexts defined. So, schema.org is very useful for NLP extraction. One can even use it for HATEOS services for hypermedia link relations. Metadata (data about data) for any context is good for content and data in general. Alternatives, include microformats, RDFa, RDFa Lite, etc. The more context you have the better as it will turn your data into smart content and help crawler bots to understand the data. It also leads further into web of data and in helping global queries over resource domains. In long run such approaches will help towards domain adaptation of agents for transfer learning on the web. Pretty much making the web of pages an externalized unit of a massive commonsense knowledge base. They also help advertising agencies understand publisher sites and to better contextualize ad retargeting.

I can't figure out where to start with GIS application development, or which technology to select

I am very new to GIS development, and to be be frank I have no background about it at all. I searched the web but the tutorials I found seemed to assume the reader has some background information.
the thing is that I am confused about what to read or learn, there seems to be lots of technologies, and I feel lost since some speak about openlayers, geoserver, mapserver, google maps, and open street maps.
So here is what I am supposed to develop, and I hove you could give me an advice about which technology to use, and where should I start reading - given that I know almost nothing -.
Case 1: a closed system for about 20 users only, who can specify locations on the map, and the web application will store the latitude and longitude of the locations and show the markers. I wanted to use google maps api, but I cancelled that since there license requires you to purchase the service if the system is a closed one. so what technology should I use in such case? I need a free option, also I will be only using web server, so if the solution includes using my own geoserver, or something like that I won't be able to do it.
Case 2: I am supposed to display the roads and routes between two given points, and probably add some notes on the map. For this I case I can use my own map server/geo server, but again I want your suggestions.
of course the solution need to be open source
finally, I hope you could tell me what to start reading first,
Start by looking over at https://gis.stackexchange.com/, starting with the tags [web-mapping] and
Some topics in particluar you may want to look at are:
https://gis.stackexchange.com/questions/8113/steps-to-start-web-mapping
https://gis.stackexchange.com/questions/8238/where-how-to-learn-about-getting-started-with-web-gis
https://gis.stackexchange.com/questions/13868/looking-for-a-developer-friendly-web-gis
As for skills and tuorials, look at:
https://gis.stackexchange.com/questions/17227/free-gis-workshops-tutorials-and-applied-learning-material
https://gis.stackexchange.com/questions/913/web-gis-development-skill-sets

Resources