Measure multiple distances in google maps/wikimapia

Measure multiple distances in google maps/wikimapia - excel

I have a excel sheet with 583 cities as origins and 8 cities as destinations. I have to find distance between each pair of these origins and destinations. Since, it will be a cumbersome task, is their a way I can input the origins and destinations from excel and get output as distance between the cities?

I believe the easiest way would be to use Google Maps Distance Matrix API Web Services. There is an example URL request provided.
A request will look like this:
http://maps.googleapis.com/maps/api/distancematrix/json?origins=Vancouver+BC|Seattle&destinations=San+Francisco|Victoria+BC&mode=bicycling&language=fr-FR&key=API_KEY
You can replace the origins and destinations within the url with yours separated by pipes (|). It will take some work to copy them over manually. You might consider exporting the file as a .csv from Excel and using a scripting language to automate this process. See for example the urllib package of python.
Also note that as a free user, you will be limited on the number of origin and destination pairs you can put in one URL request.

Check out my article on how to get geolocation parameters in Excel using Google services and calculate distances between addresses:
http://www.analystcave.com/excel-calculate-distances-between-addresses/

Related

Sort results in Azure Maps Search

i'm using AzureMaps Search and i'm trying to retrieve all POI(point of interest) in a location, but i can't find in any documentation how to sort, for example by distance my results
Someone has same problem?
https://atlas.microsoft.com/search/poi/json?subscription-key=key&api-version=1.0&query=restaurant&lat=45&lon=9

I don't think the current Search POI API provides sorting as part of the API itself. So, you'll have to do that in-memory afterwards. The results are sorted by "score"(relevancy) by default.

There is no way to order by results with POI,I guess what you're looking for here. As per the best practices, you could use nearby-search
https://atlas.microsoft.com/search/address/json?subscription-key={subscription-key}&api-version=1&query=400%20Broad%20Street%2C%20Seattle%2C%20WA&countrySet=US

If you would like straight line distances you can loop through the results can calculate the distances using the haversine formula. If using the Azure Maps Web SDK, you can use the atlas.math.getDistanceTo function instead. Once you calculate a distance to each point, then you can sort accordingly.
If you want to get the driving distance to each point there are two approaches you can take;
Use the Route Matrix API. This is fairly easy to use, would be less error prone than the second option below and the response is easy enough to work with. Only negative with this approach is that you will need to S1 pricing tier to access this service and each cell would generate a transaction which can get expensive fast.
Use the Routing Directions API with a large number of waypoints that go from your origin to each destination and back (A->B->A->C....). This will be a bit more work to understand the results and if any leg of the route is unrouteable for any reason, the whole route calculation would fail. However, this would be significantly cheaper than option one as you can use S0 pricing tier which has free limits and this would only generate 1 transaction in most cases (if you have a large number of locations then you might need to break them up and spread across a few calls). Because this would calculate the route from the origin to each destination and back, you twice as many calculations are made than you need which could make this slower than approach 1. When parsing the response you would look at the odd indexed route legs as those would go from the origin to each destination. In some scenarios it might be desirable to know the travel time from the destinations to the origin (i.e. how long would it take all employees to get to work), in which case the even numbered legs is what you would want to use.
Again, once you have the distance, or better yet, travel time, you can then sort the results accordingly.

Is there an easy way to find specific text in a PDF, highlight it and print OR save to new file?

So what I'm hoping to do is automate mapping out process of desk locations in a building layout map that is in PDF format.
I work with a deployment team that handles IT equipment requests.. and basically we get requests with a list of user names and their location in the building i.e -floor number and desk location number.
my current routine is to print out a copy of the pdf floor plan for each floor and manually highlight all the desk locations on the map with a pen before i plan out my route for the day based on the request low-high priority.. this can be a bit tedious when we get a large number of requests - And so i was wondering if i can just feed Python the list of desk locations and have it generate a PDF with all the locations already highlighted for me - and possibly adding some additional comments to the page if possible :)

Yes, this is possible. I've deployed it for work so cannot share the code.
Three approaches:
1. cv2 template matching (problem is you'll need to setup each desk as a template)
2. pytesseract (for OCR) with a 'guess & check' algorithm that narrows the field and a fuzzy text match to handle the poor OCR quality (this is slow -- will take several minutes per desk).
3. If the desks are numbered logically you could simply create a coordinate dictionary w/ offsets for 'related' desks (this is the fasted, most accurate method)

Azure Search Distance filter with variable distance

Suppose I have the following scenario:
A search UI to allow individuals to find plumbers who are able to service their home location.
When a plumber enters their info into the system, they provide their coordinates and a maximum distance they are willing to travel.
The individual can then enter their home coordinates and should be presented with a list of plumbers who are eligible.
Looking at the Azure Search geo.distance function, I cannot see how to do this. Scenarios where the searcher provides a distance are well covered but not where the distance is different for each search record.
The documentation provides the following example:
$filter=geo.distance(location, geography'POINT(-122.131577 47.678581)') le 10
This works correctly but if I try and change the 10 to the maxDistance field, it fails with
Comparison must be between a field, range variable or function call
and a literal value
My requirement seems fairly basic but am now wondering if this is currently possible with Azure Search?

I found an azure feedback suggestion asking for this feature but no news on if/when it will be implemented. Therefore it is safe to assume that this scenario is not currently supported.

To add to Paul's answer, one possible workaround is to use a conservatively large constant value instead of referencing the maxDistance field in your $filter expression. Then, you can filter the resulting list of plumbers on the client to take each plumber's max distance into account and produce final list of plumbers.

Adwords API BulkMutateJobService Fetch Global Monthly Search Volume For Multiple Keywords

I've just gotten into the Adwords API for an upcoming project and I need something quite simple actually, but I want to go about it the most efficient way.
I need code to retrieve the Global Monthly Search Volume for multiple keywords (in the millions). After reading about BulkMutateJobService, in the Google documentation they say
If you want to perform a very large number of operations (up to 500,000) on your AdWords campaigns and child objects, use BulkMutateJobService
But later on in the page they give limits of
No more than 25 OperationStream objects are allowed.
No more than 10,000 operations are allowed per BulkMutateRequest.
No more than 100 request parts are allowed.
as well as a few others. See source here http://code.google.com/apis/adwords/docs/bulkjobs.html
Now, my questions:
What do these numbers mean? If I have 1 million words I need information on, do I only need to perform 2 requests with 500K words each?
Also, are there examples of code that does this task?
I only need Global Monthly Search Volume and CPC for each keyword. I've searched online, but to no avail have I found any good example or anything leaning in that direction that utilizes BulkMutateJobService.
Any links, resources, code, advice you can offer? All is appreciated.

The BulkMutateJobService only allows for mutates, or changes, to the account. It does not provide the bulk retrieval of information.
You can fetch monthly search volume for keywords using the TargetingIdeaService. If you use it in STATS mode you can include up to 2500 keywords per request.
Estimates CPC values are obtained from the TrafficEstimatorService. You can request up to 500 keywords per request.
FYI, there is an official AdWords API Forum that you can ask questions on.

Integrating with 500+ applications

Our customers use 500+ applications and we would like to integrate these applications with our. What is the best way to do that? These applications are time registration applications and common for most of them is that they can export to csv or similar, some of them are actually home-brewed excel sheets where time is registered.
The best idea so far is to create our own excel sheet, which can be used to integrate with all these applications. The integrations could be in the form of cells containing something like ='[c:\export.csv]rawdata'!$A$3 Where export.csv is the csv file exported from the time registration applications. Can you see a better way to integrate against all these applications? It should be mentioned that almost all our customers have Microsoft Office.
Edit: Answers to the excellent questions from Pontus Gagge:
How similar are the data in the different applications?
I assume that since they time registration applications, they will have some similarities, but I assume that some will register the how long time one has worked in total for a whole month, while others will spesify for each day. If Excel is chosen, I believe that many of the differences could be ironed out using basic formulas.
What quality is the data?
The quality of the data can vary so basic validation must be undertaken, a good way is also to make it transparent for the customers, how our application understands their input, so they are responsible.
How large amounts of data are you talking about?
There will be information about the time worked for up to 50 employees.
Is the integration one-way only?
Yes
With what frequency should information be transferred?
Once per month (when they need to pay salaries).
How often do the applications themselves change, and how often does your product change?
If their application is a home-brewed Excel sheet, then I assume it will change once a year (due for example a mistake someone). If it is a standard proper time registration application, then I do not believe they are updated more often than every fifth year or so, as it is a very stabile concept.
Should the integration be fully automatic or can your end users trigger a data transfer?
They can surely trigger data transfer. The users are often dedicated to the process so they can be trained at doing it, which means that they could make up to, say 30, mouse clicks in order to integrate each month.
Will the customers have somebody to monitor the integrations?
As we have many customers, many of them should be able to undertake the integration themselves. We will though be able to assist them over the telephone. We cannot, though undertake the integration ourselves because we would then be responsible for any errors due to user mistakes, etc.
Does the phrase 'integration spaghetti' mean anything to you...?
I am looking for ideas from the best chefs to cook a nice large portion of that.

You need to come up with a common data format, and a way to translate the individual data formats to the common format. There's really no way around this - any solution you come up with will have to do this in one way or the other. It's the essential complexity of what you're doing.
The bigger issue is actually variances within the source data, in terms of how things like dates are stored, missing columns, etc. Doing a generic conversion for CSV to move columns around is comparatively easy.

I would also look at CSV and then use an OLEDB connection against the CSV file for importing.

If you try to make something that can interface to any data structure in the universe (and 500 is plenty close enough), it is guaranteed to be a maintenance nightmare. Instead I would approach this from multiple angles:
Devise an interface into which a human can enter this data already in the proper format. With 500+ clients, I'd make this a small, raw but functional browser based site that users can use to enter this information manally. This is the fall-back. At the end of the day, a human can re-key the information into the site and solve the import issue. Ideally, everyone would use this instead of their own format. Data entry people are cheap.
Similar to above, but expanded, I would develop a standard application or standardize on an off-the-shelf application that can be used to replace their existing format. This might take more time than #1. The goal would be to only do one-time imports of these varying data schemas into the application and be done with them for good.
The nice thing about spreadsheets is that you can do anything anywhere. The bad thing about spreadsheets is that you can do anything anywhere. With CSV or a spreadsheet there is simply no way to enforce data integrity and thus consistency (which is the primary goal) on the data. If the source data is already in a database, then that is obviously simpler.
I would be inclined to use database format into which each of these files need to be converted rather than a spreadsheet (e.g. use something like Jet (MDB)). If you have non-Windows users then that will make it harder and you might have to use a spreadsheet. The problem is that it is too easy for the user to change their source structure, break their upload and come crying to you. If a given end user has a resident expert, they can find a way of importing the data into that database format . If you are that expert, then I would on a case-by-case basis, write something that would import into that database format. XML would be the other choice, but that will likely take more coding than an import/export into a database format.
Standardization of the apps (even having all the sources in a database format instead of a spreadsheet would help) and control over the data schema is the ultimate goal rather than permitting a gazillion formats. There really is no nice answer other than standardization. Otherwise, you are having to write a converter for every Tom-Dick-and-Harry format and again when someone changes the source format.

With a multitude of data sources mapping each one correctly to an intermediate format is not trivial. Regular expressions are good with a finite set of known data formats. Multipass can help when data is ambiguous without context (month,day fields and have several days of data), and also help defeat data entry errors. But it seems as this data is connected to salaries there needs a good reliable transfer.
An import configuring trick
Get the customer to make a set of training data in the application. It should have a "predefined unique date" and each subsequent data field have a number corresponding to the target data field in your application. On importing your application needs to recognise the predefined date, determine the unique translation required and effect the displaying/saving of this "mapping key", and stop the import. eg If you expect "Duration hours" in field two then get the user to enter 2 in the relevant field which might be "Attendance hours".
On subsequent runs, and with the mapping definition key, import becomes a fairly easy process of translation.
Note on terms
"predefined date" - must be historical, say founding date of your company?, might need to be in PC clock settable range.
"mapping key" - could be string of hex digits and nybble based so tractable to workout
The entered code can be extended to signify required conversions ie customer's application has durations in days and your application expects it in hours.
Interfacing with windows programs (in order if increasing fragility)
Ye Olde saving as CSV file
Print to operating system printer that is setup as a text file/pdf, then scavenge the data out of that
Extract data via the application interface control, typically ActiveX for several windows programs ie like Matlab's Spreadsheet Link
Read native file format xls format ie like Matlab's xlsread
Add an additional intermediate spreadsheet sheet that has extended cell references ie ='[filename]rawdata'!$A$3

Have a look at Teiid by JBoss: http://jboss.org/teiid
Also consider using SOA - e.g., if you're on Java, try JBoss SOA platform: http://www.jboss.com/resources/soa/?intcmp=1004

Use a simple XML format. A non-technical person can easily understand a simple XML format (and could even identify basic problems with XML documents that are not well-formed).
Maybe use a DTD (or even better an XML schema) to do very basic validation, and then supplement this with an XSL stylesheet to do more validation with better error reporting. (An XSL stylesheet simply converts from XML to something else and so can be generate readable error messages.)
The advantage of this approach is that web browsers such as Internet Explorer can apply the XSL stylesheets. A customer need only spend at most a day enhancing their applications or writing excel macros to generate the XML data in the format that you specify.
Recent versions of Excel have support for converting spreadsheet data to XML, and can even validate against schemas.
Once the data passes the XSL validation checks, you have validated XML data.

If you have heaps of data and heaps of money, you could look at existing data management and cleansing tools:
http://www-01.ibm.com/software/data/infosphere/datastage
http://www-01.ibm.com/software/data/infosphere/qualitystage
But even then, you'll likely need to follow kyoryu's suggestion assuming you have 500+ data formats. The problem isn't your side. You need them to standardize their output formats if you have no control over their apps. CSV is likely the easiest. You could even send them a excel template to help them along.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string