What is the role of feature type in AzureML? - azure

I want to know what is the difference between feature numeric and numeric columns in Azure Machine Learning Studio.
The documentation site states:
Because all columns are initially treated as features, for modules
that perform mathematical operations, you might need to use this
option to prevent numeric columns from being treated as variables.
But nothing more. Not what a feature is, in which modules you need features. Nothing.
I specifically would like to understand if the clear feature dropdown option in the fields in the edit metadata-module has any effect. Can somebody give me a szenario where this clear feature-operation changes the ML outcome? Thank you
According to the documentation in ought to have an effect:
Use the Fields option if you want to change the way that Azure Machine
Learning uses the data in a model.
But what can this effect be? Any example might help

As you suspect, setting a column as feature does have an effect, and it's actually quite important - when training a model, the algorithms will only take into account columns with the feature flag, effectively ignoring the others.
For example, if you have a dataset with columns Feature1, Feature2, and Label and you want to try out just Feature1, you would apply clear feature to the Feature2 column (while making sure that Feature1 has the feature label set, of course).

Related

Original Estimates in Azure DevOps - Max value?

Our evolution of using DevOps is continuing (slowly but surely). One thing we've noticed is that some people are trying to but excessive estimates in for their time, but what we really want to be encouraging is for people to be breaking work down into multiple tasks.
Is there a way that we can set our DevOps work items to only accept a maximum value? I've had a look at the 'rules' and there doesn't seem to be anything there to let us do this, and because it's an out of the box field I don't think we can put a value limit against it.
I suppose what I want to understand is whether it would be possible to do this in some way? Could I do something with the existing 'Original Estimate' field or would I have to create a new custom field to have any chance of preventing people from putting in 100 hours for something that's actually more like 2?
If you are also using Boards, you could highlight work items where the original estimate is higher than a certain value. This would not prevent setting these values, but rather encourage the users to put in lower values.
https://learn.microsoft.com/en-us/azure/devops/boards/boards/customize-cards?view=azure-devops
Beware that this might not really help the underlying issue: People must be convinced of the benefits of splitting up tasks, otherwise they will just work around the tooling. Like always putting in the maximum value or not putting in the actual work hours.
Is there a way that we can set our DevOps work items to only accept a
maximum value?
I am afraid that setting the value limit for the Original Estimates field is currently not supported.
As workaround, you could need to create a custom field of type Picklist, and then specify the available values in the picklist.
You could add your request for this feature on our UserVoice site , which is our main forum for product suggestions.After suggest raised, you can vote and add your comments for this feedback. The product team would provide the updates if they view it.

Handle different layout of document using kofax

I am new to KofaxTotalAgility solution, but i am well aware of OCR, OMR and recognition mechanism.
I have two forms in one folder, A and B.
both of them are identical, but due to manual scan there are slight axes change, say 20 pixel right shift, so Layout is slightly differ.
Layout of Image A and Image B are different, position of a form in a page are not fix.
I know, other solution like "abbyy fine reader", provide flexilayout where we can handle this by finding the text and setting up right left top down to automatically identify zones.
As i have started learning KofaxTotalAgility, i am unaware of all option provided by "kofax Transformation Designer".
My question is which Locator should i use, i am currently using/working-on advance zone locator and for one document(Image A) which i set as a reference, extraction is proper. But for other,(Image B) due to layout mismatch text/box field are not getting extracted.
Can anyone point out the right direction from where i can get this case handled properly.
I know, i am asking direct option/solution, any help is highly appreciable.
In general, Kofax Transformations has two groups of locators:
Deterministic. You tell the locator precisely what to do, and how to do it (similar to an imperative approach when programming)
Probabilistic. You just tell your locator what to extract, and it works out the rest (based on AI).
Here's a (non-exhaustive) diagram I created the other day:
When working with forms, you might be tempted to rely on forms-specific locators such as the Advanced Zone Locator. While this locator can account for fields "moving around", for example due to images being jolted, zoomed, or distorted, there are certain limitations. Other locators don't have these limitations - the format locator for example allows you to define a certain pattern (a Regular Expression) that should be matched along with a keyword that has to be found somewhere around that pattern.
For your example, you could create a regex like M|F|X, and then define "Gender" as the keyword that needs to be present on the left.
However, any locator that's ruled by determinism follows Murphy's law - at some point that keyword might change. There could be different languages. And maybe additional letters for certain genders might be added; ultimately breaking your extraction logic.
Enter AI - while Murphy's law still applies when using Group Locators, the difference here is that users can train the system to pick up the new data. Said locator will automatically work out the best way to extract that piece of data. If you used a format locator, the customer would need to get back to you to add additional expressions, or have the keywords changed.
In your particular case, I'd try to use a Trainable Group Locator first. If you already know what you're looking for - for example SSNs that you have somewhere in a database, go for the Database Locator. Use Format Locators as a last resort, as tempting as they may be. Advanced Zone Locators are useful when you deal with forms, but I find myself using them almost exclusively for handprint or checkbox recognition.

Optimiser for excel spreadsheet

I'm a mechanical engineer, and I have developed a pretty cool spreadsheet that I use to size some steel members for lifting beams. The set back is that I need to do some trial and error in the selection of the member until I get one that gets as close to the allowable limits as possible.
What I'm hoping to improve on is to develop a function that based upon a length and weight variable that I enter, the program runs a loop and automatically selects the best member size(s) based upon a list of the members and their physical properties. Is this possible?
Yeah, depending on the complexity, either a simple search through parameters (less than, more than etc) might bring you the answer. You can do it quite easily via Pandas library. Just load up the excel as pandas DataFrame (pandas.read_excel()), which then will allow you to perform the searches on that DataFrame object.
If you want to run some optimization algo, you should look into SciPy's optimize to get what you're looking for based on the input data (it handles unconstrained and constrained functions).
Of course, the question you've stated is quite general, so I only pointed the direction. More info would be better.

Azure Recommendations API's Parameter

I would like to make a recommendation model using Recommendations API on Azure MS Cognitive Services. I can't understand three API's parameters below for "Create/Trigger a build." What do these parameters mean?
https://westus.dev.cognitive.microsoft.com/docs/services/Recommendations.V4.0/operations/56f30d77eda5650db055a3d0
EnableModelingInsights Allows you to compute metrics on the
recommendation model. Valid Values: True/False
AllowColdItemPlacement Indicates if the recommendation should also
push cold items via feature similarity. Valid Values: True/False
ReasoningFeatureList Comma-separated list of feature names to be
used for reasoning sentences (e.g. recommendation explanations).
Valid Values: Feature names, up to 512 chars
Thank you!
That page is missing references to content mentioned at other locations. See this page for a more complete guide...
https://azure.microsoft.com/en-us/documentation/articles/machine-learning-recommendation-api-documentation/
It describes Cold Items in the Rank Build section in the document as...
Features can enhance the recommendation model, but to do so requires the use of meaningful features. For this purpose a new build was introduced - a rank build. This build will rank the usefulness of features. A meaningful feature is a feature with a rank score of 2 and up. After understanding which of the features are meaningful, trigger a recommendation build with the list (or sublist) of meaningful features. It is possible to use these feature for the enhancement of both warm items and cold items. In order to use them for warm items, the UseFeatureInModel build parameter should be set up. In order to use features for cold items, the AllowColdItemPlacement build parameter should be enabled. Note: It is not possible to enable AllowColdItemPlacement without enabling UseFeatureInModel.
It also describes the ReasoningFeatureList in the Recommendation Reasoning section as...
Recommendation reasoning is another aspect of feature usage. Indeed, the Azure Machine Learning Recommendations engine can use features to provide recommendation explanations (a.k.a. reasoning), leading to more confidence in the recommended item from the recommendation consumer. To enable reasoning, the AllowFeatureCorrelation and ReasoningFeatureList parameters should be setup prior to requesting a recommendation build.

OpenMDAO 1.x: recording desvars, constraints and objective

How can you get information about which variables are design vars, objectives or constraints from the information saved by recorders? It would be useful to print this information to a file to track optimization progress during a run. It looks like the RecordingManager.record_iteration doesn't really allow for this at the moment, since you only pass the root system and a metadata dict meant for optimizer settings.
Would it be possible to add an argument to the RecordingManager.record_iteration called e.g. optproblem, which is a dictionary with dictionaries with desvars, constraints and objective?
A simple OptimizationRecorder could then dump out column formatted files with the quantities for easy plotting during the optimisation.
This is something we have on our list of to-do's for the near future. Our current planned approach is going to be to augment the meta-data (already being saved) of variables with labels identifying them as des-vars, objectives, and constraints. Then you could pull that information out as part of a custom case recorder if you want. We plan on doing it this way because it doesn't require modifying the recorder's api at all. I think we'll have something like this implemented in the next month or so.

Resources