Handle different layout of document using kofax - kofax

I am new to KofaxTotalAgility solution, but i am well aware of OCR, OMR and recognition mechanism.
I have two forms in one folder, A and B.
both of them are identical, but due to manual scan there are slight axes change, say 20 pixel right shift, so Layout is slightly differ.
Layout of Image A and Image B are different, position of a form in a page are not fix.
I know, other solution like "abbyy fine reader", provide flexilayout where we can handle this by finding the text and setting up right left top down to automatically identify zones.
As i have started learning KofaxTotalAgility, i am unaware of all option provided by "kofax Transformation Designer".
My question is which Locator should i use, i am currently using/working-on advance zone locator and for one document(Image A) which i set as a reference, extraction is proper. But for other,(Image B) due to layout mismatch text/box field are not getting extracted.
Can anyone point out the right direction from where i can get this case handled properly.
I know, i am asking direct option/solution, any help is highly appreciable.

In general, Kofax Transformations has two groups of locators:
Deterministic. You tell the locator precisely what to do, and how to do it (similar to an imperative approach when programming)
Probabilistic. You just tell your locator what to extract, and it works out the rest (based on AI).
Here's a (non-exhaustive) diagram I created the other day:
When working with forms, you might be tempted to rely on forms-specific locators such as the Advanced Zone Locator. While this locator can account for fields "moving around", for example due to images being jolted, zoomed, or distorted, there are certain limitations. Other locators don't have these limitations - the format locator for example allows you to define a certain pattern (a Regular Expression) that should be matched along with a keyword that has to be found somewhere around that pattern.
For your example, you could create a regex like M|F|X, and then define "Gender" as the keyword that needs to be present on the left.
However, any locator that's ruled by determinism follows Murphy's law - at some point that keyword might change. There could be different languages. And maybe additional letters for certain genders might be added; ultimately breaking your extraction logic.
Enter AI - while Murphy's law still applies when using Group Locators, the difference here is that users can train the system to pick up the new data. Said locator will automatically work out the best way to extract that piece of data. If you used a format locator, the customer would need to get back to you to add additional expressions, or have the keywords changed.
In your particular case, I'd try to use a Trainable Group Locator first. If you already know what you're looking for - for example SSNs that you have somewhere in a database, go for the Database Locator. Use Format Locators as a last resort, as tempting as they may be. Advanced Zone Locators are useful when you deal with forms, but I find myself using them almost exclusively for handprint or checkbox recognition.

Related

Do I use foreach for 2 different inspection checks in activity diagram?

I am new in doing an activity and currently, I am trying to draw one based on given description.
I enter into doubt on a particular section as I am unsure if it should be 'split'.
Under the "Employee", the given description is as follows:
Employee enter in details about physical damage and cleanliness on the
machine. For the cleanliness, there must be a statement to indicate
that the problem is no longer an issue.
As such, I use a foreach as a means to describe that there should be 2 checks - physical and cleanliness (see diagram in the link), before it moves on to the next activity under the System - for the system to record the checks.
Thus, am I on the right track? Thank you in advance for any replies.
Your example is no valid UML. In order to make it proper you need to enclose the fork/join in a expansion region like so:
A fork/join does not accept any sematic labels. They just split the control flow into several parallel ones which join at the end.
However, this still seems odd since you would probably have some control for the different inspections being entered. So I'd guess there's a decision which loops through multiple inspection entries. Personally I use regions only for handling interrupts. ADs are nice to a certain level. But sometimes a tabular text (like suggested by Cockburn) is just easier to write and read. Graphical programming is not the ultimate answer (unlike 42).
First, the 'NO' branch of the decision node must lead somewhere (at the end?).
After, It differs if you want to show the process for ONE or MULTIPLE inspections. But the most logical way is to represent the diagram for an inspection, because you wrote inspection without S ! If you want represent more than one inspection, you can use decision and merge node to represent loop that stop when there is no more inspection.

Internal Search optimization for relevance

My team is using Solr and I have a question regarding it.
There are some search terms which doesn't gives relevant results or results which should have been displayed. For example:
Searching for Macy's without the apostrophe like "Macys" doesnt give back any result for Macy's.
Searching for JPMorgan vs JP Morgan gives different result
Searching for IBM doesn't show results which contains its full name i.e International business machine.
How can we improve and optimize such cases so that it gets applied to all, even to the one we didn't catch apart from these 3 above?
Any suggestions?
All these issues are related to how you process the incoming text for those fields. You'll have to create a filter chain for the field - and possibly use multiple fields for different use cases and prioritize those using qf - that processes the input values to do what you want.
Your first case can be solved by using a PatternReplaceFilter to remove any apostrophes - depending on your use case and tokenizer you might want to use the CharFilter version, as it processes the text before it's split into multiple tokens.
Your second case is a straight forward synonym filter or a WordDelimiterFilter, where you expand JPMorgan to "JP Morgan", or use the WordDelimiterFilter to expand case changes into separate tokens. That'll also allow you to search for JP and get JPMorgan related entries. These might have different effects on score, use debugQuery=true to see exactly how each term in your query contributes to the score.
The third case is in general the same as the second case. You'll have to create a decent synonym word list for the terms used, and this is usually something you build as you get feedback from your users, from existing dictionaries and from domain knowledge. There's also the option of preprocessing text using NLP, or in this case, something as primitive as indexing the initials of any capitalized words after each other could help.

Smart search for acronyms in Salesforce

In Salesforce's Service Cloud one can enable the out of the box search function where the user enters a term and the system searches all parts of the database for a match. I would like to enable smart searching of acronyms so that if I spell an organizations name the search functionality will also search for associated acronyms in the database. For example, if I search type in American Automobile Association, I would also get results that contain both "American Automobile Association" and "AAA".
I imagine such a script would involve declaring that if the term being searched contains one or more spaces or periods, take the first letter of the first word and concatenate it with the letters that follow subsequent spaces or periods.
I have unsuccessfully tried to find scripts for this or articles on enabling this functionality in Salesforce. Any guidance would be appreciated.
Interesting question! I don't think there's a straightforward answer but as it's standard search functionality, not 100% programming related - you might want to cross-post it to salesforce.stackexchange.com
Let's start with searchable fields list: https://help.salesforce.com/articleView?id=search_fields_business_accounts.htm&type=0
In Setup there's standard functionality for Synonyms, quite easy to use. It's not a silver bullet though, applies only to certain objects like Knowledge Base (if you use it). Still - it claims to work on Cases too so if there's "AAA" in Case description it should still be good enough?
You could also check out the trick with marking a text field as indexed and/or external ID and adding there all your variations / acronyms: https://success.salesforce.com/ideaView?id=08730000000H6m2 This is more work, to prepare / sanitize your data upfront but it's not a bad idea.
Similar idea would be to use Tags although that could explode in size very quickly. It's ridiculous to create a tag for every single company.
You can do some really smart things in data deduplication rules. Too much to write it all here, check out the trailhead: https://trailhead.salesforce.com/en/modules/sales_admin_duplicate_management/units/sales_admin_duplicate_management_unit_2 No idea if it impacts search though.
If you suffer from bad address data there are State & Country picklists, no more mess with CA / California / SoCal... https://resources.docs.salesforce.com/204/latest/en-us/sfdc/pdf/state_country_picklists_impl_guide.pdf Might not help with Name problem...
Data.com cleanup might help. Paid service I think, no idea if it affects search too. But if enabling it can bring these common abbreviations into your org - might be better than reinventing the wheel.

Identifying teeth area within a mouth region in an image

I am trying to do an image manipulation wherein the user would be prompted to enclose the mouth portion within an image. Once the user does that my application should identify the pixels that would identify the teeth (the color varying from white to yellow) and then I would like to brighten only those pixel. Could anyone give me a guidance on how to proceed?
Your question is quite honestly, very broad as an adequate answer will touch on a large number of areas.
Nevertheless, what you are trying to attempt is called Pattern Recognition. More specifically, your problem is geared towards image-analysis, dealing mainly in Template Matching:
Template matching is a technique in digital image processing for
finding small parts of an image which match a template image. It can
be used in manufacturing as a part of quality control, a way to
navigate a mobile robot, or as a way to detect edges in images.
The Template Matching page has a C-like language sample algorithm which demonstrates what you are attempting to do (identify a specific color within an image).
As for how to go about this, generally speaking you will have to load an image, store it into an array then try to manipulate it as the algorithm suggests:
One way to perform template matching on color images is to decompose
the pixels into their color components and measure the quality of
match between the color template and search image using the sum of the
absolute differences (SAD) computed for each color separately.
Of course, there are numerous projects in various languages that do that for you. My suggestion is to read up a bit more on the topic, pick a language, and attempt a solution using libraries as necessary.
One book that you might find to be very helpful is the classic Phillips: Image Processing in C even if you don't want to use C. Why? Because it pores over a lot of the algorithmic details in how they work, and how to implement them. And, its free too.

Preferable Tag Cloud Visualization Formats

Out of curiosity, I would love to know what tag clouds formats best serve the purpose of discovery of more and more (relevant)content?
I am aware of 3 formats, but don't know which one is the best.
1) delicious one - color shading
2) The standard one with font size variations -
3) The one on this site - numbers showing importance/usage.
So which ones do you prefer? and why?
Edit:
Thanks to the answers below, I now have much more understanding of tag cloud visualization techniques.
4) Parallel Tag Clouds - a simple use of parallel coordinates technique. I find it more organized and readable.
5) voroni diagram - more useful for identifying tag relationships and making decisions based on them. Doesn't serves our purpose of discovery of relevant content.
6) Mind maps - They are good and can be employed to step by step filter content.
I found some more interesting techniques here - http://www.cs.toronto.edu/~ccollins/research/index.html
I really do think that depends on the content of the information and the audience. What's relevant to one is not relevant to another. If an audience is more specialized, then they will be more likely to think along the same lines, but it would still need to be analyzed and catered to by the content provider.
There are also multiple paths that a person can take to "discover more". Take the tag "DNS" for example. You could drill down to more specific details like "UDP Port 53" and "MX Record", or you could go sideways with terms like "IP address" "Hostname" and "URL". A Voronoi diagram shows clusters, but wouldn't handle the case where general terms could be related to many concepts. Hostname mapping to "DNS", "HTTP", "SSH" etc.
I've noticed that in certain tag clouds there's usually one or two items that are vastly larger than the others. Those sorts of things could be served by a mind map, where one central concept has others radiating out from it.
For the cases of lots of "main topics" where a mind map is inappropriate, there are parallel coordinates but that would be baffling to many net users.
I think that if we found an extremely well organized way of sorting clusters of tags while preserving links between generalities and specificities, that would be somewhat helpful to AI research.
In terms of which I personally prefer, I think the numeric approach is nice because infrequently referenced tags are still presented at a readable font size. I also think SO does it this way because they have vastly more tags to cover than the average size based cloud a la the standard.
I would go with #2 out of the options you listed above.
1 - The human eye recognizes and comprehends size differences much more effectively than color, when the color scale is along the same spectrum (ie, various blues as opposed to discrete individual colors).
3 - Requires the user to scan the full list and mathematically compare each individual number while scanning. No real meaningful relationship between tags without a lot of work on the users part.
So, going with #2, there are several considerations to take into account:
Keep the tags alphabetical. This affords the user another method of searching and establishes a known relationship between each (assuming they know the alphabet!). If they're unordered, it's just a crapshoot to find a single one.
If size comparison is absolutely critical (this usually isn't the case, as you can scale up each level by a certain percentage or pixel amount), use a monospaced font. Otherwise, certain letter combinations may end up looking larger than they actually are.
Don't include any commas, pipes, or other dividers. You're already going to have a lot of data in a small area - no need to clutter it up with debris. Space the tags out with a decent amount of padding, of course. Just don't double the number of visual elements by adding more than just the data.
Set a min/max font size and scale between those. There are situations where one tag may be so popular that visually it may appear exponentially larger than the others. Likewise, you don't want a tag to end up rendering at 1px! Set the min/max and adjust between as necessary.
size adjusted voroni diagram
- it shows which tags are inter-related
My favorite tag cloud format is the Wordle format. It looks great and it also does a pretty good job of fitting a lot of tags in a small space.

Resources