Privacy protected RPC Channels - rpc

As the title says, i'm wondering what these are. I know about RPC but what do the prefixed adjectives mean?
P.S.: I came across this term while reading about DPAPI # MSDN: http://msdn.microsoft.com/en-IN/library/ms995355.aspx#windataprotection-dpapi_topic04 (in bullet point number 6)

I am not sure, but maybe they refer to this: https://datatracker.ietf.org/doc/html/draft-ietf-nfsv4-rpcsec-gssv3-04

Related

Smart search for acronyms in Salesforce

In Salesforce's Service Cloud one can enable the out of the box search function where the user enters a term and the system searches all parts of the database for a match. I would like to enable smart searching of acronyms so that if I spell an organizations name the search functionality will also search for associated acronyms in the database. For example, if I search type in American Automobile Association, I would also get results that contain both "American Automobile Association" and "AAA".
I imagine such a script would involve declaring that if the term being searched contains one or more spaces or periods, take the first letter of the first word and concatenate it with the letters that follow subsequent spaces or periods.
I have unsuccessfully tried to find scripts for this or articles on enabling this functionality in Salesforce. Any guidance would be appreciated.
Interesting question! I don't think there's a straightforward answer but as it's standard search functionality, not 100% programming related - you might want to cross-post it to salesforce.stackexchange.com
Let's start with searchable fields list: https://help.salesforce.com/articleView?id=search_fields_business_accounts.htm&type=0
In Setup there's standard functionality for Synonyms, quite easy to use. It's not a silver bullet though, applies only to certain objects like Knowledge Base (if you use it). Still - it claims to work on Cases too so if there's "AAA" in Case description it should still be good enough?
You could also check out the trick with marking a text field as indexed and/or external ID and adding there all your variations / acronyms: https://success.salesforce.com/ideaView?id=08730000000H6m2 This is more work, to prepare / sanitize your data upfront but it's not a bad idea.
Similar idea would be to use Tags although that could explode in size very quickly. It's ridiculous to create a tag for every single company.
You can do some really smart things in data deduplication rules. Too much to write it all here, check out the trailhead: https://trailhead.salesforce.com/en/modules/sales_admin_duplicate_management/units/sales_admin_duplicate_management_unit_2 No idea if it impacts search though.
If you suffer from bad address data there are State & Country picklists, no more mess with CA / California / SoCal... https://resources.docs.salesforce.com/204/latest/en-us/sfdc/pdf/state_country_picklists_impl_guide.pdf Might not help with Name problem...
Data.com cleanup might help. Paid service I think, no idea if it affects search too. But if enabling it can bring these common abbreviations into your org - might be better than reinventing the wheel.

UML Modelling Qustion

I am in the process of developing some Use Cases for a mobile mapping/gps app. Users will be able to use this app similar to google maps. I was wondering if anyone had valuable input into some possible use cases.
Here are some I came up with myself:
1) Get Current Location
2) Set Destination Location
3) Create Fastest Route
4) View Alternative Routes
5) Traffic Estimation on Routes
If someone could help me elaborate or comment on my direction that would be helpful!
My first impulse was to flag your question as "too broad", as you basically ask to help you with your requirements analysis. But I give a few hints.
Your 5 use cases don't look bad. But they appear to be just a first rough sketch of the functionality of your app, that needs to be refined. A good model, be it UML or anything else, must be helpful for its reader to gain some insight. Now these 5 use cases could be named by any child who has seen a navigation device once in her life. To be meaningful, questions like the following should be asked and will probably lead to a more detailed use case analysis:
How are destination locations selected? If there is more than one place called Jacksonville, how will the user be informed, and how will she select the right one? Does selecting the location consist of more than one step, say country - city - road - block, to assist the user?
How do map data get into the application?
What kind of alternative routes are considered and how should they be calculated?
How will traffic data get into the application?
Try to put yourself into the developer's position. Which questions will she need to have clarified to build the right application?

Viewing Linux TCP TCB

I need to find out information that should be kept in the TCP transmission control block (TCB), specifically I need to find out what sequence numbers are used for any particular session.
I have posted to other forums, looked through the procfs, searched Google, sent myself links from lmgtfy (dot) com :) No luck.
If there is no tool or hints in the procfs, would it be possible to somehow find out where this sort of information exists in memory and gather it from there such as using to dd to copy /dev/mem?
Thanks for any help on this in advance!!!!!
Well, I guess you first need to know what sequence number is and why it's been used, then you can look at particular implementation of sequence number generation.
Sequence numbers are 32 bit field and it's been used to mark every packet uniquely as if they can be acknowledged. And, being acknowledged
is important and it's an important feature of tcp for maintaining connection reliability. A full details could be found at the TCP rfc (http://www.ietf.org/rfc/rfc793.txt - section 3.3).
Now if you need to find out how Linux does it, you need to look at net/ipv4/tcp_ipv4.c::tcp_v4_init_sequence() this is used for generating ISN (Initial Sequence Number) whenever a new connection to be established and how latter sequence numbers are generated that's explained at rfc. So, look at the implementation of tcp_v4_init_sequence() and rfc, that will help you to understand uses and implemetation of sequence number usefully. Hope this will help!

what algorithm does freebase use to match by name?

I'm trying to build a local version of the freebase search api using their quad dumps. I'm wondering what algorithm they use to match names? As an example, if you go to freebase.com and type in "Hiking" you get
"Apo Hiking Society"
"Hiking"
"Hiking Georgia"
"Hiking Virginia's national forests"
"Hiking trail"
Wow, a lot of guesses! I hope I don't muddy the waters too much by not guessing too.
The auto-complete box is basically powered by Freebase Suggest which is powered, in turn, by the Freebase Search service. Strings which are indexed by the search service for matching include: 1) the name, 2) all aliases in the given language, 3) link anchor text from the associated Wikipedia articles and 4) identifiers (called keys by Freebase), which includes things like Wikipedia article titles (and redirects).
How the various things are weighted/boosted hasn't been disclosed, but you can get a feel for things by playing with it for while. As you can see from the API, there's also the ability to do filtering/weighting by types and other criteria and this can come into play depending on the context. For example, if you're adding a record label to an album, topics which are typed as record labels will get a boost relative to things which aren't (but you can still get to things of other types to allow for the use case where your target topic doesn't hasn't had the appropriate type applied yet).
So that gives you a little insight into how their service works, but why not build a search service that does what you need since you're starting from scratch anyway?
BTW, pre-Google the Metaweb search implementation was based on top of Lucene, so you could definitely do worse than using that as your starting point. You can read some of the details in the mailing list archive
Probably they use an inverted Index over selected fields, such as the English name, aliases and the Wikipedia snippet displayed. In your application you can achieve that using something like Lucene.
For the algorithm side, I find the following paper a good overview
Zobel and Moffat (2006): "Inverted Files for Text Search Engines".
Most likely it's a trie with lexicographical order.
There are a number of algorithms available: Boyer-Moore, Smith-Waterman-Gotoh, Knuth Morriss-Pratt etc. You might also want to check up on Edit distance algorithms such as Levenshtein. You will need to play around to see which best suits your purpose.
An implementation of such algorithms is the Simmetrics library by the University of Sheffield.

Open source projects for email scrubbing generating structured data from unstructured source?

Don't know where to start on this one so hopefully you guys can clear up my question. I have project where email will be searched for specific words/patterns and stored in a structured manner. Something that is done with Trip it.
The article states that they developed a DataMapper
The DataMapper is responsible for taking inbound email messages
addressed to plans [at] tripit.com and transforming them from the
semi-structured format you see in your mail reader into a highly
structured XML document.
There is a comment that also states
If you're looking to build this yourself, reading a little bit about
Wrappers and Wrapper Induction might be helpful
I Googled and read about wrapper induction but it was just too broad of a definition and didn't help me understand how one would go about solving such problem.
Is there some open source project out there that does similar things?
There are a couple of different ways and things you can do to accomplish this.
The first part, which involves getting access to the email content I'll not answer here. Basically, I'll assume that you have access to the text of emails, and if you don't there are some libraries that allow you to connect java to an email box like camel (http://camel.apache.org/mail.html).
So now you've got the email so then what?
A handy thing that could help is that lingpipe (http://alias-i.com/lingpipe/) has an entity recognizer that you can populate with your own terms. Specifically, look at some of their extraction tutorials and their dictionary extractor (http://alias-i.com/lingpipe/demos/tutorial/ne/read-me.html) So inside of the lingpipe dictionary extractor (http://alias-i.com/lingpipe/docs/api/com/aliasi/dict/ExactDictionaryChunker.html) you'd simply import the terms you're interested in and use that to associate labels with an email.
You might also find the following question helpful: Dictionary-Based Named Entity Recognition with zero edit distance: LingPipe, Lucene or what?
Really a very broad question, but I can try to give you some general ideas, which might be enough to get started. Basically, it sounds like you're talking about an elaborate parsing problem - scanning through the text and looking to apply meaning to specific chunks. Depending on what exactly you're looking for, you might get some good mileage out of a few regular expressions to start - things like phone numbers, email addresses, and dates have fairly standard structures that should be matchable. Other data points might benefit from some indicator words - the phrase "departing from" might indicate that what follows is an address. The natural language processing community also has a large tool set available for text processing - check out things like parts of speech taggers and semantic analyzers if they're appropriate to what you're trying to do.
Armed with those techniques, you can follow a basic iterative development process: For each data point in your expected output structure, define some simple rules for how to capture it. Then, run the application over a batch of test data and see which samples didn't capture that datum. Look at the samples and revise your rules to catch those samples. Repeat until the extractor reaches an acceptable level of accuracy.
Depending on the specifics of your problem, there may be machine learning techniques that can automate much of that process for you.

Resources