Haskell data structure with efficient inexact lookup by key? - haskell

I have data keyed by Data.Time.Calendar.Day and need to efficiently look it up. Some dates are missing, when I try to look up by a missing key, I want to get data attached to the closest existing key, somewhat like std::map::lower_bound.
Any suggestions for existing libraries that can do this? I searched around for a while and only found maps supporting exact key lookups.
Thanks.

Did you check Data.Map.Lazy? In particular, I guess you could use the functions lookupLE and lookupGT, or similar. The complexity of these functions is O(log n), and similar functions exist in Data.Map.Strict.

A suitable combination of Data.Map's splitLookup and findMin/findMax will do the trick.

Related

Alternative Hash Functions for Dicts

Currently I am using hash(frozenset(my_dict.items())) to hash my dicts.
However it feels that once in a while I had issues with it (Also explained here: https://stackoverflow.com/a/5884123/2516892) and would like to have alternative approaches that are considered to be more stable/safe.
What would be a good recommendation?

Keep list of hashs

I am working on a small project to keep my skills from completely rusting
I am generating a lot of hashes(in his case md5) and I need to check if I've seen that hash before so I wanted to keep it in a list
whats the best way to list them that I can look if they exist in pior to doing calculations
The hash itself is already a key of sorts. Your best bet is a hash table. In a properly implemented hash table, you can check for the existence of a key in constant time. Common hash table implementations with this feature are C# Dictionaries, Python's dict type, PHP array (which are actually Maps, not arrays), Perl's hashes % and Ruby's Hash. If you included details of what language you're working in, an example wouldn't be too hard to lookup.

Solr - Enriching the TermsComponent answer

I'm using Solr 3.5.0 (with WebSphere Commerce). While performing a search, commerce use the suggestion tool to suggest (auto-complete) search terms regarding the letters already typed on the search box.
Currently WebSphere Commerce is using the Solr's TermsComponent. But one of my new requirement is to be abble to enrich the list of suggested terms.
Do you know is there is any way to do that by creating a plain text dictionary, using an other solr component, ... ?
Thanks for reading,
and for your help.
Regards,
Dekx.
I think a plain-text dictionary probably wouldn't be a usable data source (even if you could use it, search linearly through a plain-text file would probably be too slow). If you create an index from you dictionary, you could probably incorporate it in the TermsComponent as a shard (see the TermsComponent documentation, under the heading "Distributed Search Support").
I don't believe TermsComponent supports searching multiple fields, so you'll want to make sure the same field name is used for the terms in the dictionary that you want to use (that is, if you are looking at the "name" field in the index, then create a "name" field in your indexed dictionary as well, rather than a "dictionaryentry" field)
Just to my mind, though, I fail to understand what the value this would be. Generally, it's intended to look at the terms available in the index on that field. "Enriching" it with more data, would just be providing suggestions that it won't actually be able to find when searching. Of course, I don't really know about your search implementation, but in most cases, that would certainly be my thought.

Best way query a database for nearby lat/longs?

I have a set or lat/longs stored in a db. I want to query the db and return documents that are within range of another lat/long. I know how to determine the distance between two sets but I don't want to have to do that for every entry in the db. What is the best way to achieve this?
Thanks very much.
Perhaps you could use Geospatial Indexing to achieve this...
If that's no good, I actually built a node.js addon to perform nearest neighbor searches called node-kdtree. It could be used to find the closest n points, and is fairly quick since it is just a wrapper to an underlying C library. But it sounds like it would be a poor choice for your needs because you would have to pull all of your data out of the DB first in order to process it. With the limited information I have, I suggest that you try using the built-in functionality of mongodb first.

riak search result excerpt

Doeas anybody know if riaksearch has the ability to generate excerpt with highlight points in it similar to lucene does?
Riak Search doesn't expose this functionality out of the box, but with a little work you can create a rough approximation.
Riak Search allows you to feed search results into a MapReduce job. If you do this, then your Map or Reduce function will also get a list of token positions in the document that matched the query (this is exposed as keydata, http://www.basho.com/search.php?q=keydata). Using these positions, you can write code to mark up the document or excerpt portions of text.
I think this functionality will hardly ever be implemented in Riak since it's philisophy implies that it doesn't care about what exactly is stored in the values and therefore does not process them in any meaningful way except providing some metadata like indices.

Resources