Relevance the search result in Lucene - c#-4.0

What I want is :
In the search method i will add an extra parameter say relevance param of type float to setup the cuttoff relevance. So lets say if the cutoff is 60% I want items that are higher than 60% relevance.
Here is current code of search :
say the search text is a
and in lucene file system i have following description:
1) abcdef
2)abc
3)abcd
for now it will fetch all the above three docuements , i want to fetch those which are that are higher than 60% relevance.
//for now i am not using the relevanceparam anywhere in the method :
public static string[] Search(string searchText,float relevanceparam)
{
//List of ID
List<string> searchResultID = new List<string>();
IndexSearcher searcher = new IndexSearcher(reader);
Term searchTerm = new Term("Text", searchText);
Query query = new TermQuery(searchTerm);
Hits hits = searcher.Search(query);
for (int i = 0; i < hits.Length(); i++)
{
float r = hits.Score(i);
Document doc = hits.Doc(i);
searchResultID.Add(doc.Get("ID"));
}
return searchResultID.ToArray();
}
Edit :
what if i set boost to my query
say : query.SetBoost(1.6);-- is this is equivalent to 60 percent?

You can easily do this by ignore those hits that have less than a TopDocs.MaxScore * minRelativeRelevance where minRelativeRelevance should be a value between 0 and 1.
I've modified your code to match the 3.0.3 release of Lucene.Net, and added a FieldSelector to your call to IndexSearcher.Doc to avoid loading non-required fields.
Calling Query.SetBoost(1.6) would only mean that the score calculated by that query would be boosted by 60% (multiplied with 1.6). It may change the ordering of the result if there were other queries involved (in a BooleanQuery, for example), but it wont change which results are returned.
public static String[] Search(IndexReader reader, String searchText,
Single minRelativeRelevance) {
var resultIds = new List<String>();
var searcher = new IndexSearcher(reader);
var searchTerm = new Term("Text", searchText);
var query = new TermQuery(searchTerm);
var hits = searcher.Search(query, 100);
var minScore = hits.MaxScore * minRelativeRelevance;
var fieldSelector = new MapFieldSelector("ID");
foreach (var hit in hits.ScoreDocs) {
if (hit.Score >= minScore) {
var document = searcher.Doc(hit.Doc, fieldSelector);
var hitId = document.Get("ID");
resultIds.Add(hitId);
}
}
return resultIds.ToArray();
}

Related

Azure Table Storage: How can I create a dynamic where clause?

Ok, so I am using Azure Table Storage for the first time in a ASP.NET MVC 3 application.
I have a table entity that has a user ID as its RowKey. I have a list of user IDs and need to get all of the entities that have one of the User IDs.
In traditional SQL it would be a simple OR statement in the where clause that you can dynamically add to:
select * from blah
where userID = '123' or userID = '456' or userID = '789'
but I haven't found the equivalent in the Azure SDK.
Is this possible with Azure Table Storage?
Thanks,
David
The .Net client for Azure Table Storage has features to generate and combined filters.
So that you can write your filter expression like that
string[] split = IDs.Split(",".ToCharArray(), StringSplitOptions.RemoveEmptyEntries);
string mainFilter = null;
foreach (var id in split)
{
var filter = TableQuery.GenerateFilterCondition("RowKey", QueryComparisons.Equal, id);
mainFilter = mainFilter != null ? TableQuery.CombineFilters(mainFilter, TableOperators.And, filter) : filter;
}
var rangeQuery = new TableQuery<Blah>().Where(mainFilter);
var result = table.ExecuteQuery(rangeQuery);
I am using Windows Azure Storage 7.0.0 and you can use Linq query to filter.
Unfortunately Contains method is not supported by the Table Service but you can write a simple method to build dynamically your linq query:
public static class ContainsExtension
{
public static Expression<Func<TEntity, bool>> Contains<TEntity,
TProperty>(this IEnumerable<object> values,
Expression<Func<TEntity, TProperty>> expression)
{
// Get the property name
var propertyName = ((PropertyInfo)((MemberExpression)expression.Body).Member).Name;
// Create the parameter expression
var parameterExpression = Expression.Parameter(typeof (TEntity), "e");
// Init the body
Expression mainBody = Expression.Constant(false);
foreach (var value in values)
{
// Create the equality expression
var equalityExpression = Expression.Equal(
Expression.PropertyOrField(parameterExpression, propertyName),
Expression.Constant(value));
// Add to the main body
mainBody = Expression.OrElse(mainBody, equalityExpression);
}
return Expression.Lambda<Func<TEntity, bool>>(mainBody, parameterExpression);
}
}
So that you can build dynamic queries easily :
var storageAccount = CloudStorageAccount.Parse(ConfigurationManager.AppSettings["TableStorageConnectionString"]);
var tableClient = storageAccount.CreateCloudTableClient();
var table = tableClient.GetTableReference("Blah");
var split = IDs.Split(",".ToCharArray(), StringSplitOptions.RemoveEmptyEntries);
// Create a query: in this example I use the DynamicTableEntity class
var query = table.CreateQuery<DynamicTableEntity>()
.Where(split.Contains((DynamicTableEntity d) => d.RowKey));
// Execute the query
var result = query.ToList();
Alrighty, with a bit more digging I found the answer.
You can construct a where filter using the syntax found here: http://msdn.microsoft.com/en-us/library/windowsazure/ff683669.aspx
So for my little example it ended up looking like this:
I have a comma delimited string of IDs sent to this method
CloudStorageAccount storageAccount = CloudStorageAccount.Parse(ConfigurationManager.AppSettings["TableStorageConnectionString"]);
CloudTableClient tableClient = storageAccount.CreateCloudTableClient();
CloudTable table = tableClient.GetTableReference("Blah");
string[] split = IDs.Split(",".ToCharArray(), StringSplitOptions.RemoveEmptyEntries);
string filter = null;
for (int i = 0; i < split.Length; i++)
{
filter += " RowKey eq '" + split[i] + "' ";
if (i < split.Length - 1)
filter += " or ";
}
TableQuery<Blah> rangeQuery = new TableQuery<Blah>().Where(filter);
var result = table.ExecuteQuery(rangeQuery);
Result has the list of goodies I need.
One thing to keep in mind is that you wouldn't want to use this on a really large table because I am only getting the RowKey which causes a table scan. If you use the PartitionKey and RowKey together it is more efficient. My table is pretty small (few hundred records at most) so it shouldn't be an issue.
Hope this helps someone.
David

Dynamics CRM - Accessing Custom Product Option Value

Is there a way to programmatically access the Label & Value fields that has been created as a custom Field in MS CRM Dynamics please?
I have added a custom field called "new_producttypesubcode" which, for example, has 2 options, Trophy = 1000000 and Kit = 10000001.
I am writing an import utility that mirrors products between the customers website and their CRM and I want to get a list of all possible product options in the CRM to see if they are matched in the website.
So, in essence I want to...
get the list of possible new_producttypesubcodes and their corresponding values.
Iterate through the product variants in the website.
if the product variant name matches any name in the list of new_producttypecodes then add the value 1000000
So, if I find a product added to the website and its marked as a "Trophy" and "Trophy" exists in the CRM then new OptionSetValue(100000001)
I hope that makes sense...
Thanks
This function retrieves a dictionary of possible values localised to the current user. Taken from: CRM 2011 Programatically Finding the Values of Picklists, Optionsets, Statecode, Statuscode and Boolean (Two Options).
static Dictionary<String, int> GetNumericValues(IOrganizationService service, String entity, String attribute)
{
RetrieveAttributeRequest request = new RetrieveAttributeRequest
{
EntityLogicalName = entity,
LogicalName = attribute,
RetrieveAsIfPublished = true
};
RetrieveAttributeResponse response = (RetrieveAttributeResponse)service.Execute(request);
switch (response.AttributeMetadata.AttributeType)
{
case AttributeTypeCode.Picklist:
case AttributeTypeCode.State:
case AttributeTypeCode.Status:
return ((EnumAttributeMetadata)response.AttributeMetadata).OptionSet.Options
.ToDictionary(key => key.Label.UserLocalizedLabel.Label, option => option.Value.Value);
case AttributeTypeCode.Boolean:
Dictionary<String, int> values = new Dictionary<String, int>();
BooleanOptionSetMetadata metaData = ((BooleanAttributeMetadata)response.AttributeMetadata).OptionSet;
values[metaData.TrueOption.Label.UserLocalizedLabel.Label] = metaData.TrueOption.Value.Value;
values[metaData.FalseOption.Label.UserLocalizedLabel.Label] = metaData.FalseOption.Value.Value;
return values;
default:
throw new ArgumentOutOfRangeException();
}
}
So you would then need to do something like:
Dictionary<String, int> values = GetNumericValues(proxy, "your_entity", "new_producttypesubcode");
if(values.ContainsKey("Trophy"))
{
//Do something with the value
OptionSetValue optionSetValue = values["Trophy"];
int value = optionSetValue.Value;
}
Yes, that data is all stored in the metadata for an attribute (SDK article). You have to retrieve the entity metadata for the entity and then find the attribute in the list. Then cast that attribute to a PicklistAttributeMetadata object and it will contain a list of options. I would mention that typically retrieving Metadata from CRM is an expensive operation, so think about caching.
private static OptionSetMetadata RetrieveOptionSet(IOrganizationService orgService,
string entityName, string attributeName)
{
var entityResponse = (RetrieveEntityResponse)orgService.Execute(
new RetrieveEntityRequest
{ LogicalName = entityName, EntityFilters = EntityFilters.Attributes });
var entityMetadata = entityResponse.EntityMetadata;
for (int i = 0; i < entityMetadata.Attributes.Length; i++)
{
if (attributeName.Equals(entityMetadata.Attributes[i].LogicalName))
{
if (entityMetadata.Attributes[i].AttributeType.Value ==
AttributeTypeCode.Picklist)
{
var attributeMD = (PicklistAttributeMetadata)
entityMetadata.Attributes[i];
return attributeMD.OptionSet;
}
}
}
return null;
}
Here is how to write the options to the console using the above call.
var optionSetMD = RetrieveOptionSet(orgService, "account", "accountcategorycode");
var options = optionSetMD.Options;
for (int i = 0; i < options.Count; i++)
{
Console.WriteLine("Local Label: {0}. Value: {1}",
options[i].Label.UserLocalizedLabel.Label,
options[i].Value.HasValue ? options[i].Value.Value.ToString() : "null");
}
I believe this works for global option set attributes as well, but if you know it is a global option set there is a different message for it that would probably a bit more efficient (SDK article).

How to Insert/Update into Azure Table using Windows Azure SDK 2.0

I have multiple entities to be stored in the same physical Azure table. I'm trying to Insert/Merge the table entries from a file. I'm trying to find a way to do this w/o really serializing each property or for that matter creating a custom entities.
While trying the following code, I thought maybe I could use generic DynamicTableEntity. However, I'm not sure if it helps in an insert operation (most documentation is for replace/merge operations).
The error I get is
HResult=-2146233088
Message=Unexpected response code for operation : 0
Source=Microsoft.WindowsAzure.Storage
Any help is appreciated.
Here's an excerpt of my code
_tableClient = storageAccount.CreateCloudTableClient();
_table = _tableClient.GetTableReference("CloudlyPilot");
_table.CreateIfNotExists();
TableBatchOperation batch = new TableBatchOperation();
....
foreach (var pkGroup in result.Elements("PartitionGroup"))
{
foreach (var entity in pkGroup.Elements())
{
DynamicTableEntity tableEntity = new DynamicTableEntity();
string partitionKey = entity.Elements("PartitionKey").FirstOrDefault().Value;
string rowKey = entity.Elements("RowKey").FirstOrDefault().Value;
Dictionary<string, EntityProperty> props = new Dictionary<string, EntityProperty>();
//if (pkGroup.Attribute("name").Value == "CloudServices Page")
//{
// tableEntity = new CloudServicesGroupEntity (partitionKey, rowKey);
//}
//else
//{
// tableEntity = new CloudServiceDetailsEntity(partitionKey,rowKey);
//}
foreach (var element in entity.Elements())
{
tableEntity.Properties[element.Name.ToString()] = new EntityProperty(element.Value.ToString());
}
tableEntity.ETag = Guid.NewGuid().ToString();
tableEntity.Timestamp = new DateTimeOffset(DateTime.Now.ToUniversalTime());
//tableEntity.WriteEntity(/*WHERE TO GET AN OPERATION CONTEXT FROM?*/)
batch.InsertOrMerge(tableEntity);
}
_table.ExecuteBatch(batch);
batch.Clear();
}
Have you tried using DictionaryTableEntity? This class allows you to dynamically fill the entity as if it were a dictionary (similar to DynamicTableEntity). I tried something like your code and it works:
var batch = new TableBatchOperation();
var entity1 = new DictionaryTableEntity();
entity1.PartitionKey = "abc";
entity1.RowKey = Guid.NewGuid().ToString();
entity1.Add("name", "Steve");
batch.InsertOrMerge(entity1);
var entity2 = new DictionaryTableEntity();
entity2.PartitionKey = "abc";
entity2.RowKey = Guid.NewGuid().ToString();
entity2.Add("name", "Scott");
batch.InsertOrMerge(entity2);
table.ExecuteBatch(batch);
var entities = table.ExecuteQuery<DictionaryTableEntity>(new TableQuery<DictionaryTableEntity>());
One last thing, I see that you're setting the Timestamp and ETag yourself. Remove these two lines and try again.

Lucene - simpleAnalyzer - How to get matched word(s)?

I can't get offset of or directly the word itself by using the following algorithm. Any help would be appreciated
...
Analyzer analyzer = new SimpleAnalyzer();
MemoryIndex index = new MemoryIndex();
QueryParser parser = new QueryParser(Version.LUCENE_30, "content", analyzer);
float score = index.search(parser.parse("+content:" + target));
if(score > 0.0f)
System.out.println("How to know matched word?");
Here is whole in memory index and search example. I have just written in for my self and it works perfectly. I understand that you need to store index in memory, but the question is why you need MemoryIndex for that? You simply use RAMDirectory instead and your index will be stored in memory, so when you perform your search, index will be loaded from RAMDirectory (memory).
StandardAnalyzer analyzer = new StandardAnalyzer(Version.LUCENE_34);
IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_34, analyzer);
RAMDirectory directory = new RAMDirectory();
try {
IndexWriter indexWriter = new IndexWriter(directory, config);
Document doc = new Document();
doc.add(new Field("content", text, Field.Store.YES, Field.Index.ANALYZED, Field.TermVector.WITH_OFFSETS));
indexWriter.addDocument(doc);
indexWriter.optimize();
indexWriter.close();
QueryParser parser = new QueryParser(Version.LUCENE_34, "content", analyzer);
IndexSearcher searcher = new IndexSearcher(directory, true);
IndexReader reader = IndexReader.open(directory, true);
Query query = parser.parse(word);
TopScoreDocCollector collector = TopScoreDocCollector.create(10000, true);
searcher.search(query, collector);
ScoreDoc[] hits = collector.topDocs().scoreDocs;
if (hits != null && hits.length > 0) {
for (ScoreDoc hit : hits) {
int docId = hit.doc;
Document hitDoc = searcher.doc(docId);
TermFreqVector termFreqVector = reader.getTermFreqVector(docId, "content");
TermPositionVector termPositionVector = (TermPositionVector) termFreqVector;
int termIndex = termFreqVector.indexOf(word);
TermVectorOffsetInfo[] termVectorOffsetInfos = termPositionVector.getOffsets(termIndex);
for (TermVectorOffsetInfo termVectorOffsetInfo : termVectorOffsetInfos) {
concordances.add(processor.processConcordance(hitDoc.get("content"), word, termVectorOffsetInfo.getStartOffset(), size));
}
}
}
analyzer.close();
searcher.close();
directory.close();

Bast Way On Passing Query Parameters to Solrnet

I have been working on making a Search using Solrnet which is working the way I want to. But I just would like some advice on the best way to pass my query parameters from my web page into Solrnet.
What I would ideally like to do is pass my query string parameters similar to how this site does it: http://www.watchfinder.co.uk/SearchResults.aspx?q=%3a&f_brand=Rolex&f_bracelets=Steel&f_movements=Automatic.
As you can see from the sites query string it looks like it is being passed into SolrNet directly. Here is I am doing it at the moment (facet query segment):
public class SoftwareSalesSearcher
{
public static SoftwareSalesSearchResults Facet()
{
ISolrOperations solr = SolrOperationsCache.GetSolrOperations(ConfigurationManager.AppSettings["SolrUrl"]);
//Iterate through querystring to get the required fields to query Solrnet
List queryCollection = new List();
foreach (string key in HttpContext.Current.Request.QueryString.Keys)
{
queryCollection.Add(new SolrQuery(String.Format("{0}:{1}", key, HttpContext.Current.Request.QueryString[key])));
}
var lessThan25 = new SolrQueryByRange("SoftwareSales", 0m, 25m);
var moreThan25 = new SolrQueryByRange("SoftwareSales", 26m, 50m);
var moreThan50 = new SolrQueryByRange("SoftwareSales", 51m, 75m);
var moreThan75 = new SolrQueryByRange("SoftwareSales", 76m, 100m);
QueryOptions options = new QueryOptions
{
Rows = 0,
Facet = new FacetParameters {
Queries = new[] { new SolrFacetQuery(lessThan25), new SolrFacetQuery(moreThan25), new SolrFacetQuery(moreThan50), new SolrFacetQuery(moreThan75) }
},
FilterQueries = queryCollection.ToArray()
};
var results = solr.Query(SolrQuery.All, options);
var searchResults = new SoftwareSalesSearchResults();
List softwareSalesInformation = new List();
foreach (var facet in results.FacetQueries)
{
if (facet.Value != 0)
{
SoftwareSalesFacetDetail salesItem = new SoftwareSalesFacetDetail();
salesItem.Price = facet.Key;
salesItem.Value = facet.Value;
softwareSalesInformation.Add(salesItem);
}
}
searchResults.Results = softwareSalesInformation;
searchResults.TotalResults = results.NumFound;
searchResults.QueryTime = results.Header.QTime;
return searchResults;
}
}
At the moment I can't seem to see how I can query all my results from my current code by add the following querystring: q=:.
I'm not sure what you mean by "parameters being passed into SolrNet directly". It seems that watchfinder is using some variant of the model binder included in the SolrNet sample app.
Also take a look at the controller in the sample app to see how the SolrNet parameters are built.

Resources