Searching only in undeleted documents using zend lucene - search

I am not new with zend lucene but I have a trouble with searching using it.
I search in documents by numbers using below code:
$term = new Zend_Search_Lucene_Index_Term($id, $idFieldName);
$docIds = $index->termDocs($term);
foreach ($docIds as $id) {
$doc = $index->getDocument($id);
echo $doc->artist_name;
}
$index->commit();
and deleting a document by number using below code:
$term = new Zend_Search_Lucene_Index_Term($id, $idFieldName);
$docIds = $index->termDocs($term);
foreach ($docIds as $id) {
$doc = $index->getDocument($id);
$index->delete($doc->lyric_id);
}
$index->commit();
When I delete a document, $index->numDocs() display that the document is deleted because the returned value is not equals the returned value of $index->count(). but the problem is, after deleting the document, I can search in it yet and I can display the value of its fields.
I checked that after optimizing the indexes but the problem is live yet. I need to remove completely a document or search in the documents that are not deleted from indexes.

Loop through the search results and check if the document is deleted. If it is, remove it from the search results.
Zend_Search_Lucene::isDeleted($id) method may be used to check if a
document is deleted.
for ($count = 0; $count < $index->maxDoc(); $count++) {
if ($index->isDeleted($count)) {
echo "Document #$id is deleted.\n";
} }
via Building Indexes: Updating Documents

Related

Lucene Search for SiteCore get Field content from result

Hy. I have run into a small problem. I am using Lucene Search and I am trying to get the content from a field in the returned result. I have got so far until the ID's of the field. Right now i get the field's ID' like that.
foreach (var i in hit.Template.InnerItem.InnerData.Fields)
{
hitParagraph = hitParagraph + i.ToString();
}
This gives me the ID's of the field inside that template like this
[{25BED78C-4957-4165-998A-CA1B52F67497}, 20130307T051813][{5DD74568-4D4B-44C1-B513-0AF5F4CDA34F}, vh\branea1][{8CDC337E-A112-42FB-BBB4-4143751E123F}, 51885b42-bf8b-4f26-8259-125d352457f3][{D9CF14B1-FA16-4BA6-9288-E8A174D4D522}, .....
Please some help.
Thank You
I'm not entirely sure what you're after. If it's the content of a specific field, you could just use hit["fieldname"] (assuming hit is a Sitecore item). Or hit.Template.InnerItem["fieldname"] would work, I think.
I think you don't need the InnerData bit - if you want a foreach loop I think you could do it like so:
foreach (Field i in hit.Template.InnerItem.Fields)
{
hitParagraph += i.Value.ToString();
}
From what I understand from your code, the hit is a Sitecore Item class instance. To get all the fields from it, use:
hit.Fields.ReadAll();
foreach (Field field in hit.Fields)
{
hitParagraph = hitParagraph + field.Key + ": " + item[field.Key] + "\n";
}

How to search and sort with CouchDB in one map function

I'm stumbling a bit with my CouchDB knowledge.
I have a database of content that is tagged with an array of tags and has a created date.
I want to create a view that pulls a limited number of newest stories tagged with a specific tag.
For example, the newest 6 stories tagged "Business."
Ran across this question, which seems to get me almost to where I need to go, but I'm missing one key element, which I think is how to craft the query string to sort by one key while searching by the other.
Here's my map function.
function(doc) {
if (doc.published == "yes" && doc.type == "news") {
for (var i = 0; i < doc.tags.length; i++) {
if (doc.tags[i]) {
emit([doc.created, doc.tags[i]], doc);
}
}
}
}
So how do I query that view for a all documents tagged "Business" that are the newest documents based on created.
The created attribute is a date sortable format.
First, I would switch the order of your emit:
emit([doc.tags[i], doc.created]);
(leave out doc as well, you can just add include_docs=true to get the entire document, and your view won't take up so much disk-space in the process)
Now you can query for the all the stories tagged as "Business" by using the following querystring:
startkey=["Business"]&endkey=["Business",{}]
You'll get all the documents with the tag business, and they'll be sorted by date.
This takes advantage of view collation, which basically is the rules governing how indexes are sorted/queried. For complex keys like this, the sorting is done for each item of the array separately. (ie. the first key is sorted first, the second key is sorted second, etc) This is why the order matters, as you must always move from left to right when querying a view index.
If you want the 6 most recent, your querystring will need to change:
descending=true&limit=6&endkey=["Business"]&startkey=["Business",{}]
NOTICE You need to swap the startkey/endkey values, due to how the descending parameter works. See the View reference page on the wiki for further explanation.
OK, I think I figured this out, but I'm not quite certain I fully understand it.
I found this story about complex keys and searching and sorting.
My map function looks like this:
function(doc) {
if (doc.published == "yes" && doc.type == "news") {
for (var i = 0; i < doc.tags.length; i++) {
if (doc.tags[i]) {
emit([doc.tags[i], doc.created], doc);
}
}
}
}
And to query and sort using it, the query looks like this.
http://localhost:5984/database/_design/story/_view/tagged?limit=10&startkey=["Business"]&endkey=["Business",{}]&descending=false
I'm getting the results I want, but I'm not entirely certain I understand it all.

Is it possible to use the search results of one search as the criteria for a new search in NetSuite

Using NetSuite is it possible to embed a search within another search? I have a search that I need that will be effectively using another search's results in the criteria.
The basic structure of my search is:
Return all non-inventory skus, starting with a specific prefix,
Where the occurrence of the previously mentioned skus on a custom field on
Inventory-Part records is greater than 0.
This is then intended to be used for alerts
I'm not sure how to build this within NetSuite's search builder.
I don't think this pertains to any scripting as m_cheung suggested.
To answer your question, yes this is doable via saved search.
Transaction > Management > Saved Search > New
Select 'Item' from the list
In the criteria section:
Type = 'Non-Inventory Items'
External ID = starts with (...your desired prefix) (NOTE: Assuming that prefix is the external ID from your question)
Select the Custom field and criteria is greater than 0.
Save and Run to confirm if this is the desired result.
using nlapiSearchRecord(RECORDTYPE, JOIN_, __SEARCHFILTERSARRAY, __SEARCHCOLUMNSARRAY) you can return the results of a search and pass the returned data further into script logic
for example if you build search1 using a searchFilter array and a searchColumn array then pass these arrays into nlapiSearchRecord('item'), you can assign this call to a variable:
var searchresults = nlapiSearchRecord('item', null, searchFiltersArray, searchColumnsArray);
then using searchresults (which is an nlobjSearchResults object) you can pull out your returned search data for criteria in search2:
if(searchresults)
{
for(i=0;i<searchresults.length; i++)
{
var search2FilterAndColumnData = searchresults[i].getAllColumns();
}
}
You can use a saved search for creating another search in suitescript.
Somewhat like ,
var arrSearchResult = nlapiSearchRecord( null , SAVED_SEARCH_ID , FILTERS , COLUMNS);

SharePoint : Guessing the attachment path before updating a list item

I have some code that inserts a list item into a list...
I then have this code
SPFolder folder = web.Folders["Lists"].SubFolders[list.RootFolder.Name].SubFolders["Attachments"].SubFolders[item.ID.ToString()];
foreach (SPFile file in folder.Files)
{
string attachmentName = this.downloadedMessageID + ".xml";
if (file.Name == attachmentName)
{
SPFieldUrlValue value = new SPFieldUrlValue();
value.Description = this.downloadedMessageID + ".xml";
value.Url = this.SiteAddress + file.Url;
item["ZFO"] = value;
}
}
this is fine except for one problem... before this code actually works... I need to call the item.update() method to save the item to SharePoint...
But as you can see there is more work to do ... after item.update is called...
So this means... I have
work
item.update();
more work
item.update();
The problem I am having is I really want just
work
item.update();
So that in any event of failure the whole thing will fail at once or pass at once.... (almost like a SQL transaction).
So whats preventing me from doing this is - I need to set a hyperlink to one of the fields in the list item, this will be to an attachment in the list attachments collection.
Is there any way I can predict this address without having saved the list item to MOSS?
An attachment path depends on the item ID, and I don't believe your item will have an ID until you save it. Have you considered storing the attachments in a document library instead, linked by the field you're trying to set?
Transactional operation isn't exactly SharePoint's strong suit.

Exact phrase search using Lucene.net

I am having trouble searching for an exact phrase using Lucene.NET 2.0.0.4
For example I am searching for "scope attribute sets the variable" (including quotes) but receive no matches, I have confirmed 100% that the phrase exists.
Can anyone suggest where I am going wrong? Is this even supported with Lucene.NET? As usual the API documentation is not too helpful and a few CodeProject articles I've read don't specifically touch on this.
Using the following code to create the index:
Directory dir = Lucene.Net.Store.FSDirectory.GetDirectory("Index", true);
Analyzer analyzer = new Lucene.Net.Analysis.SimpleAnalyzer();
IndexWriter indexWriter = new Lucene.Net.Index.IndexWriter(dir, analyzer,true);
//create a document, add in a single field
Lucene.Net.Documents.Document doc = new Lucene.Net.Documents.Document();
Lucene.Net.Documents.Field fldContent = new Lucene.Net.Documents.Field(
"content", File.ReadAllText(#"Documents\100.txt"),
Lucene.Net.Documents.Field.Store.YES,
Lucene.Net.Documents.Field.Index.TOKENIZED);
doc.Add(fldContent);
//write the document to the index
indexWriter.AddDocument(doc);
I then search for a phrase using:
//state the file location of the index
Directory dir = Lucene.Net.Store.FSDirectory.GetDirectory("Index", false);
//create an index searcher that will perform the search
IndexSearcher searcher = new Lucene.Net.Search.IndexSearcher(dir);
QueryParser qp = new QueryParser("content", new SimpleAnalyzer());
// txtSearch.Text Contains a phrase such as "this is a phrase"
Query q=qp.Parse(txtSearch.Text);
//execute the query
Lucene.Net.Search.Hits hits = searcher.Search(q);
The target document is about 7 MB plain text.
I have seen this previous question however I don't want a proximity search, just an exact phrase search.
Shashikant Kore is correct with his answer, you need to enable term positions...
However, I would recommend not storing the text of the document in the field unless you absolutely need it to return back to you in the search results... Setting the store to 'NO' might help reduce the size of your index a bit.
Lucene.Net.Documents.Field fldContent =
new Lucene.Net.Documents.Field("content",
File.ReadAllText(#"Documents\100.txt"),
Lucene.Net.Documents.Field.Store.NO,
Lucene.Net.Documents.Field.Index.TOKENIZED,
Lucene.Net.Documents.Field.TermVector.WITH_POSITIONS_OFFSETS);
You have not enabled the term positions. Creating field as follows should solve your problem.
Lucene.Net.Documents.Field fldContent =
new Lucene.Net.Documents.Field("content",
File.ReadAllText(#"Documents\100.txt"),
Lucene.Net.Documents.Field.Store.YES,
Lucene.Net.Documents.Field.Index.TOKENIZED,
Lucene.Net.Documents.Field.TermVector.WITH_POSITIONS_OFFSETS);

Resources