MarkLogic search:search not returning snippets

MarkLogic search:search not returning snippets - search

I am doing a search:search on a MarkLogic database. I can search on the term "pineal" and return 297 results with snippets. I can search on "city:Vancouver" and return 83 results with snippets. The query "pineal OR city:Vancouver" returns 374 results with snippets. However, the query "pineal AND city:Vancouver" returns a count of 6 results, but no result elements and no snippets. Any idea why I am not getting result text?
Thanks!
Ravi Har

I seem to have found the problem.
The xml being searched looks like this:
<lecture objectType="lecture">
<city>Vancouver</city>
<state>British Columbia</state>
<country>Canada</country>
<formattedTranscript>
<body class="lecture-transcript" xmlns="http://www.w3.org/1999/xhtml">
...
The city constraint looks like this:
<constraint name="city">
<range type="xs:string" facet="true">
<element ns="" name="city"/>
<facet-option>frequency-order</facet-option>
<facet-option>descending</facet-option>
</range>
</constraint>"
I had the following statement in my $options declaration:
<searchable-expression>
//(formattedTranscript|title|city|state|country|objectDate)
</searchable-expression>
When I take this statement out the search returns results as expected. I'm curious why the searchable-expression statement breaks the search results.
Thanks everyone for your comments.

Related

Concatenate long strings from multiple records into one string

I have a situation where I need to concatenate long strings from multiple records in an Oracle database into a single string. These long strings are portions of a larger XML string, and my ultimate goal is to be able to convert this XML into something resembling query results and pull out specific values.
The data would look something like this, with the MSG_LINE_TEXT field being VARCHAR2(4000). So if the total message is less than 4000 characters, then there'd only be one record. In theory, there could be an infinite number of records for each message, although the highest I've seen so far is 14 records, which means I need to be able to handle strings that are at least 56000 characters long.
MESSAGE_ID MSG_LINE_NUMBER MSG_LINE_TEXT
---------- --------------- --------------------------------
17415414 1 Some XML snippet here
17415414 2 Some XML snippet here
17415414 3 Some XML snippet here
17415414 4 Some XML snippet here
The total XML for one MESSAGE_ID might look something like this. There could be many App_Advice_Error tags, although this specific example only contains one.
<tXML>
<Header>
<Source>MANH_prod_wmsweb</Source>
<Action_Type />
<Sequence_Number />
<Company_ID>1</Company_ID>
<Msg_Locale />
<Version />
<Internal_Reference_ID>17415414</Internal_Reference_ID>
<Internal_Date_Time_Stamp>2021-02-09 13:45:22</Internal_Date_Time_Stamp>
<External_Reference_ID />
<External_Date_Time_Stamp />
<User_ID>ESBUSER</User_ID>
<Message_Type>RESPONSE</Message_Type>
</Header>
<Response>
<Persistent_State>0</Persistent_State>
<Error_Type>2</Error_Type>
<Resp_Code>501</Resp_Code>
<Response_Details>
<Application_Advice>
<Shipper_ID />
<Imported_Object_Type>ASN</Imported_Object_Type>
<Response_Type>Error</Response_Type>
<Transaction_Date>2/9/21 13:45</Transaction_Date>
<Application_Ackg_Code>TE</Application_Ackg_Code>
<Business_Unit></Business_Unit>
<Tran_Set_Identifier_Code></Tran_Set_Identifier_Code>
<Transaction_Purpose_Code>11</Transaction_Purpose_Code>
<Imported_Message_Id></Imported_Message_Id>
<Imported_Object_Id>Reference Number Here</Imported_Object_Id>
<Additional_References>
<Additional_Reference_Info>
<Reference_Type>BusinessPartner</Reference_Type>
<Reference_ID></Reference_ID>
</Additional_Reference_Info>
</Additional_References>
<App_Advice_Errors>
<App_Advice_Error>
<App_Error_Text>Some error text here</App_Error_Text>
<Error_Message_Tokens>
<Error_Message_Token>Object that errored out</Error_Message_Token>
</Error_Message_Tokens>
<App_Err_Cond_Code>6100234</App_Err_Cond_Code>
</App_Advice_Error>
</App_Advice_Errors>
<Imported_Data></Imported_Data>
</Application_Advice>
</Response_Details>
</Response>
</tXML>
The values that I'm most interested in pulling out are the App_Err_Cond_Code, Error_Message_Token, and App_Error_Text tags. I had tried using something like this:
extractvalue(xmltype(msg_line_text), '//XPath of Tag')
This works beautifully for stuff where the entire XML is less than 4000 characters, i.e. the entire XML is stored in a single record. The problem comes when there are multiple records, because each individual snippet of XML isn't a valid XML string on its own, and so XMLTYPE throws an error, hence the reason I'm trying to concatenate them all into a single string, which I can then use with the above method.
I've tried a variety of ways to do this - LISTAGG, XMLAGG, SYS_CONNECT_BY_PATH, as well as writing a custom function something like this:
with
function get_messages(pTranLogID number) return string
is
xml varchar2;
begin
xml := '';
for msg in (
select r.msg_line_text
from tran_log_response_message r, tran_log t
where
t.message_id = r.message_id
and t.tran_log_id = pTranLogID
order by r.msg_line_number
)
loop
xml := xml || msg.msg_line_text;
end loop;
return 'test';
end;
select
tran_log_id, get_messages(tran_log_id)
from
tran_log
where
tran_log_id = '20633610';
/
The problem is that every one of these methods complained that the string was too long. Does anyone have any other ideas? Or maybe a better approach to this problem?
Thanks.

Excel Len function based on condition

I have two columns in my excel file.
Full ID Expected Result
159473A1 159473
159696A1 159696
160614A1 160614
43293J1A 43293
43293D1A 43293
43293A2B 43293
43293J2B 43293
43293B2B 43293
What i had tried :
=Left(A2,LEN(A2)-2)
159473
159696
160614
43293J
43293D
43293A
43293J
43293B
53202
But has you can see, I cant do that because I still have some characters in the expected results
43293J
43293D
43293A
43293J
43293B
How can I get my expected results in like the top example?

In B2 try:
=LEFT(A2,MATCH(FALSE,INDEX(ISNUMBER(MID(A2,ROW(A$1:INDEX(A:A,LEN(A2))),1)*1),),0)-1)
If you have access to DA-functions (O365), like SEQUENCE:
=LEFT(A2,MATCH(FALSE,ISNUMBER(MID(A2,SEQUENCE(LEN(A2)),1)*1),0)-1)
Note: If you are dealing with integers too, maybe to prevent possible errors through MATCH, you could use =LEFT(A2,MATCH(FALSE,ISNUMBER(MID(A2&"A",SEQUENCE(LEN(A2)+1),1)*1),0)-1)

So, very simple in the first instance:
But you have an issue with line 4... so this will help that:
IF(ISNUMBER(VALUE(LEFT(A3,LEN(A3)-2))),VALUE(LEFT(A3,LEN(A3)-2)),VALUE(LEFT(A3,LEN(A3)-3)))

This formula will work for you , remember to click Ctrl+Shift+Enter after pasting formula as its a matrix based formula
mention your email i can send you my workout
=LEFT(A2,IFERROR(MATCH(1,ISERR(MID(A2,ROW(INDIRECT("1:"&LEN(A2))),1)*1)*1,),)-1)

How to combine two queries in Solr with ComplexPhraseQueryParser?

When I search in Solr 4.0 with the following two filter queries separately, it works as expected.
{!complexphrase inOrder=true}employeeName_t:"Mike R*"
empDate_dt:[2016-10-10T00:00:00Z TO 2016-10-10T23:59:59Z]
But I am not getting proper search results when I combine these two queries(Irrespective of the order).
{!complexphrase inOrder=true}employeeName_t:"Mike R*" AND empDate_dt:[2016-10-10T00:00:00Z TO 2016-10-10T23:59:59Z]
This query gives me zero search results in Solr
"response": {
"numFound":0,
"start":0,
"maxScore":0,
"docs":[]
}
empDate_dt:[2016-10-10T00:00:00Z TO 2016-10-10T23:59:59Z] AND {!complexphrase inOrder=true}employeeName_t:"Mike R*"
Whereas change in query order gives me parse exception as follows
"error":{
"msg": "org.apache.solr.search.SyntaxError: org.apache.lucene.queryparser.classic.ParseException: Cannot
parse 'employeeName_t:\"Mike': Lexical error at line 1, column 21. Encountered: after : \"\\"Mike\"",code:400
}
Using ComplexPhraseQueryParser for partial search in solr.Need to use both queries.Any suggestions to this would be greatly appreciated.

I suggest you to use fq parameter.
docs are retrieved with query as :"Mike R*" and filtered with dates specified in fq parameter.
Example:
q={!complexphrase inOrder=true}employeeName_t:"Mike R*"&fq=empDate_dt:["2016-10-10T00:00:00Z" TO "2016-10-10T23:59:59Z"]

XQuery data and text() function

Sorry even trying to watch tutorials I am just trying to understand the difference between the data() and the text() functions in XQuery.
Any clarification is appreciated.

text() is used to match something. For example if we have this structure:
<a>
<b>hello <c>world</c></b>
</a>
Doing //b/text() will return the text node 'hello ' just like //b/element() will return the element c.
data($arg) is a function that returns the atomic value of a node, for example data(//b) will return 'hello world'. If you use the data($arg) function on a document with a schema then the type will be kept intact.

Deserialize XMLDocument with encoded characters in attribute names

I'm Trying to deserialize xml data into an object with c#. I have always done this using the .NET deserialize method, and that has worked well for most of what I have needed.
Now though, I have XML that is created by Sharepoint and the attribute names of the data I need to deserialize have encoded caracters, namely:
*space, º, ç ã, :, * and a hyphen as
x0020, x00ba, x007a, x00e3, x003a and x002d respectivly
I'm trying to figure out what I have to put in the attributeName parameter in the properties XmlAttribute
x0020 converts to a space well, so, for instance, I can use
[XmlAttribute(AttributeName = "ows_Nome Completo")]
to read
ows_Nome_x0020_Completo="MARIA..."
On The other hand, neither
[XmlAttribute(AttributeName = "ows_Motiva_x00e7__x00e3_o_x003a_")]
nor
[XmlAttribute(AttributeName = "ows_Motivação_x003a_")]
nor
[XmlAttribute(AttributeName = "ows_Motivação:")]
allow me to read
ows_Motiva_x00e7__x00e3_o_x003a_="text to read..."
With the first two I get no value returned, and the third gives me a runtime error for invalid caracters (the colon).
Anyway to get this working with .NET Deserialize, or do I have to build a specific deserializer for this?
Thanks!

What you are looking at (the "cryptic" data) is called XML entities. It's used by SharePoint to safekeep attribute names and similar elements.
There are a few ways of dealing with this, the most elegant ways to solve it is by extracting the List schema and match the element towards the schema. The schema contain all meta-data about your list data. A polished example of a Schema can be seen below or here http://www.bendsoft.com/documentation/camelot-php-tools/1_5/packets/schema-and-content-packets/schemas/example-list-view-schema/
If you don't want to walk that path you could start here http://msdn.microsoft.com/en-us/library/35577sxd.aspx
<Field Name="ContentType">
<ID>c042a256-787d-4a6f-8a8a-cf6ab767f12d</ID>
<DisplayName>Content Type</DisplayName>
<Type>Text</Type>
<Required>False</Required>
<ReadOnly>True</ReadOnly>
<PrimaryKey>False</PrimaryKey>
<Percentage>False</Percentage>
<RichText>False</RichText>
<VisibleInView>True</VisibleInView>
<AppendOnly>False</AppendOnly>
<FillInChoice>False</FillInChoice>
<HTMLEncode>False</HTMLEncode>
<Mult>False</Mult>
<Filterable>True</Filterable>
<Sortable>True</Sortable>
<Group>_Hidden</Group>
</Field>
<Field Name="Title">
<ID>fa564e0f-0c70-4ab9-b863-0177e6ddd247</ID>
<DisplayName>Title</DisplayName>
<Type>Text</Type>
<Required>True</Required>
<ReadOnly>False</ReadOnly>
<PrimaryKey>False</PrimaryKey>
<Percentage>False</Percentage>
<RichText>False</RichText>
<VisibleInView>True</VisibleInView>
<AppendOnly>False</AppendOnly>
<FillInChoice>False</FillInChoice>
<HTMLEncode>False</HTMLEncode>
<Mult>False</Mult>
<Filterable>True</Filterable>
<Sortable>True</Sortable>
</Field>
<Field>
...
</Field>

Well... I guess I kind of hacked a way around, which works for now. Just replaced the _x***_ charecters for nothing, and corrected the XmlAttributes acordingly. This replacement is done by first loading the xml as a string, then replacing, then loading the "clean" text as XML.
But I wopuld still like to know if it is possible to use some XmlAttribute Name for a more direct approach...

Try using System.Xml; XmlConvert.EncodeName and XmlConvert.DecodeName

I use a simply function to get the NameCol:
private string getNameCol(string colName) {
if (colName.Length > 20) colName = colName.Substring(0, 20);
return System.Xml.XmlConvert.EncodeName(colName);
}
I'm already searching for replace characters like á, é, í, ó, ú. EncodeName doesn't convert this characters.
Can use Replace:
.Replace("ó","_x00f3_").Replace("á","_x00e1_")

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

MarkLogic search:search not returning snippets - search

Related

Concatenate long strings from multiple records into one string

Excel Len function based on condition

How to combine two queries in Solr with ComplexPhraseQueryParser?

XQuery data and text() function

Deserialize XMLDocument with encoded characters in attribute names

Categories

Resources