In Graylog2, is it possible to use an extractor to create a field which gets displayed with newlines? - graylog2

I am receiving a message in Graylog which is very long, and would like to split it up in an extractor. I can use the "Replace with regular expression" extractor to replace certain tokens with other symbols (including \n, \\n, <br />, etc) but in the /search view the field is always rendered in a single line with those characters escaped.
Is there a way to render a field populated with an extractor as multiple lines in Graylog2? Or is the only way to send the message as multi-line from the source?

Related

How can I easily get search context around search term with Typesense?

I currently use Typesense to search in an HTML database. When I search for a term, I would like to retrieve N characters before and N characters after the term found in search.
For example, I search for "query" and this is the sentence that matches:
Let's repeat the query we made earlier with a group_by parameter
I would like to easy retrieve a fixed number of letters (or words) before and after the term to show it in a presumably small area where the search results is retrieved, without breaking any words.
For this particular example, I would be showing:
..repeat the query we made earlier..
Is there a feature like this in Typesense?
I have checked Typesense's documents, without any luck.
The feature you're referring to is called snippets/highlights and it's enabled by default. You can control how many words are returned on either side of the matched text using the highlight_affix_num_tokens search parameter, documented under the table here: https://typesense.org/docs/0.23.1/api/search.html#results-parameters
highlight_affix_num_tokens
The number of tokens that should surround the highlighted text on each side. This controls the length of the snippet.

How to truncate text and use the raw filter in Twig

The issue is that the raw Twig filter must go at the end of the chain for it to work correctly and replace the HTML entities with their corresponding characters. This causes a problem as I need to also use the truncate function. The truncation is happening correctly but in the instances where the truncation happens in the middle of one of the HTML entity strings the raw function then fails to remove this entity.
Current solution:
{{ BlogPost.description|striptags|truncate(80)|raw }}
Input string:
<p>It supports your pupils to think like scientists – but that doesn’t mean it's only for science!</p>"
What the current solution achieves:
It supports your pupils to think like scientists – but that doesn&rsq...
What I want to achieve:
It supports your pupils to think like scientists – but that doesn't m...

Solr exact search with a hyphen

I am trying to search for a term in Solr in the Title that contains only the string 1604-04. But the results come back with anything that contains 1604 or 04. What would the syntax be to force solr to search on the exact string of 1604-04?
You can also use Classic Tokenizer.The Classic Tokenizer preserves the same behavior as the Standard Tokenizer with the following exceptions:-
Words are split at hyphens, unless there is a number in the word, in which case the token is not split and
the numbers and hyphen(s) are preserved.
This means if someone searches for 1604-04 then this Tokenizer won't break search string into two tokens.
If you want exact matches only, use a string field or a text field with a KeywordTokenizer as the tokenizer. These will keep your tokens intact as one single entry, and won't break it up into multiple tokens.
The difference is that if you use a Textfield with a KeywordTokenizer, you can still apply other filters, such as a LowercaseFilter, while a string field will store anything verbatim without any further processing possible.
Your analyzer is splitting "1604-04" into two terms, "1604" and "04". You've received answer on how to change your analysis to stop doing that.
Changing your analysis my not be the best solution (can't be entirely sure based on what you've written). Using a phrase query would be the usual way to do this. You can use a phrase query by wrapping it in quotes:
field:"1604-04"
This will still analyze and split it into two terms, but it will look for those terms in sequence. So, that query would match "1604-04" and "1604 04", but not "1604 some other stuff 04".

Array of attachment type - how to get a filename for highlighted fragment?

I use ElasticSearch to index resources. I create document for each indexed resource. Each resource can contain meta-data and an array of binary files. I decided to handle these binary files with attachment type. Meta-data is mapped to simple fields of string type. Binary files are mapped to array field of attachment type (field named attachments). Everything works fine - I can find my resources based on contents of binary files.
Another ElasticSearch's feature I use is highlighting. I managed to successfully configure highlighting for both meta-data and binary files, but...
When I ask for highlighted fragments of my attachments field I only get fragments of these files without any information about source of the fragment (there are many files in attachment array field). I need mapping between highlighted fragment and element of attachment array - for instance the name of the file or at least the index in array.
What I get:
"attachments" => ["Fragment <em>number</em> one", "Fragment <em>number</em> two"]
What I need:
"attachments" => [("file_one.pdf", "Fragment <em>number</em> one"), ("file_two.pdf", "Fragment <em>number</em> two")]
Without such mapping, the user of application knows that particular resource contains files with keyword but has no indication about the name of the file.
Is it possible to achieve what I need using ElasticSearch? How?
Thanks in advance.
So what you want here is to store the filename.
Did you send the filename in your json document? Something like:
{
"my_attachment" : {
"_content_type" : "application/pdf",
"_name" : "resource/name/of/my.pdf",
"content" : "... base64 encoded attachment ..."
}
}
If so, you can probably ask for field my_attachment._name.
If it's not the right answer, can you refine a little your question and give a JSON sample document (without the base64 content) and your mapping if any?
UPDATE:
When it come from an array of attachments you can't get from each file it comes because everything is flatten behind the scene. If you really need that, you may want to have a look at nested fields instead.

how to retrieve content based paragraph from the word file using open xml and c#4.0?

I am using c#4.0 and open xml sdk 2.0 for accessing Word file.For that, Now i want to Retrieve a paragraph based on the given text.If the paragraph contains my text then retrieve the paragraph containing that text...
FOR EXAMPLE:
Given Word is: TEST
Retrieve the paragraphs that containing the word "TEST"
I want to search the given Word in the paragraph.If any matches found, Then i want to display that methods.If matches not found,no need to get the paragraph.
How i do?
The main content of a word document is stored in the body element.
At the simplest level, paragraphs can be located using Linq queries performed on the document:
using(WordprocessingDocument document = WordprocessingDocument.Open(documentStream, true)){
foreach(Paragraph p in document.MainDocumentPart.Document.Body.Descendants<Paragraph>().Where<Paragraph>(p => p.InnerText.Equals("SOME TEXT")){
// Do something with the Paragraphs.
}
}
However I would advise that the problem is a little more complicated than this. As under each paragraph there may be more than one Run (essentially a sentence) containing a string of words. It is quite likely that where the user entered the word "SOME TEXT" also contains other runs.
But this should be able to point you in the correct direction.

Resources