is it possible to concatenate string to solr field while we are searching
Example:
localhost:8983/solr/collection1/select?q=item_type%3Apostings&wt=json&indent=true
now i have one field id i need to append text "locality_" before every id value.so that i need not to to for loop on large data set.
With Solr 4.0 SOLR-2444 enables to define alias to a field, apply transformer to the value.
I have not used the above, but you can surely explore on the above either by modifying the value with function query add with constant value or defining a Custom Transformer to apply on a alias field.
Related
This bounty has ended. Answers to this question are eligible for a +400 reputation bounty. Bounty grace period ends in 22 hours.
kevzettler is looking for a more detailed answer to this question.
I have a complex data type (fraudData) that undesirably has hyphen characters in the field names I need to remove or change the hypens to some other character.
The input schema of the complex object looks like:
I have tried using the "Select" and "Derive Column" data flow functions and adding a custom mapping. It seems both functions have the same mapping interface. My current attempt with Select is:
This gets me close to the desired results. I can use the replace expression to convert hypens to underscores.
The problem here is that this mapping creates new root level columns outside of the fraudData structure. I would like to preserve the hierarchy of the fraudData structure and modify the column names in place.
If I am unable to modify the fraudData in place. Is there any way I can take the new columns and merge them into another complex data type?
Update:. I do not know the fields of the complex data type in advance. This is a schema drift problem. This is why I have tried using the pattern matching solution. I will not be able to hardcode out kown sub-column names.
You can rename the sub-columns of complex data type using derived column transformation and convert them as a complex data type again. I tried this with sample data and below is the approach.
Sample complex data type column with two sub fields are taken as in below image.
img:1 source data preview
In Derived column transformation, For the column fraudData, expression is given as
#(fraudData_1_chn=fraudData.{fraudData-1-chn},
fraudData_2_chn=fraudData.{fraudData-2-chn})
img:2 Derived column settings
This expression renames the subfields and nests them under the parent column fraudData.
img:3 Transformed data- Fields are renamed.
Update: To rename sub columns dynamically
You can use below expression to rename all the fields under the root column fraudData.
#(each(fraudData, match(true()), replace($$,'-','_') = $$))
This will replace fields which has - with _.
You can also use pattern match in the expression.
#(each(fraudData, patternMatch(`fraudData-.+` ), replace($$,'-','_') = $$))
This expression will take fields with pattern fraudData-.+ and replace - with _ in those fields only.
Reference:
Microsoft document on script for hierarchical definition in data flow.
Microsoft document on building schemas using derived column transformation .
How to execute a wildcard/RegEx search in Data Catalog (Google Cloud Platform) ?
It would make sense to search metadata across column names and tag attributes (and there values).
The current documentation only lists very strict search behavior
e.g. for tag:data_gov_template.hasPII(=true)
Needed would be a result for "PII" - I don't care about specifying the exact template name etc.
e.g. labels:etl
if I only search for etl there is no result
(metadata/attributes and values is not searchable on a direct way?)
From your use case, I understood that you want to search for a particular metadata attribute, like a Tag field, PII, right?
For tagged assets
If you don't care about the template name. You could use the tag:x search facet.
So if all your templates, data_gov_template, data_curator_template, data_etl_template, all contain the same Tag field name, has_pii, you can search using:
tag:has_pii and this will return all assets with that metadata attribute, no matter what the template name is.
For columns
You can use the column:x search facet to match a substring of the column name in the schema of the data asset. Which does not support nested columns yet.
For labels
You can use the labels:bar search facet for data assets that have a label (with some value) and the label key has bar as a substring.
You are also able to search on their values. So yes, the metadata/attributes and values are searchable.
But it is not a regex kind, it is a substring match when the search facet uses colon :, like labels:bar or an exact match when the search facet uses equals =, like type=table.
The problem is, if the long field has value 120450, 120445, 120656. Please find the query below.
{"from":0,"size":10,"query":{"nested":{"query":{"bool":{"must":[{"querystring":{"query":"120","fields":["alist.articleId"]}}]}},"path":"alist"}}}_
The response should return all the three documents which has partial match for 120. Is it possible to achieve this in long or a numeric field ?
For partial matching on numerics, you can store them as string values.
Now, you can use either of the following
use edgeNGram tokenizer
use prefix query, your field needs to be marked not_analyzed in this case
I need to create a long list of complex strings, containing the data of different fields in different places to create explanatory reports.
The only way I conceived, in Access 2010, is to save text parts in a table, together with field names to be used to compose the string to be shown (see line1 expression in figure). Briefly:
//field A contain a string with a field name:
A = "[Quantity]"
//query expression:
=EVAL(A)
//return error instead the number contained in field [Quantity], present in the query dataset
I thought doing an EVAL on a field (A), to obtain the value of the field (B) which name is contained in field A. But seems not working.
Any way exist?
Example (very simplified):
Sample query that EVAL a field containing other field names to obtain the value of the fields
Any Idea?
PS: Sorry for my english, not my mothertongue.
I found a interesting workaround in another forum.
Other people had same problem using EVAL, but found that it is possible to substitute a string with a field contents using REPLACE function.
REPLACE("The value of field Quantity is {Quantity}";"{Quantity}";[Quantity])
( {} are used only for clarity, not needed if one knows that words to be substituted do not compare in the string). Using this code in a query, and nesting as many REPLACE as many different fields one want to use:
REPLACE(REPLACE("<Salutation> <Name>";"<Salutation>";[Salutation]);"<Name>";[Name])
it is possible to embed fields name in a string and substitute them with the current value of that field in a query. Of course the latter example can be done more simply with a concatenation (&), but if the string is contained in a field instead that hardcoded, it can be linked to records as needed.
REPLACE(REPLACE([DescriptiveString];"[Salutation]";[Salutation]);"[Name]";[Name])
Moreover, it is possibile to create complex strings context-based as:
REPLACE(REPLACE(REPLACE("{Salutation} {Name} {MaidenName}";"{Salutation}";[Salutation]);"{Name}";[Name]);"{MaidenName}";IIF(Isnull([MaidenName]);"";[MaidenName]))
The hard part is to enumerate all the field's placeholders one wants to insert in the string (like {Quantity},{Salutation}, {Name}, {MaidenName}) in the REPLACE call, while with EVAL one would avoid this boring part, if only it was working.
Not as neat as I would, but works.
I am using FieldCacheTermsFilter to filter out the results matching the field value as below.
Filter filter = new FieldCacheTermsFilter("city","toronto");
This works perfectly fine, whereas it doesn't work if the value has a space or special character in between like below.
Filter filter = new FieldCacheTermsFilter("city","new york");
Filter filter = new FieldCacheTermsFilter("type","b&b");
Is there way I can achieve this with any other filter.
PS: I am using FieldCacheTermsFilter for the reason that i want to search exactly on the word matching just "toronto" and not "greater toronto". I tried using TermFilter which extracts all the records containing toronto.
Your problem isn't the TermsFilter, it's your analyzer. Analysis should suit your needs for the field.
If you need to get exact matches on a whole field, you should index it as a StringField (or use KeywordAnalyzer, or set the field as untokenized). If you index with StandardAnalyzer, there is simply no good way to do what you are asking.