I am using nutch 1.14 since I am using GCS indexer. Here is what I have in nutch-site.xml
<property>
<name>index.replace.regexp</name>
<value>
urlmatch=.*example.com\/[a-zA-Z0-9-]+
url:category=/https:\/\/www.example.com\/([a-zA-Z0-9-]+)/$1/
</value>
</property>
I am getting the error:
$ grep 'replace' logs/hadoop.log
ERROR replace.ReplaceIndexer - Pattern
url:category=/https:\/\/www.example.com\/([a-zA-Z0-9-]+)/$1/, has invalid flags component
I get the same when i change the line in nutch-site.xml to:
-url:category=/https:\/\/www.mydomain.com\/([a-zA-Z0-9-]+)/$1/2
I want to get part of url in category. Ex: If url is https://www.example.com/testcategory , i like category to be testcategory
Thanks.
If I understand correctly you want to get what is after the domain (example.com) and place it in a category field, right?
In that case, you have an error in your regex. You want to capture everything after example.com/<category> then you need to configure your urlmatch like:
urlmatch=.*example\.com\/([a-zA-Z0-9-]+)
In this case ([a-zA-Z0-9-]+) will create a capture group accessible through $1. And then you can set the field like:
url:category=$1
This would get what was captured with the capture group and place it a category field.
You can test the regex in: https://regex101.com/r/bMLqOq/1.
Related
I'm using Hybris version 6.7.0 and I stuck with the following problem:
When I trying to perform importing products from excel file. It gives me the following error ->
I've checked the excel file and there is, of course, field "Subscription Term*", it is mandatory that's why there is an asterisk there. Good to mention that this field is custom, so I write custom translator to it and exporting part works fine, but in importing part when I did debugging I found strange fact:
This WorkbookMandatoryColumnsValidator validator calls the method findColumnIndex(typeSystemSheet, sheet, this.prepareSelectedAttribute(mandatoryField)); from DefaultExcelTemplateService this method returns -1 and the validation does not passed. I dig into this method and there is such line of code:
String attributeDisplayName = this.findAttributeDisplayNameInTypeSystemSheet(typeSystemSheet, selectedAttribute); which returns "Subscription Term" string as you can see without an asterisk.
I've checked the other mandatory fields e.g. "Catalog version*^" it returns with 2 symbols after it.
The thing is that "Subscription Term" and "Subscription Term*" after string equality operation returns false and the validation fails as you can see here:
attributeDisplayName.equals(this.getCellValue(headerRow.getCell(i))).
Of course the second value is taken from the excel file where the asterisk sign presents.
If I remove an asterisk from excel file then I receive: Unknown attributes of type ISku error in WorkbookTypeCodeAndSelectedAttributeValidator validator:
The asterisk should be presented in excel file, I've just checked what would be...
It doesn't help me at all to understand what really happens.
I can't understand one thing: What is the source of "Subscription Term" string? Why without an asterisk? Is it predefined constant somewhere?
From debug I couldn't figure out from which source that string comes from.
I do not know for sure but I expect that string( i.e Subscription Term) to come from a localization file based on backoffice current session language ( e.g {extensionName}-locales_en.properties if the current language is en).
Try to search after "Subscription Term" in all properties files.
Maybe, if the attribute is mandatory(i.e optional="false" in items.xml) then Hybris will add to its name an "*" when performing the import.
Check whether you provided read and write permission to that attribute for that user.
Check with admin user before doing that. If there is no issue with admin user, then only permission issue with the user.
Unfortunately we have a special folder named "_archive" in our repository everywhere.
This folder has its purpose. But: When searching for content/documents we want to exclude it and every content beneath "_archive".
So, what i want is to exclude the path and its member from all user searches. Syntax is easy with fts:
your_query AND -PATH:"//cm:_archive//*"
to test:
https://www.docdroid.net/RmKj9gB/search-test.pdf.html
take the pdf, put it into your repo twice:
/some_random_path/search-test.pdf
/some_random_path/_archive/search-test.pdf
In node-browser everything works as expected:
TEXT:"HODOR" AND -PATH:"//cm:_archive//*"
= 1 result
TEXT:"HODOR"
= 2 results
So, my idea was to edit search.get.config.xml and add the exclusion to the list of properties:
<search>
<default-operator>AND</default-operator>
<default-query-template>%(cm:name cm:title cm:description ia:whatEvent
ia:descriptionEvent lnk:title lnk:description TEXT TAG) AND -PATH:"//cm:_archive//*"
</default-query-template>
</search>
But it does not work as intended! As soon as i am using 'text:' or 'name:' in the search field, the exclusion seems to be ignored.
What other option do i have? Basically just want to add the exclusion to the base query after the default query template is used.
Version is Alfresco Community 5.0.d
thanks!
I guess you're mistaken what query templates are meant for. Take a look at the Wiki.
So what you're basically doing is programmatically saying I've got a keyword and I want to match the keywords to the following metadata fields.
Default it will match cm:name cm:title cm:description etc. This can be changed to a custom field or in other cases to ALL.
So putting an extra AND or here of whatever won't work, cause this isn't the actual query which will be built. I can go on more about the query templates, but that won't do you any good.
In your case you'll need to modify the search.get webscript of Alfresco and the method called function getSearchResults(params) in search.lib.js (which get's imported).
Somewhere in at the end of the method it will do the following:
ftsQuery = '(' + ftsQuery + ') AND -TYPE:"cm:thumbnail" AND -TYPE:"cm:failedThumbnail" AND -TYPE:"cm:rating" AND -TYPE:"st:site"' + ' AND -ASPECT:"st:siteContainer" AND -ASPECT:"sys:hidden" AND -cm:creator:system AND -QNAME:comment\\-*';
Just add your path to query to it and that will do.
I wanted to add things such as Size, BuildHost, BuildDate etc in rpm query but adding this thing in spec file results in unknown tag?? How can I do this so that these things are reflected when i give the rpm query command?
These tags are determined when the package is built; they cannot be forced to specific values.
For example BuildHost is hardcoded in rpmbuild and cannot be changed. There is RFE https://bugzilla.redhat.com/show_bug.cgi?id=1309367 to allow it modify from command line. But right now you cannot change it by any tag in spec file nor by passing some option on command line to rpmbuild.
I assume it will be very similar to other values you specified.
RPM5 permits arbitrary unique tag names to be added to header metadata.
The tag names are configured in a colon separated list in a macro. Then the new tags can be used in spec files and can be extracted using --queryformat.
All arbitrary tags are string (or string array) valued.
I am trying run an LDAP query from a Linux machine (CentOS 5.8) to a Windows LDAP server and want to get 'memberof' detail for a user. In this example, the Domain is cm.loc and the user is admin1#cm.loc. Here is the ldapsearch syntax I am using. It returns an error.
Can someone point me in the right direction with what the correct syntax should be using ldapsearch to query for memberof detail for a particular account?
Here is what I am using that returns error; "ldap search ext bad search filter 7"
Where is my syntax wrong?
ldapsearch –x –h 192.168.1.20 –b 'DC=cm,DC=loc' -s base –D 'admin1#cm.loc' -W '(&(objectCategory=Group)(|(memberOf=group1)(memberOf=group2)…))'
Thank You
memberOf is an attribute with DN syntax. group1 is not a DN.
The syntax looks OK, you need to use the full DN syntax for the memberOf query, and it's still memberOf=, not memberOf: - if you use the colon syntax then you'll get the bad search filter error.
The next thing is that you must escape the search string according to the specifications of RFC4515. This generally means that the following characters in the search string terms: \, *, (, and ) must be escaped using \5c, \2a, \28, \29 respectively, otherwise you get the same error - bad search term. This is on top of the escaping that the ldap server may have applied to the DN already.
I'm using JMeter 2.6, and have the following setup for my test:
-
|-test.jmx
|-myschema.xsd
I've set up an XML Schema Assertion, and typed "myschema.xsd" in the File Name field. Unfortunately, this doesn't work:
HTTP Request
Output schema : error: line=1 col=114 schema_reference.4:
Failed to read schema document 'myschema.xsd', because
1) could not find the document;
2) the document could not be read;
3) the root element of the document is not <xsd:schema>.
I've tried adding several things to the path, including ${__P(user.dir)} (points to the home dir of the user) and ${__BeanShell(pwd())} (doesn't return anything). I got it working by giving the absolute path, but the script is supposed to be used by others, so that's no good.
I could make it use a property value defined in the command line, but I'd like to avoid it as well, for the same reason.
How can I correctly point the Assertion to the schema under these circumstances?
Looks like you have to in this situation
validate your xml against xsd manually: simply use corresponding java code from e.g. BeanShell Assertion or BeanShell PostProcessor;
here is a pretty nice solution: https://stackoverflow.com/a/16054/993246 (as well you can use any other you want for this);
dig into jmeter's sources, amend XML Schema file obtaining to support variables in path (File Name field) - like CSV Data Set Config does;
but the previous way seems to be much easier;
run your jmeter test-scenario from shell-script or ant-task which will first copy your xsd to jmeter's /bin dir before script execution - at least XML Schema Assertion can be used "as is".
Perhaps if you will find any other/better - please share it.
Hope this helps.
Summary: in the end I've used http://path.to.schema/myschema.xsd as the File Name parameter in the Assertion.
Explanation: following Alies Belik's advice, I've found that the code for setting up the schema looks something like this:
DocumentBuilderFactory parserFactory = DocumentBuilderFactory.newInstance();
...
parserFactory.setAttribute("http://java.sun.com/xml/jaxp/properties/schemaSource", xsdFileName);
where xsdFileName is a string (the attribute string is actually a constant, I inlined it for readability).
According to e.g. this page, the attribute, when in the form a String, is interpreted as an URI - which includes HTTP URLs. Since I already have the schema accessible through HTTP, I've opted for this solution.
Add the 'myschema.xsd' to the \bin directory of your apache-jmeter next to the 'ApacheJMeter.jar' or set the 'File Name' from the 'XML Schema Assertion' to your 'myschema.xsd' from this starting point.
E.g.
JMeter: C:\Users\username\programs\apache-jmeter-2.13\bin\ApacheJMeter.jar
Schema: C:\Users\username\workspace\yourTest\schema\myschema.xsd
File Name: ..\\..\\..\workspace\yourTest\schema\myschema.xsd