SearchBlox custom paths setup - search

I've got a path like: https://somesite/build/namez/javadoc/index.htm
Where namez - can change.
There are also paths like: https://somesite/build/namez/javadoc/somedoc/index.htm
I need to limit indexing only on https://somesite/build/*/javadoc/index.htm
How do I do this?

You can use an Allow Paths value within the SearchBlox Collection settings as follows:
/javadoc/
This will limit the crawler to only urls that contain /javadoc/.

Related

Graph API DriveItem: How can I only get query results from the root directory (PREVENT recursive searching)?

https://graph.microsoft.com/v1.0/sites/MyDomain.sharepoint.com,00000000-1111-2222-3333-444444444444/drive/search(q='Matrix')
The above correctly returns all drive files with the word "Matrix" in them within the Shared%20Documents directory for the site's provided site ID (00000000-1111-2222-3333-444444444444).
However, it's recursive: it returns files with the word "Matrix" in them within subfolders too. I only want to query files in the root directory.
How do I search for file names, only within the root directory? I tried changing /drive to /drive/root like below, but it did not make a difference:
https://graph.microsoft.com/v1.0/sites/MyDomain.sharepoint.com,00000000-1111-2222-3333-444444444444/drive/root/search(q='Matrix')
ChatGPT recommended adding the filter $filter=parentReference/path eq '/drive/root':
https://graph.microsoft.com/v1.0/sites/MySite.sharepoint.com,00000000-1111-2222-3333-444444444444/drive/search(q='Matrix')?$filter=parentReference/path eq '/drive/root'
...but I got the error "Only createdDateTime,remoteItem.shared.sharedBy.group.id,remoteItem.shared.sharedBy.user.id is supported for filtering" which ChatGPT didn't know how to get past
I solved this by obtaining the folder id of the root folder and using the /drive/items/{folderId}/children$filter URI instead of /drive/search. I obtained the folder ID of the root folder by copying the id within the parentReference of an item that lies within my root directory from the output of my first command.
Then I queried the files in the root directory with the following format:
https://graph.microsoft.com/v1.0/sites/MyDomain.sharepoint.com,{siteId}/drive/items/{folderId}/children?$filter=startswith(name,'MyWord')
So in my case, the URI ended up looking like below:
https://graph.microsoft.com/v1.0/sites/MyDomain.sharepoint.com,00000000-1111-2222-3333-444444444444/drive/items/01NCSFADN6Y2GOVW7725BZO354PWSELRRZ/children?$filter=startswith(name,'Install')
Unfortunately, I couldn't use the contains function (which functions similarly to /search) and had to use startswith because contains isn't supported on $filter for text fields.
Finally, you can optionally tack on the end whichever field(s) you're interested in retrieving with the select parameter:
&select=name,#microsoft.graph.downloadUrl

ADF Azure Data-Factory loop over folder syntax - wilcard?

i'm tryimg to loop over a diffrent countries folder that got fixed sub folder named survey (i.e Spain/survey , USA/survey ).
where and how I Need to define a wildcard / parameter for the countries so I could loop over all the files that in the survey folder ?
what is the right wildcard syntax ? ( the equivalent of - like 'survey%' in SQL) ?
I tried several ways to define it with no success and I would be happy to get some help on this - Thanks !
In case if the list of paths are static, you can create a parameter or add it in a SQL database and get that result from a lookup activity.
Pass the output to a for each activity and within foreach activity use a copy activity.
You can parameterize the input dataset to get the file paths thereby you need not think of any wildcard characters but use the actual paths itself.
Hope this is helpful.

exclude a certain path from all user searches

Unfortunately we have a special folder named "_archive" in our repository everywhere.
This folder has its purpose. But: When searching for content/documents we want to exclude it and every content beneath "_archive".
So, what i want is to exclude the path and its member from all user searches. Syntax is easy with fts:
your_query AND -PATH:"//cm:_archive//*"
to test:
https://www.docdroid.net/RmKj9gB/search-test.pdf.html
take the pdf, put it into your repo twice:
/some_random_path/search-test.pdf
/some_random_path/_archive/search-test.pdf
In node-browser everything works as expected:
TEXT:"HODOR" AND -PATH:"//cm:_archive//*"
= 1 result
TEXT:"HODOR"
= 2 results
So, my idea was to edit search.get.config.xml and add the exclusion to the list of properties:
<search>
<default-operator>AND</default-operator>
<default-query-template>%(cm:name cm:title cm:description ia:whatEvent
ia:descriptionEvent lnk:title lnk:description TEXT TAG) AND -PATH:"//cm:_archive//*"
</default-query-template>
</search>
But it does not work as intended! As soon as i am using 'text:' or 'name:' in the search field, the exclusion seems to be ignored.
What other option do i have? Basically just want to add the exclusion to the base query after the default query template is used.
Version is Alfresco Community 5.0.d
thanks!
I guess you're mistaken what query templates are meant for. Take a look at the Wiki.
So what you're basically doing is programmatically saying I've got a keyword and I want to match the keywords to the following metadata fields.
Default it will match cm:name cm:title cm:description etc. This can be changed to a custom field or in other cases to ALL.
So putting an extra AND or here of whatever won't work, cause this isn't the actual query which will be built. I can go on more about the query templates, but that won't do you any good.
In your case you'll need to modify the search.get webscript of Alfresco and the method called function getSearchResults(params) in search.lib.js (which get's imported).
Somewhere in at the end of the method it will do the following:
ftsQuery = '(' + ftsQuery + ') AND -TYPE:"cm:thumbnail" AND -TYPE:"cm:failedThumbnail" AND -TYPE:"cm:rating" AND -TYPE:"st:site"' + ' AND -ASPECT:"st:siteContainer" AND -ASPECT:"sys:hidden" AND -cm:creator:system AND -QNAME:comment\\-*';
Just add your path to query to it and that will do.

node glob pattern include all js except in certain folders

I have a project that looks like this
ls foo/
- file0.js
- a/file1.js
- b/file2.js
- c/file3.js
- d/file4.js
How do I write a glob pattern to exclude the c & d folder but get all other javascript files? I have looked here for an example, but cannot get anything to work.
I imagine the solution would look similar to this:
glob('+(**/*.js|!(c|d))', function(err, file) {
return console.log(f);
});
I am wanting back
- file0.js
- a/file1.js
- b/file2.js
For environment where there isn't a second parameter to set the exclude, we can achieve such an exception using the pattern demonstrated in the example bellow:
Pattern
/src/**/!(els)/*.scss
structure
/src/style/kit.scss
/src/style/els/some.scss
/src/style/els/two.scss
result
it will select only /src/style/kit.scss
We can use http://www.globtester.com or https://www.digitalocean.com/community/tools/glob to test quickly online.
update
If we are working with Gulp task runner. Or some other tools or classes that offers a second or multiple parameters for exclusion. Or just multiple globs selectors that support exclusion too. Then We can do as the example bellow for gulp show:
for Gulp
We pass an array instead of just a string (multiple globs selectors, one apply after another to add more files or to exclude)
src(['src/style/**/*.{scss,sass}', '!(src/style/els/**)'])
We can have multiple exclusions
watch(['src/style/**/*.{scss,sass}', '!(src/style/els/**)', '!(src/style/_somefileToExclude.scss)'])
In gulp you can use that, with any method that support an array as globs selectors. src and watch are what i used that for.
Note: If you want to exclude a folder and all it's sub folders we use ** as above and not **/* which will not work. If you need some specific files types (extension) then you can use **/*.scss for example.
There is an ignore option I glazed over in the readme:
glob('**/*.js', { ignore: '{c,d}/**' }, cb)
This will exclude both c + d folders from the match. More here

Relative path for JMeter XML Schema?

I'm using JMeter 2.6, and have the following setup for my test:
-
|-test.jmx
|-myschema.xsd
I've set up an XML Schema Assertion, and typed "myschema.xsd" in the File Name field. Unfortunately, this doesn't work:
HTTP Request
Output schema : error: line=1 col=114 schema_reference.4:
Failed to read schema document 'myschema.xsd', because
1) could not find the document;
2) the document could not be read;
3) the root element of the document is not <xsd:schema>.
I've tried adding several things to the path, including ${__P(user.dir)} (points to the home dir of the user) and ${__BeanShell(pwd())} (doesn't return anything). I got it working by giving the absolute path, but the script is supposed to be used by others, so that's no good.
I could make it use a property value defined in the command line, but I'd like to avoid it as well, for the same reason.
How can I correctly point the Assertion to the schema under these circumstances?
Looks like you have to in this situation
validate your xml against xsd manually: simply use corresponding java code from e.g. BeanShell Assertion or BeanShell PostProcessor;
here is a pretty nice solution: https://stackoverflow.com/a/16054/993246 (as well you can use any other you want for this);
dig into jmeter's sources, amend XML Schema file obtaining to support variables in path (File Name field) - like CSV Data Set Config does;
but the previous way seems to be much easier;
run your jmeter test-scenario from shell-script or ant-task which will first copy your xsd to jmeter's /bin dir before script execution - at least XML Schema Assertion can be used "as is".
Perhaps if you will find any other/better - please share it.
Hope this helps.
Summary: in the end I've used http://path.to.schema/myschema.xsd as the File Name parameter in the Assertion.
Explanation: following Alies Belik's advice, I've found that the code for setting up the schema looks something like this:
DocumentBuilderFactory parserFactory = DocumentBuilderFactory.newInstance();
...
parserFactory.setAttribute("http://java.sun.com/xml/jaxp/properties/schemaSource", xsdFileName);
where xsdFileName is a string (the attribute string is actually a constant, I inlined it for readability).
According to e.g. this page, the attribute, when in the form a String, is interpreted as an URI - which includes HTTP URLs. Since I already have the schema accessible through HTTP, I've opted for this solution.
Add the 'myschema.xsd' to the \bin directory of your apache-jmeter next to the 'ApacheJMeter.jar' or set the 'File Name' from the 'XML Schema Assertion' to your 'myschema.xsd' from this starting point.
E.g.
JMeter: C:\Users\username\programs\apache-jmeter-2.13\bin\ApacheJMeter.jar
Schema: C:\Users\username\workspace\yourTest\schema\myschema.xsd
File Name: ..\\..\\..\workspace\yourTest\schema\myschema.xsd

Resources