Emacs Searching

Emacs Searching - search

I have a convoluted search request. Lets say that I am searching for an URI pattern. I do know the scheme and the authority. Lets say http://mycompany.com.
After this URI pattern, ideally most of the URI in my search domain have two path variable.
/Context/Resource. Although it could have more. But it will always have a context.
I would like to find the distinct set of first path variable. I do not mind about the second and subsequent path variable.So if I have this.Lets use a qname is myc.
myc:/context1/resource1
myc:/context1/resource2
myc:/context2/resource1
myc:/context3/resource1
myc:/context4/resource8
myc:/context1/resource12
I will have to get context1..4. Thank You for your time.

If I understand you correctly,
(require 'cl)
(remove-duplicates
(loop while (re-search-forward "myc:/\\(.*?\\)/" nil t)
collect (match-string-no-properties 1))
:test #'string=)

Emacs supports regex searches which are normally bound to C-M-s.
The Emacs manual has a nice section about regular expressions in Emacs.
There is also M-x regexp-builder to help you build the search string with real-time feedback.

Related

make menhir find all alternatives?

I would like to change the behavior of menhir's output in follwoing way:
I want it to look up all grammatical alternatives if it finds any, and put them in a list and get me back this ambigouus interpretation. It shall not reduce conflicts, just store them.
In the source code of menhir, it seems to me, that I have to look in "Engine.ml". The resultant syntactically determined token comes in a variant type item "Accepted v" as a state of a checkpoint of the grammatical automaton. This content is found by a function "accept env prod" before, that is part of a bundle of recursive functions, that change the states.
Do you have a tip, how I could change these functions to put all the possible results in the list here and proceed as if nothing happened? Or do you think, that this wont work anyway?
Thanks.

What you are looking for is a GLR parser generator (G is for generalized). Menhir is not such tool, and I doubt you could modify it easily to do what you want.
However, there is another tool that does exactly what you want: dypgen.

How to get (translatable) strings from specific domain with POEdit

I have been trying for hours finding a way to setup POEdit so that it can grab the text from specific domain only
My gettext function looks like this:
function ri($id, $parameters = array(), $domain = 'default', $locale = null)
A sample call:
echo ri('Text %xyz%', array('%xyz%'=>100), 'myDomain');
I will need to grab only the text with the domain myDomain to translate, or at least I want POEdit to put these texts into domain specific files. Is there a way to do it?
I found several questions that are similar but the answers don't really tell me what to do (I think I'm such a noob it must be explained in plain English for me to understand):
How to set gettext text domain in Poedit?
How to get list of translatable messages

So I finally figured it out after days of searching, I finally found the answer here:
http://sourceforge.net/mailarchive/message.php?msg_id=27691818
xgettext recognizes context in strings, and gives a msgctxt field in the *.pot file, which is recognized by translation software as a
context and is shown as such (check image of Pootle showing context
below)
This can be done in 3 ways:
String in code should be in the format _t('context','string'); and xgettext invocation should be in the form --keyword=_t:1c,2
(this basically explains to xgettext that there are 2 arguments in
the keyword function, 1st one is context, 2nd one is string)
String in code in the format _t('string','context'); and xgettext invocation should be in the form --keyword=_t:1,2c
String in the code should be as _t('context|string') and xgettext invocation should be in the form --keyword=_t:1g
So to answer my own question, I added this to the "sources keywords" tab of Poedit:
ri:1,3c
ri is the function name, 1 is the location of the stringid, 3 is the location of the context/domain
Hope this helps someone else, I hate all these cryptic documents

(This is a repost of my answer to the same thing here.)
Neither GNU gettext tools nor Poedit (which uses them) support this particular misuse of gettext.
In gettext, domain is roughly “a piece of software” — a program, a library, a plugin, a theme. As such, it typically resides in a single directory tree and is alone there — or at the very least, if you have multiple pieces=domains, you have them organized sanely into some subdirectories that you can limit the extraction to.
Mixing and matching domains within a single file as you do is not how gettext was intended to be used, and there’s no reasonable solution to handle it other than using your own helper function, e.g. by wrapping all myDomain texts into __mydomain (which you must define, obviously) and adding that to the list of keywords in Poedit when extracting for myDomain and not adding that to the list of keywords for other domains' files.

What's the name for hyphen-separated case?

This is PascalCase: SomeSymbol
This is camelCase: someSymbol
This is snake_case: some_symbol
So my questions is whether there is a widely accepted name for this: some-symbol? It's commonly used in url's.

There isn't really a standard name for this case convention, and there is disagreement over what it should be called.
That said, as of 2019, there is a strong case to be made that kebab-case is winning:
https://trends.google.com/trends/explore?date=all&q=kebab-case,spinal-case,lisp-case,dash-case,caterpillar-case
spinal-case is a distant second, and no other terms have any traction at all.
Additionally, kebab-case has entered the lexicon of several javascript code libraries, e.g.:
https://lodash.com/docs/#kebabCase
https://www.npmjs.com/package/kebab-case
https://v2.vuejs.org/v2/guide/components-props.html#Prop-Casing-camelCase-vs-kebab-case
However, there are still other terms that people use. Lisp has used this convention for decades as described in this Wikipedia entry, so some people have described it as lisp-case. Some other forms I've seen include caterpillar-case, dash-case, and hyphen-case, but none of these is standard.
So the answer to your question is: No, there isn't a single widely-accepted name for this case convention analogous to snake_case or camelCase, which are widely-accepted.

It's referred to as kebab-case. See lodash docs.

It's also sometimes known as caterpillar-case

This is the most famous case and It has many names
kebab-case: It's the name most adopted by official software
caterpillar-case
dash-case
hyphen-case or hyphenated-case
lisp-case
spinal-case
css-case
slug-case
friendly-url-case

As the character (-) is referred to as "hyphen" or "dash", it seems more natural to name this "dash-case", or "hyphen-case" (less frequently used).
As mentioned in Wikipedia, "kebab-case" is also used. Apparently (see answer) this is because the character would look like a skewer... It needs some imagination though.
Used in lodash lib for example.
Recently, "dash-case" was used by
Angular (https://angular.io/guide/glossary#case-types)
NPM modules
https://www.npmjs.com/package/case-dash (removed ?)
https://www.npmjs.com/package/dasherize

Adding the correct link here Kebab Case
which is All lowercase with - separating words.

I've always called it, and heard it be called, 'dashcase.'

There is no standardized name.
Libraries like jquery and lodash refer it as kebab-case. So does Vuejs javascript framework. However, I am not sure whether it's safe to declare that it's referred as kebab-case in javascript world.

I've always known it as kebab-case.
On a funny note, I've heard people call it a SCREAM-KEBAB when all the letters are capitalized.
Kebab Case Warning
I've always liked kebab-case as it seems the most readable when you need whitespace. However, some programs interpret the dash as a minus sign, and it can cause problems as what you think is a name turns into a subtraction operation.
first-second // first minus second?
ten-2 // ten minus two?
Also, some frameworks parse dashes in kebab cased property. For example, GitHub Pages uses Jekyll, and Jekyll parses any dashes it finds in an md file. For example, a file named 2020-1-2-homepage.md on GitHub Pages gets put into a folder structured as \2020\1\2\homepage.html when the site is compiled.
Snake_case vs kebab-case
A safer alternative to kebab-case is snake_case, or SCREAMING_SNAKE_CASE, as underscores cause less confusion when compared to a minus sign.

I'd simply say that it was hyphenated.

Worth to mention from abolish:
https://github.com/tpope/vim-abolish/blob/master/doc/abolish.txt#L152
dash-case or kebab-case

In Salesforce, It is referred as kebab-case. See below
https://developer.salesforce.com/docs/component-library/documentation/lwc/lwc.js_props_names

Here is a more recent discombobulation. Documentation everywhere in angular JS and Pluralsight courses and books on angular, all refer to kebab-case as snake-case, not differentiating between the two.
Its too bad caterpillar-case did not stick because snake_case and caterpillar-case are easily remembered and actually look like what they represent (if you have a good imagination).

My ECMAScript proposal for String.prototype.toKebabCase.
String.prototype.toKebabCase = function () {
return this.valueOf().replace(/-/g, ' ').split('')
.reduce((str, char) => char.toUpperCase() === char ?
`${str} ${char}` :
`${str}${char}`, ''
).replace(/ * /g, ' ').trim().replace(/ /g, '-').toLowerCase();
}

This casing can also be called a "slug", and the process of turning a phrase into it "slugify".
https://hexdocs.pm/slugify/Slug.html

In Sphinx Search, how do I add "hashtag" to the charset_table?

I would like people to be able to search #photography as well as photography. Those should be treated as two different words in Sphinx. By default, #photography maps to photography, and I can't search for hashtags.
I read on this page that you can add the hash tag to the charset_table to accomplish this. I am completely clueless on how to do that. I don't know unicode, and I don't know what my charset_table should be.
Can someone tell me what my charset_table should be? Thanks.
# charset_table = 0..9, A..Z->a..z, _, a..z, U+410..U+42F->U+430..U+44F, U+430..U+44F
Note: I plan on using real-time index. (not sure if this makes a difference)

It's U+0023 according to the Unicode table. So the final config should be like
charset_table = 0..9, A..Z->a..z, _, a..z, U+23, U+410..U+42F->U+430..U+44F, U+430..U+44F
Don't forget about charset_type variable. AFAIK, this example charset_table is for utf-8. Besides this, you should delete U+23 from blend_chars variable to allow Sphinx to index it as a legit character.

I would like people to be able to search #photography as well as photography. Those should be treated as two different words in Sphinx. By default, #photography maps to photography, and I can't search for hashtags.
good day.
i think it wiil some workaround for you problem, but:
it's bad way to call search function directly from user query.
before call search function in sphinx engine, you need to make some kind of processing on user string.
for example you may check user string for some kind of special characters and delete special characters from query. aftet you may call search function with proceeded query.
good luck.

Turn off abbreviation in getopt_long (optarg.h)?

Is it possible to turn off abbreviation in getopt_long()? From the man page:
Long option names may be abbreviated if the abbreviation is unique or is an exact match for >some defined option.
I want to do this because the specification I have received for a piece of code requires full-length exact match of the flags, and there are many flags.

Codeape,
It appears there isn't a way to disable the abbreviation feature. You aren't alone in wishing for this feature. See: http://sourceware.org/bugzilla/show_bug.cgi?id=6863
Unfortunately, It seems the glibc developers don't want the option as the bug report linked above was resolved with "WONTFIX". You may be out of luck here :-\

If you use argp_parse() instead of getopt() (highly reccommended, BTW), you can access the exact flag entered by the user through
state->argv[ state->next - 2 ]
It's a bit of a hack, but should work.

This is not perfect solution but you can check exact arg given by a user after calling getopt_long() (normally within switch) like below:
if (strcmp(argv[optind-1], "--longoption") == 0)
optind points a next argument that you need to process. Thus, you can access the original arg using optind-1.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Emacs Searching - search

If I understand you correctly, (require 'cl) (remove-duplicates (loop while (re-search-forward "myc:/\\(.*?\\)/" nil t) collect (match-string-no-properties 1)) :test #'string=)

Emacs supports regex searches which are normally bound to C-M-s. The Emacs manual has a nice section about regular expressions in Emacs. There is also M-x regexp-builder to help you build the search string with real-time feedback.

Related

make menhir find all alternatives?

How to get (translatable) strings from specific domain with POEdit

What's the name for hyphen-separated case?

In Sphinx Search, how do I add "hashtag" to the charset_table?

Turn off abbreviation in getopt_long (optarg.h)?

Categories

Resources