How to globby with findFiles

How to globby with findFiles - node.js

What on Earth is findFiles?
It's in the vscode API documentation you didn't read on file-system support.
Choice of weapons
VS Code is a node application, and so are extensions. Everyone knows this. So the first time we need to manipulate a file, we import fs, which works, and we try some file manipulation, which works. At this point we congratulate ourselves and the VS Code team on transfer of skills, and we stop thinking about it because we know exactly how to do this.
Why would anyone doubt this choice? By all appearances Node fs is an outstanding solution. It's well documented, widely understood and cross-platform. Nothing could possibly go wrong.
Everything has gone wrong
Tempora mutantur, nos et mutamur. Things like Docker and WSL mean that even Windows users on a single host are quite likely to use a remote filesystem. But Node fs, and by extension your extension, only knows about the filesystem in which it was launched.
You need file-system redirection. That implies a proxy at the other end, and some sort of indirection so you can specify which filesystem on what host using which authentication. Once you start thinking about this you wonder why the VS Code team doesn't publish an API for whatever magic they use to do this themselves.
Remember those annoying vscode.Uri objects you have to unpick to get fsPath? The ones returned by all the file system UI interaction events? They have all that metadata in them.
Cut to the chase
If you look at vscode.workspace.fs ("fs"?! that's a clue!) you will find it exports all the methods you need to read and write files and directories. All of these methods take a vscode.Uri. At this point I hope you've connected the dots yourself. Yep, remote filesystem access for the masses (if you can call extension writers the masses).
You still haven't answered the question, Pete: what is findFiles?
Oh. Well. Okay. Right, I expect you all know what globby is and what it's for. Obviously you can't use it on a remote filesystem. And your extension depends on it. Guess what this is for:
findFiles(
include: GlobPattern,
exclude?: GlobPattern | null,
maxResults?: number,
token?: CancellationToken
): Thenable<Uri[]>
That's globby for remote filesystems, what's your problem?
My problem is it takes a single pattern, not an array of patterns like globby.
While I can work around that by comma separating patterns and surrounding that with curly braces like so {**/*.dll,**/*.exe} that falls apart when the patterns already use this syntax. You're not allowed to nest it. And my users are very likely to want to use patterns like **/*.{exe,dll,pdb,hex,bin,pdf,png,jpg,jpeg,gif,pfx}
So what is it you want, Pete?
So far this is all just you showing off. What were you hoping for in an answer?
Well, the obvious solution is to expand patterns containing expressions into multiple literal expressions, before combining them into a single non-nested expression. But that's boring and hard. I hate writing Regexes.
Or maybe it's not necessary and there's a solution I haven't noticed. I don't know. Help!

[F]alls apart when the patterns already use this syntax. You're not allowed to nest it.
I think what you can do is just process the nested syntax yourself and output non-nested syntax which has the meaning which you require.
E.g. suppose {{a,b},c}x means {a,b}x cx, which then means ax bx cx.
Observe that {a,b,c}x also means ax bx cx and so under this notion of nesting, {{a,b},c} is equivalent to {a,b,c}.
If that's the desired semantics, you can just parse the nested notations in your pattern, and flatten any nesting.

A brace expansion library already exists for Node. It is mysteriously named braces and is available from npm or from its repository on GitHub.
It operates on a string and returns an array of strings. It can expand recursive brace expressions, so a loop is not necessary for my application.
const toxicExcludes = `{${excludes.join(",")}}`; // may contain nested braces
const safeExcludes = `{${braces.expand(toxicExcludes).join(",")}}`;

Related

Tools for Domain Specific Language/Functions

Our users can enter questions that get answered by students. Our users need a extensible, flexible way to define the correct answers to these questions (which are stored as a simple string).
I would like to expose a library of domain specific functions that users can call on to describe the correct answer. Eg:
exact_match("puppy") // means the correct answer is the string 'puppy'
or
contains("yesterday") // means any answer with the word 'yesterday' is correct
The naive implementation would involve eval'ing user supplied strings in a sandboxed runtime (like a javascript vm or ruby vm). But I'd like to go further and only allow specific functions to be called. Any other scripting would be discarded. Such that:
puts("foo"); contains("yesterday")
would be illegal. Since we don't expose or allow puts().
How can I constrain the execution environment to only run a whitelist of functions? Or is there a different approach to build this kind of external-facing DSL instead of trying to constrain an existing language to a subset of functions?

I would check out MPS by JetBrains if I were you, its an open source DSL creation tool. I have never used it myself, but from everything I have seen on it, it's very intuitive; and all of their other products are incredibly powerful.

Just because you're creating a DSL, that doesn't necessarily mean that you have to give the user the ability to enter the code in text.
The key to this is providing a list of method names and your special keyword for them, the "FunCode" tag in the code example below:
Create a mapping from keyword to code, and letting them define everything they need, and then use it. And I would actually build my own XML parser so that it's not hackable, at least not on a list of zero-day-exploits hackable.
<strDefs>
<strDef><strNam>sickStr</strNam>
<strText>sick</strText><strNum>01</strNum><strDef>
<strDef><strNam>pupStr</strNam>
<strText>puppy</strText><strNum>02</strNum><strDef>
</strDefs>
<funDefs>
<funDef><funCode>pfContainsStr</funCode><funLabel>contains</funLabel>
<funNum>01</funNum></funDef>
<funDef><funCode>pfXact</funCode><funLabel>exact_match</funLabel>
<funNum>02</funNum></funDef>
</funDefs>
<queries>
<query><fun>01</fun><str>02</str>
</query>
</queries>
The above XML more represents the idea and the structure of what to do, but rather in a user interface, so the user is constrained. The user interface code that allows the data-entry of the above data should be running on your server, and they only interact with it. Any code that runs on their browser is hackable, because they can just save the page, edit the HTML (and/or JavaScript), and run that, which is their code now, not yours anymore.
You can't really open the door (pandora's box) and allow just anyone to write just any code and have it evaluated / interpreted by the language parser, because some hacker is going to exploit it. You must lock down the strings, probably by having them enter them into your database in an earlier step, and each string gets its own token that YOU generate (a SQL Server primary key is very simple, usable, and secure), but give them a display representation so it's readable to them.
Then give them a list of methods / functions they can use, along with a token (a primary key can also serve here, perhaps with a kind of table prefix) and also a display representation (label).
If you have them put all of their labels into yet another table, you can have SQL make sure that all of their labels are unique to each other in the whole "language", and then you can allow them to try to define their expressions in the language they want to use. This has the advantage that foreign languages can be used, but you don't have to do anything terribly special.
An important piece would be the verify button, that would translate their expression into unique tokens and back again, checking that the round-trip was successful. If it wasn't successful, there's some kind of ambiguity, and you might be able to allow them an option to use the list of tokens as the source in that case.
If you heavily rely on set-based logic for the underlying foundation of the language and your tables, you should be able to produce a coherent DSL that works. Many DSL creation problems are ones of integrity, where there are underlying assumptions that are contradictory, unintentionally mutually exclusive, or nonsensical. Truth is an unshakeable foundation. Anything else has a lie somewhere -- that you're trying to build on.
Sudoku is illustrative here. When you screw up a Sudoku, you often don't know that you have done so, and you keep building on that false foundation, until you get to the completion of the puzzle, and one whole string of assumptions disagrees with a different string of assumptions. They can't both be true. But you can't tell where you went wrong because you're too far away from the mistake and can not work backwards (easily). All steps taken look correct. A DSL, a database schema, and code, are all this way. Baby steps, that are double- and even triple-checked, and hopefully "correct by inspection", are the best way to "grow" a DSL, slowly, piece-by-piece. The best way to not have flaws is to not add them in the first place.
You don't want bugs in your DSL. Keep it spartan. KISS - Keep it simple, Sparticus! And I have personally found that keeping it set-based, if not overtly, under the covers, accomplishes this very well.
Finally, to be able to think this way, I've studied languages for a long time, and have cultivated a curiosity about how languages have come to be. Books are a good quality source of information, as they have a higher quality level than the internet, which is nevertheless also an indispensable source. Some of my favorite languages: Forth, Factor, SETL, F#, C#, Visual FoxPro (especially for its embedded SQL), T-SQL, Common LISP, Clojure, and probably my favorite, Dylan, an INFIX Lisp without parentheses that Apple experimented with and abandoned, with a syntax that seems to me reminiscent of Pascal, which I sort of liked. The language list is actually much longer than that (and I haven't written code for many of them -- just studied them or their genesis), but that's enough for now.
One of my favorite books, and immensely interesting for the "people" side of it, is "Masterminds of Programming: Conversations with the Creators of Major Programming Languages" (Theory in Practice (O'Reilly)) 1st Edition, Kindle Edition
by Federico Biancuzzi (Author), Chromatic (Author)
By the way, don't let them compromise the integrity of your DSL -- require that it is expressible set-based, and things should go well (IMHO). I hope it works out well for you. Add a comment to my answer telling me how it worked out, if you think of it. And don't forget to choose my answer if you think it's the best! We work hard for the money! ;-)

What is the naming convention of NodeJS?

I found that the naming conversion of node is a little strange. For instance, in the file system module, the read link function's letters are all lower cased:
fs.readlink
But the read file function's name are camelized:
fs.readFile
It confused me. After mistyping for times, I think I shoud ask. So is there a naming convention to help me memorize the api names?

Node's default convention is camelCase.
But functions in file system module named according to their respective POSIX C interface functions.
For example readdir, readlink.
These functions names are well-known by Linux developers and therefore it's often decided to use them as is(as single word), without camelizing.

Always go camel case, almostly everyone does it.
The Node core does have various disparities in this case, like the one you mentioned, process have some too (process.get*() x process.memoryUsage()), and others; but the majority of the core methods are camel cased.
Until you memorize the ones who are not camel cased, I'd say that it's a good tip to always develop with the docs open ;)

Can a LabVIEW VI tell whether one of its output terminals is wired?

In LabVIEW, is it possible to tell from within a VI whether an output terminal is wired in the calling VI? Obviously, this would depend on the calling VI, but perhaps there is some way to find the answer for the current invocation of a VI.
In C terms, this would be like defining a function that takes arguments which are pointers to where to store output parameters, but will accept NULL if the caller is not interested in that parameter.

As it was said you can't do this in the natural way, but there's a workaround using data value references (requires LV 2009). It is the same idea of giving a NULL pointer to an output argument. The result is given in input as a data value reference (which is the pointer), and checked for Not a Reference by the SubVI. If it is null, do nothing.
Here is the SubVI (case true does nothing of course):
And here is the calling VI:
Images are VI snippets so you can drag and drop on a diagram to get the code.

I'd suggest you're going about this the wrong way. If the compiler is not smart enough to avoid the calculation on its own, make two versions of this VI. One that does the expensive calculation, one that does not. Then make a polymorphic VI that will allow you to switch between them. You already know at design time which version you want (because you're either wiring the output terminal or not), so just use the correct version of the polymorphic VI.
Alternatively, pass in a variable that switches on or off a Case statement for the expensive section of your calculation.

Like Underflow said, the basic answer is no.
You can have a look here to get the what is probably the most official and detailed answer which will ever be provided by NI.
Extending your analogy, you can do this in LV, except LV doesn't have the concept of null that C does. You can see an example of this here.
Note that the code in the link Underflow provided will not work in an executable, because the diagrams are stripped by default when building an EXE and because the RTE does not support some of properties and methods used there.
Sorry, I see I misunderstood the question. I thought you were asking about an input, so the idea I suggested does not apply. The restrictions I pointed do apply, though.
Why do you want to do this? There might be another solution.

Generally, no.
It is possible to do a static analysis on the code using the "scripting" features. This would require pulling the calling hierarchy, and tracking the wire references.
Pulling together a trial of this, there are some difficulties. Multiple identical sub-vi's on the same diagram are difficult to distinguish. Also, terminal references appear to be accessible mostly by name, which can lead to some collisions with identically named terminals of other vi's.
NI has done a bit of work on a variation of this problem; check out this.

In general, the LV compiler optimizes the machine code in such a way that unused code is not even built into the executable.
This does not apply to subVIs (because there's no way of knowing that you won't try to use the value of the indicators somehow, although LV could do it if it removes the FP when building an executable, and possibly does), but there is one way you can get it to apply to a subVI - inline the subVI, which should allow the compiler to see the outputs aren't used. You can also set its priority to subroutine, which will possibly also do this, but I wouldn't recommend that.
Officially, in-lining is only available in LV 2010, but there are ways of accessing the private VI property in older versions. I wouldn't recommend it, though, and it's likely that 2010 has some optimizations in this area that older versions did not.
P.S. In general, the details of the compiling process are not exposed and vary between LV versions as NI tweaks the compiler. The whole process is supposed to have been given a major upgrade in LV 2010 and there should be a webcast on NI's site with some of the details.

Are there any context-sensitive code search tools?

I have been getting very frustrated recently in dealing with a massive bulk of legacy code which I am trying to get familiar with.
Say I try to search for a particular function call, I get loads of results that turn out to be completely irrelevant; some of them are easy to spot, eg a comment saying
// Fixed functionality in foo() so don't need to handle this here any more
But others are much harder to spot manually, because they turn out to be calls from other functions in modules that are only compiled in certain cases, or are part of a much larger block of code that is #if 0'd out in its entirety.
What I'd like would be a search tool that would allow me to search for a term and give me the choice to include or exclude commented out or #if 0'd out code. Then the search results would be displayed alongside a list of #defines that are required in order for that snippet of code to be relevant.
I'm working in C / C++, but other than the specific comment syntax I guess the techniques should be more generally applicable.
Does such a tool exist?

Not entirely what you're after, but I find this quite handy.
GrepWin - A free visual "grep" tool for searching files.
I find it quite helpful because:
Its a separate app (doesn't lock up my editor)
Handles Regular expressions
Its fast
Can specify what folder to search, and what filetypes (handles regex's here too)
Can limit by file size
Can include subdirs (or exclude by regex)
etc.

Almost any decent source browser will let you go to where a function is defined, and/or list all the calls of that function and take you directly to a call site. This will normally be based on a fairly complete parse of the source code so it will ignore comments, code that's excluded by the preprocessor, and so on (in fact, in at least one case, the parser used by the source browser is almost certainly better than the one used in the compiler itself).

When is it a good use of time to refactor string literals?

I'm starting on a project where strings are written into the code most of the time. Many strings might only be used in a few places but some strings are common throughout many pages.
Is it a good use of my time to refactor the literals into constants being that the app is pretty well established and runs well? What would be the long-term benefits to doing so?

One common thing to consider would be i18n. If you (or your muckity-mucks) ever want to sell your product in Mexico or France (etc.) you're going to appreciate having those string literals not littered throughout the code base.
EDIT: I realize this doesn't directly answer your question, so I'm voting up some of the other answers re: rule of three, and the like. I understand you're talking about an existing code base, so it's a little late to talk about incorporating i18n from the start. It's so easy to do when you're in the habit from the start.

I like to apply the rule of three when refactoring. If it happens three or more times, then the code needs to be updated.

Only if this project needs to be supported into the future is this a good use of time. If you will be regularly maintaining/expanding this system; however, this is a great idea.
1) There is a large degree of risk associated with string literals as a single misspelling can usually only be detected at run time. The reduced risk of run time errors is a serious advantage as they can be embarrassing/frustrating.
2) Also, should they ever need to be changed, for example when they are used to reference another system (like table names, server names etc) they can be very difficult to update when those other system names change. Centralize them and it's a trivial issue.

If a string is used in more than one place, refactor it. If it is only used in one place, leave it alone.

If you've refactored out all your common strings, it makes it easier to internationalize/translate them. It's even easier if they're all in properties files, or whatever your language equivalent is.

Is it a good use of my time to refactor the literals into constants being that the app is pretty well established and runs well?
No, you better leave it like it is.
What would be the long-term benefits to doing so?
If no one ever touch that code, the benefits are none.
What you can do, however is avoid adding new literals. But I would pretty much leave existing the way they are.
You could probably refactor them in your free to sleep better.
Probably there are some other bugs already that need your attention. Fix those instead.
Finally, if you manage you add "refactoring" to your task list, go ahead!!!

I agree with JMD, just keep in mind that there is more to i18n than changing strings(currencies, UI must be adpated to Right-to-left languages etc)
Even if you don´t wish to 18n your application it would be useful to refactor your strings, since that string that is used today only once, maybe reused tomorrow several times, and if it´s hardcoded you may be not aware of it and star replicating string all over the place.

Best let sleeping dogs lie. If you need to change a string that's used eighteen-lumpy times, yeah, go ahead and turn it into a constant somewhere. If you find yourself working in a module that has a string that could be constant-ize, do it if you feel like it. But going through the whole app changing all strings to constants... that should be on the very bottom of the to-do list.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string