In nushell how can I combine where clauses - nushell

I have this nushell query:
open ./s3_actions.csv | where actions !~ 'Acl' | where actions !~ 'Cors' | where actions !~ 'Job' | where actions !~ 'Retention' | where actions !~ 'Legal' | where actions !~ 'Payment' | where actions !~ 'Policy' | where actions !~ 'Configuration' | where actions !~ 'Replicate' | where actions !~ 'AccessPoint'
Is there a way I could combine all the where actions !~ ... clauses into one big list ?
The closest I got was here but I don't know how to do nested loops, both would have the same $it variable.

You didn't provide any sample data to validate against, but I assumed something like:
actions,element,node,something
Acl Forward Bar Baz,52206,19075,71281
Legal Stack,31210,21366,52576
Configuration Help Doc,13684,65142,78826
Help Doc,48488,44794,93282
Baz Bar Foo,13140,60057,73197
Foo Baz Bar,62892,63836,126728
,64022,54909,118930
Abc,50240,16109,66348
Def Bar Foo,18680,17847,36526
Acl Bar Vegas,55553,22953,78506
Taco Hamburger Sushi,30311,42345,72656
It's possible there's another way for your particular data source, but the following line (split to multiple lines for improved readability) returns the same results as the multiple where clauses in your question:
open ./s3_actions.csv | where {|it|
(["Acl", "Cors", "Job", "Retention", "Legal", "Payment", "Policy", "Configuration", "Replicate", "AccessPoint"] |
reduce -f true { |action,acc|
($it.actions !~ $action) && $acc
})
}
Explanation:
Ultimately, what you are asking for is a way to "reduce" a "list of where actions" down to a single value, true or false. Look to the reduce command for that.
In this particular case:
where is run for each record returned from s3_actions.csv
For each record, reduce is then run for each action in the list. It starts with the accumulator (acc) set to true, but if any of the tests returns false, it will become false when the two are && (and)'ed.
And yes, because there are two blocks here (for where and reduce), you'll need to use a different name for the arguments for each.

I'd probably just use the && or and operator like ... | where x == y && a == b && z == d

Related

Is there a simple way to group multiple lines of a command output based on a beginning and end match?

Is there a simple way to group multiple lines which match a pattern into single lines?
Basically, the output of a command lists something like:
key1 blah blah = dict {
unrelated stuff {
}
something I actually want to match via grep or something
some common end term for key1 I can use as an end pattern match
}
x 100 similar keys
My end-game here in this specific case is to strip an XML of entries which have a specific entry within them. I could do this (and solve a lot of other day-to-day problems) if each entry was its own line instead of multi-line (grep in the matches, sed out the text after the bracket, etc.)
Something like:
print multi-line crap | merge beginningpattern endpattern | grep lines now that everything is merged
Basically the 'merged' command would strip all linefeeds between every new beginningpattern and endpattern (maybe putting a linefeed at the end)
awk and gsub would be the right way, if I understand your question correctly. For example:
required_string=$(cat $i.xml | awk 'BEGIN { x=0 ; y=0} /<yourStartingString>/ { x=1 } /<EndingString>/ {x=0} {if (x==1 && y==1) { gsub(/(.*<grepforwhatyouneed>)|(<endgrep>)/,"");print } } { if(x==1 && y==0) y=1 }')

Non-greedy matching of tokens from lists in ANTLR4

In a previous question with a simple grammar, I learned to handle IDs that can include keywords from a keyword list. My actual grammar is a little more complex: there are several lists of keywords that are expected in different types of sentences. Here's my attempt at a simple grammar that tells the story:
grammar Hello;
file : ( fixedSentence )* EOF ;
sentence : KEYWORD1 ID+ KEYWORD2 ID+ PERIOD
| KEYWORD3 ID+ KEYWORD4 ID+ PERIOD;
KEYWORD1 : 'hello' | 'howdy' | 'hi' ;
KEYWORD2 : 'bye' | 'goodbye' | 'adios' ;
KEYWORD3 : 'dear' | 'dearest' ;
KEYWORD4 : 'love' | 'yours' ;
PERIOD : '.' ;
ID : [a-z]+ ;
WS : [ \t\r\n]+ -> skip ;
So the sentences I want to match are, for example:
hello snickers bar goodbye mars bar.
dear peter this is fun yours james.
and that works great. But I also want to match sentences that contain keywords that would not be expected to terminate the ID+ block. For example
hello hello kitty goodbye my dearest raggedy ann and andy.
hello fist appears as KEYWORD1 and then just following as part of that first ID+. Following the example of the above linked question, I can fix it like this:
// ugly solution:
fixedSentence : KEYWORD1 a=(ID|KEYWORD1|KEYWORD3|KEYWORD4)+ KEYWORD2 b=(ID|KEYWORD1|KEYWORD2|KEYWORD3|KEYWORD4)+ PERIOD
| KEYWORD3 a=(ID|KEYWORD1|KEYWORD2|KEYWORD3)+ KEYWORD4 b=(ID|KEYWORD1|KEYWORD2|KEYWORD3|KEYWORD4)+ PERIOD;
which works and does exactly what I'd like. In my real language, I've got hundreds of keyword lists, to be used in different types of sentences, so if I try for this approach, I'll certainly make a mistake doing this, and when I create new structures in my language, I have to go back and edit all the others.
What would be nice is to do non-greedy matching from a list, following the ANTLR4 book's examples for comments. So I tried this
// non-greedy matching concept:
KEYWORD : KEYWORD1 | KEYWORD2 | KEYWORD3 | KEYWORD4 ;
niceID : ( ID | KEYWORD ) ;
niceSentence : KEYWORD1 niceID+? KEYWORD2 niceID+? PERIOD
| KEYWORD2 niceID+? KEYWORD3 niceID+? PERIOD;
which I think follows the model for comments (e.g. given on p.81 of the book):
COMMENT : '/*' .*? '*/' -> skip ;
by using the ? to suggest non-greediness. (Though the example is a lexer rule, does that change the meaning here?) fixedSentence works but niceSentence is a failure. Where do I go from here?
To be specific, the errors reported in parsing the hello kitty test sentence above are,
Testing rule sentence:
line 1:6 extraneous input 'hello' expecting ID
line 1:29 extraneous input 'dearest' expecting {'.', ID}
Testing rule fixedSentence: no errors.
Testing rule niceSentence:
line 1:6 extraneous input 'hello' expecting {ID, KEYWORD}
line 1:29 extraneous input 'dearest' expecting {KEYWORD2, ID, KEYWORD}
line 1:57 extraneous input '.' expecting {KEYWORD2, ID, KEYWORD}
And if it helps to see the parse trees, here they are.
Recognize that the parser is ideally suited to handling syntax, i.e., structure, and not semantic distinctions. Whether a keyword is an ID terminator in one context and not in another, both being syntactically equivalent, is inherently semantic.
The typical ANTLR approach to handling semantic ambiguities is to create a parse tree recognizing as many structural distinctions as reasonably possible, and then walk the tree analyzing each node in relation to the surrounding nodes (in this case) to resolve ambiguities.
If this resolves to your parser being
sentences : ( ID+ PERIOD )* EOF ;
then your sentences are essentially free form. The more appropriate tool might be an NLP library - Stanford has a nice one.
Additional
If you define your lexer rules as
KEYWORD1 : 'hello' | 'howdy' | 'hi' ;
KEYWORD2 : 'bye' | 'goodbye' | 'adios' ;
KEYWORD3 : 'dear' | 'dearest' ;
KEYWORD4 : 'love' | 'yours' ;
. . . .
KEYWORD : KEYWORD1 | KEYWORD2 | KEYWORD3 | KEYWORD4 ;
the lexer will never emit a KEYWORD token - 'hello' is consumed and emitted as a KEYWORD1 and the KEYWORD rule is never evaluated. Since the parse tree fails to identify the type of the tokens (apparently) it is not very illuminating. Dump the token stream to see what the lexer is actually doing
hello hello kitty goodbye my dearest ...
KEYWORD1 KEYWORD1 ID KEYWORD2 ID KEYWORD3 ...
If you place the KEYWORD rule before the others, then the lexer is going to only emit KEYWORD tokens.
Changing to parser rules
niceID : ( ID | keyword ) ;
keyword : KEYWORD1 | KEYWORD2 | KEYWORD3 | KEYWORD4 ;
will allow this very limited example to work.

make a change on the string based on mapping

I have the following string format
str="aaa.[any_1].bbb.[any_2].ccc"
I have the following mapping
map1:
any_1 ==> 1
cny_1 ==> 2
map2
any_2 ==> 1
bny_2 ==> 2
cny_2 ==> 3
What's the best command to execute on the str with taking account the above mapping in order to get
$ command $str
aaa.1.bbb.1.ccc
Turn your map files into sed scripts:
sed 's%^%s/%;s% ==> %/%;s%$%/g%' map?
Apply the resulting script to the input string. You can do it directly by process substitution:
sed 's%^%s/%;s% ==> %/%;s%$%/g%' map? | sed -f- <(echo "$str")
Output:
aaa.[1].bbb.[1].ccc
Update: I now think that I didn't understand the question correctly, and my solution therefore is wrong. I'm leaving it in here because I don't know if parts of this answer will be helpful to your question, but I encourage you to look at the other answers first.
Not sure what you mean. But here's something:
any_1="1"
any_2="2"
str="aaa.${any_1}.bbb.${any_2}.ccc"
echo $str
The curly brackets tell the interpreter where the variable name ends and the normal string resumes. Result:
aaa.1.bbb.2.ccc
You can loop this:
for any_1 in {1..2}; do
for any_2 in {1..3}; do
echo aaa.${any_1}.bbb.${any_2}.ccc
done
done
Here {1..3} represents the numbers 1, 2, and 3. Result
aaa.1.bbb.1.ccc
aaa.1.bbb.2.ccc
aaa.1.bbb.3.ccc
aaa.2.bbb.1.ccc
aaa.2.bbb.2.ccc
aaa.2.bbb.3.ccc
{
echo "${str}"
cat Map1
cat Map2
} | sed -n '1h;1!H;$!d
x
s/[[:space:]]*==>[[:space:]]*/ /g
:a
s/\[\([^]]*\)\]\(.*\)\n\1 \([^[:cntrl:]]*\)/\3\2/
ta
s/\n.*//p'
you could use several mapping, not limited to 2 (even and find to cat every mapping found).
based on fact that alias and value have no space inside (can be adapted if any)
I have upvoted #chw21's answer as it promotes - right tool for the problem scenario. However,
You can devise a perlbased command based on the following.
#!/usr/bin/perl
use strict;
use warnings;
my $text = join '',<DATA>;
my %myMap = (
'any_1' => '1',
'any_2' => '2'
);
$text =~s/\[([^]]+)\]/replace($1)/ge;
print $text;
sub replace {
my ($needle) = #_;
return "\[$needle\]" if ! exists $myMap{ lc $needle};
return $myMap{lc $needle};
}
__DATA__
aaa.[any_1].bbb.[any_2].ccc
Only thing that requires a bit of explanation is may be the regex,it matches text that comes between square brackets and sends the text to replace routine. In replace routine, we get mapped value from map corresponding to its argument.
$ cat tst.awk
BEGIN {
FS=OFS="."
m["any_1"]=1; m["cny_1"]=2
m["any_2"]=1; m["bny_2"]=2; m["cny_2"]=3
for (i in m) map["["i"]"] = m[i]
}
{
for (i=1;i<=NF;i++) {
$i = ($i in map ? map[$i] : $i)
}
print
}
$ awk -f tst.awk <<<'aaa.[any_1].bbb.[any_2].ccc'
aaa.1.bbb.1.ccc

how to split the data in the unix file

I've a file in Unix (solaris) system with data like below
[TYPEA]:/home/typeb/file1.dat
[TYPEB]:/home/typeb/file2.dat
[TYPEB]:/home/typeb/file3.dat
[TYPE_C]:/home/type_d/file4.dat
[TYPE_C]:/home/type_d/file5.dat
[TYPE_C]:/home/type_d/file6.dat
I want to separate the headings like below
[TYPEA]
/home/typeb/file1.dat
[TYPEB]
/home/typeb/file2.dat
/home/typeb/file3.dat
[TYPE_C]
/home/type_d/file4.dat
/home/type_d/file5.dat
/home/type_d/file6.dat
Files with similar type have to come under one type.
Please help me with any logic to achieve this without hardcoding.
Assuming the input is sorted by type like in your example,
awk -F : '$1 != prev { print $1 } { print $2; prev=$1 }' file
If there are more than 2 fields you will need to adjust the second clause.
sed 'H;$ !b
x
s/\(\(\n\)\(\[[^]]\{1,\}]\):\)/\1\2\1/g
:cycle
=;l
s/\(\n\[[^]]\{1,\}]\)\(.*\)\1/\1\2/g
t cycle
s/^\n//' YourFile
Posix sed version a bit unreadeable due to presence of [ in pattern
- allow : in label or file/path
- failed if same label have a line with another label between them (sample seems ordered).
If you can use perl you will be able to make use of hashes to create a simple data structure:
#! /usr/bin/perl
use warnings;
use strict;
my %h;
while(<>){
chomp;
my ($key,$value) = split /:/;
$h{$key} = [] unless exists $h{$key};
push ${h{$key}},$value;
}
foreach my $key (sort keys %h) {
print "$key"."\n";
foreach my $value (#{$h{$key}}){
print "$value"."\n";
}
}
In action:
perl script.pl file
[TYPEA]
/home/typeb/file1.dat
[TYPEB]
/home/typeb/file2.dat
/home/typeb/file3.dat
[TYPE_C]
/home/type_d/file4.dat
/home/type_d/file5.dat
/home/type_d/file6.dat
If you like it, there is a wholeTutorial to solve this simple problem. It's worth reading it.

awk group by and print if matches a condition

I have this structure:
aaa,up
bbb,down
aaa,down
aaa,down
aaa,up
bbb,down
ccc,down
ccc,down
ddd,up
ddd,down
And I would like to have the next output:
aaa,up
bbb,down
ccc,down
ddd,up
So, the firs thing is to group by. Then, if at least one line is up print up else print down.
So far I have this:
awk -F"," '$2=="up"{arr[$1]++}END{for (a in arr) print a,arr[a]}'
Then I change $2=="down" and join the two results into one. But with this, I have duplicated values for up's and down's.
Sometimes, instead of ups and downs I receive 0,1,2,3,4 which are more variables and the up status is 0 and 1.
Thanks in advance.
How about save the value you see, with a preference for "up"?
awk -F "," '$2 ~ /0^(0|1)$/ { $2 = "up" }
$2 ~ /^[2-9]/ { $2 = "down" }
$2 == "up" || !($1 in a) { a[$1]=$2 }
END { OFS=FS; for(k in a) print k, a[k] }' file | sort
That is, if the value is "up", we always save it. Otherwise, we only save the value if we don't yet have a value for this key.
I'm not sure I grasped your 0,1,2,3,4 requirement. The first lines now convert a number into either "up" or "down".
It's similar to triplee one, but imho it's sufficiently different to have an answer on its own, in particular I think that the logical flow is clearer by skipping processing when the variable has already been "upped", and the job of discriminating the different possible types of $2 is handled to a simple user function
awk -F"," '
function up_p(x){
if(x==0||x=="down") return "down"; else return "up"
}
a[$1]=="up" {next}
{a[$1]=up_p($2)}
END {for(k in a) print k "," a[k]}' file | sort
aaa,up
bbb,down
ccc,down
ddd,up
On second thought, the user function is unnecessary...
awk -F"," '
a[$1]=="up" {next}
{a[$1]=($2==0||$2=="down")?"down":"up"}
END {for(k in a) print k "," a[k]}' file | sort
aaa,up
bbb,down
ccc,down
ddd,up
but it comes down to personal taste so I leave both versions in my answer.

Resources