Beeline usage instead of hive in shell script - linux

In my shell script I am using this query to get the last_value of the column id.
last_val=`beeline -e "select nvl(max(id),0) from testing.1_test"`
The result is
+----------+--+
| _c0 |
+----------+--+
| 3380901 |
+----------+--+
Now I want to pass this value as variable ${last_val}
when I do echo ${last_val} I want to have 3380901 but I am receiving
+----------+--+
| _c0 |
+----------+--+
| 3380901 |
+----------+--+
How can I echo 3380901.
When I used hive option like below I got what I want
last_val=`hive -e "select nvl(max(id),0) from testing.1_test"`
echo ${last_val} gave me 3380901
Please let me know how can I do this?

last_val=`beeline --showHeader=false --outputformat=tsv2 -e "select nvl(max(id),0) from testing.1_test"`

Related

Bash: For Loop & save each output as a new column in a csv

I have a folder with a mixture of files types (.bam, .bam.bai, and .log). I created a for loop to perform two commands on each of the .bam files. My current code direct the output of each command into a separate csv files, because I could not figure out how to direct the outputs to separate columns.
TYIA!
Question 1
I want to export the output from the commands into the same csv. How can I alter my code so that the output from my first command is saved as the first column of a csv, and the output from my second command is saved as the second column of the same csv.
Question 2
What is the name of the syntax used to select files in a for loop? For instance, the * in *.bam represents a wildcard. Is this regex? I had a tough time trying to alter this so that only *.bam files were selected for the for loop (and .bam.bai were excluded). I ended up with *[.bam] by guessing and empirically testing my outputs. Are there any websites that do a good job of explaining this syntax and provide lots of examples (coder level: newbie)
Current Code
> ~/Desktop/Sample_Names.csv
> ~/Desktop/Read_Counts.csv
echo "Sample" | cat - > ~/Desktop/Sample_Names.csv
echo "Total_Reads" | cat - > ~/Desktop/Read_Counts.csv
for file in *[.bam]
do
samtools view -c $file >> ~/Desktop/Read_Counts.csv
gawk -v RS="^$" '{print FILENAME}' $file >> ~/Desktop/Sample_Names.csv
done
Current Outputs (truncated)
>Sample_Names.csv
| Sample |
|--------------|
| B40-JV01.bam |
| B40-JV02.bam |
| B40-JV03.bam |
>Read_Counts.csv
| Total_Reads |
|-------------|
| 3835555 |
| 4110463 |
| 144558 |
Desired Output
>Combined_Outputs.csv
| Sample | Total_Reads |
|--------------|-------------|
| B40-JV01.bam | 3835555 |
| B40-JV02.bam | 4110463 |
| B40-JV03.bam | 144558 |
Something like
echo "Sample,Total_Reads" > Combined_Outputs.csv
for file in *.bam; do
printf "%s,%s\n" "$file" "$(samtools view -c "$file")"
done >> Combined_Outputs.csv
Print one line for each file, and move the output redirection outside of the loop for efficiency.

jq select dynamic item where key is an environmental variable and not a bash variable

I have the following json:
{
"feature/EBS_DDS_SC-27428": {
"auth": "http://test123:8080/service.jsp",
"publish": "http://test234:8080/service.jsp",
"general_name": "PG"
},
"feature/EBS_DDS_SC-27428": {
"auth": "http://ab123:8080/service.jsp",
"publish": "http://ab234:8080/service.jsp",
"general_name": "PG1"
}
}
when I do the following I get the expected result
jq --raw-output '."feature/EBS_DDS_SC-27428" | .auth'
But the following is not working,
export branch=feature/EBS_DDS_SC-27428
cat input.json | jq --raw-output '."${branch}" | .auth'
I get the following compilation error:
jq: error: syntax error, unexpected '$' (Unix shell quoting issues?) at <top-level>, line 1:
.${branch} | .auth
jq: error: try .["field"] instead of .field for unusually named fields at <top-level>, line 1:
.${branch} | .auth
jq: 2 compile errors
Now I have an environmental variable called branch in my Linux machine
In your specific case I think you have a small bash quoting error causing ${branch} to be treated as a constant. I think you want to quote it like this:
'."' "${branch}" '" | .auth'
------ ------------------ -----------
single double quote so single
quote shell expands the quote
constant branch variable constant
Sample Run
$ echo '."'"${branch}"'" | .auth'
."feature/EBS_DDS_SC-27428" | .auth
$ cat input.json | jq --raw-output '."'"${branch}"'" | .auth'
http://ab123:8080/service.jsp
The variable substitution section of the Advanced Bash Scripting Guide is your friend.
I have an environmental variable called branch
Environment variables (as opposed to shell variables) can be dereferenced in jq programs using the env object. In your case, you'd write env.branch.
If you wanted to make the value of a variable accessible as $branch in the shell, it is best to use the --arg command-line option, along the lines of:
jq --arg branch "$branch"
That way, you can reference the value as $branch in the jq program. In the specific case of the Q, you could (for example) write:
jq -r '.[$branch].auth'

Cassandra CLI command with cqlsh -e how to declare

In my Cassandra table, it had been created with all columns in Upper case. When we tried to select the columns in cqlsh terminal, we were able to select those columns, but when we tried to pull same query based on the cqlsh -e facing some issue with escaping character.
cqlsh:key1> select "PLAN_ID" ,"A_ADVERTISER_ID" ,"ADVERTISER_NAME" from key1.plan_advertiser where "PLAN_ID" = '382633' and "A_ADVERTISER_ID" = 15019;
PLAN_ID | A_ADVERTISER_ID | ADVERTISER_NAME
---------+-----------------+----------------------
382633 | 15019 | Hanesbrands, Updated
NMH206576286LM:sparklatest0802 KarthikeyanDurairaj$ cqlsh -e 'select "PLAN_ID" ,
"A_ADVERTISER_ID" ,"ADVERTISER_NAME" from key1.plan_advertiser
where "PLAN_ID" = '382633' and "A_ADVERTISER_ID" = 15019'
<stdin>:1:InvalidRequest: Error from server: code=2200 [Invalid query]
message="Invalid INTEGER constant (382633) for "PLAN_ID" of type text"
NMH206576286LM:sparklatest0802 KarthikeyanDurairaj$
cqlsh can be a little tricky in this regard. While it doesn't allow you to escape single quotes, it does allow you to escape double quotes. This works for me:
$ bin/cqlsh -u cassdba -p flynnLives -e "SELECT * FROM stackoverflow.plan_advertiser
where \"PLAN_ID\" = '382633' and \"A_ADVERTISER_ID\" = 15019"
PLAN_ID | A_ADVERTISER_ID | ADVERTISER_NAME
---------+-----------------+----------------------
382633 | 15019 | Hanesbrands, Updated
(1 rows)
In this way, we switch from single quotes to double quotes for the CQL statement, use single quotes for column values, and then escape out the double quotes around the column names.

Parsing in Linux

I want to parse the compute zones in open-stack command output as below
+-----------------------+----------------------------------------+
| Name | Status |
+-----------------------+----------------------------------------+
| internal | available |
| |- controller | |
| | |- nova-conductor | enabled :-) 2016-07-07T08:09:57.000000 |
| | |- nova-consoleauth | enabled :-) 2016-07-07T08:10:01.000000 |
| | |- nova-scheduler | enabled :-) 2016-07-07T08:10:00.000000 |
| | |- nova-cert | enabled :-) 2016-07-07T08:10:00.000000 |
| Compute01 | available |
| |- compute01 | |
| | |- nova-compute | enabled :-) 2016-07-07T08:09:53.000000 |
| Compute02 | available |
| |- compute02 | |
| | |- nova-compute | enabled :-) 2016-07-07T08:10:00.000000 |
| nova | not available |
+-----------------------+----------------------------------------+
i want to parse the result as below, taking only nodes having nova-compute
Compute01;Compute02
I used below command:
nova availability-zone-list | awk 'NR>2 {print $2}' | grep -v '|' | tr '\n' ';'
but it returns output like this
;internal;Compute01;Compute02;nova;;
In Perl (and written rather more verbosely than is really necessary):
#!/usr/bin/perl
use strict;
use warnings;
use 5.010;
my $node; # Store current node name
my #compute_nodes; # Store known nova-compute nodes
while (<>) { # Read from STDIN
# If we find the start of line, followed by a pipe, a space and
# a series of word characters...
if (/^\| (\w+)/) {
# Store the series of word characters (i.e. the node name) in $node
$node = $1;
}
# If we find a line that contains "nova-compute", add the current
# node name in #compute_nodes
push #compute_nodes, $node if /nova-compute/;
}
# Print out all of the values in #compute_nodes
say join ';', #compute_nodes;
I detest one-line programs except for the most simple of applications. They are unnecessarily cryptic, they have none of the usual programming support, and they are stored only in the terminal buffer. Want to do the same thing tomorrow? You must start coding again
Here's a Perl solution. Run it as
$ perl nova-compute.pl command-output.txt
use strict;
use warnings 'all';
my ($node, #nodes);
while ( <> ) {
$node = $1 if /^ \| \s* (\w+) /x;
push #nodes, $node if /nova-compute/;
}
print join(';', #nodes), "\n";
output
Compute01;Compute02
Now all of that is saved on disk. It may be run again at any time, modified for similar results, or fixed if you got it wrong. It is also readable. No contest
$ nova availability-zone-list | awk '/^[|] [^|]/{node=$2} node && /nova-compute/ {s=s ";" node} END{print substr(s,2)}'
Compute01;Compute02
How it works:
/^[|] [^|]/{node=$2}
Any time a line begins with | followed by space followed by a character not |, then save the second field as a node name.
node && /nova-compute/ {s=s ";" node}
If node is non-empty and the current line contains nova-compute, then append node to the string s.
END{print substr(s,2)}
After we have read all the lines, print out string s minus its first character which is a superfluous ;.

How to reuse Cucumber step definition with a table for the last parameter?

This code:
Then %{I should see the following data in the "Feeds" data grid:
| Name |
| #{name} |}
And this one:
Then "I should see the following data in the \"Feeds\" data grid:
| Name |
| #{name} |"
And this:
Then "I should see the following data in the \"Feeds\" data grid:\n| Name |\n| #{name} |"
And even this:
Then <<EOS
I should see the following data in the "Feeds" data grid:
| Name |
| #{name} |
EOS
Gives me:
Your block takes 2 arguments, but the Regexp matched 1 argument.
(Cucumber::ArityMismatchError)
tests/endtoend/step_definitions/instruments_editor_steps.rb:29:in `/^the editor shows "([^"]*)" in the feeds list$/'
melomel-0.6.0/lib/melomel/cucumber/data_grid_steps.rb:59:in `/^I should see the following data in the "([^"]*)" data grid:$/'
tests/endtoend/instruments_editor.feature:11:in `And the editor shows "myFeed" in the feeds list
This one:
Then "I should see the following data in the \"Feeds\" data grid: | Name || #{name} |"
And this one:
Then "I should see the following data in the \"Feeds\" data grid:| Name || #{name} |"
Gives:
Undefined step: "I should see the following data in the "Feeds" data grid:| Name || myFeed |" (Cucumber::Undefined)
./tests/endtoend/step_definitions/instruments_editor_steps.rb:31:in `/^the editor shows "([^"]*)" in the feeds list$/'
tests/endtoend/instruments_editor.feature:11:in `And the editor shows "myFeed" in the feeds list'
I've found the answer myself:
steps %Q{
Then I should see the following data in the "Feeds" data grid:
| Name |
| #{name} |
}
NOTE ON THE ABOVE: might seem obvious, but the new line after the first '{' is soooooo important
Another way:
Given /^My basic step:$/ do |table|
#do table operation
end
Given /^My referring step:$/ do |table|
table.hashes.each do |row|
row_as_table = %{
|prop1|prop2|
|#{row[:prop1]}|#{row[:prop2]}|
}
Given %{My basic step:}, Cucumber::Ast::Table.parse(row_as_table, "", 0)
end
end
You can also write it this way, using #table
Then /^some other step$/ do
Then %{I should see the following data in the "Feeds" data grid:}, table(%{
| Name |
| #{name} |
})
end
Consider using
Given /^events with:-$/ do |table|
Given %{I am on the event admin page}
table.hashes.each do |row|
Given %{an event with:-}, Cucumber::Ast::Table.new([row]).transpose
end
end
I find that much more elegant that building up the table by hand.
events with:- gets a table like this
| Form | Element | Label |
| foo | bar | baz |
and an event with:- gets a table like
| Form | foo |
| Element | bar |
| Label | baz |

Resources