I want one of my several SELECT statements to not print the column headers, just the selected records. Is this possible in Cassandra 3.0?
I tried the below but it returns the column name:
cqlsh -e "select count(1) from system_schema.keyspaces where keyspace_name='test'";
count
-------
1
MySQL has options like -s -N to suppress the same.
There isn't a built-in option in cqlsh that would allow you to suppress the output header from CQL SELECT.
Your best option is to use shell scripting to parse the output. There are several Linux utilities available you can use depending on the outcome you're after. Here are just some examples in a long list of possibilities:
EXAMPLE 1 - To print the first row of results (4th line of the cqlsh output), you can use the awk utility:
$ cqlsh -e "SELECT ... FROM ..." | awk 'NR==4'
EXAMPLE 2 - The sed utility equivalent is:
$ cqlsh -e "SELECT ... FROM ..." | sed -n '4p'
EXAMPLE 3 - If you want to print all the rows, not just the first (assuming your query returns multiple rows):
$ cqlsh -e "SELECT ... FROM ..." | tail -n +4 | head -n -2
The tail -n +4 will print all lines from the 4th onwards and head -n -2 will strip out the last 2 lines (blank line + (# rows) at the end). Cheers!
Try this option as workaround:
# cqlsh -e "select count(1) from system_schema.keyspaces where keyspace_name='test'" | tail -n +4
0
(1 rows)
#Dexter, for selecting records, why can't you simply leverage SELECT * FROM system_schema.keyspaces where keyspace_name='test';?
What are you trying to achieve here, i.e. the end result?
If you simply want to count the number of records, you could simply leverage DataStax Bulk Loader to perform the count operation.
References:
https://www.datastax.com/blog/datastax-bulk-loader-counting
https://docs.datastax.com/en/dsbulk/docs/dsbulkAbout.html
./dsbulk count -k system_schema -t keyspaces
Alternatively, you could leverage the dsbulk unload -query <...> to selectively unload records based on the query that you pass in.
Related
I need help on this homework. I thought I basically solved it, but two results does not match. I had "psychology" at line 5 where it's supposed to be line 1 and I have "finance" as the last row instead of "Political science". The output (autograder) is attached below for clarity.
Can anyone figure out what I'm doing wrong? Any help would be greatly appreciated.
Question is:
write a short shell script to first download the data set with wget from the included url
Next, print out "Major,Total" (the column names of interest) to the screen
Then using cut, sort, and head, print out the n most popular majors (largest Total) in descending order
You will want to use the -k and -t arguments of sort (on top of ones you should already know) to achieve this.
The value of n will be passed into the script as a command-line argument
I attached output differences between mine and the autograder below. My code goes like this:
number=$1
if [ ! -f recent-grads.csv ]; then
wget https://raw.githubusercontent.com/fivethirtyeight/data/master/college-majors/recent-grads.csv
fi
echo Major,Total
cat recent-grads.csv | sed -i | sort -k4 -n -r -t, recent-grads.csv | cut -d ',' -f 3-4 | head -n ${number}
I am trying to make a small script, and I would like to write a few lines of a file below. I try to write the output of these commands to the fstab file, to automate the assembly ... The problem is that after writing the output of the UUID of the disk, I want to write the data that is after the echo with a tabulator in each space, but I can not put them in any way ... Thank you
blkid |grep "/dev/sdb"|cut -d " " -f2 |sed 's/"//g'|echo "/mnt/discon1 ext4 defaults 0 2">>fstab.bak
The output of the command sends it to the fstab file and passes it to me in the following way
UUID=377055f4-4f83-4326-8b43-a65694de84da
/mnt/discon1 ext4 defaults 0 2
I need after the UUID insert a tabulator and I add the rest of the text
So many ways to parse text.
Assuming this command produces the following output:
$ blkid
UUID=377055f4-4f83-4326-8b43-a65694de84da
/mnt/discon1 ext4 defaults 0 2
We could use awk:
$ blkid | awk -F= '$1=="UUID"{u=$2;next} {print u,$0}' OFS="\t"
377055f4-4f83-4326-8b43-a65694de84da /mnt/discon1 ext4 defaults 0 2
This sets = as a field separator in order to capture the UUID, but for lines without a UUID it'll print the last-captured-UUID along with the current line.
Or sed:
$ blkid | sed -ne $'/UUID=/{s///;h;};H;x;s/\\n/\t/;$p'
377055f4-4f83-4326-8b43-a65694de84da /mnt/discon1 ext4 defaults 0 2
This searches for a UUID line, strips the left-hand-side from it and stores it in sed's "hold buffer". Then for other lines, it simply appends the current line to the hold buffer, swaps it back to the pattern buffer and replaces the newline with a tab. Then at the end of the file, it prints.
Note that the options above use a tab as the first separator, but copy the remaining line verbatim.
Perhaps you want something in bash alone:
$ while IFS== read -r a b; do if [[ $a = UUID ]]; then printf '%s' "$b"; else c=($a); printf '\t%s' "${c[#]}"; fi; done < <(blkid); echo
377055f4-4f83-4326-8b43-a65694de84da /mnt/discon1 ext4 defaults 0 2
Like the awk script, this uses the = character as a field separator in order to capture the UUID. When it finds a UUID, it prints it. If it doesn't find a UUID, it goes through each field and prints it with a preceding tab.
Many many ways to do anything. Pick the one that you think will make the most sense to you a year from now. Try anything you're doing with data that doesn't quite match what you expect. It's important to be able to predict errors.
Break your command into 2 lines: capture the output of blkid into a variable, then echo that variable together with the other text e.g.
blkid=`blkid | grep "/dev/sdb" | cut -d " " -f2 | sed 's/"//g'`
echo "$blkid /mnt/discon1 ext4 defaults 0 2" >> fstab.bak
You can use sed for a lot of things. When you have time left, look at the command beneath:
/sbin/blkid | sed -n '/dev.sda3/ s#UUID="\([^"]*\)#\1\t/mnt/discon1\text4\tdefaults\t0\t2#p'
I need to extract name from a file and delete duplicates.
output.txt:
Server001-1
Server001-2
Server001-3
Server001-4
Server002-1
Server002-2
Server003-1
Server003-2
Server003-3
I need to only have output as follow.
Server001-1
Server002-1
Server003-1
So, only print first server for every server group (Server00*) and delete the rest in that group.
try simply with awk:
awk -F"-" '!a[$1]++' Input_file
Explanation: Making a field separator as - and then creating an array named a whose index is current line's 1st field and checking here a condition !a[$1] means it will check if current line's 1st field doesn't have any presence in array a then do a print of that line and then ++ means it will create that specific line's 1st field's occurrence value to 1 in array a so next time that line will not be printed.
awk -F- 'dat[$1]=="" { dat[$1]=$0 } END { for (i in dat) {print dat[i]}}' filename
result:
Server001-1
Server002-1
Server003-1
Create an array keyed on the first space delimited piece of data storing the complete line only when there are no other entries in that array entry. This will ensure that only the first unique entry is stored. Loop through the array and print
Simple GNU datamash solution:
datamash -t'-' -g1 first 2 <file
-t'-' - field separator
-g1 - group lines by the 1st field
first 2 - get only first value of the 2 field for each group. Can be also changed to min 2 operation
The output:
Server001-1
Server002-1
Server003-1
Since you've mentioned the string format as Server00*, you can simply use this one :
grep -E "Server\d+-1" file
Server\d+ for cases Server1000, Server100000 etc
or even
grep '[-]1$' file
Output for both :
Server001-1
Server002-1
Server003-1
A simple way just 1 command line to get general unique result:
nin output.txt nul "^(\w+)-\d+" -u -w
Explanation:
nul is a non-existing Windows file like /dev/null on Linux.
-u to get unique result, -w to output whole lines. Ignore case ? use -i.
"^(\w+)-\d+" is the same Regex syntax in C++/C#/Java/Scala, etc.
Save to file ? nin output.txt nul "^(\w+)-\d+" -u -w > result.txt
Save to file with summary info ? nin output.txt nul "^(\w+)-\d+" -u -w -I > result.txt
Future automation with nin.exe : Result count = return value %ERRORLEVEL%
nin.exe / nin.gcc* is a single portable exe tool to get difference or intersection keys/lines between 2 files or a pipe and a file. See my open project tools directory of https://github.com/qualiu/msr.
And you can also see colorful built-in usage/examples: https://qualiu.github.io/msr/usage-by-running/nin-Windows.html
I am new to the Linux shell commands, and I am learning sort command.
The input file is as follow:
a 1
b 2
a 0
I want to make the first column as key for sort and use '-u' option to remove the line "a 0", because it has the same key with the first line and the command manual says '-u' will keep only the first of an equal run.
When I used the command sort -k 1 -u text, the result is:
a 0
a 1
b 0
And however, when I used the command sort -k 1, 1 -u text, the output is:
a 1
b 2
Can anyone tell me what the difference between the two commands is?
-k 1
will sort from field 1 till the end of line.
-k 1,1
will sort only by first field. You defined stop position.
That is the reason why you got different output.
Read KEYDEF in sort man page.
-k option is setting the key as fields from position [to position]. So -k1 is not descriptive (actually useless) since it defines the whole record which is the default. By setting -k1,1 you're asking sort to use only the first field as the key, hence the desired result.
I'm stuck in a problem for few days. Here it is maybe u got bigger brains than me!
I got a bunch of CSV files and i want them concatenated into a single .csv file, numeric sorted. Ok, first encountered problem is with the ID (i want to sort unly by ID) name.
eg
sort -f *.csv > output.csv This would work if i had standard ids like id001, id002, id010, id100
but my ids are like id1, id2, id10, id100 and this make my sort job inaccurate.
Ok
sort -t, -V *.csv > output.csv - This works perfectly on my test machine (sort --version GNU coreutils 8.5.0) but my live machine from work got 5.3.0 sort version (and they didn't had implemented -V syntax on it) and i cannot update it!
I'm feel so noob and unlucky
If you have a better idea please bring it on.
my csv file looks like
cn41 AQ34070YTW CDEAQ34070YTW 9C:B6:54:08:A3:C6 9C:B6:54:08:A3:C4
cn42 AQ34070YTY CDEAQ34070YTY 9C:B6:54:08:A4:22 9C:B6:54:08:A4:20
cn43 AQ34070YV1 CDEAQ34070YV1 9C:B6:54:08:9F:0E 9C:B6:54:08:9F:0C
cn44 AQ34070YV3 CDEAQ34070YV3 9C:B6:54:08:A3:7A 9C:B6:54:08:A3:78
cn45 AQ34070YW7 CDEAQ34070YW7 9C:B6:54:08:25:22 9C:B6:54:08:25:20
This is actually copy / paste from a csv. So let's say, this is my first CSV. and the other one looks like
cn201 AQ34070YTW CDEAQ34070YTW 9C:B6:54:08:A3:C6 9C:B6:54:08:A3:C4
cn202 AQ34070YTY CDEAQ34070YTY 9C:B6:54:08:A4:22 9C:B6:54:08:A4:20
cn203 AQ34070YV1 CDEAQ34070YV1 9C:B6:54:08:9F:0E 9C:B6:54:08:9F:0C
cn204 AQ34070YV3 CDEAQ34070YV3 9C:B6:54:08:A3:7A 9C:B6:54:08:A3:78
cn205 AQ34070YW7 CDEAQ34070YW7 9C:B6:54:08:25:22 9C:B6:54:08:25:20
Looking forward reading you!
Regards
You can use the -kX.Y for column X starting on Y character, together with -n for numeric:
sort -t, -k2.3 -n *csv
Given your sample file, it produces:
$ sort -t, -k2.3 -n file
,id1,aaaaaa,bbbbbbbbbb,cccccccccccc,ddddddd
,id2,aaaaaa,bbbbbbbbbb,cccccccccccc,ddddddd
,id10,aaaaaa,bbbbbbbbbb,cccccccccccc,ddddddd
,id40,aaaaaa,bbbbbbbbbb,cccccccccccc,ddddddd
,id101,aaaaaa,bbbbbbbbbb,cccccccccccc,ddddddd
,id201,aaaaaaaaa,bbbbbbbbbb,ccccccccccc,ddddddd
Update
For your given input, I would do:
$ cat *csv | sort -k1.3 -n
cn41 AQ34070YTW CDEAQ34070YTW 9C:B6:54:08:A3:C6 9C:B6:54:08:A3:C4
cn42 AQ34070YTY CDEAQ34070YTY 9C:B6:54:08:A4:22 9C:B6:54:08:A4:20
cn43 AQ34070YV1 CDEAQ34070YV1 9C:B6:54:08:9F:0E 9C:B6:54:08:9F:0C
cn44 AQ34070YV3 CDEAQ34070YV3 9C:B6:54:08:A3:7A 9C:B6:54:08:A3:78
cn45 AQ34070YW7 CDEAQ34070YW7 9C:B6:54:08:25:22 9C:B6:54:08:25:20
cn201 AQ34070YTW CDEAQ34070YTW 9C:B6:54:08:A3:C6 9C:B6:54:08:A3:C4
cn202 AQ34070YTY CDEAQ34070YTY 9C:B6:54:08:A4:22 9C:B6:54:08:A4:20
cn203 AQ34070YV1 CDEAQ34070YV1 9C:B6:54:08:9F:0E 9C:B6:54:08:9F:0C
cn204 AQ34070YV3 CDEAQ34070YV3 9C:B6:54:08:A3:7A 9C:B6:54:08:A3:78
cn205 AQ34070YW7 CDEAQ34070YW7 9C:B6:54:08:25:22 9C:B6:54:08:25:20
If your CSV format is fixed, you can use the shell equivalent of the decorate-sort-undecorate pattern:
cat *.csv | sed 's/^,id//' | sort -n | sed 's/^/,id/' >output.csv
The -n option is present even in ancient version of sort.
UPDATE: the updated input contains a number with a different prefix, and at a different position in the line. Here is a version that handles both kinds of input, as well as other inputs that have a number somewhere in the line, sorting by the first number:
cat *.csv | sed 's/^\([^0-9]*\)\([0-9][0-9]*\)/\2 \1\2/' \
| sort -n \
| sed 's/^[^ ]* //' > output.csv
You could try the -g option:
sort -t, -k 2.3 -g fileName
-t seperator
-k key/column
-g general numeric sort