Cassandra : OperationTimedOut: errors={}, last_host=127.0.0.1 - cassandra

I am trying "select count(*) from users;" on cassandra but after waiting for 10 seconds (approx) i am getting "OperationTimedOut: errors={}, last_host=127.0.0.1". I am trying this on cqlsh. cqlsh and cassandra versions are below.
cqlsh 5.0.1 | Cassandra 3.0.8
I found few solutions on stackoverflow. But none of them is working. I tried below.
In file cassandra/conf/cassandra.yaml i increased few request_time_out settings and than restarted the cassandra
I created file cqlshrc in folder .cassandra and added below in that file
[connection]
client_timeout = 2000000
Few solution in stackoverflow are suggesting to increase some timeout field in file cqlsh, but File '/cassandra/bin/cqlsh' doesn't have any such field, so i didn't change anything in that file. cqlsh file content is below
python -c 'import sys; sys.exit(not (0x020700b0 < sys.hexversion < 0x03000000))' 2>/dev/null \
&& exec python "python -c "import os;print(os.path.dirname(os.path.realpath('$0')))"/cqlsh.py" "$#"
for pyver in 2.7; do
which python$pyver > /dev/null 2>&1 && exec python$pyver "python$pyver -c "import os;print(os.path.dirname(os.path.realpath('$0')))"/cqlsh.py" "$#"
done
echo "No appropriate python interpreter found." >&2
exit 1
Solutions please.

If the Cassandra server is working correctly, the timeout exception raises because the server can't handle the request. Is 'users' table too large?
One solution to count big tables is use the 'counter' type in another table.
For example, we can create a table like this:
CREATE TABLE custom_counters
(name varchar,
count counter,
PRIMARY KEY (name)
);
Every time we insert a user in 'users' table, we update its counter in the 'custom_counters' table:
UPDATE custom_counters
SET count = count + 1
WHERE name='users';
So, when we need to know the number of users, we have to request that field:
SELECT count FROM custom_counters WHERE name = 'users';
More info here: https://docs.datastax.com/en/cql/3.1/cql/cql_using/use_counter_t.html

Related

I need to List all the available keyspaces in Cassandra and save to a .txt file

Hi All I am new in Cassandra and got an assignment where I need to List all the available keyspaces in Cassandra and save to a .txt file
I have tried all possible codes and searched many sites but still I am unable to succeed.
I have tried below codes in order to save the available keyspaces in .txt file.
cqlsh -e 'DESCRIBE KEYSPACE firstkeyspace' > test.txt;
cqlsh -e "DESCRIBE KEYSPACE firstkeyspace" > pathtosomekeyspace.txt
cqlsh -e "DESC KEYSPACE firstkeyspace" > firstkeyspace_schema.txt;
cqlsh -e "DESC KEYSPACES" > firstkeyspace_schema.txt
I am getting error and unable to fix it.
SyntaxException: line 1:0 no viable alternative at input 'cqlsh' ([cqlsh]...)
I have also checked with single quote but still not working.
Request you all to help me to solve this problem.
Thanks in advance.
This error indicates that you're running the commands within cqlsh itself:
SyntaxException: line 1:0 no viable alternative at input 'cqlsh' ([cqlsh]...)
For example:
cqlsh> cqlsh -e "DESCRIBE KEYSPACE ks" > ks.txt ;
SyntaxException: line 1:0 no viable alternative at input 'cqlsh' ([cqlsh]...)
You need to exit out of cqlsh and run the commands at the Linux command line. For example:
$ cqlsh -e "DESCRIBE KEYSPACES" > keyspaces.txt
Don't confuse CQL commands like DESCRIBE KEYSPACES with Linux shell commands. Cheers!
So I see that error when I try to run cqlsh from within cqlsh.
aploetz#cqlsh> cqlsh -u aploetz -p xxxxxxxx -e 'DESCRIBE KEYSPACE stackoverflow' ;
SyntaxException: line 1:0 no viable alternative at input 'cqlsh' ([cqlsh]...)
That's not going to work. Exit out, and run it from your command line, instead.
aploetz#cqlsh> exit
% bin/cqlsh -u aploetz -p xxxxxxxx -e 'DESCRIBE KEYSPACE stackoverflow' > stackoverflow.txt
% head -n 5 stackoverflow.txt
CREATE KEYSPACE stackoverflow WITH replication = {'class': 'NetworkTopologyStrategy', 'SnakesAndArrows': '1'} AND durable_writes = true;
CREATE TABLE stackoverflow.customer_info_by_date (
billing_due_date date,
If you're referring to a hackerrrank prompt, or even otherwise, here is what I did to solve it!
hr gave me the example of trying:
cqlsh -e "command" > filename
HOWEVER: this didn't work for me just as it didn't for you. Instead, do:
COPY system_schema.keyspaces TO 'keyspace.txt';
**Here, system_schema.keyspaces is generic to all systems as it seems to collect all the keyspaces (rather than being one of my named keyspaces)

bash arguments list too long on linux but not on mac

Here is the shell script that I run:
for file in “$d/resources/“*; do
resourceName=$(basename $file)
echo “Inserting resouce: $resourceName...”
resource=`cat $file`
# Generate id with md5
resourceId=$((resourceId+1))
# Insert into resources table
cqlsh -e “INSERT INTO $TENANT_NAME.resources (id, target,lastUpdateDate,lastUpdateUser,algorithmName,resourceName,resourceContent) VALUES ( $resourceId, ‘template’, toTimestamp(now()), null, ‘$algorithmName’, ‘$resourceName’, \$\$$resource\$\$);” $STORAGE_HOST_ADDRESS $STORAGE_HOST_PORT
done
On mac it works fine, but on linux it throws the error bash arguments list too long becaue of $resource. Can someone please tell how to fix this? Thanks.
Linux has a limit of 128k per argument. macOS has a limit of 256k for arguments+environment.
Write the query to a file instead, and have cqlsh execute that instead of an environment:
cqlsh -f myqueryfile host port

psql return code if zero rows found

I would like for my psql command to fail if zero rows are found:
psql -U postgres -d db -c "select * from user where id=1 and name='Joe';"
I want to be able to check the return value. Return 0 from the process(!) if at least one row exists and return non-zero from the psql process if no such row exists. How can I set a return code if no rows are found?
I don't think psql can do it by itself, but if you just want to see if there are any rows or not with the exit status you could combine it like
psql -U postgres -d db -t -c "select * from user where id=1 and name='Joe'" | egrep .
That will cause egrep to exit with non-zero if it cannot match anything. The -t will make it not print the column headers and summary information, so you may need to tweak this command line if you need that stuff.

Using Postgres transactions in linux shell script

I'm developing a shell script that loops through a series of Postgres database table names and dumps the table data. For example:
# dump data
psql -h $SRC_IP_ADDRESS -p 5432 -U postgres -c "BEGIN;" AWARE
do
:
pg_dump -U postgres -h $IP_ADDRESS -p 5432 -t $i -a --inserts MYDB >> \
out.sql
done
psql -h $IP_ADDRESS -p 5432 -U postgres -c "COMMIT;" MYDB
I'm worried about concurrent access to the database, however. Since there is no database lock for Postgres, I tried to wrap a BEGIN and COMMIT around the loop (using psql, as shown above). This resulted in an error message from the psql command, saying that:
WARNING: there is no transaction in progress
Is there any way to achieve this? If not, what are the alternatives?
Thanks!
Your script has two main problems. The first problem is practical: a transaction is part of a specific session, so your first psql command, which just starts a transaction and then exits, has no real effect: the transaction ends when the command completes, and later commands do not share it. The second problem is conceptual: changes made in transaction X aren't seen by transaction Y until transaction X is committed, but as soon as transaction X is committed, they're immediately seen by transaction Y, even if transaction Y is still in-progress. This means that, even if your script did successfully wrap the entire dump in a single transaction, this wouldn't make any difference, because your dump could still see inconsistent results from one query to the next. (That is: it's meaningless to wrap a series of SELECTs in a transaction. A transaction is only meaningful if it contains one or more DML statements, UPDATEs or INSERTs or DELETEs.)
However, since you don't really need your shell-script to loop over your list of tables; rather, you can just give pg_dump all the table-names at once, by passing multiple -t flags:
pg_dump -U postgres -h $IP_ADDRESS -p 5432 \
-t table1 -t table2 -t table3 -a --inserts MYDB >> out.sql
and according to the documentation, pg_dump "makes consistent backups even if the database is being used concurrently", so you wouldn't need to worry about setting up a transaction even if that did help.
(By the way, the -t flag also supports a glob notation; for example, -t table* would match all tables whose names begin with table.)

RRD print the timestamp of the last valid data

I have a rdd database storing ping response from a wide range of network equipments
How can i print on the graph the timestamp of the last valid entry in the rrd database, so i can see if a host is down when did it went down
I use the folowing to creade the RRD file.
rrdtool create terminal_1.rrd -s 60 \
DS:ping:GAUGE:120:0:65535 \
RRA:AVERAGE:0.5:1:2880
Use the lastupdate option of rrdtool.
Another solution exists if you only have one file per host : don't update your RRD if the host is down. You can then see the last updated time with a plain ls or stat as in :
ls -l terminal_1.rrd
stat --format %Y terminal_1.rrd
In case you plan to use the caching daemon of RRD, you have to use the last command in order to flush the pending updates.
rrdtool last terminal_1.rrd

Resources