Why might DSBulk Load stop operation without any errors? - cassandra

I have created a Cassandra database in DataStax Astra and am trying to load a CSV file using DSBulk in Windows. However, when I run the dsbulk load command, the operation never completes or fails. I receive no error message at all, and I have to manually terminate the operation after several minutes. I have tried to wait it out, and have let the operation run for 30 minutes or more with no success.
I know that a free tier of Astra might run slower, but wouldn't I see at least some indication that it is attempting to load data, even if slowly?
When I run the command, this is the output that is displayed and nothing further:
C:\Users\JT\Desktop\dsbulk-1.8.0\bin>dsbulk load -url test1.csv -k my_keyspace -t test_table -b "secure-connect-path.zip" -u my_user -p my_password -header true
Username and password provided but auth provider not specified, inferring PlainTextAuthProvider
A cloud secure connect bundle was provided: ignoring all explicit contact points.
A cloud secure connect bundle was provided and selected operation performs writes: changing default consistency level to LOCAL_QUORUM.
Operation directory: C:\Users\JT\Desktop\dsbulk-1.8.0\bin\logs\LOAD_20210407-143635-875000
I know that DataStax recently changed Astra so that you need credentials from a generated Token to connect DSBulk, but I have a classic DB instance that won't accept those token credentials when entered in the dsbulk load command. So, I use my regular user/password.
When I check the DSBulk logs, the only text is the same output displayed in the console, which I have shown in the code block above.
If it means anything, I have the exact same issue when trying to run dsbulk Count operation.
I have the most recent JDK and have set both the JAVA_HOME and PATH variables.
I have also tried adding dsbulk/bin directory to my PATH variable and had no success with that either.
Do I need to adjust any settings in my Astra instance?
Lastly, is it possible that my basic laptop is simply not powerful enough for this operation or just running the operation crazy slow?
Any ideas or help is much appreciated!

Related

Cassandra server-side timeout configuration for "drop table" command

Is there a timeout setting in the cassandra.yaml file used to cause server-side timeouts when issuing a drop table command?
I'm using the following versions of software:
Cassandra database version: 3.11.2
Cassandra datastax java driver version: 3.4.0
I tried changing cassandra.yaml settings for write_request_timeout_in_ms, truncate_request_timeout_in_ms, and request_timeout_in_ms all to 10 ms and then issued a drop table statement via the datastax java driver. From my application logs I can see the statement takes about 2 seconds when measured from the client (client and database are all just on my local development machine and doing nothing else but this test) and finishes without a timeout.
I then executed the exact same test but replaced the "drop table" text in the statement with "truncate table" with no other changes and saw the expected timeout "com.datastax.driver.core.exceptions.TruncateException: Error during truncate: Truncate timed out - received only 0 responses".
I tried searching the Cassandra github project but couldn't find a reference in the code to see how the server side timeouts are applied so I am hoping someone knows the answer to this question.

How to increase query specific timeout in VoltDB

In VoltDB Community edition When I am uploading a CSV file (size: 550Mb)for more than 7 times and then performing basic aggregation operations, it’s showing query timeout.
But then I tried to increase the query timeout through the web interface and still, it’s showing error as “query specific timeout is 10s”
What should I do if I want to resolve this issue?
What does your configuration / deployment file look like? To increase the query timeout, the following code should be somewhere in your deployment.xml file:
<systemsettings>
<query timeout="30000"/>
</systemsettings>
Where 30000 is 30 seconds, for example. The cluster-wide query timeout is set when you first initialize the database with voltdb init. You could re-initialize with force a new deployment file with the above section in it:
voltdb init --force --config=new_deployment_file.xml
Or you could keep the cluster running and simply use:
voltadmin update new_deployment_file.xml
The section Query Timeout in this part of the docs contains more information as well:
https://docs.voltdb.com/AdminGuide/HostConfigDBOpts.php
Full disclosure: I work at VoltDB.

Update SQLite database without restarting application

I have an NodeJS application that run on a production server with forever.
That application use a third-party SQLite database which is updated every morning with a script triggered by a cron, who download the db from an external FTP to the server.
I spend some time before realising that I need to restart my server every time the file is rapatriated otherwise there is no change in the data used by my application (I guess it's cached in memory at starting).
// sync_db.sh
wget -l 0 ftp://$REMOTE_DB_PATH --ftp-user=$USER --ftp-password=$PASSWORD \
--directory-prefix=$DIRECTORY -nH
forever restart 0 // <- Here I hope something nicer...
What can I do to refresh the database without restarting the app ?
You must not overwrite a database file that might have some open connection to it (see How To Corrupt An SQLite Database File).
The correct way to overwrite a database is to download to a temporary file, and then copy it to the actual database with the backup API, which takes care of proper transactional behaviour. The simplest way to do this is with the sqlite3 command-line shell:
sqlite3 $DIRECTORY/real.db ".restore $DOWNLOADDIRECTORY/temp.db"
(If your application manually caches data, you still have to tell it to reload it.)

knex migration error in node js app

I am using knew to connect with postgres in my application. I am getting following error when I run
knex migrate:latest
TimeoutError: Knex: Timeout acquiring a connection. The pool is probably full. Are you missing a .transacting(trx) call?
at Timeout._onTimeout
Referring some thread , I understand that I have to add transacting call but Do I need to add in all the sql calls of my app ?
In documentation , It do not give me details about when to add this ? why is must ? My queries are mostly of type "GET", hence not sure if those queries needs to apply transacting?
It seems a library bug, probably.
Generally speaking, any behaviors including SELECT also need a transaction with read locking. DB will organize the resource locking sequence according to the transaction isolation level setting and mostly READ COMMITTED is default. Rows in a table cannot be deleted while a user is reading it until finished the action. Delete (exclusive locking) waits until the Select (read shared lock) release it, even if we didn't mention a begin transaction.
In this reason, most of the database connection libraries are supporting "auto commit" option like this, this and this to automatically wrap with a transaction by default if there is no explicit transaction made (or supported by the DBMS session option natively), so all the request run on a transaction block.
Knex seems not have this option explicitly. I can find
it may differ to the DBMS types. Oracle dialect. While reading the code, I found Oracle implementation have it here but Postgresql implementation here does not have auto commit. It looks incomplete to me.
The document also says it could select query without transacting call. If it leaks many open session, then it's obviously a bug. Please file a bug report with a sample code to reproduce this issue.
Or you could inspect what queries in the pending list from the database side. All the modern database system could list up the sessions and locking status. I suppose you have mixed with the naive select call and the transacting() call and then the naive select calls may appended to an uncommitted open transaction. You can watch what is happening from the DB admin feature like this.

Bacula - Director unable to authenticate with Storage daemon

I'm trying to stay sane while configuring Bacula Server on my virtual CentOS Linux release 7.3.1611 to do a basic local backup job.
I prepared all the configurations I found necessary in the conf-files and prepared the mysql database accordingly.
When I want to start a job (local backup for now) I enter the following commands in bconsole:
*Connecting to Director 127.0.0.1:9101
1000 OK: bacula-dir Version: 5.2.13 (19 February 2013)
Enter a period to cancel a command.
*label
Automatically selected Catalog: MyCatalog
Using Catalog "MyCatalog"
Automatically selected Storage: File
Enter new Volume name: MyVolume
Defined Pools:
1: Default
2: File
3: Scratch
Select the Pool (1-3): 2
This returns
Connecting to Storage daemon File at 127.0.0.1:9101 ...
Failed to connect to Storage daemon.
Do not forget to mount the drive!!!
You have messages.
where the message is:
12-Sep 12:05 bacula-dir JobId 0: Fatal error: authenticate.c:120 Director unable to authenticate with Storage daemon at "127.0.0.1:9101". Possible causes:
Passwords or names not the same or
Maximum Concurrent Jobs exceeded on the SD or
SD networking messed up (restart daemon).
Please see http://www.bacula.org/en/rel-manual/Bacula_Freque_Asked_Questi.html#SECTION00260000000000000000 for help.
I double and triple checked all the conf files for integrity and names and passwords. I don't know where to further look for the error.
I will gladly post any parts of the conf files but don't want to blow up this question right away if it might not be necessary. Thank you for any hints.
It might help someone sometime who made the same mistake as I:
After looking through manual page after manual page I found it was my own mistake. I had (for a reason I don't precisely recall, I guess to trouble-shoot another issue before) set all ports to 9101 - for the director, the file-daemon and the storage daemon.
So I assume the bacula components must have blocked each other's communication on port 9101. After resetting the default ports like 9102, 9103 according to the manual, it worked and I can now backup locally.
You have to add director's name from the backup server, edit /etc/bacula/bacula-fd.conf on remote client, see "List Directors who are permitted to contact this File daemon":
Director {
Name = BackupServerName-dir
Password = "use *-dir password from the same file"
}

Resources