I have a sstable having a size of 40GB which i was trying to split using the following command :
bin/sstablesplit --no-snapshot -s 10 keyspace-columnfamily-ka-2466-Data.db
But it deletes the current file of 40Gb and doens't even split it without giving error. What could be the possible reason or am i doing soomething wrong here.
Try -v version and send the output
bin/sstablesplit -v --no-snapshot -s 10 keyspace-columnfamily-ka-2466-Data.db
Related
I am not able to identify what is causing my ec2 disk space to reach 100% of capacity.
I have a script which deletes files in tmp folder.But still randomly sometimes my disk capacity reaches 100%.
I have attached the logs of df -i to show disk utilization.
Error
PM2 | Error: ENOSPC: no space left on device, write
PM2 | at Object.writeSync (fs.js:679:3)
PM2 | at Object.writeFileSync (fs.js:1393:26)
PM2 | at ProcessContainer (/usr/lib/node_modules/pm2/lib/ProcessContainer.js:70:10)
PM2 | at Object.<anonymous> (/usr/lib/node_modules/pm2/lib/ProcessContainer.js:103:3)
PM2 | at Module._compile (internal/modules/cjs/loader.js:999:30)
I am using command df -i to find
[![enter image description here][1]][1]
[![enter image description here][2]][2]
du -h -d 1
Check the user .pm2/logs directory, if your node app as errors or many regular logs this can increase disk space used.
I think that 8 Go is too small. I think you should upgrade your server to allocate more space. This will solved your problem.
If you can't or if you don't want to add disk space, you can take a look at the /var/log directory to delete some extra log. On long term, you can use logrotate to compress log files and upload compressed one to another place in order to keep /var/log as small a possible.
UPDATE
Also, i am not a specialist of ubuntu and snap, but your /snap directory is 2,1Go in size. You can check this to see if snap retain old version of snap package or if there is some cache that can be cleared.
Here is a bash script to remove old snaps version that i find here : https://www.debugpoint.com/clean-up-snap/
#!/bin/bash
#Removes old revisions of snaps
#CLOSE ALL SNAPS BEFORE RUNNING THIS
set -eu
LANG=en_US.UTF-8 snap list --all | awk '/disabled/{print $1, $3}' |
while read snapname revision; do
snap remove "$snapname" --revision="$revision"
done
You can also delete files in /var/lib/snapd/cache it's a snap cache that can be cleared.
But as i say, not a specialist of Ubuntu, so not tested.
You can use the dh utility
cd /
du -h -d 1
it will show the disk usage for every folder in /, then you can cd in the biggest ones and repeat the same.
You can also run
du | sort -n
and you'll get (after a while) all the folders size in the filesystem (ordered by ascending size). By my experience I'd take a first look at /home, /tmp and /var.
I'm puzzled by this problem I'm having on Ubuntu 20.04 where cron is able to run a bash script but the overall outcome is different then when using the shell command.
I've look through all questions I could in here and on Google but couldn't find anyone that had the same problem.
Background:
I'm using Pushgateway to store metrics I'm generating through a bash script, and afterwards it's being imported automatically to Prometheus.
The end goal is to export a list of running processes, their CPU%, Mem% etc, similar to top command.
This is the bash script:
#!/bin/bash
z=$(top -n 1 -bi)
while read -r z
do
var=$var$(awk 'FNR>7{print "cpu_usage{process=\""$12"\", pid=\""$1"\"}", $9z} FNR>7{print "memory_usage{process=\""$12"\", pid=\""$1"\"}", $10z}')
done <<< "$z"
curl -X POST -H "Content-Type: text/plain" --data "$var
" http://localhost:9091/metrics/job/top/instance/machine
I used to have a version that used ps aux but then I found out that it only shows the average CPU% per process.
As you can see, the command I'm running is top -n 1 -bi which gives me a snapshot of active processes and their metrcis.
I'm using awk to format the data, and FNR>7 because I need to ignore the first 7 lines which is the summery presented by top.
The bash scrip is registered on /bin, /usr/bin and /usr/local/bin.
When checking http://localhost:9091/metrics, which is supposed to show me the information gathered, I'm getting this some of information when running the scrip using shell:
cpu_usage{instance="machine",job="top",pid="114468",process="php-fpm74"} 17.6
cpu_usage{instance="machine",job="top",pid="114483",process="php-fpm74"} 11.8
cpu_usage{instance="machine",job="top",pid="126305",process="ffmpeg"} 64.7
And this is the same information when cron is running the same script:
cpu_usage{instance="machine",job="top",pid="114483",process="php-fpm+"} 5
cpu_usage{instance="machine",job="top",pid="126305",process="ffmpeg"} 60
cpu_usage{instance="machine",job="top",pid="128777",process="php"} 15
So, for some reason, when I run it from cron it cuts the process name after 7 places.
I initially though it was related to the FNR>7 but even after changing it to 8 or 9 (and using exec bash to re-register the command) it gives the same results, also when I run it manually it works just fine.
Any help would be appreciated!!
I'm trying to load a large CSV (30 GB) file into my cluster. I'm realizing that I might be overloading my Cassandra driver which is causing it to crash at some point during loading. I am getting a repeated message while it loads the data, until a certain point where it stops and I get an error that stops the process.
My current loading command is:
dsbulk load -url data.csv -k hotels -t reviews -delim '|' -header true -h '' -port 9042 -maxConcurrentQueries 128
Using -maxConcurrentQueries 128 did not change anything in terms of errors.
Any idea how I can modify my command to make it work?
I have a 1,700 lines query to be executed in Impala-shell. I created a shell script with below command:
impala-shell -V -i hostname -q "[QUERY]"
However, when I executed it using sh script.sh, I got the error message "Argument list too long". I am able to run simpler/short query using Impala-shell command.
I also tried to enlarge the limit by running command ulimit -s 65536 but I got the same error.
I suspect the number of lines of the query is too big.
-f option is the answer. I prepared a separate SQL file and it worked.
impala-shell -V -i hostname -f file.sql
on running this query:
{ "start_absolute":1359695700000, "end_absolute":1422853200000,
"metrics":[{"tags":{"Building_id":["100"]},"name":"meterreadings","group_by":[{"name":"time","group_count":"12","range_size":{"value":"1","unit":"MONTHS"}}],"aggregators":[{"name":"sum","align_sampling":true,"sampling":{"value":"1","unit":"Months"}}]}]}
I am getting the following response:
500 {"errors":["Too many open files"]}
Here this link it is written that increase the size of file-max.
My file-max output is:
cat /proc/sys/fs/file-max
382994
it is already very large, do I need to increase its limit
What version are you using? Are you using a lot of grou-by in your queries?
You may need to restart kairosDB as a workaround.
Can you check if you have deleted (ghost) files handles (replace by kairosDB process ID in the command line below)?
ls -l /proc/<PID>/fd | grep kairos_cache | grep -v '(delete)' | wc -l
THere was a fix in 0.9.5 for unclosed file handles.
There's a fix pending for next release (1.0.1).
cf. https://github.com/kairosdb/kairosdb/pull/180, https://github.com/kairosdb/kairosdb/issues/132, and https://github.com/kairosdb/kairosdb/issues/175.