Process permanently stuck on D state [closed] - linux

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 9 years ago.
Improve this question
I have an issue with some processes stuck in a D state on Ubuntu 10.04.3 LTS.
They have been in this state since Nov the 5th (today being December 6th). I understand these are uninterruptible sleep states often related to waiting for data from hardware such as a hard disk. This is a production server so rebooting is a very last resort, is anyone able to shed any light on what these processes might be?
This is the output for the D state items from ps -aux
www-data 22851 0.0 0.0 0 0 ? D Nov05 0:00 [2637.64]
www-data 26306 0.0 0.0 4008 12 ? D Nov05 0:00 ./2.6.37
www-data 26373 0.0 0.0 4008 12 ? D Nov05 0:00 ./n2
www-data 26378 0.0 0.0 4008 12 ? D Nov05 0:00 ./n2
This is output of ps axl | awk '$10 ~ /D/' for a little more info.
0 33 22851 1 20 0 0 0 econet D ? 0:00 [2637.64]
1 33 26306 1 20 0 4008 12 ec_dev D ? 0:00 ./2.6.37
1 33 26373 1 20 0 4008 12 ec_dev D ? 0:00 ./n2
1 33 26378 1 20 0 4008 12 ec_dev D ? 0:00 ./n2
Is there a way to kill these? Does having processes in this state when rebooting cause any issues?

This is the dreaded un-interruptible (TASK_UNINTERRUPTIBLE) state of a process. This is the state where the process doesn't react to signals until what it started to wait for, gets done.
Unfortunately it is a necessary evil. See here and here What is an uninterruptable process?.
My answer is to reboot the system.
Do rebooting cause any issues ?
Hard to tell, it may it may not. The process which is in the D state may have to do some crucial updates which it wont if you reboot.
If you really cant afford to reboot, try to find the disk on which the process is waiting and see if the disk is working fine by opening, closing, reading/writing into it

No - you cannot kill them, period. kill -9 does not work either. And it is not kernel bug, it is by design. All signals are blocked until those processes leave the D state. They either leave the D state or the system gets rebooted. No, rebooting does not have any problem with these guys.
The usual culprits for this kind of problem are removable media devices like a cdrom. The device may be defective or somebody found a way to do something stupid.

Related

Copy to SD Card changes the Execute permissions (Linux) [closed]

Closed. This question is not about programming or software development. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 17 days ago.
Improve this question
I have a file on /tmp
-rw-r--r-- 1 root root 6782 Jun 30 11:20 DATA_00.csv
when I copy it to SD Card with
cp /tmp/DATA_00.csv /mnt/mmccard/
Its Execute flag is set !
-rwxr-xr-x 1 root root 6782 Jun 30 11:21 DATA_00.csv
Is it normal ?
on Linux 2.6.20 ;)
#koyaanisqatsi
Hi, I don't have new information with fdisk -l
In fact I don't know why there is not only one partition.
/mnt/mmccard type vfat (rw,sync,fmask=0022,dmask=0022,codepage=cp437,iocharset=iso8859-1)
Disk /dev/mmcblk0p1: 8064 MB, 8064598016 bytes
4 heads, 16 sectors/track, 246112 cylinders
Units = cylinders of 64 * 512 = 32768 bytes
Device Boot Start End Blocks Id System
/dev/mmcblk0p1p1 ? 29216898 55800336 850670010+ 7a Unknown
Partition 1 does not end on cylinder boundary.
/dev/mmcblk0p1p2 ? 25540106 55528404 959625529 72 Unknown
Partition 2 does not end on cylinder boundary.
/dev/mmcblk0p1p3 ? 1 1 0 0 Empty
Partition 3 does not end on cylinder boundary.
/dev/mmcblk0p1p4 438273 438279 221 0 Empty
Partition 4 does not end on cylinder boundary.
This was formatted with w10 as FAT32
Hello
Yes - Depending on the filesystem the SD Card has. I guess somewhat
like MS FAT16/FAT32?
Check out the command mount without any option/parameter.

Restart the computer when the power on time exceeds 30 minutes [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 4 years ago.
Improve this question
I was currently controlling this through an uptime.
The computer restarts if uptime is greater than 1h.
But I do not know how to control if the computer is one day on or more, because currently I only control the hours.
Is it possible to control days, hours and minutes with uptime?
I need to restart the computer when the power on time is greater than 1h.
If the time is 1 day and 0 hours gives failure.
Sorry for my explanation, it is a script that does a series of things and alfinal exists this function that is responsible for controlling this parameter.
thanks for reading me
Not sure I quite understand your issue.
If you want your computer to ALWAYS reboot after a specific amount of time, which is very unusual, then use cron. Add this to /etc/crontab (alternatively, if there is a /etc/cron.d directory on your machine, you can also create a file /etc/cron.d/reboot with this content) :
#reboot root sleep 1800; /sbin/reboot
(adapt reboot's path to match your system; 1800 is the number of seconds for 30 minutes, change it to whatever delay you need)
On the other hand, you may be writing a script that will reboot your server, and you may want to keep it from working if it is run before 30 minutes of uptime (which makes more sense).
Then, I understand you have difficulties parsing the result of uptime and you should use /proc/uptime which gives your uptime in seconds:
#!/bin/sh
not_before=1800 # Number of seconds - adapt to your needs
uptime=$(cut -d . -f 1 /proc/uptime)
[ "$uptime" -ge "$not_before" ] && exec reboot
echo "Sorry, only $uptime s of uptime; you must wait $((not_before - uptime)) seconds" >&2
exit 1
If you want to do it in a script, use the result of uptime | grep " day"to determine whether to execute things (in anifcondition), then do anything you want inside theif`.
Make that script executable and put it in crontab to run every 5min or so.
More information on Cron: https://wiki.archlinux.org/index.php/Cron

How to monitor bunch of Ubuntu machines for CPU usage from another machine? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 8 years ago.
Improve this question
I have couple of machines as shown below which are running Ubuntu 12.04 and I need to find out the process name along with its pid whose CPU usage is greater than 70%.
Below are the machines as an example -
machineA
machineB
machineC
machineD
I need to have my shell script which can run periodically every 15 minutes and check whether any of the above machines has CPU usage greater than 70%. If there are any machines which are having CPU usage as greater than 70%, then send out an email with the machine name and the process name along with it's id.
I will be running my shell script from machineX and I have passwordless ssh key setup for user david from machineX to all the above machines.
What is the best way to do all these kind of monitoring?
I have below command which can get me PID, %CPU and COMMAND name of the process whose CPU usage is greater than 70%.
ps aux --sort=-%cpu | awk 'NR==1{print $2,$3,$11}NR>1{if($3>=70) print $2,$3,$11}'
Not sure how to fully automate this process?
Hey men you are doing something wrong there, the cpu there its peak cpu in the current execution gap. That means that in the next second this proccess may not be eating any resources. Its better to use a different way to catch up cpu hungry proccess. My prefered way its by cpu time. Have a look on this example:
panos#wintermute:~$ ps xafu
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 2 0.0 0.0 0 0 ? S 18:53 0:00 [kthreadd]
root 3 0.0 0.0 0 0 ? S 18:53 0:00 \_ [ksoftirqd/
root 5 0.0 0.0 0 0 ? S< 18:53 0:00 \_ [kworker/0:
root 7 0.0 0.0 0 0 ? S 18:53 0:02 \_ [rcu_sched]
TIME: is the cpu time a proccess has bean eat. normal proccess dont need to eat much cpu proccess. So by creating a simple shell script with a small loop you could gather the information you need. The shell script could looked like:
#!/bin/sh
date
for i in "a b c d" ; do
echo machine${i}
ssh machine${i} ps xau|awk 'NR==1{print $2,$10,$11}NR>1{if($10>=5) print $2,$10,$11}'
echo -- --
done
exit
That will match any proccess that has bean eat 5 minutes of cpu
AWK does row-oriented editing (and a bunch of other stuff too).
A block of statements enclosed in brackets {} will be executed on each row of input.This behaviour may be limited by prefixing the block with a condition (in ordinary C syntax)
NR==1 {} means that block will be executed on a first input row. In the example above fields 2, 10 and 11 from the first input row will be printed on a single row.
NR>1 {} means that block will be executed for each row after the first one.

Linux amount of swap displayed by "free" is different from "smem" [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 5 years ago.
Improve this question
I am trying to analyze from where the amount of swap is from, and looking at smem display I get a completely different amount of swap usage.
Free shows the following :
[root#server1 ~/smem-1.3]# free -k
total used free shared buffers cached
Mem: 24554040 24197360 356680 0 510200 14443128
-/+ buffers/cache: 9244032 15310008
Swap: 20980880 2473120 18507760
And smem shows :
PID User Command Swap USS PSS RSS
...
18829 oracle oracle_1 (LOCAL=NO) 0 3.9M 98.3M 10.1G
18813 oracle oracle_1 (LOCAL=NO) 0 3.9M 98.6M 10.1G
18809 oracle oracle_1 (LOCAL=NO) 0 4.1M 99.2M 10.0G
28657 oracle ora_lms0_1 56.0K 54.1M 100.3M 4.2G
29589 oracle ora_lms1_1 964.0K 69.7M 118.9M 4.5G
29886 oracle ora_dbw1_1 5.7M 20.8M 130.9M 10.2G
29857 oracle ora_dbw0_1 4.2M 22.6M 133.0M 10.3G
11075 ccm_user /usr/java/jre1.6/bin/java - 197.8M 133.9M 135.9M 140.7M
21688 bsuser /usr/local/java/bin/java -c 30.7M 145.1M 147.2M 152.1M
29930 oracle ora_lck0_1 2.3M 58.6M 169.8M 1.0G
29901 oracle ora_smon_1 0 78.0M 195.6M 4.3G
15604 oracle /var/oragrid/jdk/jre//bin/j 65.4M 253.9M 254.3M 262.2M
-------------------------------------------------------------------------------
359 10 678.8M 2.5G 13.5G 1.2T
Why free shows me "2.4G" and smem only shows me 679M? One of them is showing some wrong result.
I need to find out where are the remaining 1.8G, or prove that free is showing wrong results.
Last but not least, the kernel is 2.6.18.
Well, the main issue is RSS(resident set size) and PSS(proportional set size). From http://www.selenic.com/smem/ as it says - "PSS instead measures each application's "fair share" of each shared area to give a realistic measure". On the otherhand, RSS overestimates by calculating shared memory area of multiple applications as their own. And this is why, you see the difference. In simple word, smem can differentiate between applications shared memory and rather than treating shared area as every applications own!

an htop-like tool to display disk activity in linux [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Closed 11 years ago.
The community reviewed whether to reopen this question 1 year ago and left it closed:
Original close reason(s) were not resolved
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
I am looking for a Linux command-line tool that would report the disk IO activity. Something similar to htop would be really cool. Has someone heard of something like that?
You could use iotop. It doesn't rely on a kernel patch. It Works with stock Ubuntu kernel
There is a package for it in the Ubuntu repos. You can install it using
sudo apt-get install iotop
nmon shows a nice display of disk activity per device. It is available for linux.
? Disk I/O ?????(/proc/diskstats)????????all data is Kbytes per second??????????????????????????????????????????????????????????????ij
?DiskName Busy Read WriteKB|0 |25 |50 |75 100| ?
?sda 0% 0.0 127.9|> | ?
?sda1 1% 0.0 127.9|> | ?
?sda2 0% 0.0 0.0|> | ?
?sda5 0% 0.0 0.0|> | ?
?sdb 61% 385.6 9708.7|WWWWWWWWWWWWWWWWWWWWWWWWWWWWWWR> | ?
?sdb1 61% 385.6 9708.7|WWWWWWWWWWWWWWWWWWWWWWWWWWWWWWR> | ?
?sdc 52% 353.6 9686.7|WWWWWWWWWWWWWWWWWWWWWWWWWWR > | ?
?sdc1 53% 353.6 9686.7|WWWWWWWWWWWWWWWWWWWWWWWWWWR > | ?
?sdd 56% 359.6 9800.6|WWWWWWWWWWWWWWWWWWWWWWWWWWWW> | ?
?sdd1 56% 359.6 9800.6|WWWWWWWWWWWWWWWWWWWWWWWWWWWW> | ?
?sde 57% 371.6 9574.9|WWWWWWWWWWWWWWWWWWWWWWWWWWWWR> | ?
?sde1 57% 371.6 9574.9|WWWWWWWWWWWWWWWWWWWWWWWWWWWWR> | ?
?sdf 53% 371.6 9740.7|WWWWWWWWWWWWWWWWWWWWWWWWWWR > | ?
?sdf1 53% 371.6 9740.7|WWWWWWWWWWWWWWWWWWWWWWWWWWR > | ?
?md0 0% 1726.0 2093.6|>disk busy not available | ?
??????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????
It is not htop-like, but you could use atop. However, to display disk activity per process, it needs a kernel patch (available from the site). These kernel patches are now obsoleted, only to show per-process network activity an optional module is provided.
Use collectl which has extensive process I/O monitoring including monitoring threads.
Be warned that there are I/O counters for I/O being written to cache and I/O going to disk. collectl reports them separately. If you're not careful you can misinterpret the data. See http://collectl.sourceforge.net/Process.html
Of course, it shows a lot more than just process stats because you'd want one tool to provide everything rather than a bunch of different one that displays everything in different formats, right?

Resources