perl get linux cached memory number - linux

I'm writing a perl script and I would really like to get the amount of cached memory currently being used on my linux box. When you run "free -m", you get this output:
total used free shared buffers cached
Mem: 496 322 173 0 33 106
-/+ buffers/cache: 183 312
Swap: 1023 25 998
The number under "cached" is the value I want. I've been using Linux::SysInfo,which helps me get a lot of useful information about my box, but seems to be lacking cached memory. Does anyone know of another module or elegant way in perl to get the amount of cached memory on my machine? I know that I could get it by running this:
my $val = `free -m`;
And then running regex on val, but I'd prefer another solution if one exists. Thanks!

Running strace free -m show that it is using /proc/mem:
open("/proc/meminfo", O_RDONLY) = 3
cat /proc/meminfo confirms that this contains the information you're looking for.

I am not sure if you only want a Perl solution, or if any command line solution will be acceptable. Just in case, here is a simple AWK solution:
free -m | awk '/^Mem:/{print $NF}'
that will print the number you are interested in.
You could assign it to some shell variable if that was necessary:
$ c_val=`free -m | awk '/^Mem:/{print $NF}'`
$ echo $c_val
will display the value to verify.
Explanation of awk command:
/^Mem:/ searches for a line that contains the string Mem: at the start. If it is found it prints the last item on that line which is the number we are interested in. In awk the line is split into fields based on white space. $0 is the whole line, $1 the first field, $2 the second etc. The number of fields per line is given by the pre-defined awk NF variable, so we can access the last field on the line with $NF.
We could have also used this awk command:
awk 'NR==2{print $NF}'
which makes use of the pre-defined awk NR variable that contains the current line number. In this case we print the last item (field) on the 2nd line.

You can read it from /proc/meminfo:
perl -ne's/^Cached: *//&&print' /proc/meminfo
or directly only value in kB:
perl -anE'/^Cached/&&say$F[1]' /proc/meminfo

Related

Can we do this in perl?

I want the below awk one liner to be translated to perl. is it possible??
awk '{ for(i=1;i<=NF;i++){if(i==NF){printf("%s\n",$NF);}else {printf("%s\t",$i)}}}' file.txt | awk 'NR > 1'
The first awk command removes the leading empty column and the next one removes the first line.
Below is the head of file.txt
#FILEOUTPUT
1 137442 2324
2326 139767 4169
6491 143936 94
The output i get from those commands is below
1 137442 2324
2326 139767 4169
6491 143936 94
Thanks,
Karthic
#Alex got the usage of $. correctly - which is not a very common perl idiom (though a useful one as we see), but they didn't handle the extra spaces correctly.
Awk is all about understanding what the fields are and then manipulating the fields, and as part of that it does a lot of whitespace canonalization.
Perl, OTOH, usually doesn't involve itself in field separation and a lot of users like to do that themeselves - but it does support this Awk behavior using the -a flag.
So a simple implementation of the above Awk line noise might look like this:
perl -anle 'print join("\t",#F) if $. > 1' file.txt
Explanation:
-a: trigger field separation using the default field separator (which works well in this case) or whatever -F says (like Awk).
-n : iterate over the input lines (same as what the outermost {} do in Awk). A common alternative is -p which would mean to iterate over the input lines and then print out whatever the line buffer has after running the code.
-l : When printing, add a new line at the end of the text (makes things like this slightly easier to work with)
-e : here's a script.
Then we just take the separated field array (#F) and join it. Often devs like to just address certain fields with $F[<index>], but here we don't need to loop - we can just take the list as is and pipe it to join().

Fast string search in a very large file

What is the fastest method for searching lines in a file containing a string. I have a file containing strings to search. This small file (smallF) contains about 50,000 lines and looks like:
stringToSearch1
stringToSearch2
stringToSearch3
I have to search all of these strings in a larger file (about 100 million lines). If any line in this larger file contains the search string the line is printed.
The best method I have come up with so far is
grep -F -f smallF largeF
But this is not very fast. With just 100 search strings in smallF it takes about 4 minutes. For over 50,000 search strings it will take a lot of time.
Is there a more efficient method?
I once noticed that using -E or multiple -e parameters is faster than using -f. Note that this might not be applicable for your problem as you are searching for 50,000 string in a larger file. However I wanted to show you what can be done and what might be worth testing:
Here is what I noticed in detail:
Have 1.2GB file filled with random strings.
>ls -has | grep string
1,2G strings.txt
>head strings.txt
Mfzd0sf7RA664UVrBHK44cSQpLRKT6J0
Uk218A8GKRdAVOZLIykVc0b2RH1ayfAy
BmuCCPJaQGhFTIutGpVG86tlanW8c9Pa
etrulbGONKT3pact1SHg2ipcCr7TZ9jc
.....
Now I want to search for strings "ab", "cd" and "ef" using different grep approaches:
Using grep without flags, search one at a time:
grep "ab" strings.txt > m1.out
2,76s user 0,42s system 96% cpu 3,313 total
grep "cd" strings.txt >> m1.out
2,82s user 0,36s system 95% cpu 3,322 total
grep "ef" strings.txt >> m1.out
2,78s user 0,36s system 94% cpu 3,360 total
So in total the search takes nearly 10 seconds.
Using grep with -f flag with search strings in search.txt
>cat search.txt
ab
cd
ef
>grep -F -f search.txt strings.txt > m2.out
31,55s user 0,60s system 99% cpu 32,343 total
For some reasons this takes nearly 32 seconds.
Now using multiple search patterns with -e
grep -E "ab|cd|ef" strings.txt > m3.out
3,80s user 0,36s system 98% cpu 4,220 total
or
grep --color=auto -e "ab" -e "cd" -e "ef" strings.txt > /dev/null
3,86s user 0,38s system 98% cpu 4,323 total
The third methode using -E only took 4.22 seconds to search through the file.
Now lets check if the results are the same:
cat m1.out | sort | uniq > m1.sort
cat m3.out | sort | uniq > m3.sort
diff m1.sort m3.sort
#
The diff produces no output, which means the found results are the same.
Maybe want to give it a try, otherwise I would advise you to look at the thread "Fastest possible grep", see comment from Cyrus.
You may want to try sift or ag. Sift in particular lists some pretty impressive benchmarks versus grep.
Note: I realise the following is not a bash based solution, but given your large search space, a parallel solution is warranted.
If your machine has more than one core/processor, you could call the following function in Pythran, to parallelize the search:
#!/usr/bin/env python
#pythran export search_in_file(string, string)
def search_in_file(long_file_path, short_file_path):
_long = open(long_file_path, "r")
#omp parallel for schedule(guided)
for _string in open(short_file_path, "r"):
if _string in _long:
print(_string)
if __name__ == "__main__":
search_in_file("long_file_path", "short_file_path")
Note: Behind the scenes, Pythran takes Python code and attempt to aggressively compile it into very fast C++.

renaming files using loop in unix

I have a situation here.
I have lot of files like below in linux
SIPTV_FIPTV_ID00$line_T20141003195717_C0000001000_FWD148_IPV_001.DATaac
SIPTV_FIPTV_ID00$line_T20141003195717_C0000001000_FWD148_IPV_001.DATaag
I want to remove the $line and make a counter from 0001 to 6000 for my 6000 such files in its place.
Also i want to remove the trailer 3 characters after this is done for each file.
After fix file should be like
SIPTV_FIPTV_ID0000001_T20141003195717_C0000001000_FWD148_IPV_001.DAT
SIPTV_FIPTV_ID0000002_T20141003195717_C0000001000_FWD148_IPV_001.DAT
Please help.
With some assumption, I think this should do it:
1. list of the files is in a file named input.txt, one file per line
2. the code is running in the directory the files are in
3. bash is available
awk '{i++;printf "mv \x27"$0"\x27 ";printf "\x27"substr($0,1,16);printf "%05d", i;print substr($0,22,47)"\x27"}' input.txt | bash
from the command prompt give the following command
% echo *.DAT??? | awk '{
old=$0;
sub("\\$line",sprintf("%4.4d",++n));
sub("...$","");
print "mv", old, $1}'
%
and check the output, if it looks OK
% echo *.DAT??? | awk '{
old=$0;
sub("\\$line",sprintf("%4.4d",++n));
sub("...$","");
print "mv", old, $1}' | sh
%
A commentary: echo *.DAT??? is meant to give as input to awk a list of all the filenames that you want to modify, you may want something more articulated if the example names you gave aren't representative of the whole spectrum... regarding the awk script itself, I used sprintf to generate a string with the correct number of zeroes for the replacement of $line, the idiom `"\\$..." with two backslashes to quote the dollar sign is required by gawk and does no harm in mawk, and as a last remark I have to say that in similar cases I prefer to make at least a dry run before passing the commands to the shell...

How to do something like grep -B to select only one line?

Everything is in the title. Basicaly let's say I have this pattern
some text lalala
another line
much funny wow grep
I grep funny and I want my output to be "lalala"
Thank you
One possible answer is to use either ed or ex to do this (it is trivial in them):
ed - yourfile <<< 'g/funny/.-2p'
(Or replace ed with ex. You might have red, the restricted editor, too; it can't modify files.) This looks for the pattern /funny/ globally, and whenever it is found, prints the line 2 before the matching line (that's the .-2p part). Or, if you want the most recent line containing 'lalala' before the line matching 'funny':
ed - yourfile <<< 'g/funny/?lalala?p'
The only problem is if you're trying to process standard input rather than a file; then you have to save the standard input to a file and process that file, which spoils the concurrency.
You can't do negative offsets in sed (though GNU sed allows you to do positive offsets, so you could use sed -n '/lalala/,+2p' file to get the 'lalala' to 'funny' lines (which isn't quite what you want) based on finding 'lalala', but you cannot find the 'lalala' lines based on finding 'funny'). Standard sed does not allow offsets at all.
If you need to print just the IP address found on a line 8 lines before the pattern-matching line, you need a slightly more involved ed script, but it is still doable:
ed - yourfile <<< 'g/funny/.-8s/.* //p'
This uses the same basic mechanism to find the right line, then runs a substitute command to remove everything up to the last space on the line and print the modified version. Since there isn't a w command, it doesn't actually modify the file.
Since grep -B only prints each full number of lines before the match, you'll have to pipe the output into something like grep or Awk.
grep -B 2 "funny" file|awk 'NR==1{print $NF; exit}'
You could also just use Awk.
awk -v s="funny" '/[[:space:]]lalala$/{n=NR+2; o=$NF}NR==n && $0~s{print o}' file
For the specific example of an IP address 8 lines before the match as mentioned in your comment:
awk -v s="funny" '
/[[:space:]][0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}$/ {
n=NR+8
ip=$NF
}
NR==n && $0~s {
print ip
}' file
These Awk solutions first find the output field you might want, then print the output only if the word you want exists in the nth following line.
Here's an attempt at a slightly generalized Awk solution. It maintains a circular queue of the last q lines and prints the line at the head of the queue when it sees a match.
#!/bin/sh
: ${q=8}
e=$1
shift
awk -v q="$q" -v e="$e" '{ m[(NR%q)+1] = $0 }
$0 ~ e { print m[((NR+1)%q)+1] }' "${#--}"
Adapting to a different default (I set it to 8) or proper option handling (currently, you'd run it like q=3 ./qgrep regex file) as well as remembering (and hence printing) the entire line should be easy enough.
(I also didn't bother to make it work correctly if you see a match in the first q-1 lines. It will just print an empty line then.)

My bash script uses so much memory

I was looking for which program is using my memory, where is leak?
And, I founded it, leak is at bash script.
But, how can it possible? Bash script will always alloc new space for each variable assignment?
My bash script is like the following, please let me know how can I correct this problem.
CONF="/conf/my.cfg"
HIGHRES="/data/high.dat"
getPeriod()
{
meas=`head -n 1 $CONF`
statperiod=`echo $meas`
}
(while true
do
lastline=`tail -n 1 $HIGHRES |cut -d"," -f2`
linenumber=`grep -n $lastline $HIGHRES | cut -f1 -d:`
/bin/stat $linenumber
getPeriod
sleep $statperiod
done)
EDIT #1:
The last line of high.dat
2013-02-11,10:59:13,1,0,0,0,0,0,0,0,0,12.340000,0.330000,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,24.730000,24.709990,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
I was unable to verify a memory leak with a close approximation of that script, so maybe the leak isn't actually where you think it is. Consider updating your question with much more info, including a complete working example along with what you did to figure out that you had a memory leak.
That said, you have chosen quite an odd way to find out how many lines a file has. The most usual way would be to use the standard wc tool:
$ wc -l < test.txt
19
$
Note: Use < file instead of passing the file name, since the latter will cause the file name to be written to stdout, and you'll then have to edit it away:
$ wc -l test.txt
19 test.txt
$

Resources