Possible names for a process Linux? - linux

I am trying to write a script in which I read from /proc/.../stat. One of the values in the space separated list is the name of the process, which does not interest me for the time being. I would like to read some other value after it. My idea was to move forward a certain number of values using spaces as the separator. A potential problem with this though is that I could have /proc/.../stat containing something like 1234 (asdf asdf) S .... The space in the process name would cause the program to read asdf) instead of S as intended.
So my question is can the process name have spaces in it? If so how could I differentiate between the values in /proc/.../stat?

I, personally, hate the way this file is laid out for precisely the reason you stated. With that said, it is possible to parse it uniquely no matter what the process name is. This is important, because not only the process name may contain spaces, it may also contain the close bracket character.
The method I suggest is to manually parse out the process name, and use space delimiting for everything else.
The process name should be defined as starting at the first open-bracket character on the line and ending at the last close bracket on the line. Since the other fields on the line don't have user-controlled format, this should reliably single the process name out, no matter what weird ways the proces is named.

Related

How to make it so my code remembers what is has written in a text file?

Hello python newbie here.
I have code that prints names into a text file. It takes the names from a website. And on that website, there may be multiple same names. It filters them perfectly without an issue into one name by looking if the name has already written in the text file. But when I run the code again it ignores the names that are already in the text file. It just filters the names it has written on the same session. So my question is how do I make it remember what it has written.
image of the text file
kaupan_nimi = driver.find_element_by_xpath("//span[#class='store_name']").text
with open("mainostetut_yritykset.txt", "r+") as tiedosto:
if kaupan_nimi in tiedosto:
print("\033[33mNimi oli jo tiedostossa\033[0m")
else:
print("\033[32mUusi asiakas vahvistettu!\033[0m")
#Kirjoittaa tekstitiedostoon yrityksen nimen
tiedosto.seek(0)
data = tiedosto.read(100)
if len(data) > 0:
tiedosto.write("\n")
tiedosto.write(kaupan_nimi)
There is the code that I think is the problem. Please correct me if I am wrong.
There are two main issues with your current code.
The first is that you are likely only going to be able to detect duplicated names if they are back to back. That is, if the prior name that you're seeing again was the very last thing written into the file. That's because all the lines in the file except the last one will have newlines at the end of them, but your names do not have newlines. You're currently looking for an exact match for a name as a line, so you'll only ever have a chance to see that with the last line, since it doesn't have a newline yet. If the list of names you are processing is sorted, the duplicates will naturally be clumped together, but if you add in some other list of names later, it probably won't pick up exactly where the last list left off.
The second issue in your code is that it will tend to clobber anything that gets written more than 100 characters into the file, starting every new line at that point, once it starts filling up a bit.
Lets look at the different parts of your code:
if kaupan_nimi in tiedosto:
This is your duplicate check, it treats the file as an iterator and reads each line, checking if kaupan_nimi is an exact match to any of them. This will always fail for most of the lines in the file because they'll end with "\n" while kaupan_nimi does not.
I would suggest instead reading the file only once per batch of names, and keeping a set of names in your program's memory that you can check your names-to-be-added against. This will be more efficient, and won't require repeated reading from the disk, or run into newline issues.
tiedosto.seek(0)
data = tiedosto.read(100)
if len(data) > 0:
tiedosto.write("\n")
This code appears to be checking if the file is empty or not. However, it always leaves the file position just past character 100 (or at the end of the file if there were fewer than 100 characters in it so far). You can probably fit several names in that first 100 characters, but after that, you'll always end up with the names starting at index 100 and going on from there. This means you'll get names written on top of each other.
If you take my earlier advice and keep a set of known names, you could check that set to see if it is empty or not. This doesn't require doing anything to the file, so the position you're operating on it can remain at the end all of the time. Another option is to always end every line in the file with a newline so that you don't need to worry about whether to prepend a newline only if the file isn't empty, since you know that at the end of the file you'll always be writing a fresh line. Just follow each name with a newline and you'll always be doing the right thing.
Here's how I'd put things together:
# if possible, do this only once, at the start of the website reading procedure:
with open("mainostetut_yritykset.txt", "r+") as tiedosto:
known_names = set(name.strip() for name in tiedosto) # names already in the file
# do the next parts in some kind of loop over the names you want to add
for name in something():
if name in known_names: # duplicate found
print("\033[33mNimi oli jo tiedostossa\033[0m")
else: # not a duplicate
print("\033[32mUusi asiakas vahvistettu!\033[0m")
tiedosto.write(kaupan_nimi) # write out the name
tiedosto.write("\n") # and always add a newline afterwards
# alternatively, if you can't have a trailing newline at the end, use:
# if known_names:
# tiedosto.write("\n")
# tiedosto.write(kaupan_nimi)
known_names.add(kaupan_nimi) # update the set of names

searching elements of list in file

The list name is disk and its below:
disks
['5000cca025884d5\n', '5000cca025a1ee6\n']
The file name is p and its below:
c0t5000CCA025884D5Cd0 solaris
/scsi_vhci/disk#g5000cca025884d5c
c0t5000CCA025A1EE6Cd0
/scsi_vhci/disk#g5000cca025a1ee6c
c3t50060E8007DB981Ad1
/pci#400/pci#1/pci#0/pci#8/SUNW,emlxs#0/fp#0,0/ssd#w50060e8007db981a,1
c3t50060E8007DB981Ad2
/pci#400/pci#1/pci#0/pci#8/SUNW,emlxs#0/fp#0,0/ssd#w50060e8007db981a,2
c3t50060E8007DB981Ad3
/pci#400/pci#1/pci#0/pci#8/SUNW,emlxs#0/fp#0,0/ssd#w50060e8007db981a,3
c3t50060E8007DB981Ad4
i want to search elements of a list in file
There are a couple of things to look at here:
I haven't actually used re.match() before, but I can see the first issue: Your list of disks has a newline character after every entry, so that will mess up matches. Also, re.match() only matches from the start of the line. Your lines start with numbers, so you need to search during the line, using re.search(). Finally, you should make it case insensitive; one option to d this is to make everything lowercase just as your disks list is.
try adapting your loop as so:
#.strip() will get rid of new lines and .lower() will make the string lowercase
for line in q:
if re.search(disks[0].strip(),line.lower()):
print line
If that doesn't fix it, I would try making it print out disks[0].strip() and line for every iteration of the loop (not just when it matches the if clause) to make sure it's reading in what you think it is.

Fortran: odd space-padding string behavior when opening files

I have a Fortran program which reads data from a bunch of input files. The first file contains, among other things, the names of three other files that I will read from, specified in the input file (which I redirect to stdin at execution of the program) as follows
"data/file_1.dat" "data/file2.dat" "data/file_number_3.txt"
They're separated by regular spaces and there's no trailing spaces on the line, just a line break. I read the file names like this:
character*30 fnames(3)
read *, fnames
and then I proceed to read the data, through calling on a function which takes the file name as parameter:
subroutine read_from_data_file(fname)
implicit none
character*(*) fname
open(15,file=fname)
! read some data
end subroutine read_from_data_file
! in the main program:
do i=1,3
call read_from_data_file(trim(fnames(i)))
end do
For the third file, regardless of in which order I put the file names in the input file, the padding doesn't work and Fortran tries to open a with a name like "data/file_number_3.txt ", i.e. with a bunch of trailing spaces. This creates an empty file named data/file_number_3.txt (White Space Conflict) in my folder, and as soon as I try to read from the file the program crashes with an EOF error.
I've tried adding trim() in various places, e.g. open(15,file=trim(fname)) without any success. I assume it has something to do with the fix length of character arrays in Fortran, but I thought trim() would take care of that - is that assumption incorrect?
How do I troubleshoot and fix this?
Hmmm. I wonder if there is a final character on the last line of your input file which is not whitespace, such as an EOF marker from a Linux system popping up on a Windows system or vice-versa. Try, if you are on a Linux box, dos2unix; on a Windows box try something else (I'm not sure what).
If that doesn't work, try using the intrinsic IACHAR function to examine each individual character in the misbehaving string and examine the entrails.
Like you, I expect trim to trim trailing whitespace from a string, but not all the characters which are not displayed are regarded as whitespace.
And, while I'm writing, your use of declarations such as
character*30
is obsolescent, the modern alternative is
character(len=30)
and
character(len=*)
is preferred to
character*(*)
EDIT
Have you tried both reading those names from a file and reading them from stdin ?

Decrypt obfuscated perl script

Had some spam issues on my server and, after finding out and removing some Perl and PHP scripts I'm down to checking what they really do, although I'm a senior PHP programmer I have little experience with Perl, can anyone give me a hand with the script here:
http://pastebin.com/MKiN8ifp
(It was one long line of code, script was called list.pl)
The start of the script is:
$??s:;s:s;;$?::s;(.*); ]="&\%[=.*.,-))'-,-#-*.).<.'.+-<-~-#,~-.-,.+,~-{-,.<'`.{'`'<-<--):)++,+#,-.{).+,,~+{+,,<)..})<.{.)-,.+.,.)-#):)++,+#,-.{).+,,~+{+,,<)..})<*{.}'`'<-<--):)++,+#,-.{).+:,+,+,',~+*+~+~+{+<+,)..})<'`'<.{'`'<'<-}.<)'+'.:*}.*.'-|-<.+):)~*{)~)|)++,+#,-.{).+:,+,+,',~+*+~+~+{+<+,)..})
It continues with precious few non-punctuation characters until the very end:
0-9\;\\_rs}&a-h;;s;(.*);$_;see;
Replace the s;(.*);$_;see; with print to get this. Replace s;(.*);$_;see; again with print in the first half of the payload to get this, which is the decryption code. The second half of the payload is the code to decrypt, but I can't go any further with it, because as you see, the decryption code is looking for a key in an envvar or a cookie (so that only the script's creator can control it or decode it, presumably), and I don't have that key. This is actually reasonably cleverly done.
For those interested in the nitty gritty... The first part, when de-tangled looks like this:
$? ? s/;s/s;;$?/ :
s/(.*)/...lots of punctuation.../;
The $? at the beginning of the line is the pre-defined variable containing the child error, which no doubt serves only as obfuscation. It will be undefined, as there can be no child error at this point.
The questionmark following it is the start of a ternary operator
CONDITION ? IF_TRUE : IF_FALSE
Which is also added simply to obfuscate. The expression returned for true is a substitution regex, where the / slash delimiter has been replaced with colon s:pattern:replacement:. Above, I have put back slashes. The other expression, which is the one that will be executed is also a substitution regex, albeit an incredibly long one. The delimiter is semi-colon.
This substitution replaces .* in $_ - the default input and pattern-searching space - with a rather large amount of punctuation characters, which represents the bulk of the code. Since .* matches any string, even the empty string, it will simply get inserted into $_, and is for all intents and purposes identical to simply assigning the string to $_, which is what I did:
$_ = q;]="&\%[=.*.,-))'-,-# .......;;
The following lines are a transliteration and another substitution. (I inserted comments to point out the delimiters)
y; -"[%-.:<-#]-`{-}#~\$\\;{\$()*.0-9\;\\_rs}&a-h;;
#^ ^ ^ ^
#1 2 3
(1,2,3 are delimiters, the semi-colon between 2 and 3 is escaped)
The basic gist of it is that various characters and ranges -" (space to double quote), and something that looks like character classes (with ranges) [%-.:<-#], but isn't, get transliterated into more legible characters e.g. curly braces, dollar sign, parentheses,0-9, etc.
s;(.*);$_;see;
The next substitution is where the magic happens. It is also a substitution with obfuscated delimiters, but with three modifers: see. s does nothing in this case, as it only allows the wildcard character . to match newline. ee means to evaluate the expression twice, however.
In order to see what I was evaluating, I performed the transliteration and printed the result. I suspect that I somewhere along the line got some characters corrupted, because there were subtle errors, but here's the short (cleaned up) version:
s;(.*);73756220656e6372797074696f6e5f6 .....;; # very long line of alphanumerics
s;(..);chr(hex($1));eg;
s;(.*);$_;see;
s;(.*);704b652318371910023c761a3618265 .....;; # another long line
s;(..);chr(hex($1));eg;
&e_echr(\$_);
s;(.*);$_;see;
The long regexes are once again the data containers, and insert data into $_ to be evaluated as code.
The s/(..)/chr(hex($1))/eg; is starting to look rather legible. It is basically reading two characters at the time from $_ and converting it from hex to corresponding character.
The next to last line &e_echr(\$_); stumped me for a while, but it is a subroutine that is defined somewhere in this evaluated code, as hobbs so aptly was able to decode. The dollar sign is prefixed by backslash, meaning it is a reference to $_: I.e. that the subroutine can change the global variable.
After quite a few evaluations, $_ is run through this subroutine, after which whatever is contained in $_ is evaluated a last time. Presumably this time executing the code. As hobbs said, a key is required, which is taken from the environment %ENV of the machine where the script runs. Which we do not have.
Ask the B::Deparse module to make it (a little more) readable.

How do you delete everything but a specific pattern in Vim?

I have an XML file where I only care about the size attribute of a certain element.
First I used
global!/<proto name="geninfo"/d
to delete all lines that I don't care about. That leaves me a whole bunch of lines that look like this:
<proto name="geninfo" pos="0" showname="General information" size="174">
I want to delete everything but the value for "size."
My plan was to use substitute to get rid of everything not matching 'size="[digit]"', the remove the string 'size' and the quotes but I can't figure out how to substitute the negation of a string.
Any idea how to do it, or ideas on a better way to achieve this? Basically I want to end up with a file with one number (the size) per line.
You can use matching groups:
:%s/^.*size="\([0-9]*\)".*$/\1/
This will replace lines that contain size="N" by just N and not touch other lines.
Explanation: this will look for a line that contains some random characters, then somewhere the chain size=", then digits, then ", then some more random characters, then the end of the line. Now what I did is that I wrapped the digits in (escaped) parenthesis. That creates a group. In the second part of the search-and-replace command, I essentially say "I want to replace the whole line with just the contents of that first group" (referred to as \1).
:v:size="\d\+":d|%s:.*size="\([^"]\+\)".*:\1:
The first command (until the | deletes every line which does not match the size="<SOMEDIGIT(S)>" pattern, the second (%s... removes everything before and after size attr's " (and " will also be removed).
HTH

Resources