if string not in line - string

I have a list with a bunch of lines (created from a 'show cdp neighbors detail
from a Cisco switch or router).
I want to perform some processing but only on lines that contain one of the following four strings:
important = (
"show cdp neighbors detail",
"Device ID: ",
"IP address: ",
"Platform: "
)
Right now my iterations work but I am processing over a large number of lines that end up doing nothing. I simply want to skip over the unimportant lines and do my processing over the four lines lines that contain the data that I need.
What I want to do (in English) is:
for line in cdp_data_line:
if any of the lines in 'cdp_data_line' does not contain one of the four strings in 'important', then continue (skip over the line).
After skipping the unwanted lines, then I can test to see which of the four lines I have left and process accordingly:
if all of the four strings in 'important' is not found in any line of cdp_data_line, the continue
if any of those four strings is found in a line,then process:
elif 'show cdp neighbors detail' in line:
do something
elif 'Device ID: ' in line:
do something
elif 'IP address: ' in line:
do something
elif 'Platform: ' in line:
do something
else:
sys.exit("Invalid prompt for local hostname - exiting")
I am trying to do it like this:
if any(s in line for line in cdp_data_line for s not in important):
continue
PyCharm says it doesn't know what 's' is, even though I found code snippets
using this technique that work in PyCharm. Not sure why mine is failing.
There has got to be a succinct way of saying: If string1 or string 2 or string 3 or string 4 is not in cdp_data_line, then skip (continue), otherwise
if I find any of those four strings, do something.
Any help would be greatly appreciated.

Related

How do I search for a substring in a string then find the character before the substring in python

I am making a small project in python that lets you make notes then read them by using specific arguments. I attempted to make an if statement to check if the string has a comma in it, and if it does, than my python file should find the comma then find the character right below that comma and turn it into an integer so it can read out the notes the user created in a specific user-defined range.
If that didn't make sense then basically all I am saying is that I want to find out what line/bit of code is causing this to not work and return nothing even though notes.txt has content.
Here is what I have in my python file:
if "," not in no_cs: # no_cs is the string I am searching through
user_out = int(no_cs[6:len(no_cs) - 1])
notes = open("notes.txt", "r") # notes.txt is the file that stores all the notes the user makes
notes_lines = notes.read().split("\n") # this is suppose to split all the notes into a list
try:
print(notes_lines[user_out])
except IndexError:
print("That line does not exist.")
notes.close()
elif "," in no_cs:
user_out_1 = int(no_cs.find(',') - 1)
user_out_2 = int(no_cs.find(',') + 1)
notes = open("notes.txt", "r")
notes_lines = notes.read().split("\n")
print(notes_lines[user_out_1:user_out_2]) # this is SUPPOSE to list all notes in a specific range but doesn't
notes.close()
Now here is the notes.txt file:
note
note1
note2
note3
and lastly here is what I am getting in console when I attempt to run the program and type notes(0,2)
>>> notes(0,2)
jeffv : notes(0,2)
[]
A great way to do this is to use the python .partition() method. It works by splitting a string from the first occurrence and returns a tuple... The tuple consists of three parts 0: Before the separator 1: The separator itself 2: After the separator:
# The whole string we wish to search.. Let's use a
# Monty Python quote since we are using Python :)
whole_string = "We interrupt this program to annoy you and make things\
generally more irritating."
# Here is the first word we wish to split from the entire string
first_split = 'program'
# now we use partition to pick what comes after the first split word
substring_split = whole_string.partition(first_split)[2]
# now we use python to give us the first character after that first split word
first_character = str(substring_split)[0]
# since the above is a space, let's also show the second character so
# that it is less confusing :)
second_character = str(substring_split)[1]
# Output
print("Here is the whole string we wish to split: " + whole_string)
print("Here is the first split word we want to find: " + first_split)
print("Now here is the first word that occurred after our split word: " + substring_split)
print("The first character after the substring split is: " + first_character)
print("The second character after the substring split is: " + second_character)
output
Here is the whole string we wish to split: We interrupt this program to annoy you and make things generally more irritating.
Here is the first split word we want to find: program
Now here is the first word that occurred after our split word: to annoy you and make things generally more irritating.
The first character after the substring split is:
The second character after the substring split is: t

How do I get the computer to seperate a conjoined string into seperate items on a list depending on what it detects?

This is a follow up from a question I asked yesterday which I got brilliant responses for but now I have more problems :P
(How do I get python to detect a right brace, and put a space after that?)
Say I have this string that's in a txt document which I make Python read
!0->{100}!1o^{72}->{30}o^{72}->{30}o^{72}->{30}o^{72}->{30}o^{72}->{30}
I want to seperate this conjoined string into individual components that can be indexed after detecting a certain symbol.
If it detects !0, it's considered as one index.
If it detects ->{100}, that is also considered as another part of the list.
It seperates all of them into different parts until the computer prints out:
!0, ->{100}, !1, o^{72}, ->{30}
From yesterdays code, I tried a plethora of things.
I tried this technique which separates anything with '}' perfectly but has a hard time separating !0
text = "(->{200}o^{90}->{200}o^{90}->{200}o^{90}!0->{200}!1o^{90})" #this is an example string
my_string = ""
for character in text:
my_string += character
if character == "}":
my_string+= "," #up until this point, Guimonte's code perfectly splits "}"
elif character == "0": #here is where I tried to get it to detect !0. it splits that, but places ',' on all zeroes
my_string+= ","
print(my_string)
The output:
(->{20,0,},o^{90,},->{20,0,},o^{90,},->{20,0,},o^{90,},!0,->{20,0,},!1o^{90,},)
I want the out put to insead be:
(->{200}, o^{90}, ->{200}, o^{90}, ->{200}, o^{90}, !0, ->{200}, !1, o^{90})
It seperates !0 but it also messes with the other symbols.
I'm starting to approach a check mate scenario. Is there anyway I can get it to split !0 and !1 as well as the right brace?

Python: remember line index when reading lines from text file

I'm extracting data in a loop from a text file between two strings with Python 3.6. I've got multiple strings of which I would like to extract data between those strings, see code below:
for i in range(0,len(strings1)):
with open('infile.txt','r') as infile, open('outfile.txt', 'w') as outfile:
copy = False
for line in infile:
if line == strings1[i]:
copy = True
elif line == strings2[i]:
copy = False
elif copy:
outfile.write(line)
continue
To decrease the processing time of the loop, I would like to modify my code such that after it has extracted data between two strings, let's say strings1[1] and strings2[1], it remembers the line index of strings2[1] and starts the next iteration of the loop at that line index. Therefore it doesn't have to read the whole file during each iteration. The string lists are build such that the previous strings will never occur after a current string, so modifying my code to what I want won't break the loop.
Does anyone how to do this?
===========================================================================
EDIT:
I've got a file in a format such as:
the first line
bla bla bla
FIRST some string 1
10 10
15 20
5 2.5
SECOND some string 2
bla bla bla
bla bla bla
FIRST some string 3
10 10
15 20
5 2.5
SECOND some string 4
The file goes on like this for many lines.
I want to extract the data between 'FIRST some string 1' and 'SECOND some string 2', and plot this data. When that is done, I want to do the same for the data between 'FIRST some string 3' and 'SECOND some string 4' (thus also plot the data). All the 'FIRST some string ..' are stored in strings1 list and all the 'SECOND some string ..' are stored in strings2 list.
To decrease computational time, I would like to modify the code such that after the first iteration, it knows that it can start from line with string 'some string 2' and not from 'the first line' AND also that when during the first iteration, it knows that it can stop the first iteration when it has found 'SECOND some string 2'.
Does anyone how to do this? Please let me know when something is unclear.
The key issue is you're reopening your files in a for loop, of course it will reiterate the files from the beginning each time. I wouldn't open the files in a for loop, that's horribly inefficient. You can load the files into memory first and then loop through strings1.
There are some other issues, namely here:
copy = False
for line in infile:
if line == strings1[i]:
copy = True
elif line == strings2[i]:
copy = False
elif copy:
outfile.write(line)
continue
The elif copy: line will never execute in the first iteration of the second loop because copy is only ever True once the line == strings1[i] is met. After that condition is met, for the rest of the iterations it will always write the lines from infile to outfile. Unless this is precisely what you're trying to achieve the logic doesn't work.
Without a full context it's hard to understand what exactly you're looking for.
But maybe what you want to do instead is simply this:
with open('infile.txt','r') as infile, open('outfile.txt', 'w') as outfile:
for line in infile.readlines():
if line.rstrip('\n') in strings1:
outfile.write(line)
What this code is doing:
1.) Open both files into memory.
2.) Iterate through the lines of the infile.
3.) Check if the iterated line, stripping the trailing newline character is in the list strings1, assuming your strings1 is a list that doesn't have any trailing newline characters. If each item in strings1 already has a trailing \n, then don't rstrip the line.
4.) If line occurs in strings1, write the line to outfile.
This looks to be the gist of what you're attempting.

How find text in file and get lines up and down according to pattern

How can I find in file particular text '12345' and get all lines up and down till to the 'Received notification:' using linux console commands without hardcoding numbers of lines for up and down?
Received notification:
Random text
Random text
...
12345
random text
...
Random text
Received notification:
You can use the following approach:
$ awk '/str1/ {p=1}; p; /str2/ {p=0}' file
When it finds str1, then makes variable p=1. It just prints lines when p==1. This is accomplished with the p condition. If it is true, it performs the default awk action, that is, print $0. Otherwise, it does not.
When it finds str2, then makes variable p=0. As this condition is checked after p condition, it will print the line in which str2 appears for the first time.

str.format places last variable first in print

The purpose of this script is to parse a text file (sys.argv[1]), extract certain strings, and print them in columns. I start by printing the header. Then I open the file, and scan through it, line by line. I make sure that the line has a specific start or contains a specific string, then I use regex to extract the specific value.
The matching and extraction work fine.
My final print statement doesn't work properly.
import re
import sys
print("{}\t{}\t{}\t{}\t{}".format("#query", "target", "e-value",
"identity(%)", "score"))
with open(sys.argv[1], 'r') as blastR:
for line in blastR:
if line.startswith("Query="):
queryIDMatch = re.match('Query= (([^ ])+)', line)
queryID = queryIDMatch.group(1)
queryID.rstrip
if line[0] == '>':
targetMatch = re.match('> (([^ ])+)', line)
target = targetMatch.group(1)
target.rstrip
if "Score = " in line:
eValue = re.search(r'Expect = (([^ ])+)', line)
trueEvalue = eValue.group(1)
trueEvalue = trueEvalue[:-1]
trueEvalue.rstrip()
print('{0}\t{1}\t{2}'.format(queryID, target, trueEvalue), end='')
The problem occurs when I try to print the columns. When I print the first 2 columns, it works as expected (except that it's still printing new lines):
#query target e-value identity(%) score
YAL002W Paxin1_129011
YAL003W Paxin1_167503
YAL005C Paxin1_162475
YAL005C Paxin1_167442
The 3rd column is a number in scientific notation like 2e-34
But when I add the 3rd column, eValue, it breaks down:
#query target e-value identity(%) score
YAL002W Paxin1_129011
4e-43YAL003W Paxin1_167503
1e-55YAL005C Paxin1_162475
0.0YAL005C Paxin1_167442
0.0YAL005C Paxin1_73182
I have removed all new lines, as far I know, using the rstrip() method.
At least three problems:
1) queryID.rstrip and target.rstrip are lacking closing ()
2) Something like trueEValue.rstrip() doesn't mutate the string, you would need
trueEValue = trueEValue.rstrip()
if you want to keep the change.
3) This might be a problem, but without seeing your data I can't be 100% sure. The r in rstrip stands for "right". If trueEvalue is 4e-43\n then it is true the trueEValue.rstrip() would be free of newlines. But the problem is that your values seem to be something like \n43-43. If you simply use .strip() then newlines will be removed from either side.

Resources