Split a string of chr into a list (Python) - string

I just read the 10th line from file 'text.txt'
>>>line=linecache.getline("text.txt",10)
>>>line
"['\\x02', '\\x03']\n"
I would like to create a list lst in this case of two variable '\\x02' and '\\x03'
>>>lst
['\\x02','\\x03']
I have to iterate the process for different text lines always formatted like line also with more variables.
Any suggestions?
Thank you

This will take a string in that format with an arbitrary number of elements and convert it to a list.
line = "['\\x02', '\\x03']\n"
line = line.strip()[1:-1]
lst = [x.strip()[1:-1] for x in line.split(",")]

Related

How to find a substring in a line from a text file and add that line or the characters after the searched string into a list using Python?

I have a MIB dataset which is around 10k lines. I want to find a certain string (for eg: "SNMPv2-MIB::sysORID") in the text file and add the whole line into a list. I am using Jupyter Notebooks for running the code.
I used the below code to search the search string and it print the searched string along with the next two strings.
basic = open('mibdata.txt')
file = basic.read()
city_name = re.search(r"SNMPv2-MIB::sysORID(?:[^a-zA-Z'-]+[a-zA-Z'-]+) {1,2}", file)
city_name = city_name.group()
print(city_name)
Sample lines in file:
SNMPv2-MIB::sysORID.10 = OID: NOTIFICATION-LOG-MIB::notificationLogMIB
SNMPv2-MIB::sysORDescr.1 = STRING: The MIB for Message Processing and Dispatching.
The output expected is
SNMPv2-MIB::sysORID.10 = OID: NOTIFICATION-LOG-MIB::notificationLogMIB
but i get only
SNMPv2-MIB::sysORID.10 = OID: NOTIFICATION-LOG-MIB
The problem with changing the number of string after the searched strings is that the number of strings in each line is different and i cannot specify a constant. Instead i want to use '\n' as a delimiter but I could not find one such post.
P.S. Any other solution is also welcome
EDIT
You can read all lines one by one of the file and look for a certain Regex that matches the case.
r(NMPv2-MIB::sysORID).* finds the encounter of the string in the parenthesis and then matches everything followed after.
import re
basic = open('file.txt')
entries = map(lambda x : re.search(r"(SNMPv2-MIB::sys).*",x).group() if re.search(r"(SNMPv2-MIB::sys).*",x) is not None else "", basic.readlines())
non_empty_entries = list(filter(lambda x : x is not "", entries))
print(non_empty_entries)
If you are not comfortable with Lambdas, what the above script does is
taking the text from the file, splits it into lines and checks all lines individually for a regex match.
Entries is a list of all lines where the match was encountered.
EDIT vol2
Now when the regex doesn't match it will add an empty string and after we filter them out.

I have a single line list and want to covert it to a multi dimensional list

I have a text file that I converted into a list, but I want it to be a multi-dimensional list. Is there a way to do this easily?
This is my code:
crimefile = open(fileName, 'r')
yourResult = [line.split(',') for line in crimefile.readlines()]
Your code does create a 2-dimensional list (assuming your file is multiple lines of numbers where each number is separated by a comma). If you want to print out each individual list in yourResult, try this: for list in yourResult: print (list) To access a certain item in the list, for example the first number on each line, simply replace print (list) with print (list[0])

Python read file contents into nested list

I have this file that contains something like this:
OOOOOOXOOOO
OOOOOXOOOOO
OOOOXOOOOOO
XXOOXOOOOOO
XXXXOOOOOOO
OOOOOOOOOOO
And I need to read it into a 2D list so it looks like this:
[[O,O,O,O,O,O,X,O,O,O,O],[O,O,O,O,O,X,O,O,O,O,O],[O,O,O,O,X,O,O,O,O,O,O],[X,X,O,O,X,O,O,O,O,O,O],[X,X,X,X,O,O,O,O,O,O,O,O],[O,O,O,O,O,O,O,O,O,O,O]
I have this code:
ins = open(filename, "r" )
data = []
for line in ins:
number_strings = line.split() # Split the line on runs of whitespace
numbers = [(n) for n in number_strings]
data.append(numbers) # Add the "row" to your list.
return data
But it doesn't seem to be working because the O's and X's do not have spaces between them. Any ideas?
Just use data.append(list(line.rstrip())) list accepts a string as argument and just splits them on every character.

matlab function replacing last part of strings between known characters

I have a text file TF including a set of the following kind of strings:
"linStru.twoZoneBuildingStructure.north.airLeakage.senTem.T",
"linStru.twoZoneBuildingStructure.north.vol.Xi[1]",
"linStru.twoZoneBuildingStructure.south.airLeakage.senTem.T",
"linStru.twoZoneBuildingStructure.south.vol.Xi[1]", "
"linStru.twoZoneBuildingStructure.north_ext.layMul.nMat[1].monoLayer1Nf.T[1]",
"linStru.twoZoneBuildingStructure.north_ext.layMul.nMat[1].monoLayer2Nf.T[2]",
Given a line L, starting from the end let the substring s denote the portion of the string between ," and the first .
To make it clearer, for L=1: s=T, for L=2: s=Xi[1], for L=5: s=T[1], etc.
Given a text file TF in the above format, I want to write a MATLAB function which takes TF and replaces the corresponding s on each line with der(s).
For example, the function should change the above strings as follows:
"linStru.twoZoneBuildingStructure.north.airLeakage.senTem.der(T)",
"linStru.twoZoneBuildingStructure.north.vol.der(Xi[1])",
"linStru.twoZoneBuildingStructure.south.airLeakage.senTem.der(T)",
"linStru.twoZoneBuildingStructure.south.vol.der(Xi[1])", "
"linStru.twoZoneBuildingStructure.north_ext.layMul.nMat[1].monoLayer1Nf.der(T[1])",
"linStru.twoZoneBuildingStructure.north_ext.layMul.nMat[1].monoLayer2Nf.der(T[2])",
How can such a function be written?
Something like
regexprep(TF, '\.([^.]+)",$', '.der($1)",', 'dotexceptnewline', 'lineanchors')
It finds the longest sequence of non-dot characters appearing between a dot before and quote-comma-endline after, and encloses that inside der( ).
I see there is a small " typo on the fourth line of your text file. I'm going to remove this to make things simpler.
As such, the simplest way that I can see you do this is iterate through all of your strings, remove the single quotes, then find the point in your string where the last . occurs. Extract this substring, then manually insert the der() in between this string. Assuming that those strings are in a text file called functions.txt, you would read in your text file using textread to read in individual strings. As such:
names = textread('functions.txt', '%s');
names should now be a cell array of names where each element is each string encapsulated in double quotes. Use findstr to extract where the . is located, then extract the last location of where this is. Extract this substring, then replace this string with der(). In other words:
out_strings = cell(1, numel(names)); %// To store output strings
for idx = 1 : numel(names)
%// Extract actual string without quotes and comma
name_str = names{idx}(2:end-2);
%// Find the last dot
dot_locs = findstr(name_str, '.');
%// Last dot location
last_dot_loc = dot_locs(end);
%// Extract substring after dot
last_string = name_str(last_dot_loc+1:end);
%// Create new string
out_strings{idx} = ['"' name_str(1:last_dot_loc) 'der(' last_string ')",'];
end
This is the output I get:
celldisp(out_strings)
out_strings{1} =
"linStru.twoZoneBuildingStructure.north.airLeakage.senTem.der(T)",
out_strings{2} =
"linStru.twoZoneBuildingStructure.north.vol.der(Xi[1])",
out_strings{3} =
"linStru.twoZoneBuildingStructure.south.airLeakage.senTem.der(T)",
out_strings{4} =
"linStru.twoZoneBuildingStructure.south.vol.der(Xi[1])",
out_strings{5} =
"linStru.twoZoneBuildingStructure.north_ext.layMul.nMat[1].monoLayer1Nf.der(T[1])",
out_strings{6} =
"linStru.twoZoneBuildingStructure.north_ext.layMul.nMat[1].monoLayer2Nf.der(T[2])",
The last thing you want to do is write each line of text to your text file. You can use fopen to open up a file for writing. fopen returns a file ID that is associated with the file you want to write to. You then use fprintf to print your strings and name a newline for each string using this file ID. You then close the file using fclose with this same file ID. As such, if we wanted to output a text file called functions_new.txt, we would do:
%// Open up the file and get ID
fid = fopen('functions_new.txt', 'w');
%// For each string we have...
for idx = 1 : numel(out_strings)
%// Write the string to file and make a new line
fprintf(fid, '%s\n', out_strings{idx});
end
%// Close the file
fclose(fid);
Another way to do it with regexprep:
str_out = regexprep(str_in, '\.([^\.]+)"$','\.der($1)"');
Example: for
str_in = {'"linStru.twoZoneBuildingStructure.north.airLeakage.senTem.T"'
'"linStru.twoZoneBuildingStructure.north.vol.Xi[1]"'};
this gives
str_out =
'"linStru.twoZoneBuildingStructure.north.airLeakage.senTem.der(T)"'
'"linStru.twoZoneBuildingStructure.north.vol.der(Xi[1])"'

Skipping over array elements of certain types

I have a csv file that gets read into my code where arrays are generated out of each row of the file. I want to ignore all the array elements with letters in them and only worry about changing the elements containing numbers into floats. How can I change code like this:
myValues = []
data = open(text_file,"r")
for line in data.readlines()[1:]:
myValues.append([float(f) for f in line.strip('\n').strip('\r').split(',')])
so that the last line knows to only try converting numbers into floats, and to skip the letters entirely?
Put another way, given this list,
list = ['2','z','y','3','4']
what command should be given so the code knows not to try converting letters into floats?
You could use try: except:
for i in list:
try:
myVal.append(float(i))
except:
pass

Resources