How to divide text into several parts relative to one character - string

How to separate text from "^" to "^" and save data to different variables
Below I insert what is the main problem:
DM^126287^8209/2018^INLMDU 39942^70
It will often be the case that the number of characters between "^" will change, so I have to read from the character to the character.
Do you have any ideas?
I know how to check in which cell the sign is located, the code is presented below:
currentWord = "DM^126287^8209/2018^INLMDU 39942^70"
guess = "^"
occurrences = currentWord.count(guess)
indices = [i for i, a in enumerate(currentWord)
if a == guess]
print indices
But it needs to save "8209/2018" to the variable, "INLMDU 39942" to the next variable and "70" to the last variable
Thank you in advance

Related

how to extract a substring in a text file, when the substring is between two parentheses?

I have a text file that contains sections as shown below
V1('ww', '6deg')
V2('bb', '15meter')
V3('cc','25yards')
.
.
V4('dd', '72cm')
these sections are randomly distributed inside the text file.
Using MATLAB, I need to find all the occurrences of VariableProp(VarName, VarValue) in the file, and change the VarValue.
Any ideas?
Thank you
You can do this with textscan. (You could also probably do it with regexp). Here's a textscan approach:
str = "V4('dd', '72cm')"; % a line from the file
% Call textscan on a single line of text
x = textscan(str, "%[^(](%[^']%[^'])", ...
MultipleDelimsAsOne=true, Delimiter=[","," ", "'"]);
% x is a 3-element cell array. If we got a match, each element in the
% outer cell is a scalar. Use vertcat to unwrap a layer of cell-ness:
x = vertcat(x{:});
% If we're left with 3 elements, it was a match
isMatch = numel(x) == 3;

need to use a 'for loop' for this one. The user has to enter a sentence and any spaces must be replaced with "%"

the input
sentence = input("Please enter a sentence:")
the for loop (incorrect here)
for i in sentence:
print(sentence)
space_loc = sentence.index(" ")
for c in sentence:
print(space_loc)
for b in range(space_loc):
print("%")
confused about how to get the answer out.
You can try using concatenation of strings and slicing in this one.
sentence = input()
After taking the input simply store the length of your string
length = len(sentence)
Then iterate through every characters in the string and when you find a " ", break the string into two halves using slicing such that each half has one side of the string from " ". And then, join it by a "%" :-
for i in range(length):
if sentence[i]==" ":
sentence = sentence[:i] + "%" + sentence[i+1:]
Here, sentence[:i] is the part of string before the space and sentence[i+1:] is the part of string after the space.
One way of solving your query:
Code
sentence = input("Please enter a sentence:")
ls=sentence.split() #Creating a list of words present in sentence
new_sentence='%'.join(ls) #Joining the list with '%'
print(new_sentence)
Output
Please enter a sentence:Hello there coders!
Hello%there%coders!
EDIT
I do not understand how exactly you want to use the for loop here. If you just want to include a for loop (no restrictions), then you can do this:
Code
ls=[]
a=0
sentence = input("Please enter a sentence:")
for i in range(0,len(sentence)): # This loop will find the words in the sentence and store them in a list. Words are determined by checking the white space. Each space is replaced with '%'
if sentence[i]==' ':
ls.append(sentence[a:i])
a=i
ls.append('%')
ls.append(sentence[a:]) # This is to save the last word
ls1=[]
for i in ls: # Removing any white space inside the list
j=i.replace(' ','')
ls1.append(j)
print(''.join(ls1)) # Displaying final output
Again, your question is very open ended and this is just one way of using for loop to get the desired result!

Is there a way to substring, which is between two words in the string in Python?

My question is more or less similar to:
Is there a way to substring a string in Python?
but it's more specifically oriented.
How can I get a par of a string which is located between two known words in the initial string.
Example:
mySrting = "this is the initial string"
Substring = "initial"
knowing that "the" and "string" are the two known words in the string that can be used to get the substring.
Thank you!
You can start with simple string manipulation here. str.index is your best friend there, as it will tell you the position of a substring within a string; and you can also start searching somewhere later in the string:
>>> myString = "this is the initial string"
>>> myString.index('the')
8
>>> myString.index('string', 8)
20
Looking at the slice [8:20], we already get close to what we want:
>>> myString[8:20]
'the initial '
Of course, since we found the beginning position of 'the', we need to account for its length. And finally, we might want to strip whitespace:
>>> myString[8 + 3:20]
' initial '
>>> myString[8 + 3:20].strip()
'initial'
Combined, you would do this:
startIndex = myString.index('the')
substring = myString[startIndex + 3 : myString.index('string', startIndex)].strip()
If you want to look for matches multiple times, then you just need to repeat doing this while looking only at the rest of the string. Since str.index will only ever find the first match, you can use this to scan the string very efficiently:
searchString = 'this is the initial string but I added the relevant string pair a few more times into the search string.'
startWord = 'the'
endWord = 'string'
results = []
index = 0
while True:
try:
startIndex = searchString.index(startWord, index)
endIndex = searchString.index(endWord, startIndex)
results.append(searchString[startIndex + len(startWord):endIndex].strip())
# move the index to the end
index = endIndex + len(endWord)
except ValueError:
# str.index raises a ValueError if there is no match; in that
# case we know that we’re done looking at the string, so we can
# break out of the loop
break
print(results)
# ['initial', 'relevant', 'search']
You can also try something like this:
mystring = "this is the initial string"
mystring = mystring.strip().split(" ")
for i in range(1,len(mystring)-1):
if(mystring[i-1] == "the" and mystring[i+1] == "string"):
print(mystring[i])
I suggest using a combination of list, split and join methods.
This should help if you are looking for more than 1 word in the substring.
Turn the string into array:
words = list(string.split())
Get the index of your opening and closing markers then return the substring:
open = words.index('the')
close = words.index('string')
substring = ''.join(words[open+1:close])
You may want to improve a bit with the checking for the validity before proceeding.
If your problem gets more complex, i.e multiple occurrences of the pair values, I suggest using regular expression.
import re
substring = ''.join(re.findall(r'the (.+?) string', string))
The re should store substrings separately if you view them in list.
I am using the spaces between the description to rule out the spaces between words, you can modify to your needs as well.

how to find two different strings in the same line in matlab

I have a cell obtained from text scan and I want to find the index of lines containing particular string,
fid = fopen('data.txt');
E = textscan(fid, '%s', 'Delimiter', '\n');
and I wanted to know the line numbers (index) of those lines which have a specific text, e.g. I wanted to find the rows that have the keyword "2016":
rows = find(contains(E{1},"2016" );
but I want to find the index of those lines which have two keywords "2016" and "Mathew Perry" (only those lines which have both the keywords).
I tried using this code but does not work
rows = find(contains(E{1},"2016" && contains(E{1},"Mathew Perry");
the error I get is:
Operands to the || and && operators must be convertible to logical scalar values.
To find a single string:
idx = strfind(E{1}, '2016');
idx = find(not(cellfun('isempty', idx)));
Use strfind instead of find. YOu may try the above with and/or. If it works fine, then no problem; if not, get the indices separately for each word and get the intersection of the indices.

Get only one word from line

How can I take only one word from a line in file and save it in some string variable?
For example my file has line "this, line, is, super" and I want to save only first word ("this") in variable word. I tried to read it character by character until I got on "," but I when I check it I got an error "Argument of type 'int' is not iterable". How can I make this?
line = file.readline() # reading "this, line, is, super"
if "," in len(line): # checking, if it contains ','
for i in line:
if "," not in line[i]: # while character is not ',' -> this is where I get error
word += line[i] # add it to my string
You can do it like this, using split():
line = file.readline()
if "," in line:
split_line = line.split(",")
first_word = split_line[0]
print(first_word)
split() will create a list where each element is, in your case, a word. Commas will not be included.
At a glance, you are on the right track but there are a few things wrong that you can decipher if you always consider what data type is being stored where. For instance, your conditional 'if "," in len(line)' doesn't make sense, because it translates to 'if "," in 21'. Secondly, you iterate over each character in line, but your value for i is not what you think. You want the index of the character at that point in your for loop, to check if "," is there, but line[i] is not something like line[0], as you would imagine, it is actually line['t']. It is easy to assume that i is always an integer or index in your string, but what you want is a range of integer values, equal to the length of the line, to iterate through, and to find the associated character at each index. I have reformatted your code to work the way you intended, returning word = "this", with these clarifications in mind. I hope you find this instructional (there are shorter ways and built-in methods to do this, but understanding indices is crucial in programming). Assuming line is the string "this, line, is, super":
if "," in line: # checking that the string, not the number 21, has a comma
for i in range(0, len(line)): # for each character in the range 0 -> 21
if line[i] != ",": # e.g. if line[0] does not equal comma
word += line[i] # add character to your string
else:
break # break out of loop when encounter first comma, thus storing only first word

Resources