How to check if a certain part of an input is in a list of possible inputs? - python-3.x

Sorry if this has been asked before, I couldn't find the exact answer or a close-enough in other questions.
I want to make a program that checks if any part of a users input is matching to another possible input in a list. And if part of that input was something in a list, that part of the input could be saved as another variable.
correct = ["game","code","text"]
command = input("> ")
if command == "open" + something in correct:
name = thing in correct
doSomething()
So if the command was 'open text', then name would be 'text'.
Is this even possible? Again, sorry if this has already been asked, and sorry if this is rambling and makes no sense.

Assumning you expect the command and its parameters to be separated by a space (or multiple spaces for multiple parameters), you could do this:
correct = ["game","code","text"]
command = input("> ")
subcommands = command.split(" ")
if subcommands[0] == "open" and subcommands[1] in correct:
name = subcommands[1]
doSomething()
This gets more tricky if parameters are supposed to contain spaces themselves, but for simple one-word params, this should do.

Related

Regular expression match / break

I am doing text analysis on SEC filings (e.g., 10-K), and the documents I have are the complete submission. The complete filing submission includes the 10-K, plus several other documents. Each document resides within the tags ‘<DOCUMENT>’ and ‘</DOCUMENT>’.
What I want: To count the number of words in the 10-K only before the first instance of ‘</DOCUMENT>’
How I want to accomplish it: I want to use a for loop, with a regex (regex_end10k) to indicate where to stop the for loop.
What is happening: No matter where I put my regex match break, the program counts all of the words in the entire document. I have no error, however I cannot get the desired results.
How I know this: I have manually trimmed one filing, while retaining the full document (results below). When I manually remove the undesired documents after the first instance of ‘</DOCUMENT>’, I yield about 750,000 fewer words.
Current output
Note: Apparently I don't have enough SO reputation to embed a screenshot in my post; it defaults to a link.
What I have tried: several variations of where to put the regex match break. No matter what, it almost always counts the entire document. I believe that the two functions may be performed over the entire document. I have tried putting the break statement within get_text_from_html() so that count_words() only performs on the 10-K, but I have had no luck.
The code below is a snippet from a larger function. It's purpose is to (1) strip html tags and (2) count the number of words in the text. If I can provide any additional information, please let me know and I'll update my post.
The remaining code (not shown) extracts firm and report identifiers, (e.g., ‘file’ or ‘cik’) from the header section between tags ‘<SEC-HEADER>’ and ‘</SEC-HEADER>’. Using the same logic, when extracting header information, I use a regex match break logic and it works perfectly. I need help trying to understand why this same logic isn’t working when I try to count the number of words and how to correct my code. Any help is appreciated.
regex_end10k = re.compile(r'</DOCUMENT>', re.IGNORECASE)
for line in f:
def get_text_from_html(html:str):
doc = lxml.html.fromstring(html)
for table in doc.xpath('.//table'): # optional: removes tables from HTML source code
table.getparent().remove(table)
for tag in ["a", "p", "div", "br", "h1", "h2", "h3", "h4", "h5"]:
for element in doc.findall(tag):
if element.text:
element.text = element.text + "\n"
else:
element.text = "\n"
return doc.text_content()
to_clean = f.read()
clean = get_text_from_html(to_clean)
#print(clean[:20000])
def count_words(clean):
words = re.findall(r"\b[a-zA-Z\'\-]+\b",clean)
word_count = len(words)
return word_count
header_vars["words"] = count_words(clean)
match = regex_end10k.search(line) # This should do it, but it doesn't.
if match:
break
You dont need regx, just split your orginal string, and then in the part before count the words, simple example above:
text = 'Text before <DOCUMENT> text after'
splited_text = text.split('<DOCUMENT>')
splited_text_before = splited_text[0]
count_words = len(splited_text_before.split())
print(splited_text_before)
print(count_words)
output
Text before
2

Struct name from variable in Matlab

I have created a structure containing a few different fields. The fields contain data from a number of different subjects/participants.
At the beginning of the script I prompt the user to enter the "Subject number" like so:
prompt='Enter the subject number in the format SUB_n: ';
SUB=input(prompt,'s');
Example SUB_34 for the 34th subject.
I want to then name my structure such that it contains this string... i.e. I want the name of my structure to be SUB_34, e.g. SUB_34.field1. But I don't know how to do this.
I know that you can assign strings to a specific field name for example for structure S if I want field1 to be called z then
S=struct;
field1='z';
S.(field1);
works but it does not work for the structure name.
Can anyone help?
Thanks
Rather than creating structures named SUB_34 I would strongly recommend just using an array of structures instead and having the user simply input the subject number.
number = input('Subject Number')
S(number) = data_struct
Then you could simply find it again using:
subject = S(number);
If you really insist on it, you could use the method proposed in the comment by #Sembei using eval to get the struct. You really should not do this though
S = eval([SUB, ';']);
Or to set the structure
eval([SUB, ' = mydata;']);
One (of many) reasons not to do this is that I could enter the following at your prompt:
>> prompt = 'Enter the subject number in the format SUB_n: ';
>> SUB = input(prompt, 's');
>> eval([SUB, ' = mydata;']);
And I enter:
clear all; SUB_34
This would have the unforeseen consequence that it would remove all of your data since eval evaluates the input string as a command. Using eval on user input assumes that the user is never going to ever write something malformed or malicious, accidentally or otherwise.

Lua: Parsing and Manipulating Input with Loops - Looking for Guidance

I am currently attempting to parse data that is sent from an outside source serially. An example is as such:
DATA|0|4|7x5|1|25|174-24|7x5|1|17|TERW|7x5|1|9|08MN|7x5|1|1|_
This data can come in many different lengths, but the first few pieces are all the same. Each "piece" originally comes in with CRLF after, so I've replaced them with string.gsub(input,"\r\n","|") so that is why my input looks the way it does.
The part I would like to parse is:
4|7x5|1|25|174-24|7x5|1|17|TERW|7x5|1|9|08MN|7x5|1|1|_
The "4" tells me that there will be four lines total to create this file. I'm using this as a means to set the amount of passes in the loop.
The 7x5 is the font height.
The 1 is the xpos.
The 25 is the ypos.
The variable data (172-24 in this case) is the text at these parameters.
As you can see, it should continue to loop this pattern throughout the input string received. Now the "4" can actually be any variable > 0; with each number equaling a set of four variables to capture.
Here is what I have so far. Please excuse the loop variable, start variable, and print commands. I'm using Linux to run this function to try to troubleshoot.
function loop_input(input)
var = tonumber(string.match(val, "DATA|0|(%d*).*"))
loop = string.match(val, "DATA|0|")
start = string.match(val, loop.."(%d*)|.*")
for obj = 1, var do
for i = 1, 4 do
if i == 1 then
i = "font" -- want the first group to be set to font
elseif i == 2 then
i = "xpos" -- want the second group to be set to xpos
elseif i == 3 then
i = "ypos" -- want the third group to be set to ypos
else
i = "txt" -- want the fourth group to be set to text
end
obj = font..xpos..ypos..txt
--print (i)
end
objects = objects..obj -- concatenate newly created obj variables with each pass
end
end
val = "DATA|0|4|7x5|1|25|174-24|7x5|1|17|TERW|7x5|1|9|08MN|7x5|1|1|_"
print(loop_input(val))
Ideally, I want to create a loop that, depending on the var variable, will plug in the captured variables between the pipe deliminators and then I can use them freely as I wish. When trying to troubleshoot with parenthesis around my four variables (like I have above), I receive the full list of four variables four times in a row. Now I'm having difficulty actually cycling through the input string and actually grabbing them out as the loop moves down the data string. I was thinking that using the pipes as a means to delineate variables from one another would help. Am I wrong? If it doesn't matter and I can keep the [/r/n]+ instead of each "|" then I am definitely all for that.
I've searched around and found some threads that I thought would help but I'm not sure if tables or splitting the inputs would be advisable. Like these threads:
Setting a variable in a for loop (with temporary variable) Lua
How do I make a dynamic variable name in Lua?
Most efficient way to parse a file in Lua
I'm fairly new to programming and trying to teach myself. So please excuse my beginner thread. I have both the "Lua Reference Manual" and "Programming in Lua" books in paperback which is how I've tried to mock my function(s) off of. But I'm having a problem making the connection.
I thank you all for any input or guidance you can offer!
Cheers.
Try this:
val = "DATA|0|4|7x5|1|25|174-24|7x5|1|17|TERW|7x5|1|9|08MN|7x5|1|1|_"
val = val .. "|"
data = val:match("DATA|0|%d+|(.*)$")
for fh,xpos,ypos,text in data:gmatch("(.-)|(.-)|(.-)|(.-)|") do
print(fh,xpos,ypos,text)
end

Comparring a string with a manually added string

If have been cracking my head over this for a week now:
We have an assignment, where we have 2 options in our program, with option 1, the program asks for a name and a date, and then it generates an email addressed to the give name, with that date.
The second option, we have to paste text in to program, and it will tell us if the 'template' from option 1 is used or not, and it gives you the name, and date.
my question is now: how do I compare the given string, with the manual input string and make that name, and date (could be 2nd of oktober, could be 10/02, could be sunday the 2nd, basically anything that isn't the same as the template) and still make it say the template matches?
I thought: cutting the strings up, comparing them, word for word, but then what? and how?
Since I do not know what language you are programming in, I will give you some examples of what you ask in the languages I know.
Python(2.7):
x = raw_input('Manual String') // get user input, this can be replaced
y = 'this is a string: '+ str(x) // use Str incase of a number or other format of x.
if(y == 'this is a string: doubleo'):
print "The strings are equal!"
C:
use this page:
http://www.tutorialspoint.com/ansi_c/c_strcmp.htm

Is there any way when I choose an option not to clear previously typed data in the input

The problem is this:
In my programme at first the user gets options for a first name - so hopefully he likes something from the options and he chooses it -so far everything is OK!
But then when he types space he starts receiving options for second name and a if he likes something and chooses it - then the Autocomplete just erases the first name. Is there any way I can change that?
hello Rich thank you very much or your response - now i've decided to change my task and here is what I made when a user types for example I character i get all the first names that start with I- so far no problem! ANd when he types the white space and K for example I make request to my web service that gets the middle names that starts with K or the last names that start with K (one of them should start with K for Iwelina), so in this case for Iwelina Ive got RADULSKA KOSEWA and KOSEWA NEDEWA! For the source of autocomplete I concatenate iwelina with (radulska kosewa)and iwelina with (KOSEWA NEDEWA) so at the end I've got IWELINA IELINA RADULSKA KOSEWA and IWELINA KOSEWA NEDEWA!!! the only problem is that when i type Iwelina K i get only IWELINA KOSEWA NEDEWA!!!here is the code for autocomlete
$('#input').autocomplete({
source: function(request, response) {
var matcher = new RegExp( $.ui.autocomplete.escapeRegex(request.term, " "));
var data = $.grep( srcAutoComp, function(value) {
return matcher.test( value.label || value.value || value );
});
response(data);
}
});
if you know how i can change it I will be glad for the help
I don't understand how, when the user begines to type the second name, he's getting results that are only the last name. For example, if he types "Joh" and selects "John" from the options, and then continues to type "John Do", then how is it possible that your drop down gives him results for only the last name, like "Doe"?
At any rate, assuming this is truly happening, you could just combine all combinations of first and last names in your source data and that will show "John Doe" in the drop down when the user types "Joh" selects "John" and then continues to type "John Do".
Another way to do this is with a complicated change to the search and response events to search after a space if it is there, and recombine it with the first string after the search for the last name is complete. If you give me your source data, I could put something together for this.

Resources