Split a string into indexes in a List of Lists

Split a string into indexes in a List of Lists - python-3.x

I have a collection of text that I am trying to parse in order to audit against. In Python3 I am trying to split the white space in a list of lists so that each word gets its own index and then add data that is missing to some of the lists . Ex:
listoflists=[['Port UP UP Description'] , ['Port Description' ],[ 'Port UP UP Description']]
I'd like to make it so that the output of listoflists now contains:
listoflists=[['Port,UP,UP,Description'] , ['Port, Down, Down, Description' ],[ 'Port,UP,UP,Description']]
My goal here is to then use this data to output to prettytable or something like that with rows containing each for indexes from each list and then, phase 2, audit against which rows have 'Ports' showing down.
I tried something like this but I believe my thinking is not quite right here.
for i in range(len(listoflists)):
listoflists=i.split(' ')
for item in items:
new_items.extend(item.split())

Related

Indexing inner lists in "list of lists"

bigList = [list1,list2,list3]
I need a way to get the name of list1 using the zero index
bigList[0] just gives you all the items in the list
My original code prints all the items in the lists while the modified one says the index is out of range. I want it to give me the ilness names not all the symptoms.
Original code
https://i.stack.imgur.com/HDlA9.png
Modified
https://i.stack.imgur.com/5Qdh2.png

String matching keywords and key phrases in Python

I am trying to perform a smart dynamic lookup with strings in Python for a NLP-like task. I have a large amount of similar-structure sentences that I would like to parse through each, and tokenize parts of the sentence. For example, I first parse a string such as "bob goes to the grocery store".
I am taking this string in, splitting it into words and my goal is to look up matching words in a keyword list. Let's say I have a list of single keywords such as "store" and a list of keyword phrases such as "grocery store".
sample = 'bob goes to the grocery store'
keywords = ['store', 'restaurant', 'shop', 'office']
keyphrases = ['grocery store', 'computer store', 'coffee shop']
for word in sample.split():
# do dynamic length lookups
Now the issue is this Sometimes my sentences might be simply "bob goes to the store" instead of "bob goes to the grocery store".
I want to find the keyword "store" for sure but if there are descriptive words such as "grocery" or "computer" before the word store I would like to capture that as well. That is why I have the keyphrases list as well. I am trying to figure out a way to basically capture a keyword at the very least then if there are words related to it that might be a possible "phrase" I want to capture those too.
Maybe an alternative is to have some sort of adjective list instead of a phrase list of multiple words?
How could I go about doing these sort of variable length lookups where I look at more than just a single word if one is captured, or is there an entirely different method I should be considering?

Here is how you can use a nested for loop and a formatted string:
sample = 'bob goes to the grocery store'
keywords = ['store', 'restaurant', 'shop', 'office']
keyphrases = ['grocery', 'computer', 'coffee']
for kw in keywords:
for kp in keyphrases:
if f"{kp} {kw}" in sample:
# Do something

Is there a simple way to remove sublits

I have a list(rs_data) with sublists obtained from a Dataframe, and some rows from Dataframe contain multiple elements, like those:
print(rs_data)
rs1791690, rs1815739, rs2275998
rs6552828
rs1789891
rs1800849, rs2016520, rs2010963, rs4253778
rs1042713, rs1042714, rs4994, rs1801253
I want to obtain a list in which each element (rs….) is separated, something like this:
{'rs1791690', 'rs1815739', 'rs227599', 'rs401681', 'rs2180062', 'rs9018'….}
How can I eliminate sublits or generate a new list without sublists, in which each element is unique.

To generate a new list you could iterate over the old one and throw out the elements you don't like.
Something like this
for i in rs_data:
if i in bad_values:
# do something
else:
# do something else
If you just want to eliminate duplicates it would be the best to use a set
Like this
mynewset = set(rs_data)

Access list element using get()

I'm trying to use get() to access a list element in R, but am getting an error.
example.list <- list()
example.list$attribute <- c("test")
get("example.list") # Works just fine
get("example.list$attribute") # breaks
## Error in get("example.list$attribute") :
## object 'example.list$attribute' not found
Any tips? I am looping over a vector of strings which identify the list names, and this would be really useful.

Here's the incantation that you are probably looking for:
get("attribute", example.list)
# [1] "test"
Or perhaps, for your situation, this:
get("attribute", eval(as.symbol("example.list")))
# [1] "test"
# Applied to your situation, as I understand it...
example.list2 <- example.list
listNames <- c("example.list", "example.list2")
sapply(listNames, function(X) get("attribute", eval(as.symbol(X))))
# example.list example.list2
# "test" "test"

Why not simply:
example.list <- list(attribute="test")
listName <- "example.list"
get(listName)$attribute
# or, if both the list name and the element name are given as arguments:
elementName <- "attribute"
get(listName)[[elementName]]

If your strings contain more than just object names, e.g. operators like here, you can evaluate them as expressions as follows:
> string <- "example.list$attribute"
> eval(parse(text = string))
[1] "test"
If your strings are all of the type "object$attribute", you could also parse them into object/attribute, so you can still get the object, then extract the attribute with [[:
> parsed <- unlist(strsplit(string, "\\$"))
> get(parsed[1])[[parsed[2]]]
[1] "test"

flodel's answer worked for my application, so I'm gonna post what I built on it, even though this is pretty uninspired. You can access each list element with a for loop, like so:
#============== List with five elements of non-uniform length ================#
example.list=
list(letters[1:5], letters[6:10], letters[11:15], letters[16:20], letters[21:26])
#===============================================================================#
#====== for loop that names and concatenates each consecutive element ========#
derp=c(); for(i in 1:length(example.list))
{derp=append(derp,eval(parse(text=example.list[i])))}
derp #Not a particularly useful application here, but it proves the point.
I'm using code like this for a function that calls certain sets of columns from a data frame by the column names. The user enters a list with elements that each represent different sets of column names (each set is a group of items belonging to one measure), and the big data frame containing all those columns. The for loop applies each consecutive list element as the set of column names for an internal function* applied only to the currently named set of columns of the big data frame. It then populates one column per loop of a matrix with the output for the subset of the big data frame that corresponds to the names in the element of the list corresponding to that loop's number. After the for loop, the function ends by outputting that matrix it produced.
Not sure if you're looking to do something similar with your list elements, but I'm happy I picked up this trick. Thanks to everyone for the ideas!
"Second example" / tangential info regarding application in graded response model factor scoring:
Here's the function I described above, just in case anyone wants to calculate graded response model factor scores* in large batches...Each column of the output matrix corresponds to an element of the list (i.e., a latent trait with ordinal indicator items specified by column name in the list element), and the rows correspond to the rows of the data frame used as input. Each row should presumably contain mutually dependent observations, as from a given individual, to whom the factor scores in the same row of the ouput matrix belong. Also, I feel I should add that if all the items in a given list element use the exact same Likert scale rating options, the graded response model may be less appropriate for factor scoring than a rating scale model (cf. http://www.rasch.org/rmt/rmt143k.htm).
'grmscores'=function(ColumnNameList,DataFrame) {require(ltm) #(Rizopoulos,2006)
x = matrix ( NA , nrow = nrow ( DataFrame ), ncol = length ( ColumnNameList ))
for(i in 1:length(ColumnNameList)) #flodel's magic featured below!#
{x[,i]=factor.scores(grm(DataFrame[, eval(parse(text= ColumnNameList[i]))]),
resp.patterns=DataFrame[,eval(parse(text= ColumnNameList[i]))])$score.dat$z1}; x}
Reference
*Rizopoulos, D. (2006). ltm: An R package for latent variable modelling and item response theory analyses, Journal of Statistical Software, 17(5), 1-25. URL: http://www.jstatsoft.org/v17/i05/

MYSQL: Using GROUP BY with string literals

I have the following table with these columns:
shortName, fullName, ChangelistCount
Is there a way to group them by a string literal within their fullName? The fullname represents file directories, so I would like to display results for certain parent folders instead of the individual files.
I tried something along the lines of:
GROUP BY fullName like "%/testFolder/%" AND fullName like "%/testFolder2/%"
However it only really groups by the first match....
Thanks!

Perhaps you want something like:
GROUP BY IF(fullName LIKE '%/testfolder/%', 1, IF(fullName LIKE '%/testfolder2/%', 2, 3))
The key idea to understand is that an expression like fullName LIKE foo AND fullName LIKE bar is that the entire expression will necessarily evaluate to either TRUE or FALSE, so you can only get two total groups out of that.
Using an IF expression to return one of several different values will let you get more groups.
Keep in mind that this will not be particularly fast. If you have a very large dataset, you should explore other ways of storing the data that will not require LIKE comparisons to do the grouping.

You'd have to use a subquery to derive the column values you'd like to ultimately group on:
FROM (SELECT SUBSTR(fullname, ?)AS derived_column
FROM YOUR_TABLE ) x
GROUP BY x.derived_column

Either use when/then conditions or Have another temporary table containing all the matches you wish to find and group. Sample from my database.
Here I wanted to group all users based on their cities which was inside address field.
SELECT ut.* , c.city, ua.*
FROM `user_tracking` AS ut
LEFT JOIN cities AS c ON ut.place_name LIKE CONCAT( "%", c.city, "%" )
LEFT JOIN users_auth AS ua ON ua.id = ut.user_id

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string