Search for Pattern in list : python Regex

Search for Pattern in list : python Regex - python-3.x

After the Data Analysis & getting the Required Result I'm appending that result to a List
Now I Need to Retrieve Or Separate the Result (Search For Pattern & Obtain It)
Code:
data = []
data.append('\n'.join([' -> '.join(e) for e in paths]))
List Contais This data:
CH_Trans -> St_1 -> WDL
TRANSFER_Trn -> St_1
Access_Ltd -> MPL_Limited
IPIPI -> TLC_Pvt_Ltd
234er -> Three_Star_Services -> Asian_Pharmas -> PPP_Channel
Sonata_Ltd -> Three_Star_Services
Arc_Estates -> Russian_Hosp
A -> B -> C -> D -> E -> F
G -> H
ZN_INTBNKOUT_SET -> -2008_1 -> X
ZZ_1_ -> AA_2 -> AA_3 -> ZZ_1_
XYZ- -> ABC -> XYZ-
SSS -> BBB -> SSS
Rock_8CC -> Russ -> By_sus -> Rock_8CC
Note : Display or Retrieve Pattern Which has more than two symbol of type[->]
( Txt -> txt -> txt )
I'm Trying to get it Done by Regex
for i in data:
regex = ("\w+\s->\s\w+\s->\s\w+")
match = re.findall(regex, i,re.MULTILINE)
print(match)
Regex Expression I Tried But Unable to get Requried Result
#\w+\s->\s\w+\s->\s\w+
#\w+\s[-][>]\s\w+\s[-][>]\s\w+
#\w+\s[-][>]\s\w+\s[-][>]\s\w+\s[-][>]\s\w+
Result I Got
['CH_Trans-> St_1-> WDL', '234er -> Three_Star_Services -> Asian_Pharmas',
'A -> B -> C', 'D -> E -> F', 'ZZ_1_ -> AA_2 -> AA_3',
'SSS -> BBB -> SSS', 'Rock_8CC -> Russ -> By_sus']
Requried Result What I want to Obtain is
----Pattern I------
CH_Trans -> St_1 -> WDL
234er -> Three_Star_Services -> Asian_Pharmas -> PPP_Channel
A -> B -> C -> D -> E -> F
ZN_INTBNKOUT_SET -> -2008_1 -> X
# Pattern II Consists of Patterns which are same i.e[ Fist_ele & Last_Ele Is Same]
----Pattern II------
ZZ_1_ -> AA_2 -> AA_3 -> ZZ_1_
XYZ- -> ABC -> XYZ-
SSS -> BBB -> SSS
Rock_8CC -> Russ -> By_sus -> Rock_8CC

Would you please try the following as a starting point:
regex = r'^\S+(?:\s->\s\S+){2,}$'
for i in data:
m = re.match(regex, i)
if (m):
print(m.group())
Results (Pattern I + Pattern II):
CH_Trans -> St_1 -> WDL
234er -> Three_Star_Services -> Asian_Pharmas -> PPP_Channel
A -> B -> C -> D -> E -> F
ZN_INTBNKOUT_SET -> -2008_1 -> X
ZZ_1_ -> AA_2 -> AA_3 -> ZZ_1_
XYZ- -> ABC -> XYZ-
SSS -> BBB -> SSS
Rock_8CC -> Russ -> By_sus -> Rock_8CC
Explanation of the regex ^\S+(?:\s->\s\S+){2,}$:
^\S+ start with non-blank string
(?: ... ) grouping
\s->\s\S+ a blank followed by "->" followed by a blank and non-blank string
{2,} repeats the previous pattern (or group) two or more times
$ end of the string
As of pattern II please say:
regex = r'^(\S+)(?:\s->\s\S+){1,}\s->\s\1$'
for i in data:
m = re.match(regex, i)
if (m):
print(m.group())
Results:
ZZ_1_ -> AA_2 -> AA_3 -> ZZ_1_
XYZ- -> ABC -> XYZ-
SSS -> BBB -> SSS
Rock_8CC -> Russ -> By_sus -> Rock_8CC
Explanation of regex r'^(\S+)(?:\s->\s\S+){1,}\s->\s\1$':
- ^(\S+) captures the 1st element and assigns \1 to it
- (?: ... ) grouping
- \s->\s\S+ a blank followed by "->" followed by a blank and non-blank string
- {1,} repeats the previous pattern (or group) one or more times
- \s->\s\1 a blank followed by "->" followed by a blank and the 1st element \1
- $ end of the string
In order to obtain the result of pattern I, we may need to subtract the list of pattern II from the 1st results.
If we could say:
regex = r'^(\S+)(?:\s->\s\S+){2,}(?<!\1)$'
it will exclude the string whose last element differs from the 1st element then we could obtain the result of pattern I directry but the regex causes the error saying "group references in lookbehind assertions" so far.

Related

Python Search for Pattern in list of string elements

I'm searching for a pattern in a list of string elements.
As far my code is working fine, but some data is unable to produce required result.
Code
ss = '''
X A
B A
A C
A D
E A
A F
'''.strip()
lst = []
for r in ss.split('\n'):
lst.append(r.split())
paths = []
for e in lst:
# each row in source data
pnew = [] # new path
for p in paths:
if e[0] in p: # if start in existing path
if p.index(e[0]) == len(p)-1: # if end of path
p.append(e[1]) # add to path
else:
pnew.append(p[:p.index(e[0])+1]+[e[1]]) # copy path then add
break
else: # loop completed, not found
paths.append(list(e)) # create new path
if len(pnew): # copied path
paths.extend(pnew) # add copied path
print('\n'.join([' -> '.join(e) for e in paths]))
what i'm getting is
X -> A -> C
B -> A
X -> A -> D
E -> A
X -> A -> F
what my requried result is
B -> A -> C
X -> A -> D
E -> A -> F
X -> A -> C
B -> A -> D
B -> A -> F
X -> A- > F
Based on Cr & Dr I'm Trying to get the pattern (Cr & Dr are optional)
X A Cr
B A Cr
A C Dr
A D Dr
E A Cr
A F Dr

It's easier to handle this with pandas:
import pandas as pd
from io import StringIO
ss = '''
X A
B A
A C
A D
E A
A F
'''.strip()
df = pd.read_csv(StringIO(ss), sep=' ', names=['source', 'target'])
df = df.merge(df, how='inner', left_on='target', right_on='source')
df = df[['source_x', 'target_x', 'target_y']]
df.apply(lambda x: ' -> '.join(x), axis=1).sort_values()

Problem to recreate the parser function "many" in Haskell

I try to create a calculator and I create some functions for the parsing
I already create a type type Parser a = String -> Maybe (a , String ) and a function that take a Char as argument and returns a Parser Char like this :
parse1 :: Char -> Parser Char
parse1 b (a:as)
| b == a = Just (b, as)
| otherwise = Nothing
and I would like to create a function which takes a parser in argument and tries to apply it zero or more times, returning a list
of the parsed elements.
many :: Parser a -> Parser [a]
> many (parse1 ' ') " lopesbar"
Just (" ", "lopesbar")
I Already try this but it doesn't work (infinite loop)
many :: Parser a -> Parser [a]
many _ [] = Nothing
many func1 s =
case func1 s of
Just _ -> many func1 s
Nothing -> Nothing
also i try this
many :: Parser a -> Parser [a]
many func1 s = case func1 s of
Just (f, a) -> Just ([f], a)
Nothing -> Nothing
there are no errors but it's not the same output

Representing a theorem with multiple hypotheses in Lean (propositional logic)

Real beginners question here. How do I represent a problem with multiple hypotheses in Lean? For example:
Given
A
A→B
A→C
B→D
C→D
Prove the proposition D.
(Problem taken from The Incredible Proof Machine, Session 2 problem 3. I was actually reading Logic and Proof, Chapter 4, Propositional Logic in Lean but there are less exercises available there)
Obviously this is completely trivial to prove by applying modus ponens twice, my question is how do I represent the problem in the first place?! Here's my proof:
variables A B C D : Prop
example : (( A )
/\ ( A->B )
/\ ( A->C )
/\ ( B->D )
/\ ( C->D ))
-> D :=
assume h,
have given1: A, from and.left h,
have given2: A -> B, from and.left (and.right h),
have given3: A -> C, from and.left (and.right (and.right h)),
have given4: B -> D, from and.left (and.right (and.right (and.right h))),
have given5: C -> D, from and.right (and.right (and.right (and.right h))),
show D, from given4 (given2 given1)
Try it!
I think I've made far too much a meal of packaging up the problem then unpacking it, could someone show me a better way of representing this problem please?

I think it is a lot clearer by not using And in the hypotheses instead using ->. here are 2 equivalent proofs, I prefer the first
def s2p3 {A B C D : Prop} (ha : A)
(hab : A -> B) (hac : A -> C)
(hbd : B -> D) (hcd : C -> D) : D
:= show D, from (hbd (hab ha))
The second is the same as the first except using example,
I believe you have to specify the names of the parameters using assume
rather than inside the declaration
example : A -> (A -> B) -> (A -> C) -> (B -> D) -> (C -> D) -> D :=
assume ha : A,
assume hab : A -> B,
assume hac, -- You can actually just leave the types off the above 2
assume hbd,
assume hcd,
show D, from (hbd (hab ha))
if you want to use the def syntax but the problem is e.g. specified using example syntax
example : A -> (A -> B) -> (A -> C)
-> (B -> D) -> (C -> D) -> D := s2p3
Also, when using and in your proof, in the unpacking stage
You unpack given3, and given 5 but never use them in your "show" proof.
So you don't need to unpack them e.g.
example : (( A )
/\ ( A->B )
/\ ( A->C )
/\ ( B->D )
/\ ( C->D ))
-> D :=
assume h,
have given1: A, from and.left h,
have given2: A -> B, from and.left (and.right h),
have given4: B -> D, from and.left (and.right (and.right (and.right h))),
show D, from given4 (given2 given1)

Identifying input values for which a function does NOT generate a specific output

I built a data structure in form of a function that outputs certain strings in response to certain input strings like this:
type mydict = String -> String
emptydict :: mydict
emptydict _ = "not found"
Now I can add entries into this dictionary by doing the following:
addentry :: String -> String -> mydict -> mydict
addentry s1 s2 d s
| s1 == s = s2
| otherwise = d s
To look for s2's I can simply enter s1 and look in my dictionary
looky :: String -> mydict -> String
looky s1 d = d s1 --gives s2
My goal is now to create another function patternmatch in which I can check which s1's are associated with an s2 that starts with a certain pattern. Now the pattern matching itself isn't the problem, but I am not sure how can I keep track of the entries I entered, i.e. for which input is the output not "not found" ?
My idea was to try to keep track of all the s1's I entered in the addentry function and add them to a separate list. In patternmatch I would feed the list elements to looky, such that I can get back the associated s2's and check whether they match the pattern.
So my questions:
1) Is this list building approach good or is there a better way of identifying the inputs for which a function is defined as something other than "not found"?
2) If it is the right approach, how would I keep track of the s1's? I was thinking something like:
addentry s1 s2 d s
| last (save s1) == s = s2
| otherwise = d s1
And then save s1 being a function generating the list with all s1's. last (save s1) would then return the most recent s1. Would appreciate any help on implementing save s1 or other directions going from here. Thanks a lot.

Your design is hard-coded such that the only criteria for finding a key is by presenting the same exact key. What you need is a more flexible approach that lets you provide a criteria other than equality. I took the liberty of making your code more general and using more conventional names for the functions:
import Prelude hiding (lookup)
-- instead of k -> Maybe v, we represent the dictionary as
-- (k -> Bool) -> Maybe v where k -> Bool is the criteria
-- on which to match the key. by using Maybe v we can signal
-- that no qualifying key was found by returning Nothing
-- instead of "not found"
newtype Dict k v = Dict ((k -> Bool) -> Maybe v)
empty :: Dict k v
empty = Dict $ const Nothing
-- insert a new key/value pair
insert :: k -> v -> Dict k v -> Dict k v
insert k v d = Dict $ \f -> if f k then Just v else lookupBy f d
-- lookup using the given criteria
lookupBy :: (k -> Bool) -> Dict k v -> Maybe v
lookupBy f (Dict d) = d f
-- lookup using the default criteria (equality with some given key)
lookup :: Eq k => k -> Dict k v -> Maybe v
lookup k = lookupBy (k==)
-- your criteria
startsWith :: String -> String -> Bool
startsWith s = undefined -- TODO
lookupByPrefix :: String -> Dict String v -> Maybe v
lookupByPrefix = lookupBy . startsWith
I should mention that while this is a great exercise for functional programming practice and general brain-expansion, it's a terrible way to implement a map. A list of pairs is equivalent and easier to understand.
As a side note, we can easily define an instance of Functor for this type:
instance Functor (Dict k) where
fmap f d = Dict $ \g -> fmap f (lookupBy g d)

Choosing where newlines are in a multi-line string literal

If I create the following multi-line string literal:
let lit = "A -> B
C -> D
E -> F";
It prints out like this:
A -> B
C -> D
E -> F
No surprise. However, if I try this:
let lit = "A -> B\
C -> D\
E -> F";
I get:
A -> BC -> DE -> F
What I'm trying to get is this:
A -> B
C -> D
E -> F
But this is the best thing I've come up with:
let lit = "A -> B\n\
C -> D\n\
E -> F";
Or maybe this:
let lit = vec!["A -> B", "C -> D", "E -> F"].connect("\n");
Both of those feel a little clunky, though not terrible. Just wondering if there's any cleaner way?

Indoc is a procedural macro that does what you want. It stands for "indented document." It provides a macro called indoc!() that takes a multiline string literal and un-indents it so the leftmost non-space character is in the first column.
let lit = indoc! {"
A -> B
C -> D
E -> F"
};
The result is "A -> B\nC -> D\nE -> F" as you asked for.
Whitespace is preserved relative to the leftmost non-space character in the document, so the following preserves 2 spaces before "C":
let lit = indoc! {"
A -> B
C -> D
E -> F"
};
The result is "A -> B\n C -> D\nE -> F".

I see three other possible solutions:
1) Get rid of the spaces:
let lit = "A -> B
C -> D
E -> F";
This way, you lose the neat display in your code. You could get that back like this:
2) Get rid of the spaces, shift everything down a line, and escape the return.
let lit = "\
A -> B
C -> D
E -> F";
I would explain what that "\" is doing in a comment, though, because it isn't obvious otherwise.
3) Combine these two solutions:
let lit =
"A -> B
C -> D
E -> F";
You can test this at Ideone.

Mostly as an exercise, I mimicked Python's join syntax with the following:
trait CanJoin {
fn join(&self, in_strings: Vec<&str>) -> String;
}
impl CanJoin for str {
fn join(&self, in_strings: Vec<&str>) -> String {
in_strings.connect(self)
}
}
fn main() {
let vector = vec!["A -> B", "B -> C", "C -> D"];
let joined = "\n".join(vector);
}
Or as a macro:
macro_rules! join_lines {
($($x:tt)*) => {
{
vec![$($x)*].connect("\n")
}
}
}
let joined = join_lines!("A -> B", "B -> C", "C -> D");

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Search for Pattern in list : python Regex - python-3.x

Related

Python Search for Pattern in list of string elements

Problem to recreate the parser function "many" in Haskell

Representing a theorem with multiple hypotheses in Lean (propositional logic)

Identifying input values for which a function does NOT generate a specific output

Choosing where newlines are in a multi-line string literal

Categories

Resources