extract data between single quotes - python-3.x

trying to extract the data between single quotes
import re
a = 'USA-APA HA-WBS-10.152.08.0/24'
print(re.findall(r'()', a))
expecting the oputput : USA-APA HA-WBS-10.152.08.0/24

What is wrong with ? It is just a string ?
a = 'USA-APA HA-WBS-10.152.08.0/24'
print(a)
Output:
% python3 test.py
USA-APA HA-WBS-10.152.08.0/24
You might want to look at this also regarding quotes and strings:
Single and Double Quotes | Python

I am not very familiar with python but with some quick searching around
I've found that this work
import re
a = 'USA-APA HA-WBS-10.152.08.0/24'
result = re.findall(r'(.*?)', a)
print("".join(result))
I'm pretty sure there are better ways of solving this but I'm not familiar with the language

Related

Using "r" for escape sequence when file in Julia

On the Python using r front of the file path, can deal with escape sequence such as :
df = pd.read_csv(r"D:\datasets\42133.csv")
However on Julia, the below code returns, MethodError: no method matching joinpath(::Regex)
file_path = r"D:\datasets\42133.csv"
df = DataFrame(CSV.File(file_path))
I checked this, and know that I can chage \ to \\ or /. But wondering that why Julia does not allowed to use r"String"? Also is there something like r"String" on Julia?
You are looking for raw"..." string.
julia> raw"D:\datasets\42133.csv"
"D:\\datasets\\42133.csv"
In Julia, r"..." strings create a Regex object.

Python regexp to get substring contains '/\'

I have string
ss='/users/parun/kk/jdk/bin/\x1b[01;31m\x1b[kjava\x1b[m\x1b[k'
How to get output
'/users/parun/kk/jdk/bin' only from the above
I tried
import re
re.split(r'\/\\')
But not working
A regex search might be the best option here:
ss = '/users/parun/kk/jdk/bin/\x1b[01;31m\x1b[kjava\x1b[m\x1b[k'
path = re.findall(r'^.*/jdk/bin', ss)[0]
print(path) # /users/parun/kk/jdk/bin

python replace \\ with \ in stringpath automatically

how can I replace "\" in path string with "\\" python, u know \ is for escape character and r'\' and r"\" also don't work, neither in str.replace() or in re.sub()
If your objective is to get the correct path you can use the raw string:
r"C:\Users"
# will return
Out[2]: 'C:\\Users'
# in the console
#however if you print it, it will print this:
print(r"C:\Users")
C:\Users
if you want to combine parts of the path dynamically i recommend the os library (standard library)
use it like this:
import os
path = os.path.join(r"first_part_of_path", r"other_part_of_path", "filename.xlsx")
from python's documentation: "The solution is to use Python’s raw string notation for regular expression patterns; backslashes are not handled in any special way in a string literal prefixed with 'r'. So r"\n" is a two-character string containing '' and 'n', while "\n" is a one-character string containing a newline. Usually patterns will be expressed in Python code using this raw string notation."
https://docs.python.org/3/library/re.html
below maybe what you are looking for:
x=r'this, is a \test'
re.subn('\\','\\',x)
from the standard library, you could use os.path.normpath
Example:
import os
myDir = r"path\to\dir"
normalized = os.path.normpath(myDir)
Which enables the following :
>>> normalized
'path\\to\\dir'
>>> print(normalized)
path\to\dir
>>> str(normalized)
'path\\to\\dir'
>>> repr(normalized)
"'path\\\\to\\\\dir'"
I just realized our path for i.e.
path_str="E:\neural network\Pytorch"
can be changed to
path_str=path_str.encode('unicode-escape').decode().replace('\\\\', '\\')
and this would also do it automatically without need to manipulating the string manually to
path_str=r"E:\neural network\Pytorch"

How to use python to convert a backslash in to forward slash for naming the filepaths in windows OS?

I have a problem in converting all the back slashes into forward slashes using Python.
I tried using the os.sep function as well as the string.replace() function to accomplish my task. It wasn't 100% successful in doing that
import os
pathA = 'V:\Gowtham\2019\Python\DailyStandup.txt'
newpathA = pathA.replace(os.sep,'/')
print(newpathA)
Expected Output:
'V:/Gowtham/2019/Python/DailyStandup.txt'
Actual Output:
'V:/Gowtham\x819/Python/DailyStandup.txt'
I am not able to get why the number 2019 is converted in to x819. Could someone help me on this?
Your issue is already in pathA: if you print it out, you'll see that it already as this \x81 since \201 means a character defined by the octal number 201 which is 81 in hexadecimal (\x81). For more information, you can take a look at the definition of string literals.
The quick solution is to use raw strings (r'V:\....'). But you should take a look at the pathlib module.
Using the raw string leads to the correct answer for me.
import os
pathA = r'V:\Gowtham\2019\Python\DailyStandup.txt'
newpathA = pathA.replace(os.sep,'/')
print(newpathA)
OutPut:
V:/Gowtham/2019/Python/DailyStandup.txt
Try this, Using raw r'your-string' string format.
>>> import os
>>> pathA = r'V:\Gowtham\2019\Python\DailyStandup.txt' # raw string format
>>> newpathA = pathA.replace(os.sep,'/')
Output:
>>> print(newpathA)
V:/Gowtham/2019/Python/DailyStandup.txt

multiple variable in python regex

I have seen several related posts and several forums to find an answer for my question, but nothing has come up to what I need.
I am trying to use variable instead of hard-coded values in regex which search for either word in a line.
However i am able to get desired result if i don't use variable.
<http://www.somesite.com/software/sub/a1#Msoffice>
<http://www.somesite.com/software/sub1/a1#vlc>
<http://www.somesite.com/software/sub2/a2#dell>
<http://www.somesite.com/software/sub3/a3#Notepad>
re.search(r"\#Msoffice|#vlc|#Notepad", line)
This regex will return the line which has #Msoffice OR #vlc OR #Notepad.
I tried defining a single variable using re.escape and that worked absolutely fine. However i have tried many combination using | and , (pipe and comma) but no success.
Is there any way i can specify #Msoffice , #vlc and #Notepad in different variables and so later i can change those ?
Thanks in advance!!
If I did understand you the right way you'd like to insert variables in your regex.
You are actually using a raw string using r' ' to make the regex more readable, but if you're using f' ' it allows you to insert any variables using {your_var} then construct your regex as you like:
var1 = '#Msoffice'
var2 = '#vlc'
var3 = '#Notepad'
re.search(f'{var1}|{var2}|{var3}', line)
The most annoying issue is that you will have to add \ to escaped char, to look for \ it will be \\
Hope it helped
import re
lines = ["<http://www.somesite.com/software/sub/a1#Msoffice>",
"<http://www.somesite.com/software/sub1/a1#vlc>",
"<http://www.somesite.com/software/sub2/a2#dell>",
"<http://www.somesite.com/software/sub3/a3#Notepad>"]
for line in lines:
if re.search(r'\b(?:\#{}|\#{}|\#{})\b'.format('Msoffice', 'vlc', 'Notepad'), line):
print(line)
Output :
<http://www.somesite.com/software/sub/a1#Msoffice>
<http://www.somesite.com/software/sub1/a1#vlc>
<http://www.somesite.com/software/sub3/a3#Notepad>

Resources