Python3 using f-string with regex in findall

Python3 using f-string with regex in findall - python-3.x

I tried to use re.findall to look for some string to parse to information I want to pattern from a large html file. the information I need is 0000-01-02-03-04-asdasdad that sit in the string like https://xxx/yyy/0000-01-02-03-04-asdasdad/index.css
How can I do it?
def func(html: str):
import re
pattern = r'\d{4}-\d{2}-\d{2}-\d{2}-\d{2}-\w+'
s = fr'https://xxx/yyy/{pattern}/index.css'
return re.findall(s, html)
# error: TypeError: cannot use a string pattern on a bytes-like object

Related

Striping text in scrappy

I'm trying to run spyder to extract real estate advertisements informaiton.
My code:
import scrapy
from ..items import RealestateItem
class AddSpider (scrapy.Spider):
name = 'Add'
start_urls = ['https://www.exampleurl.com/2-bedroom-apartment-downtown-4154251/']
def parse(self, response):
items = RealestateItem()
whole_page = response.css('body')
for item in whole_page:
Title = response.css(".obj-header-text::text").extract()
items['Title'] = Title
yield items
After running in console:
scrapy crawl Add -o Data.csv
In .csv file I get
['\n 2-bedroom-apartment ']
Tried adding strip method to function:
Title = response.css(".obj-header-text::text").extract().strip()
But scrapy returns:
Title = response.css(".obj-header-text::text").extract().strip()
AttributeError: 'list' object has no attribute 'strip'
Is there are some easy way to make scrapy return into .csv file just:
2-bedroom-apartment

AttributeError: 'list' object has no attribute 'strip'
You get this error because .extract() returns a list, and .strip() is a method of string.
If that selector always returns ONE item, you could replace it with .get() [or extract_first()] instead of .extract(), this will return a string of the first item, instead of a list. Read more here.
If you need it to return a list, you can loop through the list, calling strip in each item like:
title = response.css(".obj-header-text::text").extract()
title = [item.strip() for item in title]
You can also use an XPath selector, instead of a CSS selector, that way you can use normalize-space to strip whitespace.
title = response.xpath('normalize-space(.//*[#class="obj-header-text"]/text())').extract()
This XPath may need some adjustment, as you didn't post the source I couldn't check it

Discord.py random choice bug

Code:
# Random Choice
#client.command(aliases=["rand_c"])
async def random_choice(ctx, python_list):
await ctx.send(random.choice(python_list))
Weird error when I type a proper Python list (["Cats", "Dogs", "No pet"]):
discord.ext.commands.errors.UnexpectedQuoteError: Unexpected quote mark, '"', in non-quoted string
It works fine in regular Python, but why not in discord.py?

All of the inputs to your commands are initially treated as strings. You need to provide a converter function to tell the command what to do with that string:
from ast import literal_eval
#client.command(aliases=["rand_c"])
async def random_choice(ctx, *, python_list: literal_eval):
await ctx.send(str(python_list))

Need to remove unwanted symbols from value using replaceAll method

I am getting the value of a field as ["value"]
I want to print only the value removing the [ "from the result value.

That looks like a JSON array of Strings? No idea, as you don't provide any context, but you could do:
import groovy.json.JsonSlurper
def valueField = '["value"]'
def result = new JsonSlurper().parseText(valueField).head()
println result
Prints value

The following script should be what you need
def str = '["value"]'
println(str.replaceAll(/\[|\]/,''))

What does .read() actually do, and what gets returned when it .read() is not used?

I how to use it but I noticed that if I don't use read then it can spit out different thing.
It can spit out <_io.TextIOWrapper name='story.txt' mode='r' encoding='UTF-8'>
when using text files.
It can also return <http.client.HTTPResponse object at 0x76521550>
when using urlopen from urllib.
What do those things mean and what does .read() actually do?

Those are "file-like objects" that have a .read() method and you are seeing the repr() of the object, which is a description string. When you call .read() it reads the complete contents from the object, usually as a byte or Unicode string.
A small, custom example:
class Demo:
def __repr__(self):
return '<My Custom Description>'
def read(self):
return 'some stuff'
x = Demo()
print(x)
print(x.read())
Output:
<My Custom Description>
some stuff

Is there a way to declare a Groovy string format in a variable?

I currently have a fixed format for an asset management code, which uses the Groovy string format using the dollar sign:
def code = "ITN${departmentNumber}${randomString}"
Which will generate a code that looks like:
ITN120AHKXNMUHKL
However, I have a new requirement that the code format must be customizable. I'd like to expose this functionality by allowing the user to set a custom format string such as:
OCP${departmentNumber}XI${randomString}
PAN-${randomString}
Which will output:
OCP125XIBQHNKLAPICH
PAN-XJKLBPPJKLXHNJ
Which Groovy will then interpret and replace with the appropriate variable value. Is this possible, or do I have to manually parse the placeholders and manually do the string.replace?

I believe that GString lazy evaluation fits the bill:
deptNum = "C001"
randomStr = "wot"
def code = "ITN${deptNum}${->randomStr}"
assert code == "ITNC001wot"
randomStr = "qwop"
assert code == "ITNC001qwop"

I think the original poster wants to use a variable as the format string. The answer to this is that string interpolation only works if the format is a string literal. It seems it has to be translated to more low level String.format code at compile time. I ended up using sprintf
baseUrl is a String containing http://example.com/foo/%s/%s loaded from property file
def operation = "tickle"
def target = "dog"
def url = sprintf(baseUrl, operation, target)
url
===> http://example.com/foo/tickle/dog

I believe in this case you do not need to use lazy evaluation of GString, the normal String.format() of java would do the trick:
def format = 'ITN%sX%s'
def code = { def departmentNumber, def randomString -> String.format(format, departmentNumber, randomString) }
assert code('120AHK', 'NMUHKL') == 'ITN120AHKXNMUHKL'
format = 'OCP%sXI%s'
assert code('120AHK', 'NMUHKL') == 'OCP120AHKXINMUHKL'
Hope this helps.

for Triple double quoted string
def password = "30+"
def authRequestBody = """
<dto:authTokenRequestDto xmlns:dto="dto.client.auth.soft.com">
<login>support#soft.com</login>
<password>${password}</password>
</dto:authTokenRequestDto>
"""

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Python3 using f-string with regex in findall - python-3.x

Related

Striping text in scrappy

Discord.py random choice bug

Need to remove unwanted symbols from value using replaceAll method

What does .read() actually do, and what gets returned when it .read() is not used?

Is there a way to declare a Groovy string format in a variable?

Categories

Resources