Python3 using f-string with regex in findall - python-3.x

I tried to use re.findall to look for some string to parse to information I want to pattern from a large html file. the information I need is 0000-01-02-03-04-asdasdad that sit in the string like https://xxx/yyy/0000-01-02-03-04-asdasdad/index.css
How can I do it?
def func(html: str):
import re
pattern = r'\d{4}-\d{2}-\d{2}-\d{2}-\d{2}-\w+'
s = fr'https://xxx/yyy/{pattern}/index.css'
return re.findall(s, html)
# error: TypeError: cannot use a string pattern on a bytes-like object

Related

Striping text in scrappy

I'm trying to run spyder to extract real estate advertisements informaiton.
My code:
import scrapy
from ..items import RealestateItem
class AddSpider (scrapy.Spider):
name = 'Add'
start_urls = ['https://www.exampleurl.com/2-bedroom-apartment-downtown-4154251/']
def parse(self, response):
items = RealestateItem()
whole_page = response.css('body')
for item in whole_page:
Title = response.css(".obj-header-text::text").extract()
items['Title'] = Title
yield items
After running in console:
scrapy crawl Add -o Data.csv
In .csv file I get
['\n 2-bedroom-apartment ']
Tried adding strip method to function:
Title = response.css(".obj-header-text::text").extract().strip()
But scrapy returns:
Title = response.css(".obj-header-text::text").extract().strip()
AttributeError: 'list' object has no attribute 'strip'
Is there are some easy way to make scrapy return into .csv file just:
2-bedroom-apartment
AttributeError: 'list' object has no attribute 'strip'
You get this error because .extract() returns a list, and .strip() is a method of string.
If that selector always returns ONE item, you could replace it with .get() [or extract_first()] instead of .extract(), this will return a string of the first item, instead of a list. Read more here.
If you need it to return a list, you can loop through the list, calling strip in each item like:
title = response.css(".obj-header-text::text").extract()
title = [item.strip() for item in title]
You can also use an XPath selector, instead of a CSS selector, that way you can use normalize-space to strip whitespace.
title = response.xpath('normalize-space(.//*[#class="obj-header-text"]/text())').extract()
This XPath may need some adjustment, as you didn't post the source I couldn't check it

Discord.py random choice bug

Code:
# Random Choice
#client.command(aliases=["rand_c"])
async def random_choice(ctx, python_list):
await ctx.send(random.choice(python_list))
Weird error when I type a proper Python list (["Cats", "Dogs", "No pet"]):
discord.ext.commands.errors.UnexpectedQuoteError: Unexpected quote mark, '"', in non-quoted string
It works fine in regular Python, but why not in discord.py?
All of the inputs to your commands are initially treated as strings. You need to provide a converter function to tell the command what to do with that string:
from ast import literal_eval
#client.command(aliases=["rand_c"])
async def random_choice(ctx, *, python_list: literal_eval):
await ctx.send(str(python_list))

Need to remove unwanted symbols from value using replaceAll method

I am getting the value of a field as ["value"]
I want to print only the value removing the [ "from the result value.
That looks like a JSON array of Strings? No idea, as you don't provide any context, but you could do:
import groovy.json.JsonSlurper
def valueField = '["value"]'
def result = new JsonSlurper().parseText(valueField).head()
println result
Prints value
The following script should be what you need
def str = '["value"]'
println(str.replaceAll(/\[|\]/,''))

What does .read() actually do, and what gets returned when it .read() is not used?

I how to use it but I noticed that if I don't use read then it can spit out different thing.
It can spit out <_io.TextIOWrapper name='story.txt' mode='r' encoding='UTF-8'>
when using text files.
It can also return <http.client.HTTPResponse object at 0x76521550>
when using urlopen from urllib.
What do those things mean and what does .read() actually do?
Those are "file-like objects" that have a .read() method and you are seeing the repr() of the object, which is a description string. When you call .read() it reads the complete contents from the object, usually as a byte or Unicode string.
A small, custom example:
class Demo:
def __repr__(self):
return '<My Custom Description>'
def read(self):
return 'some stuff'
x = Demo()
print(x)
print(x.read())
Output:
<My Custom Description>
some stuff

Is there a way to declare a Groovy string format in a variable?

I currently have a fixed format for an asset management code, which uses the Groovy string format using the dollar sign:
def code = "ITN${departmentNumber}${randomString}"
Which will generate a code that looks like:
ITN120AHKXNMUHKL
However, I have a new requirement that the code format must be customizable. I'd like to expose this functionality by allowing the user to set a custom format string such as:
OCP${departmentNumber}XI${randomString}
PAN-${randomString}
Which will output:
OCP125XIBQHNKLAPICH
PAN-XJKLBPPJKLXHNJ
Which Groovy will then interpret and replace with the appropriate variable value. Is this possible, or do I have to manually parse the placeholders and manually do the string.replace?
I believe that GString lazy evaluation fits the bill:
deptNum = "C001"
randomStr = "wot"
def code = "ITN${deptNum}${->randomStr}"
assert code == "ITNC001wot"
randomStr = "qwop"
assert code == "ITNC001qwop"
I think the original poster wants to use a variable as the format string. The answer to this is that string interpolation only works if the format is a string literal. It seems it has to be translated to more low level String.format code at compile time. I ended up using sprintf
baseUrl is a String containing http://example.com/foo/%s/%s loaded from property file
def operation = "tickle"
def target = "dog"
def url = sprintf(baseUrl, operation, target)
url
===> http://example.com/foo/tickle/dog
I believe in this case you do not need to use lazy evaluation of GString, the normal String.format() of java would do the trick:
def format = 'ITN%sX%s'
def code = { def departmentNumber, def randomString -> String.format(format, departmentNumber, randomString) }
assert code('120AHK', 'NMUHKL') == 'ITN120AHKXNMUHKL'
format = 'OCP%sXI%s'
assert code('120AHK', 'NMUHKL') == 'OCP120AHKXINMUHKL'
Hope this helps.
for Triple double quoted string
def password = "30+"
def authRequestBody = """
<dto:authTokenRequestDto xmlns:dto="dto.client.auth.soft.com">
<login>support#soft.com</login>
<password>${password}</password>
</dto:authTokenRequestDto>
"""

Resources