How do you add an attribute to your CSS selector to specify specific pagination link? - python-3.x

I just got into Scrapy & I’m aware this is a Noob question but How do I add an attribute to specify specific pagination link?
here is the html with the element I’m targeting.
`<div class="pagination">
<a rel="prev" href="/collections/all?page=1" class="fa fa-chevron-left prev pagination-icon"></a>
<ul>
<li class="pagination-icon">
1
</li>
<li class="pagination-icon pagination-icon--current">
2
</li>
<li class="pagination-icon">
3
</li>
<li class="pagination-icon">
4
</li>
<li class="pagination-icon pagination-icon--current">
…
</li>
<li class="pagination-icon">
50
</li>
</ul>
I Need to follow the link in this line
<a rel="next" href="/collections/all?page=3" class="fa fa-chevron-right next pagination-icon"></a>
Here is my scrapy code
next_page = response.css('div.pagination a::attr(href)').extract_first()
if next_page is not None:
yield response.follow(next_page, callback=self.parse)
What’s happening is its following this link instead of the other one because it is the first one in the class “pagination”
<a rel="prev" href="/collections/all?page=1" class="fa fa-chevron-left prev pagination-icon"></a>
I can see 2 differences between the attributes of the 2 links, both in the class “pagination”
Rel attribute is different, I need the one with “next”
Class attribute is different, I need “fa fa-chevron-right next pagination-icon”
I’m pretty sure I can get the correct link by specifying one of the 2 attributes listed above in my css selector. I tried using the following CSS selectors but none worked.
div.pagination a.fa fa-chevron-right next pagination-icon a::attr(href) does not work
a.fa fa-chevron-right next pagination-icon a::attr(href) does not work
a.fa fa-chevron-right next pagination-icon::attr(href) does not work
How can I achieve my goal? Why do none of the CSS selectors I tried work?

You can't select multiple classes with a single dot. Either combine each of them with dots or go for this syntax "[class='fa fa-chevron-right next pagination-icon']". However, if any class out of them is generated dynamically then the selector will break.
Then try with this to see what happens.
response.css('div.pagination a[rel="next"]::attr(href)').extract_first()

Related

Prestashop 1.7 attribute groups - check stock and apply css to unavailable combination

Prestashop option to hide unavailable attributes on product page doesn't work when attribute groups are used, for example color and size (for clothing shops).
I need to keep showing all possible combinations, but grey out (or strikethrough) the combinations with no stock.
Like this:
I tried several things.
In Prestashop 1.6 the following piece of code worked to apply css class (.out-of-stock-float-left) to unavailable combinations:
{elseif $group.group_type == 'radio'}
<ul id="group_{$id_attribute_group}">
{foreach from=$group.attributes key=id_attribute item=group_attribute}
{if {$group.attributes_quantity[{$id_attribute|intval}]} > 1} <!-- product in stock -->
<li class="input-container float-left">
<input class="input-radio" type="radio" data-product-attribute="{$id_attribute_group}"
name="group[{$id_attribute_group}]"
value="{$id_attribute}"{if $group_attribute.selected} checked="checked"{/if}>
<span class="radio-label">{$group_attribute.name}</span>
</li>
{else} <!-- product out of stock -->
<li class="input-container out-of-stock-float-left">
<input class="input-radio" type="radio" data-product-attribute="{$id_attribute_group}"
name="group[{$id_attribute_group}]"
value="{$id_attribute}"{if $group_attribute.selected} checked="checked"{/if}>
<span class="radio-label">{$group_attribute.name}</span>
</li>
{/if}
{/foreach}
</ul>
{/if}
When changing combinations there is an ajax request. I don't know how to grey out combinations with no stock and make them not clickable.
Thanks

Why does attribute splitting happen in BeautifulSoup?

I try to get the attribute of the parent element:
<div class="detailMS__incidentRow incidentRow--away odd">
<div class="time-box">45'</div>
<div class="icon-box soccer-ball-own"><span class="icon soccer-ball-own"> </span></div>
<span class=" note-name">(Autogoal)</span><span class="participant-name">
Reynaldo
</span>
</div>
span_autogoal = soup.find('span', class_='note-name')
print(span_autogoal)
print(span_autogoal.find_parent('div')['class'])
# print(span_autogoal.find_parent('div').get('class')
Output:
<span class="note-name">(Autogoal)</span>
['detailMS__incidentRow', 'incidentRow--away', 'odd']
I know i can do something like this:
print(' '.join(span_autogoal.find_parent('div')['class']))
But i want to know why this is happening and is it possible to do this more correctly?
Above answer is correct however if you want get mutli attribute value return as string try use xml parser after get the parent element.
from bs4 import BeautifulSoup
data='''<div class="detailMS__incidentRow incidentRow--away odd">
<div class="time-box">45'</div>
<div class="icon-box soccer-ball-own"><span class="icon soccer-ball-own"> </span></div>
<span class=" note-name">(Autogoal)</span><span class="participant-name">
Reynaldo
</span>
</div>'''
soup=BeautifulSoup(data,'lxml')
span_autogoal = soup.find('span', class_='note-name')
print(span_autogoal)
parentdiv=span_autogoal.find_parent('div')
data=str(parentdiv)
soup=BeautifulSoup(data,'xml')
print(soup.div['class'])
Output on console:
<span class="note-name">(Autogoal)</span>
detailMS__incidentRow incidentRow--away odd
According to the BeautifulSoup documentation:
HTML 4 defines a few attributes that can have multiple values. HTML 5
removes a couple of them, but defines a few more. The most common
multi-valued attribute is class (that is, a tag can have more than one
CSS class). Others include rel, rev, accept-charset, headers, and
accesskey. Beautiful Soup presents the value(s) of a multi-valued
attribute as a list:
css_soup = BeautifulSoup('<p class="body"></p>') css_soup.p['class']
# ["body"]
css_soup = BeautifulSoup('<p class="body strikeout"></p>')
css_soup.p['class']
# ["body", "strikeout"]
So in your case in <div class="detailMS__incidentRow incidentRow--away odd"> a class attribute is multi-valued.
That's why span_autogoal.find_parent('div')['class'] gives you list as an output.

How can I add mutiple anchors to the same block?

I'm using AsciiDoctor to create an HTML manual. In order to keep existing links valid, I need multiple anchors at the same header.
Basically I want this output:
<a id="historic1"></a>
<a id="historic2"></a>
<h2 id="current">Caption</h2>
While it is possible to create multiple inline anchors like this
Inline [[historic1]] [[historic2]] [[current]] Anchor
Inline <a id="historic1"></a> <a id="historic2"></a> <a id="current"></a> Anchor
it looks like additional anchor macros in front of blocks are simply swallowed:
[[historic1]]
[[historic2]]
[[current]]
== Caption
<h2 id="current">Caption</h2>
So what are my options to have multiple anchors in front of a block?
You can also use the shorthand version of this solution.
[#current]
== [[historic1]][[historic2]]Caption
Now you get all three anchors on the same heading.
The best I could do (tested with Asciidoctor.js 1.5.4):
== anchor:historic1[historic1] anchor:historic2[historic2] anchor:current[current] Caption
Some text
Output:
<h2 id="__a_id_historic1_a_a_id_historic2_a_a_id_current_a_caption"><a id="historic1"></a> <a id="historic2"></a> <a id="current"></a> Caption</h2>
<div class="sectionbody">
<div class="paragraph">
<p>Some text</p>
</div>
</div>
There are two issues:
#840
#1689

ExpressionEngine swtich tag working inconsistently

In ExpressioneEngine, I'm creating a list with conditionals that is returning some strange behavior. The code below is part of a bigger set:
<li><h4>DERMATOLOGY</h4>
<ul>
{exp:channel:entries channel="specialist" dynamic="no" orderby="sp_order" sort="asc"}
{if sp_specialty == "sp_dermatology"}
<li>
<img src="{sp_headshot}" />
<p>{title}</p>
</li>
{/if}
{/exp:channel:entries}
</ul>
</li>
<li><h4>EMERGENCY AND CRITICAL CARE</h4>
<ul>
{exp:channel:entries channel="specialist" dynamic="no" orderby="sp_order" sort="asc"}
{if sp_specialty == "sp_emergency"}
<li class="{switch='one|two'}">
<img src="{sp_headshot}" />
<p>{title}</p>
</li>
{/if}
{/exp:channel:entries}
</ul>
</li>
What happens, in the case of EMERGENCY AND CRITICAL CARE, is that with the 5 entries I have under that, the classes are returned like this: two, one, one, one, two. Any suggestions on getting the behavior I need?
I see what you mean. The switch variable applies its logic to all entries returned by the entries loop - which is why you're seeing odd numbering in your rendered page - because it's applying them to entries returned by the loop that you are then applying conditionals to in order to do your grouping. You could use the search param to do some of that for you, returning only the entries you're looking for within each loop. Like this:
<li><h4>DERMATOLOGY</h4>
<ul>
{exp:channel:entries channel="specialist" search:sp_specialty="=sp_dermatology" dynamic="no" orderby="sp_order" sort="asc"}
<li>
<img src="{sp_headshot}" />
<p>{title}</p>
</li>
{/exp:channel:entries}
</ul>
</li>
<li><h4>EMERGENCY AND CRITICAL CARE</h4>
<ul>
{exp:channel:entries channel="specialist" search:sp_specialty="=sp_emergency" dynamic="no" orderby="sp_order" sort="asc"}
<li class="{switch='one|two'}">
<img src="{sp_headshot}" />
<p>{title}</p>
</li>
{/exp:channel:entries}
</ul>
</li>
This way each loop returns ONLY the matching items you're looking for, eliminating the need for the conditional and allowing the switch param to operate as it wants to - applying itself in alternating fashion to every returned entry from the loop.

how can I select with multiple css selector with YUI3

<ul>
<li class="selected cell">test</li>
<li class="cell">test2</li>
</ul>
How can I select only the .selected .cell element?
Y.one('.selected, .cell') <= This selects boths li elements. and I just want to select the first element.
Is there something like?
Y.one('.cell').one('.selected') ???
.selected.cell
Notice the lack of space between them.

Resources