Use HTML Table without headers - tabulator

I've a table, where the normal guidance for HTML tables arn't followed.
My best move will be, to just create a proper JSON-object, and using that.
But i'll like to ask, if there is any options for parsing an HTML table, "without headers", and define them in Tabulator, instead.
I know the case id odd, but i'll just like to hear :-)
Example where no thead and th is in the HTML-source:
<table border="0" cellpadding="0" cellspacing="0">
<tbody>
<tr height="16">
<td colspan="16">
Something
</td>
<td colspan="16">
14
</td>
<td colspan="16">
2020-01-28
</td>
</tr>
</tbody>
</table>

Im afraid not.
When Tabulator is built on a table element, it parses the HTML to create a JavaScript object for each row of the table, using the column headers as property names.
Without headers it would have no reasonable way to map the column values onto an object.

I solve problem like this
<table border="0" cellpadding="0" cellspacing="0">
<tr height="16" hidden>
<td colspan="16">
</td>
<td colspan="16">
</td>
<td colspan="16">
</td>
</tr>
<tr height="16">
<td colspan="16">
Something
</td>
<td colspan="16">
14
</td>
<td colspan="16">
2020-01-28
</td>
</tr>
</table>

Related

beautifulsoup for looping and getting text and Href

I'm in a bit of a quinch here:
its an ASP site which is really messy that I am trying to get data from:
I'm trying to use a for loop to get an href and the text of all the rows of the 4th table that is on the site, so I first did:
table = soup.findAll('table')[3]
Then from this table I need to get all text inside the <tr> tags and the href's of the <a> inside.
i tried something like this:
for product in table.findAll('tbody'):
product_title = product.find('tr').text
product_link = product.find('a')['href']
print (product_title, product_link)
But I get nothing in return
The table Im working on:
<tr bgcolor="#EFEFEF">
<td>
<a href="free.asp?detail=hide&c_id=4342141">
<img align="absmiddle" border="0" hspace="0" src="pic/bullet.gif" vspace="0"/>
</a>
</td>
<td>
4342141
</td>
<td width="10">
</td>
<td>
25.07.2018 09:00
</td>
<td width="10">
</td>
<td>
Ankara
</td>
<td width="10">
-
</td>
<td>
Konya
</td>
<td colspan="2">
</td>
</tr>
<tr bgcolor="#EFEFEF" height="3">
<td colspan="10">
</td>
</tr>
<tr bgcolor="#FFFFFF" height="1">
<td colspan="10">
</td>
</tr>
<tr bgcolor="#DDDDDD" height="6">
<td colspan="10">
</td>
</tr>
<tr bgcolor="#FFFFFF" height="1">
<td colspan="10">
</td>
</tr>
<tr bgcolor="#DEE3E7" height="3">
<td colspan="10">
</td>
</tr>
<tr bgcolor="#DEE3E7">
<td>
<a href="free.asp?detail=hide&c_id=4134123">
<img align="absmiddle" border="0" hspace="0" src="pic/bullet.gif" vspace="0"/>
</a>
</td>
<td>
4134123
</td>
<td width="10">
</td>
<td>
26.07.2018 09:00
</td>
<td width="10">
</td>
<td>
Van
</td>
<td width="10">
-
</td>
<td>
Istanbul
</td>
<td colspan="2">
</td>
</tr>
Instead of extracting text from tbody from table, you can directly get all tr tags.
Based on your snippet you can refer to this code snippet for data extraction from table.
soup = BeautifulSoup(text, 'html.parser')
all_products = []
for tr in soup.find_all('tr'):
text = tr.get_text(separator=' ', strip=True)
if text:
a_tag = tr.find('a')
if a_tag:
product_link = a_tag.attrs['href']
all_text = text + ' ' + product_link
all_products.append(all_text.split(' '))
print(all_products)
Output is:
[['4342141', '25.07.2018', '09:00', 'Ankara', '-', 'Konya', 'free.asp?detail=hide&c_id=4342141'], ['4134123', '26.07.2018', '09:00', 'Van', '-', 'Istanbul', 'free.asp?detail=hide&c_id=4134123']]

Highlighting Kit/Package components on a picking ticket

I have a customer wanting kit components to be highlighted or specially formatted on a Picking Ticket with an advanced PDF.
I can format the parent with no problem, but the customer wants the components to display in a different background colour. (custom column field indicating a Kit item, then using <#if> on the form to change to Bold...
But i can't find a field or criteria to tell the template if the item in question is a KIT COMPONENT??
Anyone know how I can achieve this?
Cheers
Steve
Just for those who may be interested, I found a way to differentiate between kit parent and kit component on a Picking Ticket Advanced PDF Template.
Firstly, create a transaction column field. Check Box; Stored Value; Sale Item/IF;HIDDEN;Default is CHECKED.
During sales order entry, this field will be "checked" by default. However, as the kit components do not appear on the sales order entry, they will not inherit the default value and thus will remain NULL.
In the advanced PDF template, I did the following:
<#assign committed="${item.quantitycommitted}"/><#if committed="0"><#assign committed=''/></#if>
<#if item.custcol_notcomponent='T'>
<#if item.custcolitemtype="Kit/Package"><tr style="font-weight:bold">
<#else><tr style="font-weight:normal"></#if>
<td width="15%" class="item" font-size="7pt">${item.item}</td>
<td width="20%" class="item" align="center">${item.binnumber}</td>
<td width="40%" class="item">${item.description}</td>
<td width="8%" class="item" align="center">${item.quantity}</td>
<td width="8%" class="item" align="center">${committed}</td>
<td width="8%" class="item"> </td>
<td width="9%" class="item"> </td>
</tr>
<#else>
<tr>
<td width="15%" class="kititem" font-size="7pt"> ${item.item}</td>
<td width="20%" class="kititem" align="center">${item.binnumber}</td>
<td width="40%" class="kititem">${item.description}</td>
<td width="8%" class="kititem" align="center">${item.quantity}</td>
<td width="8%" class="kititem" align="center">${committed}</td>
<td width="8%" class="kititem"> </td>
<td width="9%" class="kititem"> </td>
</tr>
</#if>
The result, is a picking ticket with BOLD kit parents, greyed and indented kit components, and just regular black text for standard inventory items.
Something like this for the transaction lines:
Hope this helps someone out sometime :)
I tried your solution but it won't indent. Looks like it doesn't go to the <#else> part.
<#if item.custcol_f5_not_component = 'T'>
<#if item.custcol_item_type="Kit/Package"><tr style="font-weight:bold"><#else><tr style="font-weight:normal"></#if>
<td colspan="5">${item.custcol_f5_item}</td>
<td colspan="7">${item.description}</td>
<td colspan="2">${item.custcol_skl_bin_location}</td>
<td align="center" colspan="3">${item.quantity}</td>
<td align="center" colspan="2">${item.units}</td>
<td align="center" colspan="3"><b>${item.quantitycommitted}</b></td>
<td align="center" colspan="3">${item.quantitybackordered}</td>
</tr>
<#else>
<tr>
<td colspan="5" style="padding-left:10px;">${item.custcol_f5_item}</td>
<td colspan="7">${item.description}</td>
<td colspan="2">${item.custcol_skl_bin_location}</td>
<td align="center" colspan="3">${item.quantity}</td>
<td align="center" colspan="2">${item.units}</td>
<td align="center" colspan="3"><b>${item.quantitycommitted}</b></td>
<td align="center" colspan="3">${item.quantitybackordered}</td>
</tr>
</#if>

How to pass product TVs to SimpleCart's scGetCart snippet?

I need some TVs (weight, dimensions, etc) I've associated with my products to appear in the Cart page of my SimpleCart site.
Problem is I have no idea how to do this. I don't understand how the SimpleCart cart is built and there isn't documentation for this.
Would anyone know how I can show TVs associated with each product in the cart output chunk?
The cart snippet has the following code which gets data from the cart and puts it into Chunks:
$sc = $modx->getService('simplecart','SimpleCart',$modx->getOption('simplecart.core_path',null,$modx->getOption('core_path').'components/simplecart/').'model/simplecart/',$scriptProperties);
if (!($sc instanceof SimpleCart)) return '';
 
$controller = $sc->loadController('Cart');
$output = $controller->run($scriptProperties);
The output Chunk looks like:
<div id="simplecart">
<form action="[[~[[*id]]]]" method="post" id="form_cartoverview">
<input type="hidden" name="updatecart" value="true" />
<table>
<tr>
<th class="desc">[[%simplecart.cart.description]]</th>
<th class="price">[[%simplecart.cart.price]]</th>
<th class="quantity">[[%simplecart.cart.quantity]]</th>
[[+cart.total.vat_total:notempty=`<th class="quantity">[[%simplecart.cart.vat]]</th>`:isempty=``]]
<th class="subtotal">[[%simplecart.cart.subtotal]]</th>
<th> </th>
</tr>
[[+cart.wrapper]]
[[+cart.total.discount:notempty=`<tr class="total first discount">
<td colspan="[[+cart.total.vat_total:notempty=`3`:isempty=`2`]]"> </td>
<td class="label">[[%simplecart.cart.discount]]</td>
<td class="value">- [[+cart.total.discount_formatted]]</td>
<td class="extra">[[+cart.total.discount_percent:notempty=`([[+cart.total.discount_percent]]%)`:isempty=` `]]</td>
</tr>`:isempty=``]]
[[+cart.total.vat_total:notempty=`
<tr class="total [[+cart.total.discount:notempty=`second`:isempty=`first`]]">
<td colspan="3"> </td>
<td class="label">[[%simplecart.cart.total_ex_vat]]</td>
<td class="value">[[+cart.total.price_ex_vat_formatted]]</td>
<td class="extra"> </td>
</tr>
[[+cart.vat_rates]]
<tr class="total [[+cart.total.discount:notempty=`third`:isempty=`second`]]">
<td colspan="3"> </td>
<td class="label">[[%simplecart.cart.total_vat]]</td>
<td class="value">[[+cart.total.vat_total_formatted]]</td>
<td class="extra"> </td>
</tr>
<tr class="total [[+cart.total.discount:notempty=`fourth`:isempty=`third`]]">
<td colspan="3"> </td>
<td class="label">[[%simplecart.cart.total_in_vat]]</td>
<td class="value">[[+cart.total.price_formatted]]</td>
<td class="extra"> </td>
</tr>
`:isempty=`
<tr class="total [[+cart.total.discount:notempty=`second`:isempty=`first`]]">
<td colspan="2"> </td>
<td class="label">[[%simplecart.cart.total]]</td>
<td class="value">[[+cart.total.price_formatted]]</td>
<td class="extra"> </td>
</tr>
`]]
</table>
<div class="submit">
<input type="submit" value="[[%simplecart.cart.update]]" />
</div>
</form>
This does appear to be documented:
Product Options (TVs)
and to output them:
Modifying the Product Template
It appears that you would just output them normally [[*myProductOptions]]
Though, it appears that your template is using a placeholder, I would try
[[+cart.myProductOptions] as well. If all else fails you might try debugging the simplecart class and dump the array of product data before it populates the chunk, there might be a clue in there.
Found (through trial and error) you must use:
[[+product.tv.name_of_tv]]

Getting data from Website back to excel

I'm trying to automate a page scrape program in Excel using VBA but having difficulty getting the results from the webpage as the fields I want do not have id's, I have copied the source code below I think its contained within a table? how do you get the data using td Class and class?
<table>
<tbody>
<tr>
<td class="vehicledetailstableleft"><span class="bodytextbold">Date of Liability</span></td>
<td class="vehicledetailstableright"><span class="bodytext">01 07 2014</span></td>
</tr>
<tr>
<td class="vehicledetailstableleft"><span class="bodytextbold">Date of First Registration</span></td>
<td class="vehicledetailstableright"><span class="bodytext">02 07 2013</span></td>
</tr>
<tr>
<td class="vehicledetailstableleft"><span class="bodytextbold">Year of Manufacture</span></td>
<td class="vehicledetailstableright"><span class="bodytext">2013</span></td>
</tr>
<tr>
<td class="vehicledetailstableleft"><span class="bodytextbold">Cylinder Capacity (cc)</span></td>
<td class="vehicledetailstableright"><span class="bodytext">2993cc</span></td>
</tr>
<tr>
<td class="vehicledetailstableleft"><span class="bodytextbold">CO₂ Emissions</span></td>
<td class="vehicledetailstableright"><span class="bodytext">129 g/km</span></td>
</tr>
<tr>
<td class="vehicledetailstableleft"><span class="bodytextbold">Fuel Type</span></td>
<td class="vehicledetailstableright"><span id="fueltype" class="bodytext">HEAVY OIL</span></td>
</tr>
<tr>
<td class="vehicledetailstableleft"><span class="bodytextbold">Export Marker</span></td>
<td class="vehicledetailstableright"><span id="exportmarker" class="bodytext">N</span></td>
</tr>
<tr>
<td class="vehicledetailstableleft"><span class="bodytextbold">Vehicle Status</span></td>
<td class="vehicledetailstableright"><span id="vehiclelicencestatus" class="bodytext">Licence Not Due</span></td>
</tr>
<tr>
<td class="vehicledetailstableleft"><span class="bodytextbold">Vehicle Colour</span></td>
<td class="vehicledetailstableright"><span id="colour" class="bodytext">BLUE</span></td>
</tr>
<tr>
<td class="vehicledetailstableleft"><span class="bodytextbold">Vehicle Type Approval</span></td>
<td class="vehicledetailstableright"><span class="bodytext">M1</span></td>
</tr>
<tr>
<td class="vehicledetailstableleft"><span class="bodytextbold">Date of Last V5C Issued</span>
</td>
<td class="vehicledetailstableright"><span class="bodytext">No Result Found</span>
</td>
</tr>
Tim is suggesting a code heavy way to do it, and it is technically correct. I suggest the same thing repeatedly here:
VBA spliting results from html imported table into excel
Basically, use the macro recorder, and then create a HTML query for data.
see my blog post on this as well.
http://automatic-office.com/?p=344
Many ways to skin the cat, but this is the easy way.

Using watir to click table element without id

I have a dynnamic table that I would like to use watir to select a edit button. It doesn't have a traditional watir id that I can select (example: browser.img(:title, "iconname")) and there may be multiple edit icons to choose from. In the past I would get the correct element by querying the database. However this does not have a database entry to help me select the correct edit link this time.
In the code below what I am trying to select is from the section where it shows "autogenerated3" I am trying to select either the "onclick" element or the "img src"
Both are selectable items that will click the edit icon.
<div id="certificate_table" style="margin-bottom: 1em">
<table cellspacing="0">
<tbody>
<tr>
<th>Alias</th>
<th>Common Name</th>
<th class="centered">Status</th>
<th class="centered">In Use</th>
<th class="centered">Issued</th>
<th class="centered">Expires</th>
<th class="centered">Days</th>
<th>Actions</th>
</tr>
<tr class="normal">
<td>autogenerated2</td>
<td>default.domain.com</td>
<td class="centered"> Revoked </td>
<td class="centered">
<td class="centered"> 10/18/2013 19:46:34 GMT </td>
<td class="centered"> 10/17/2016 19:46:34 GMT </td>
<td class="centered">N/A</td>
<td>
<a onclick="new Ajax.Request('/media/certificates/edit_certificate/3', {asynchronous:true, evalScripts:true}); return false;" href="#">
<img title="Edit" src="/media/images/icons/edit.gif?1276876449" alt="Edit">
</a>
</td>
</tr>
<tr class="alt">
<td>autogenerated3</td>
<td>autogenerated3.domain.com</td>
<td class="centered"> CSR Issued </td>
<td class="centered">
<td class="centered"> 10/18/2013 20:54:55 GMT </td>
<td class="centered"> 10/17/2016 20:54:55 GMT </td>
<td class="centered">1092 </td>
<td>
<a onclick="new Ajax.Request('/media/certificates/edit_certificate/4', {asynchronous:true, evalScripts:true}); return false;" href="#">
<img title="Edit" src="/media/images/icons/edit.gif?1276876449" alt="Edit">
</a>
<a onclick="new Ajax.Request('/media/certificates/generate_csr/4', {asynchronous:true, evalScripts:true}); return false;" href="#">
<a onclick="new Ajax.Request('/media/certificates/import_certificate_reply/4', {asynchronous:true, evalScripts:true}); return false;" href="#">
<a onclick="if (confirm('Are you sure you want to revoke this certificate?')) { new Ajax.Request('/media/certificates/revoke_certificate/4', {asynchronous:true, evalScripts:true}); }; return false;" href="#">
</td>
</tr>
<tr class="normal">
<td>Original Certificate</td>
<td>localhost.localdomain</td>
<td class="centered"> Self Signed </td>
<td class="centered">
<td class="centered"> 10/03/2013 22:37:02 GMT </td>
<td class="centered"> 10/03/2014 22:37:02 GMT </td>
<td class="centered">347 </td>
<td>
<a onclick="new Ajax.Request('/media/certificates/edit_certificate/1', {asynchronous:true, evalScripts:true}); return false;" href="#">
<img title="Edit" src="/media/images/icons/edit.gif?1276876449" alt="Edit">
</a>
</td>
</tr>
<tr class="alt">
<td>vhost4</td>
<td>vhost4.domain.com</td>
<td class="centered"> Revoked </td>
<td class="centered">
<td class="centered"> 10/18/2013 15:58:01 GMT </td>
<td class="centered"> 10/17/2016 15:58:01 GMT </td>
<td class="centered">N/A</td>
<td>
<a onclick="new Ajax.Request('/media/certificates/edit_certificate/2', {asynchronous:true, evalScripts:true}); return false;" href="#">
<img title="Edit" src="/media/images/icons/edit.gif?1276876449" alt="Edit">
</a>
</td>
</tr>
</tbody>
</table>
I don't have trouble selecting a icon. Just trouble selecting the correct icon. Both the onclick and image values are selectable items. The icon may not be the last item in the list. I saw a post to try .last.click which does select the last icon in the list. Unfortunately the table posts the data in alphabetical order based on the Alias name. So it may not be the last item in the list and cannot use this method. Suggestions?
b.div(:id, "certificate_table").imgs(:src => "/media/images/icons/edit.gif?1276876449").last.when_present.click
It sounds like you need to find an element based on its siblings, so you might find this blog post useful. The post gives two options you might consider.
If the text you are looking for is unique - ie the row that has the text is definitely going to be the right row, you can find the td, go to the parent row and then get the link.
b.td(:text => 'autogenerated3').parent.link.click
If you need to ensure that the text is in the first column, then you can do:
b.trs.find{ |tr|
tr.td.exists? and tr.td(:index => 0).text == 'autogenerated3'
}.link.click
The tr.td.exists? is added since some of your rows do not have any tds, which would cause an exception when checking the second criteria.

Resources