Extracting specific values with Postgresql - string

I have a table like this:
Table
<!DOCTYPE html>
<html>
<body>
<table border="1" style="width:100%">
<tr>
<td>email</td>
<td>data</td>
</tr>
<tr>
<td>creator_a#creator.com</td>
<td>"vimeo_profile"=>"", "twitter_profile"=>"", "youtube_profile"=>"", "creator_category"=>"production_company", "facebook_profile"=>"", "linkedin_profile"=>"", "personal_website"=>"", "instagram_profile"=>"", "content_expertise_categories"=>"4,5,8"</td>
</tr>
<tr>
<td>creator_b#creator.com</td>
<td>"twitter_profile"=>"", "creator_category"=>"association", "facebook_profile"=>"", "linkedin_profile"=>"", "personal_website"=>"", "content_expertise_type"=>"image", "content_expertise_categories"=>"4, 6"</td>
</tr>
</table>
</body>
</html>
And I want to query this using PostgreSQL, so I only get the values regarding content_expertise_categories:
*Important to mention that the number of values vary. The table has many more entries so I am looking for a solution that helps me extract the values regardless of whether there are 2 or 15 values to pull out.
Result
<!DOCTYPE html>
<html>
<body>
<table border="1" style="width:100%">
<tr>
<td>email</td>
<td>data</td>
</tr>
<tr>
<td>creator_a#creator.com</td>
<td>4,5,8</td>
</tr>
<tr>
<td>creator_b#creator.com</td>
<td>4,6</td>
</tr>
</table>
</body>
</html>
I have tried substring but can't make it to work.
Some help would be much appreciated, thanks!

SELECT
email,
(string_to_array(
data::text,'"content_expertise_categories"=>'::text
)
)[2] as data
FROM users
;
Update:
In your example all strings have "content_expertise_categories" listed last, which allows to think you can just split string to two pieces. If you happen to have more php array definition values after, you'll need an additional split on ',"' and taking [1] part this time...
Mind casting column "data" to ::text before using it in content_expertise_categories function, as it requires text type, and your column appeared to be not such.
I believe more elegant would be this query:
select
email,
data->'content_expertise_categories' as data
from h
;
But when I was posting first query I did not know that you use hstore

Related

How to get website generated data to excel [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
Do you have any idea how to get this little table from this website to excel?
Normal source code scraping won't work since the results are not stored in the source. Power query doesn't work either...
Edit:
I have tried Power Query. I have some codes that download data from websites searching by class, tag etc. - but all of them search in the source, not in the rendered website, so posting such codes, just to post anything, is pointless.
I know that starting off with web scraping can be sometimes cumbersome and the volume of information out there can be overwhelming so I decided to kickstart your efforts in the hopes that in the future you will at least know where to start from.
Inspect the network traffic.
Use your browser's developer tools to inspect the requests being sent when you browse a website. In your case, the requests that are sent under the hood when you press search are quite a few. However, you only need one of them. It's the XHR request that produces the table as a response.
Inspect the request itself
The request basically consists of a URL which contains the parameters that you select in the dropdown menus, the headers which in your case are not essential to the result and a body which in your case is empty because all the parameters are contained in the URL.
Inspect the response
The response in your case is an HTML. It could have been something else like a JSON. The data you want is in an HTML table with an ID "qoutaTable".
<html>
<head>
<!-- Including version.html for defect CUSTD00035918 Start -->
<meta name="application" content="DDS2-TARIC" />
<meta name="version" content="#REL#" />
<!-- Defect# CUSTD00024730 Start -->
<!-- IPG Rule requires the following 7 metatags in all application pages. Additional metatags e.g. version and application can be added if required by the application. -->
<meta http-equiv="Content-Language" content="en">
<meta name="description" content="DDS2-TARIC Application page">
<meta name="reference" content="DDS2-TARIC Reference">
<meta name="creator" content="DG-TAXUD">
<meta name="classification" content="DDS2-TARIC">
<meta name="keywords" content="DDS2-TARIC, TARIC, DDS2">
<meta name="date" content="">
<!-- Defect# CUSTD00024730 End -->
<!-- Including version.html for defect CUSTD00035918 End -->
</head>
<body style="background-color:#FFFFF0;">
<div id="quotaMarkedUpContainer">
<div class='scroller' id="navigation" align=center>
<table>
<tr>
<td>
</td>
<td>
</td>
</tr>
</table>
</div>
<table id="quotaTable" class="list" width="100%" style="padding-left: 7%; padding-right: 7%;">
<thead>
<tr class="columnHeader">
<th>
Order number
</th>
<th>
Origins
</th>
<th style="text-align: center;">
Start date
</th>
<th style="text-align: center;">
End date
</th>
<th style="text-align: right;">
Balance
</th>
<th/>
</tr>
</thead>
<tr class="oddRow">
<td>
096714
</td>
<td>
<div>
Ukraine
</div>
</td>
<td style="text-align: center;">
01-01-2019
</td>
<td style="text-align: center;">
31-12-2019
</td>
<td style="text-align: right;">
0 Kilogram
</td>
<td>
<a id="quotaLink" href="https://ec.europa.eu/taxation_customs/dds2/taric/quota_tariff_details.jsp?Lang=en&StartDate=2019-01-01&Code=096714" style="color:#3247e8; text-decoration:underline;" class='browse_action_a'>[More info]</a>
</td>
</tr>
</table>
<div class='scroller' id="navigation" align=center>
<table>
<tr>
<td>
</td>
<td>
</td>
</tr>
</table>
</div>
</div>
</body>
</html>
Write the code
For that you will need the following references
Microsoft WinHTTP Services, version 5.1 (to create and manipulate HTTP requests)
Microsoft HTML Object Library (to manipulate HTML elements)
Here's an example of how to get one of the table's cells:
Option Explicit
Sub getData()
Dim req As New WinHttpRequest
Dim doc As New HTMLDocument
Dim table As HTMLTable
Dim url As String, code As String, year As String, origin As String, status As String, critical As String 'the request's parameters
critical = "" 'you can leave it blank if it's not important to your search
status = "" 'you can leave it blank if it's not important to your search
origin = "UA"
year = "2019"
code = "096714"
url = "https://ec.europa.eu/taxation_customs/dds2/taric/quota_list.jsp?Lang=en&Origin=" & origin & "&Code=" & code & "&Year=" & year & "&Status=" & status & "&Critical=" & critical & "&Expand=true&Offset=0" 'build the URL by concatenating the various parameters
With req
.Open "GET", url, False
.send
doc.body.innerHTML = .responseText 'Assign the HTML response to an HTML document object
'Debug.Print .responseText
End With
Set table = doc.getElementById("quotaTable") 'get the table you're interested in
Debug.Print table.Rows(1).Cells(4).innerText 'print the 5th cell of the 2nd row in the immediate window
End Sub
The result looks like that:
For demonstration purposes I'm only showing you how to print the contents of one of the table's cells. You can experiment with the above code and modify it to get access to the other elements of the table as well.
I use Chrome and have the result stored in the source. Then I simply copy the html code to the online html to csv:
Html to csv online editor
It works for me. Or if this is not your solution please try to describe better your problem.

What does h:dataTable bodyrows attribute mean

Does anybody know what h:dataTable bodyrows means? I tried a simple example, but I don't understand what it's supposed to do.
<h:dataTable bodyrows="d" value="#{index.publishDates}" var="d">
Is this some sort of shortcut for making a table? I don't see any rows because of the bodyrows annotation. If h:column does columns, what does bodyrows do?
I don't understand the documentation.
This must be a comma separated list of integers. Each entry in this list is the row index of the row before which a "tbody" element should be rendered.
In HTML, a <table> can have multiple bodies via <tbody>.
<table>
<tbody>...</tbody>
<tbody>...</tbody>
<tbody>...</tbody>
</table>
By default, a <h:datatable> generates only one body like below.
<h:dataTable value="#{[1,2,3,4,5]}" var="i">
<h:column>#{i}</h:column>
</h:dataTable>
<table>
<tbody>
<tr><td>1</td></tr>
<tr><td>2</td></tr>
<tr><td>3</td></tr>
<tr><td>4</td></tr>
<tr><td>5</td></tr>
</tbody>
</table>
The bodyrows attribute can be used to specify a commaseparated string of row indexes which should start as a new body.
<h:dataTable value="#{[1,2,3,4,5]}" var="i" bodyrows="0,2,4">
<h:column>#{i}</h:column>
</h:dataTable>
<table>
<tbody>
<tr><td>1</td></tr>
<tr><td>2</td></tr>
</tbody>
<tbody>
<tr><td>3</td></tr>
<tr><td>4</td></tr>
</tbody>
<tbody>
<tr><td>5</td></tr>
</tbody>
</table>
See also:
Can we have multiple <tbody> in same <table>?
HTML Dog HTML beginner tutorial - tables
MDN HTML element reference - table

xpages dijit.form.checkbox multiple values

Using xpages on domino 8.5.3 server.
Can a djcheckbox be use with muiltiple value field similar to a checkboxgroup ?
if so, would it be possible to supply a code snippet.
Thanks
dijit.form.CheckBox can deal with only one value and that's true for djCheckBox too as it's based on dijit.form.CheckBox.
You could combine several djCheckBox controls and let it look like a checkBoxGroup. Bind every djCheckBox to a viewScope variable initialized by a document item and write values back at document save.
Here is an example for UI similarity to checkBoxGroup:
<fieldset
class="xspCheckBox">
<table>
<tbody>
<tr>
<td>
<xe:djCheckBox
label="abcdefg"
id="djCheckBox4"
value="#{viewScope.abcdefg}">
</xe:djCheckBox>
</td>
<td>
<xe:djCheckBox
label="hijklmno"
id="djCheckBox5"
value="#{viewScope.hijklmno}">
</xe:djCheckBox>
</td>
<td>
<xe:djCheckBox
label="pqrstuvwxyz"
id="djCheckBox6"
value="#{viewScope.pqrstuvwxyz}">
</xe:djCheckBox>
</td>
</tr>
</tbody>
</table>
</fieldset>
I am not sure though what's the reason for your question and if it's worth the extra effort.

Adding new xElement after ALL found Descendants

I have an xDocument with multiple various xElements.
I can successfully find a specific xElement by searching via it's xAttributes & then Add a new xElement after it using the code below:
xDocument.Descendants("td").LastOrDefault(e => ((string)e.Attribute("ID")) == "3").Add(new XElement("b", "Just a test."));
The problem is that I wish to Add this new xElement after all found instances of the Descendants, not just LastOrDefault or FirstOrDefault.
My xDocument is created dynamically & there is no way before hand to know how many 'td' xElements with 'ID' = '3' that there are going to be.
Any help would be appreciated.
Thanks
ADDED CODE AS REQUESTED
<html> .... etc....
<body>
<table>
<tr>
<td>Image</td>
<td>Description</td>
<td>Date</td>
</tr>
<tr>
<td ID="1">*.jpg</td>
<td ID="2">some image</td>
<td ID="3">01/01/1901</td> <--CHANGING THIS PART OF CODE-->
<--THIS TABLE ROW REPEATS AN UNDETERMINED NUMBER
OF TIMES RELATING TO THE NUMBER OF FILES CONTAINED IN WHATEVER DIRECTORY IS BEING SEARCHED USING A FOREACH LOOP IN ANOTHER PART OF
THE CODE-->
</tr>
</table>
</body>
</html>
So I am trying to add a tag between the <td> with ID = 3.
This <b> tag also contains a string variable i.e.
new xElement("b", DateTaken)
& needs to be created at runtime and not hard coded as it relates to each loaded image at the start of the table row.
So I am trying to add this <b> tag to every occurrence of <td> with ID=3 & not just the first or the last.
Hope this extra info helps.

JavaFX hide text of column in tableview

I have a tableview and I want to show an image in the first column. My problem is I can't sort the column then. My idea is to set text in the column too and hide the text so it is only for the correct sorting set. Is there a way to do that? Or what other solutions are possible for my problem?
I think this is the perfect example what you wants to do.Still let me know if you have any issue.
Check here
I would have a look at TableColumn.setCellValueFactory() and TableColumn.setCellFactory(). The further is used to provide the actual cell value (used for sorting!), the latter is used to provide the rendering.
In other words: If you need the sort order, you must not change the content, but only the Cell rendering. The methods mentioned above let you do exactly this.
Hope that helps ...
You could do it with just CSS using text-indent. You would also need to set the image as a css background. You did not provide an code of your table, but below is some example:
HTML:
<table width="100%" border="1" cellspacing="1" cellpadding="1">
<tr>
<td class="hidetext image">Text 1</td>
<td>Some text to show</td>
</tr>
<tr>
<td class="hidetext image">Text 2</td>
<td>Some text to show</td>
</tr>
<tr>
<td class="hidetext image">Text 3</td>
<td>Some text to show</td>
</tr>
<tr>
<td class="hidetext image">Text 4</td>
<td>Some text to show</td>
</tr>
</table>
CSS:
.hidetext {text-indent:-9000px}
.image {background:url(http://www.madisoncopy.com/images/jpeg.jpg) no-repeat;}
See how in the left column the text does not show (but it is actually there just indented off the screen).
See this fiddle: http://jsfiddle.net/D297P/

Resources