Hello I am trying to isolate a image - src=\"https:\/\/steamcommunity-a.akamaihd.net\/economy\/image\/6TMcQ7eX6E0EZl2byXi7vaVKyDk_zQLX05x6eLCFM9neAckxGDf7qU2e2gu64OnAeQ7835Ff5GLNfCk4nReh8DEiv5dbPK47pbcyR_m4DQ68Ofs\/62fx62f\" from https://steamcommunity.com/market/listings/252490/Alien%20Red/render?currency=1&format=json and put it into a embed, I am using node fetch to get the overall json from the site how would I go about doing this? I currently have a inefficient way of doing it (getting every character) and it does not work at some points. Any help is appreciated.
There will probably be a better way to do this, but I've thrown this together:
/(?!src=\\?)\"(.*?)\"/g
Just match this with the return and you will have your URL.
If you are making this a static thing (ie. it will be ONLY this image, and will not be editable depending on a command or variable, then just put it as a string & make that the embed's image.
If not, then fetch the contents, then get the results_html portion of it. Then, find where it says "<img" and isolate it until the next ">". Sub string that bad boy from where is says src= to when it sees \" . (or " depending on how it interprets this), and use that sub string as the embed's image attachment.
Related
I'm trying to fetch a specific part from the youtube URL. For example:- if the user has put the following URL https://www.youtube.com/watch?v=KOjE7cQ0FkA
I would fetch the following part KOjE7cQ0FkA using the below method
String url = urlController.text; // https://www.youtube.com/watch?v=KOjE7cQ0FkA
List urlList = url.split("=");
String urlCode = await urlList[1];
but sometimes people copy the video link from mobile and this is how it would appear
https://youtu.be/KOjE7cQ0FkA In such case, above code wouldn't work
So how can I detect which URL is put by the user and perform a split operation accordingly
Sorry, if my question sounds stupid but I hope you got an idea that what I'm trying to achieve here
There are many ways to approach this. One way would be to parse the URL (have a look at this SO question).
With this approach you could check if the query parameter watch is present -- if not the URL is probably in the mobile format.
Another approach would be to define a regex expression for the Video-ID (KOjE7cQ0FkA) part of the URL. That way you can extract the Video-ID regardless of the format of the URL. I would probably go with that approach.
Your regex could look like this: ([a-zA-Z]+(\d[a-zA-Z]+)+)
I used this site to create the regex. You probably need to modify it a little bit. Also if the ID has a fixed length that is a great criteria to filter by.
I have this NodeJS application, that uses Jade as template language. On one particular page, one text block is retrieved from the server, which reads the text from database.
The problem is, the returned text might contain line-breaks and links, and an operator might change this text at any time. How do I make these elements display correctly?
Most answers suggest using a new line:
p
| this is the start of the para
a(href='http://example.com') a link
| and this is the rest of the paragraph
But I cannot do this, since I cannot know when the a element appears. I've solved how to get newline correct, by this trick:
p
each l in line.description.split(/\n/)
= l
br
But I cannot seem to solve how to get links to render correctly. Does anyone know?
Edit:
I am open to any kind of format for links in the database, whatever would solve the issue. For example, say database contains the following text:
Hello!
We would like you to visit [a("http://www.google.com")Google]
Then we would like that to output text that looks like this:
Hello!
We would like you to visit Google
Looks like what you're looking for is unescaped string interpolation. The link does not work in the output because Pug automatically escapes it. Wrap the content you want to insert with !{} and it should stop breaking links. (Disclaimer: Make sure you don't leave user input unescaped - this only is a viable option if you know for sure the content of your DB does not have unwanted HTML/JS code in it.)
See this CodePen for illustration.
With this approach, you would need to use standard HTML tags (<a>) in your DB text. If you don't want that, you could have a look at Pug filters such as markdown-it (you will still need to un-escape the compilation output of that filter).
I need an alternative to Google Custom Search for a website I look after, it has to be something that will crawl a website, index it, allow fiddling of priorities, and then allow search queries via REST or something similar and return XML or JSON etc. It needs to run on a Windows Server instance.
So, I'm up and running with http://www.opensearchserver.com/ and it seems to do the trick, but can't, for the life of me, work out how to get thumbnail images in the results? I've searched the documentation and read everything I could, but can't find out how to do this (or how to get my head around it).
I'm crawling standard web pages and they all have thumbnail meta data, which I'm assuming should be able to be parsed somehow for results and included in the JSON results?
Any pointers at all would be very helpful, thanks!
I figured this out, in case anyone else is struggling, here's how I did it. The answer is in the documentations, it's just not that simple.
Read: http://www.opensearchserver.com/documentation/faq/crawling/how_to_extract_specific_information_from_web_pages.md - it contains the method
Assume you set up a 'web crawler' index.
Assuming you're using a meta thumbnail like this:
<meta name="thumbnail" content="http://my_cdn.com/news/images/29637.jpg">
Go into Schema / Fields. Add a new field called 'thumbnail' with index no, store yes, vector no, analyser Text, copy of blank. Save that.
Now go to schema / parser list, edit HTML parser. Go to 'field mapping', now add a new regex for the thumbnail in the html. We map from the 'htmlSource' to the thumbnail' with the matching regex.
My imperfect regex (that works though) is:
htmlSource -> linked in: thumbnail -> captured by:
(?s)<meta name="thumbnail" content="(.*?)">
Now SAVE this and go to crawl/manual crawl, enter a url that has a thumbnail and then check if the field now appears in the list below when it's read. If not check your regex, and check you actually saved the HTML Parser changes.
To get the thumb in your results, simply add the fieldname to the JSON you send with the query:
"returnedFields": [ "
"url",
"thumbnail"
],
Need a way to pass a value between to pages using URL query strings if possible. However everytime I add "?customquery=customvalue" at the end it ends up to the 404 page of the website.
I want to basically make it look like this.
https://example.com/somedepartment/sample.nsf/page/hello+world?customquery=customvalue
hello+world is a document that is equivalent to a webpage.
I tried this plus a javascript that collects the strings after the number sign and it works.
https://example.com/somedepartment/sample.nsf/page/hello+world#customvalue
However, I couldn't use the hash sign because they told me not to use it and use another unique symbol instead. I am not aware of any symbols that could work the same with hash sign. If there is, please enlighten me.
Apparently, I was able to find an answer.
https://example.com/somedepartment/sample.nsf/page/hello+world?OpenDocument&RandomParam=sample
Now I could pass values by means of this format. Basically it has to be preceded by "OpenDocument" parameter before putting custom ones.
This documentation also helps: http://www.ibm.com/developerworks/lotus/library/ls-Domino_URL_cheat_sheet/
In some website for which I have access, there are some input fields. In the sixth field I need to enter some input string from a list of 10000 strings, then a new page appears, for which I would just need to count the number of lines. Finally I would like to get a table with two columns like input string and number of resulting lines. Since I have to manually enter the info for all the different 10000 strings, I wonder therefore what is the best approach to enter a string into a generic formular field and get the resulting text. I heard about curl but I am not sure whether this is the easiest one.
P.S.
Example of interactive way: I type some string o words into google search and then I get a new page with the search results. Previously I have introduced my google username and password, so the results will be probably filtered according to my profile.
Example of non-interactive way: A script somehow introduces my user information, search query and saves to some text file the search results. Imagine the same idea but for a more complicated website like this.
What you want to do is to send a HTTP POST with specific data. This can be done with any proper HTTP client code, and one such is libcurl (or the pycurl binding or even using the curl command line tool). On the response from the post, you probably get a redirect and then the results, or you need to do a separate request for the results and then you're done and go back to do the next POST. Repeat until all POSTs are done.
What you may need to take into account is that you may have to deal with cookies and possibly to follow a redirect from the POST. A good approach is to record a "manual session" as done with a browser (use firebug or LiveHTTPHeaders etc) and then use that recording to help you repeat the same thing with a HTTP client.
A decent tutorial to get some starting up details on this kind of work can be found here: http://curl.haxx.se/docs/httpscripting.html
You could also use JMeter to run all the posts. You may use the CSV input to set the 10000 strings. Then you save the result as xml and extract the necessary data.