Need Direction - Web Bot Creation - web

I want to create something that will look at a specific location on a website and read the value at this location. Then take that value and put it into a already created block of text.
What do I need to start researching to create something like that? Simple direction such as key words to Google and such would be extremely helpful.

Do you want to get the value from your own website or someone else's website?
If you want to get the value from an HTML element, you can easily do so using Javascript/jQuery.
If you are looking to write a script that parses the HTML from somebody else's website, you will need a HTML parser. If you plan to use Python, look into Beautiful Soup.

Related

Converting-punycode-with-dash-character-to-unicode

This is in reference to this topic on the page here:
Converting punycode with dash character to Unicode
//Javascript Punycode converter derived from example in RFC3492.
I don't know where to place the input 清华大学.cn domain to get the Javascript to work. I am not a real a programmer.
I want to use the js code on this page to convert IDN domain names to penycode if possible. I'm using a ColdFusion html page to process the JS. Then I'll save the penycode to our SQL database.
Example: 清华大学.cn needs to be converted to penycode.
I can use any number of online converters but that won't help. It has to be automated with a script. FYI, the penycode for 清华大学.cn is xn--xkry9kk1bz66a.cn.
HERE IS MY PROBLEM:
Even after copying the js code into Dreamweaver, I have no idea where to place the domain 清华大学.cn into the Javascript code be converted. I can't see a hint where the input is - if any. I can figure things out okay if there was some hint at where to begin.
I just need to know where to place the input or someone to tell me this can't be done with the Javascript example on that page.
We are using ColdFusion 19 and SQL on our under construction domain marketplace website. We want to accept IDN domains to be listed and I am hoping your JS will do what I want.
If I'm totally wrong then perhaps someone can suggest another js code that will convert the domain to correct penycode.
After searching I found an close answer I can at least work with, I hope. I needed an html input form to process the Javascript.
I found that information here.
How to convert domain names with greek characters to an ascii URL?
I then copied the page, inserted the Javascript as puny.js and it works. Now I need to figure out how to somehow capture the input "id" and "label for" to save the result into SQL using ColdFusion. Not sure if this can be done. But at least the somewhat answers my question. Maybe it's the best I'm going to get here on Stackoverflow.

Can someone guide me how to collect a list of url address in the tab using python?

I'm trying to collect a list of "https://..." and hope to store them in csv file. I can do them manually such as use excel, copy the urls from the website of interest and paste them one by one. But it's tedious and definitely would take lot of time.
can someone suggest and guide for a faster way?
If you just need the addresses quickly from one page you could run this javascript snippet document.links.forEach(link=>console.log(link.href)) in the console of your browser, this will output all of the links on that page.
If you want to use python to scrape the page I would suggest taking a look at this question on stackoverflow, this uses the beautifulsoup framework.
If there is dynamic content loaded on the page with javascript it's probably better to use something like Selenium, relevant stackoverflow question

waiting for the website to change something

I am a student and in the school website, what I want to do is that I want to busy wait on the certain URL and check if the class i want to register for is open or not. I was wondering if there was a way to constantly check on the website(busy waiting or otherwise) to see if the class is open or not. There is a table Rem where it shows the number of places remaining in the User Interface.
Also what language would you use to solve this problem?
Yes you can. but for that you will probably need to create a script that fetches the value of data from that table.
So something like web scraping should work.
I would definately use php for this stuff.
Google web scraping and you can code the script.
I am not sure if this is the exact thing that will help you, but what you need to do is something similar - See Here

can you have "variables" in text in google sites?

Sorry, this is a bad question. I don't even know what the title should be. I'm a total noob at making websites so this might be easy to find but I just don't know the terminology to search for. I cannot find anything about how to do this...
What I want to do is have something like references/variables that I can use in a block of text and it will automatically get replaced with whatever value should be there. Best way I can think of to describe it would be if I was using the site as a design doc for a game or something, I would be able to type in [Title] or something similar on any page and when it loads that text would be replaced with whatever my Title is. That way If I ever change titles, names, classes, races, places, items, etc... they would only have to be changed in 1 place and the change would be reflected everywhere.
I notice if I add a link to a page it will automatically use the Title of that page as the text of the link. That is almost exactly what I want. Except when I change the Title of the other page the text of the link remains as the original text. It doesn't get updated to the new Title and that is not at all what I want.
Also, I want to do this in Google Sites and as simply as possible. I don't really want to use a database. I was hoping Google Sites would have some kind of funcionality for this.
I don't believe this is possible (on Google Sites) and likely you need to consider a hosted solution.
Quoting the answer from this relevant post:
You should consider hosting your solution using Google's App Engine
instead of Google Sites. You can set it up so it uses PHP (see link
below), you can configure it to use your domain name and you get
enough CPU, disk and bandwidth allowance to serve around five million
page views for free each month, if you are serving more than that,
their prices are extremely competitive.
Google App Engine:
http://code.google.com/appengine/docs/whatisgoogleappengine.html How
to setup PHP using Google App Engine: http://blog.caucho.com/?p=187
Also I'm not sure how your PHP skills are but if you're unfamiliar with it then this should help to get you started.

Extract localizable content from a HTML page

I need some advice on the best aproach to a feature I need to implement in a project I'm working on.
Basically, I need to be able to extract all localizable content (i.e. all the strings) from a HTML page. I really don't want to have to go and write a HTML parser. The application is written in C#.
Has anybody got any experience with this, or can anyone recommend an existing library that I could use to accomplish this?
Thanks.
You do not have to write your own parser. Fortunately somebody else already did that.
To parse HTML file, you can use HTML Agility Pack.
In this case you would receive Document Object Model, which you can walk just like any other DOM. Please find these examples:
https://web.archive.org/web/20211020001935/https://www.4guysfromrolla.com/articles/011211-1.aspx
http://htmlagilitypack.codeplex.com/wikipage?title=Examples&referringTitle=Home
And this question:
How to use HTML Agility pack

Resources