auto fill web form with dynamic data - web

I am trying to create shipping labels for a lot of different customers by filling forms on ups website. Is there a programmatic way of doing this?
It is different from the usual auto-fill web form. Because the name, address, etc. fields aren't filled with "constants". 100 customers needs 100 different forms.
Before I dig into python-mechanize, or autoit IE.au3, is there an easier way doing this?

UPDATE 2019-09-09: Generally, would no longer recommend FF.au3 unless you're very much into AutoIt or have tons of legacy code. I'd rather suggest using Selenium with a "proper" programming language.
You could check out FF.au3 for AutoIt. Together with FireFox and MozRepl it allows for web automation, including dynamic websites/forms.
The feature-set should be sufficient for your task (eg. XPath for content extraction and for filling out forms, but just have a look at the link and you'll get an idea of what it can do). It's also fairly easy to use.
The downside is that it's not the most performant approach and I've encountered a bug, but that doesn't say much. Overall it did work well for me for small or medium-ish projects.
Setup:
Install AutoIt: https://www.autoitscript.com/site/autoit-tools/
Get the FF.au3 lib: https://www.thorsten-willert.de/index.php/software/autoit/ff/ff-au3
Get an old Firefox version <v57 or ESR (see remarks on ff.au3 page above)
Install MozRepl: http://legacycollector.org/firefox-addons/264678/index.html

Related

Best method to screen-scrape data off of many different websites

I'm looking to scrape public data off of many different local government websites. This data is not provided in any standard format (XML, RSS, etc.) and must be scraped from the HTML. I need to scrape this data and store it in a database for future reference. Ideally the scraping routine would run on a recurring basis and only store the new records in the database. There should be a way for me to detect the new records from the old easily on each of these websites.
My big question is: What's the best method to accomplish this? I've heard some use YQL. I also know that some programming languages make parsing HTML data easier as well. I'm a developer with knowledge in a few different languages and want to make sure I choose the proper language and method to develop this so it's easy to maintain. As the websites change in the future the scraping routines/code/logic will need to be updated so it's important that this will be fairly easy.
Any suggestions?
I would use Perl with modules WWW::Mechanize (web automation) and HTML::TokeParser (HTML parsing).
Otherwise, I would use Python with the Mechanize module (web automation) and the BeautifulSoup module (HTML parsing).
I agree with David about perl and python. Ruby also has mechanize and is excellent for scraping. The only one I would stay away from is php due to it's lack of scraping libraries and clumsy regex functions. As far as YQL goes, it's good for some things but for scraping it really just adds an extra layer of things that can go wrong (in my opinion).
Well, I would use my own scraping library or the corresponding command line tool.
It can use templates which can scrape most web pages without any actual programming, normalize similar data from different sites to a canonical format and validate that none of the pages has changed its layout...
The command line tool doesn't support databases through, there you would need to program something...
(on the other hand Webharvest says it supports databases, but it has no templates)

Can I (relatively easily) test ZK interfaces in Watir?

How easily will Watir interact with a ZK interface? If "not at all" do you have any recommendations for automated testing of the web interface for me?
Edit: Another way to put this would be can I test a Spring/ZK generated page (Ajax/JScript). I found another issue too: I need not to use a proxy to test (like Sahi does) if at all possible.
Edit: I have been testing ZK interfaces now for quite some time. With a higher knowledge of Watir (and now webdriver) I can say it's definitely possible. Timing isn't usually an issue, but finding the elements certainly can be as the ids are dynamically generated. I recommend a strong, maintainable, object oriented approach with a powerful and dynamic DSL, or you'll be listing every element on the page in a custom built object library of some sort. So... it works, but it needs extra effort.
If you're talking about this: http://zssdemo.zkoss.org/ you can take a look at the DOM output, it's atrocious, but possible to test it with Watir. I've dealt with some apps that generate awful output like that. It makes for a challenge. :) Search the Watir google group for testing Ajax, plenty of people do it.
HTH,
Charley

How to organize libraries and links of programming information

I have an email account whose sole purpose it is to store interesting and useful links to programming articles, code, and blog posts. It has become a little knowledgebase of sorts. I can even do a search on it, which is pretty cool.
However, after using this account for a couple of years, I now have 775 links, and it has become this big blob of amorphous information, most of which I have never looked at again. I take comfort in the fact that, if I really needed to, I could find something in there again, if I even remember putting it in there in the first place. But it has developed a "smell," if you will.
How are you organizing your programming library of cool stuff? Do you have a system or tool, and is it better than the way I'm doing it?
I would use something that is made for storing bookmarks. I use delicious.com for all of my bookmarks. The tagging system works perfectly for technology sites because you can tag each page with a specific language or tech abbreviation. This coupled with the Delicious Bookmarks plugin will make it very easy to tag sites and get back to them.
Use one word or abbreviations for languages: java c# vb.net python
Use acronyms for technologies: wpf wcf
I used to use the standard bookmark system in the browser but since I bounce around through various machines and browsers throughout the day I started to use bookmark synchronizers. Both Foxmarks and the one that google came out with. But neither I was completely satisfied with. Plus delicious has a great web interface to it as a decent api to extend for your own purpose.
IMHO, using Evernote to store this information is great.
1) you can go back and search through it easily
2) organize by tags and "notebooks collections"
3) available on multiple platforms (even mobile)
4) available as browser plugins (for direct archiving in-browser)
The only drawback is it's copy-paste functionality is a little lacking (it sometimes doesn't import/display the CSS styles correctly).
Otherwise, it's a great alternative to store web "bookmarks" (and also archive the content at the same time).

Good image gallery engines

What are the best open source image gallery engines? Both stand-alone, and for existing frameworks such as Wordpress or Drupal.
Hopefully we can build a good list here over time.
Gallery is the classic choice. It has skins, security layers, heaps of plugins, etc, but can be run with the default settings easily if you want to. I've used it for years.
GOOD QUESTION, lots of people ask this in many web forums so hopefully we will get some good responses to this, and have a good list of solutions.
Personally I always used to say something like Gallery or some other OS script, but recently I have found myself using more and more something like a simple php script which just spits our a list of images (maybe 7 a page) but relying on a Javascript library such as mootools or Ext to provide all the functionality, particularly for small or individual galleries. Im particularly loving the noobslide mootools class at the moment which has some lovely gallery effects.
Noobslide
I suppose at the end of the day its all down to what you need, there will be no one answer that fits all but a number of different solutions will hopefully show up here that will suit different peoples needs.

What is a good web-based Grid that accepts Excel clipboard data?

Any good recommendations for a platform agnostic (i.e. Javascript) grid control/plugin that will accept pasted Excel data and can emit Excel-compliant clipboard data during a Copy?
I believe Excel data is formatted as CSV during "normal" clipboard operations.
dhtmlxGrid looks promising, but the online demo's don't actually copy contents to my clipboard!
I'm currently using dhtmlxGrid and we have the Excel copy/paste functionality working. dhtmlXGrid is the most full featured javascript grid package that I've found.
On their website, dhtmlXGrid claims to support Clipboard functionality in the Professional version. (However, I noticed the Sample on their site isn't working on my Firefox. EDIT: It's probably the permissions issue that Nathan mentioned.)
In any case, we had to do some extra work to get the exact Excel copy and paste functionality we wanted. We essentially had to override some of their functionality to get the desired behavior. Their support was pretty good in helping us come up with a solution.
So to answer your question, you should be able to get them to support copy and paste if you purchase the Professional version. I'm just warning you that it may take some additional work to fine tune that behavior.
Overall, I'm happy with dhtmlXGrid. We use a lot of their features. Their support is pretty good. They usually take one day to respond since they are in Europe (I think). And Javascript is by its very nature open source so I can always dive in when I need to.
Not an answer, but a warning: my company bought the 2007 Infragistics ASP.NET controls just for the Grid, and we regret that choice.
The quality of API is horrible (in our opinion at least), making it very hard to program against the grid (for example, inconsistent naming conventions, but this is just an inconvenience, we have complaints about the object model as well).
So I can't say that I know of a better option, I just know I will give a try to something else before paying for Infragistics products again (and the email support we got was horrible as well).
I was wrestling with this problem several years ago (2004 I think). We ran into the problem that Firefox doesn't allow scripts to read the clipboard by default (but you can grant access to the clipboard).
There's other ways of reading the clipboard data as well...Flash, for instance, can read the clipboard. There's a good article on ajaxian to explain how do to this behind the scenes.
In the end, we couldn't find a web-based Grid that fit the bill, so we had to create our own in a mixture of Actionscript and Javascript.
I'd hate to be Captain Obvious here...but what about a plain old .NET Gridview control? You can copy Excel data into it and out of it...and you can run it on any system with the .NET platform installed.
http://dhtmlx.com/dhxdocs/doku.php?id=dhtmlxgrid:clipboard_operations

Resources