Scraping dynamic data from IE using VBA

Scraping dynamic data from IE using VBA - excel

I am trying to use VBA to automate orders in Internet Explore. At this point, I have trouble using the commands like "getElementsByClassName", "getElementByID" and so on. I searched about those commands, read and tested some examples, but I still can't figure out how they work. I will state my goal and my doubts.
The HTML code:
<div class="z-row-content" id="s0GQm-cell">
<input class="form-control z-textbox" id="s0GQm" type="text" value="">
</div>
My goal: I just need the id "s0GQm" to point my orders. However, this id is partially dynamic and changes everytime I reset the window. The first 4 characters will change and only the "m" will stand. These dynamic characters repeat for all over the HTML code, so if I get it from anywhere I just needed to update all the id's and everything would run fine.
Therefore, I would like to understand how these functions can be used to identify this exact dynamic code. More specifically, what do these codes do exacly? What is the ".item(0)" or length of "getElementsByClassName"?
I appreciate any help.

Related

Pulling Django variable data using Python

I am currently working on a project to grab live match data for eSports (yeah I'm a nerd, aren't we all?). So I am pulling HTML from a page and checking if my favorite team is playing and if so, go the page and check the scores every few minutes to get updates. The part I am now having trouble with is getting the names of teams and the match ID's. The reason for this is because when I grab the HTML data, inline JS variable names are included instead of their values which is what I need. Here is an example of what I am talking about:
<div class="col-xs-12 schedule-{[{schedule.state}]}" ng-if="schedules.length>0" ng-repeat="schedule in schedules track by schedule._id">
I need a way to grab the value of "schedule.state". Hopefully you guys can help. Also if there is already a similar solution I would be happy to be directed to it! Cheers!
EDIT
I have just realized through a little more researech that the variables are not JS but Django. Given the same problem just on how to get Django variable data instead of JS.

ModX: Display multiple pages on one page -How to implement

I understand I am meant to use Ditto to do this but am unsure where to go from there.
Currently, I have a Template with all my TVs on it along with several pages using the template that are stored under a parent. The Ditto code I am using is:
[!Ditto? &parents=`173`&orderBy=`createdon ASC` &tpl=`showtemp` &display=`100` &total=`100`!]
However, when I view the page I get the error:
&tpl either does not contain any placeholders or is an invalid chunk name, code block, or filename. Please check it.
My chucnk ('showtemp') looks like:
<div class="showmedia">
[*showmedia*]
</div>
<div class="showright">
<h2>[*showname*]</h2>
<h2>[*showtime*]</h2>
</div>
As far as the set up goes I am not sure if I am going about it right.
Do I make a Chunk as if it were a normal template with TVs, then replicate it as a proper template, create the resources and go from there?
If someone could give me a step by step on how to do this correctly I would be very grateful! Thanks

You're getting that error message because your placeholder syntax is incorrect in this context.
[*templateVariable*] is correct for displaying the current resource's TVs, but in a chunk to be used within a snippet loop such as in Ditto you need to format them as placeholders like this: [+templateVariable+]
I would recommend going through each step in the following tutorial, it will help you understand all the MODX fundamentals:
http://codingpad.maryspad.com/2009/03/28/building-a-website-with-modx-for-newbies-part-1-introduction/

Cannot locate a text_field with dynamic id

<div id="temp_1333021214801">
<input type="text"/>
</div>
$browser.text_field(:xpath,".//*[#id='temp_1333018770709']/input").set("apple")
I am getting error "unable to locate element", because the ID changes dynamically.
Please help me to set the text in the text field.

It seems like your dynamic id is temp_ so this should do it given information above:
browser.div(:id, /temp_\d+/).text_field.set 'something'
Issues with my solution is that it assumes id will always be temp_ regex matching any number set consecutively, which seems to be the case with your sample above. Also, it assumes there is no other div(:id, /temp_\d+/) combination in the DOM of that page, most likely should not be an issue.

If you have dynamic IDs I can suggest the following:
Code to object counts. For example
$browser.text_field(:index => 2)
gives the third text_field on the page.
Code to what is around the thing you're trying to find.
$browser.div(:name => 'mydiv').text_field(:index=>2)
gives the third text field in the div called 'mydiv'.
HOWEVER
If your front-end is less-than-testable in this way I highly suggest you put time into thinking over your commitment to automated testing in the first place. Any minor change to the software is going to have you working until 9pm pulling your hair out and rocking back and forth as you update all your scripts, so unless code maintenance is your weekend hobby think about semi-automation or exploratory testing or manual scripts. Talk to development (whomever that might be. It might be you!) or the higher-ups (unless that's you too) to see if it can be made more testable. Also don't use xpaths unless you take some deviant pleasure in it.
Hope that was helpful, I can't do anything specific without the source HTML.

Managing Unregistered User Posts by Screening

I am considering allowing users to post to my site without having them register or provide any identifying information. If each post is sent to a db queue and I then manually screen these posts, what sort of issues might I run into? How might I handle those issues?

Screening every post would be tedious and tiresome. And prone to annoying admin spam. My suggestion would be to automate as much of screening as possible. And besides, providing any identifying information does nothing to prevent spam (a bot will just generate it).
A lot of projects implement recognition system: first the user has to post 1-2 posts that are approved, then by IP and (maybe) a cookie he's identified as a trusted poster, so his posts appear automatically (and later can be marked as spam).
Also some heuristics on the content of the post could be used (like amount of links in the post) to automatically discard potential spam posts.

The most obvious issue is that you'll get overwhelmed by the number of submissions to screen, if your site is sufficiently popular.
I would make sure to add some admin tools, so you can automatically kill all posts from a particular IP address, or that match a particular regex. That should help get rid of obvious spam faster, but again, you'd have to be behind the wheel for all of that.

Tedium seems to be the greatest concern – screening posts manually is effective against spam (I'm assuming this is what you want to weed out) but very boring.
It could be best fixed with a cup of coffee and nice music to listen to while weeding?

I've found that asking for the answer to a simple question sent the browser as an image (like "2 + 3 - 4 =", a varient of a 'captcha' but not so annoying), with a wee bit of Javascript does quite well.
Send your form with the image and answer field, and a hidden field with a "challenge" (some randomly generated string). When the user submits the form, hash the challenge and the answer, and send the result back to the server. The server can check for a valid answer before adding it to the database for review.
It seems like a lot of work up front, but it will save hours of review time. Using jQuery:
<script type="text/javascript">
// Hash function to mask the answer
function answerMask()
{
var a = $('#a').val();
var c = $('#c').val();
var h = hex_md5(hex_md5(a) + c);
$('#a').val(h);
}
</script>
<form onsubmit="answerMask()" action="/cgi-bin/comment.py" method="POST">
<table>
<tr><td>Comment</td><td><input type="text" name="comment" /></td></tr>
<tr><td># put image here #</td><td><input id="p" type="text" name="a" size="30" /></td></tr>
<tr><td><input id="c" type="hidden" value="ddd8c315d759a74c75421055a16f6c52" name="c" /></td><td><input type="submit" value=" Go "></td></tr>
</p>
</form>
Edit update...
I saw this technique on a web site, I'm not sure which one, so this idea isn't mine but you might find it useful.
Provide a form with a challenge field and a comment field. Prefix the challenge with "Pick the third word from: glark snerm hork morf" so the words, and which one to pick, are easy to generate on the server and easy to validate when the form contents come back.
The point is to make the user do something, apply a few brain cells, and more work than it's worth for a script kiddie.

posts that attempt to look legit but aren't
the sheer volume
These are the issues that I see on my blog.

mailto link for large bodies

I have a page upon which a user can choose up to many different paragraphs. When the link is clicked (or button), an email will open up and put all those paragraphs into the body of the email, address it, and fill in the subject. However, the text can be too long for a mailto link.
Any way around this?
We were thinking about having an SP from the SQL Server do it but the user needs a nice way of 'seeing' the email before they blast 50 executive level employees with items that shouldn't be sent...and of course there's the whole thing about doing IT for IT rather than doing software programming. 80(
When you build stuff for IT, it doesn't (some say shouldn't) have to be pretty just functional. In other words, this isn't the dogfood we wake it's just the dog food we have to eat.
We started talking about it and decided that the 'mail form' would give us exactly what we are looking for.
A very different look to let the user know that the gun is loaded
and aimed.
The ability to change/add text to the email.
Send a copy to themselves or not.
Can be coded quickly.

By putting the data into a form, I was able to make the body around 1800 characters long before the form stopped working.
The code looked like this:
<form action="mailto:youremail#domain.com">
<input type="hidden" name="Subject" value="Email subject">
<input type="hidden" name="Body" value="Email body">
<input type="submit">
</form>
Edit: The best way to send emails from a web application is of course to do just that, send it directly from the web application, instead of relying on the users mailprogram. As you've discovered, the protocol for sending information to that program is limited, but with a server-based solution you would of course not have those limitations.

Does the e-mail content need to be in the e-mail? Could you store the large content somewhere centrally (file-share/FTP site) then just send a link to the content?
This makes the recipient have an extra step, but you have a consistent e-mail size, so won't run into reliability problems due to unexpectedly large or excessive content.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Scraping dynamic data from IE using VBA - excel

Related

Pulling Django variable data using Python

ModX: Display multiple pages on one page -How to implement

Cannot locate a text_field with dynamic id

Managing Unregistered User Posts by Screening

mailto link for large bodies

Categories

Resources