DIFFERENT webpages Same URL - excel

I hope this finds you in good shape.
I'm attempting to scrape data for my colleagues, and I've noticed that various websites can share the same URL. This has given me problems because I won't be able to scrape the data I require. Is there a solution to this.
Colgate's website in question is depicted below. The corporate vice-president tab and the leadership tab share the same URL. Can someone tell me how to scrape their names and roles or tell me how to find their individual URLs?
https://www.colgatepalmolive.com/en-us/who-we-are/our-leadership-team

You’re going to need more complex logic than just screen-scraping. The nature of object-oriented web scripting means that these links don’t work the way you think they do.
If you imagine a web page as static HTML, then each link is a discrete URL that the web server receives, interprets, and displays.
But most web pages aren’t static HTML anymore. When you click on the picture for Joe Smith you are not sending a message to the web server to retrieve and send another static HTML page that contains Joe’s bio. Rather your click is sending a message to the “Joe Smith Object” and telling it “please display the bio portion of your object.” The message never says “open the Joe Smith” bio, it simply says “open your bio. How does it know which one to open? The “display your bio” message only gets sent to whichever object the user clicked on. If Joe’s,object gets the message, the request Is for Joe’s bio. If Jane’s object received the message, the request is for Jane’s bio.

Related

How to implement logic based on external redirects?

I'm building a website for a client (real estate), and on the website are links to a different website (adverts for properties). My client routinely activates and deactivates these adverts when he rents out a certain property.
The hrefs on my links look something like this:
<a href="https://domain.xx/estate/idxx/des-crip-tion-xx-xx-x-xx/">. If the advert is indeed active, it just takes them to the advert. If it is not active, however, the website in question redirects the user to https://domain.xx/estate-for-rent/city/, effectively sending the users to my client's competition.
I wish to implement some logic where, before handing the users over to the other website, the server checks to see if it is redirected to https://domain.xx/estate-for-rent/city/, or some similar logic, and if so, uses preventDefault, or something, and notifies the user that the advert is not available instead of sending them to the other website.
I wonder if I can use the fact that only if the advert is active does the resulting url in the users browser window (after they've been directed to the other website) match the url in my href. Can i somehow get the server to try to access the url in my href, and have it see where it gets redirected, and then do something based on that? On the back-end, I'm running NodeJS with Express by the way, and if it matters, I'm relying heavily on EJS for templating. Thanks in advance for any help!
This sounds more like a problem you could solve on the client as opposed to the server. For example, at a high level here's how I would do it:
Handle the click event for each link (really simple to do a catch-all with jQuery)
Fire off a HEAD request via AJAX to the destination URL (this would be much more efficient than a GET but depends on the external service supporting this verb)
Use the status code to determine what to do next (e.g. 2xx allow redirect, 3xx pop a message and block)

Kentico 8.2 Newsletter Link and unsubscribe link

I have created a contact form under Forms with first name, last name, and email that is designed to sign up people for a newsletter. I then created a page so when people click on the link placed on the home page it takes them to a page with the contact form.
Right now when I test the subscribe form out, the data does to to the "back office" where it can be retrieved. However, the information I entered is still in the text fields and, unless you notice the small flash of the web page, one might think nothing happened.
I'd like to know how (or be directed to somewhere in the Kentico 8.2 Documentation) I can make it so that the fields clear and a message appears saying "You have been subscribed to the newsletter." That message can either appear on a separate page on the web site, or send a message to the user email, or both. In the Email Marketing part under the templates there are Subscribe and Unsubscribe templates, but I don't know how to use those.
The other issue is creating an Unsubscribe link. Ideally that will open up to a new page saying "You have been unsubscribed." Kentico 8.2 has an unsubscribe page you can create where the user enters in an email address and then hits the Unsubscribe Request button, but I'd rather not do that. As it stands, I did create a page with that form and tested it, but it doesn't seem to work.
When you edit your form, under general tab, there are settings for what will happen after the form is submitted:
Display Text
Redirect to URL
Clear Form
Continue Editing.
Currently you're using the standard Forms application for something which can be managed through the Newsletter/Email Campaign module. Read the documentation more on how to configure this vs. using the Forms application.
Essentially the steps you will do are:
Create your newsletter following the directions in the linked documentation.
Place a newsletter subscription webpart on your page template and configure it to the newsletter you want them to subscribe to.
Use the out of the box unsubscribe feature to allow users to unsubscribe to your newsletter. No need to add any page to the content tree but you can if you want OR just use the OOTB functionality.
If you follow the documentation you should be able to get it setup properly vs. using an online form.

custom title and description of physical web notification

Reply from : https://github.com/google/physical-web/issues/595
For example, I am transmitting www.starbucks.com
http://www.starbucks.com as the URL.
My phone looks for physical web pages and say it detects www.starbucks.com
and shows it to me in my physical web present in my chrome.
As a user, this is how it will appear to me presently
» Now this does not convey much information to me.
» The text "Order while you wait" has been taken from the metadata
description of the page( as far as I know) and the title "Starbucks" *has
been taken from the *title tag.
Now, say if I can custom define these parameters, for example like this
Here, I custom defined the text of the same starbucks URL that my phone's
physical web scanned for.
This adds for relevancy to the URL. A user gets a clear message. Also, it
allows the stores to convey an effective contextual message.
This is possible when you use ReactJS and JSX?, because only you have one HTML file and always show the title default that is in this html, even if you change it with document.title = "other title" in the notification show the first and not the new title
The text shown in the Physical Web notification is strictly given by the target website and you can influence it only there.
The Chrome is actually not analyzing the target website. Its a Google server (Physical Web Service) that analysis it and this one provides information to Chrome. You seem to need changing the title instantly and often. So be careful about caching of already resolved webs on the server.
The website analysis does not execute any Javascript. It takes only what is written in HTML directly. So the trick with document.title wont work.
But there is a different way how to get the notifications. Look at the Google Nearby Notifications. In summary this works based on Eddystone-UID. You register your UID with the service and configure to redirect to target website. But in the configuration you can specify the title and description. Look at the mentioned page for the details.

String decode extracted from web log

Working on log analysis, I found a string with odd syntax and contents, by parsing page field of web log (a webshell?):
/campaign/(f(2ewt_ygmarlagti7sw4tvhj0zk17klgxnhnk1aawgtixm5x-2qmvsvouolvaffrhitumf4wnk496p2dbzmkc3ywfloksiixdtrlawmt78f_mg-45kdzzpdlnogeishkcgtohttp://www.facebook.com/externalhit_uatext.phptelf6gqmu2ia0i1j5lfgmcvw1))/home/index
Could someone guide me how to decode this string and find a clue ? Also why is the following:
http://www.facebook.com/externalhit_uatext.php
included in the string?
I am quoting https://www.facebook.com/externalhit_uatext.php
Facebook allows its users to send links to interesting web content to other Facebook users. Part of how this works on the Facebook system involves the temporary display of certain images or details related to the web content, such as the title of the web page or the embed tag of a video. Our system retrieves this information only after a user provides us with a link. You may have found this page because a Facebook user sent a link from your website to other Facebook users. If you have any questions or concerns about any links or content sent by one of our users, please contact us at legal#facebook.com.
My guess is that someone posted a link to your website to Facebook and someone clicked on that link (visited your website through that link). The (probably) encoded stuff seem a bit random though. If I were you I would either post a link from my website on Facebook, click on it and see if I get something similar. If it doesn't look like that, I would contact legal#facebook.com to clarify whether it is linked to them.

How to block spammers from using my public email api

I am working on a web application which allows users to share stuff on a web-page by clicking on an 'email to friend' link; similar to what extole is doing here
http://www.american-giant.com/mens-heavyweight-full-zip-hooded-sweatshirt-product.html
on this page if you click on the email icon near "REFER & GET $15", you will see a pop-up where you can enter your own email and a friends email and can edit the subject of the email. When you click send the data is sent to the backend as json. They are using a plain simple url to do this i.e. http://refer.american-giant.com/v2/share.
The problem for me is that somehow spammers got hold of my url (can't mention here) and now they are using it to spam others by using some sort of a script. What I did is I placed a check in the backend api to block an ip if more than 5 share requests originate from it, but it seems that the spammers have a lots of ips (more than 30,000 from what I counted in my logs) so they are still able to send lots of email. One possible solution is to use a captcha to thwart the spamming script. But I am curious that how extole is doing it. They aren't using any captchas; and they are famous too, so it is unlikely that spammers don't know about their publicly accessible api. Can any one shed some light on this?
Note:
1. I am using a third party email service to send the emails.
2. Users are not required to sign in as this defeats the purpose of sharing on a simple website
3. Users can edit the subject and body, thus these are sent to the api call and this is what allows the spammers to abuse the api with their own stuff.

Resources