Power Query - Yahoo - HTTP 301 Error

Power Query - Yahoo - HTTP 301 Error - excel

I have a problem which was discussed in an other thread but while the author said he solved it is - for me at least - quit unclear how he did it.
Other thread: Yahoo finance historical stock price power query returns 301 response
I use Power Query an Add In for Excel which allows different query's, one is to grab website content, which I automated for yahoo data. Unfortunately it seems something changed with the yahoo site index and I am not able to use my query anymore.
If I try to recreate the query (build it from scratch again) I get the error "HTTP 301". Even with the "normal" query feature of excel the yahoo data is not available anymore.
Hopefully someone is able to help me.
Best wishes,
Andreas

You should set header "user-agent" to emulate browser
For examble Google Chrome
let
url = "https://finance.yahoo.com/quote/AAL/history?p=AAL",
#"user-agent"="Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.86 Safari/537.36",
web = Web.Contents(url, [Headers=[#"user-agent"=#"user-agent"]]),
html = Web.Page(web),
Data0 = html{0}[Data]
in
Data0

Related

Gmail following links (Gmail-content-sampling)

I'm getting a click event from SendGrid for every link in an email sent to Gmail users, and it is definitely not me clicking the links, I even get the first click event to my SendGrid webhook before SendGrid sends the delivered event to me.
This adds needless hits to both the webhook endpoint and to the pages it hits, I can filter out the click events so I won't get false click counts by filtering for user agents with "Gmail-content-sampling" in it
as the user agent supplied is
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML,
like Gecko) Chrome/42.0.2311.135 Safari/537.36 Edge/12.246
Gmail-content-sampling
However it would still needlessly hit the site umpteen thousand times in a few minutes (I don't think that will agree with the standard click here to validate email address links) and I don't want to have to add a block user agent per for every request.
Although filtering would be a problem if other providers do the same thing or Gmail change the user agent.
I have tried adding rel="nofollow" to the links but it still does it.
Anyone experience this/know how to get gmail to f-off ?
Edit:
As at 10th August google have undone their stuff up.

API Authentication from Google Chrome in Python

I apologize in advance as I'm pretty new to this stuff. I've seen similar questions, but just can't figure out my particular situation.
I'm trying to use an API through Python, but can't figure out how to authenticate and there is no documentation. I use a service that has a website. The website seems to be powered by an API. Therefore, when I trace all of my network traffic through ctrl-shift-I, I can see all the API calls I need to use as I click through the website. So, even though the API isn't documented, I know all of the end points I need.
Once I login to my account via the website, the API is authenticated. I can then make requests in the browser to the API. However, I can't seem to figure out how to authenticate via Python in the requests library. I'm open to any manner of authenticating, and have even tried using cookies from my browser per other suggestions on Stack Overflow, but I'm very unfamiliar with that method.
Am I completely missing something here? Most of these methods of authenticating I've found as solutions in Stack Overflow that seem to work for others.
s = requests.Session()
payload = {'usernameOrEmail': 'XXXXXXX', 'password':'XXXXXXX'}
s.post('https://XXXXXXX', json=payload, headers={'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.81 Safari/537.36'})
s = requests.Session()
payload = {'usernameOrEmail': 'XXXXXXX', 'password':'XXXXXXX'}
s.post('https://XXXXXXX', json=payload)
url = 'https://XXXXXXX'
values = {'usernameOrEmail': 'XXXXXXX',
'password': 'XXXXXXX'}
r = requests.post(url, data=values, verify=False)
I've tried using the URL of the login page, the member overview page that you are forwarded to after logging in, the API URL's/endpoints, etc. I'm 98% sure I have the name value pairs correct for username and password.
I've even tried a simple get request and appending the usernameOrEmail and password fields on the URL.
I've gotten 401's, 404's, and a 405 most recently depending on the URL method combination.
Apologies in advance as I'm sure this is something extremely basic. Am I on the right track with submitting Username and Password or do I need to go the route of using browser cookies?
I'm using Spyder through Anaconda.
Thanks
Updated for Code Producing a 404:
import requests
s = requests.Session()
payload = {'username_or_email': 'XXXXXXX', 'password':'XXXXXXX'}
test=s.post('https://members.onepeloton.com/login', json=payload)
print(test)

There's nothing HTTP related you can't do in Python that you can do in your browser, except for running Javascript.
You're probably missing a request or some parameter. Maybe there has been a cookie set from before you started you browser session.
Try recording and emulating all the requests from an incognito tab, that way you're sure you start with the same state as your Python session.
Maybe we can help you more if you can tell us which website you're trying to authenticate to.

How to get the real HQ image from the Instagram API?

Firstly, there are already questions to this topic, but none cover up my problem, entirely, because it's either not the data I need or it's not working properly.
There are services like InstaDP that are able to show you the HQ version of any profile picture from Instagram. Now, I wonder how this is possible?
I did some research and were able to find a higher quality URL when accessing https://www.instagram.com/instagramforbusiness/?__a=1 (see profile_pic_url_hd, answered here). However, InstaDP seems to have a backend that returns a different url that redirects to a way higher quality image: https://instadp-cors-222621.appspot.com/get-hd?id=1107766105 (see at hd_profile_pic_url_info, I extracted the ID for the URL from the result of the ?__a=1 link). I tested this with my personal profile and was able to get the image of myself in an outstanding quality of 1024x1024. However, the ?__a=1 link seem only to return a link for my profile picture in 320x320.
Since InstaDP seem to not be the only player who is able to fetch HQ profile pictures I went ahead and compared the backends of those players. It seems that each service seem to have a different URL to the HQ profile picture of the same Instagram account. So my conclusion is that the Instagram API is involved in all that.
So I created a client key at https://www.instagram.com/developer/. I was also able to get my auth token and determine my logged in csrftoken for the X-CSRFToken header. Now my question is how to continue?
I found a few answers to this topic stating I should request https://i.instagram.com/api/v1/users/1107766105/info/, but it always returns the login page as HTML.
I tried a REST client that uses my Chrome cookies and logged into Instagram before, I tried to set my HTTP headers to X-CSRFToken:<mycookietoken> and Content-Type:application/json. (If I don't set the CSRFToken it errors, so I need to add it, but if the header is set I get the HTML again, even when the CSRFToken is correct. I don't get an error when the CSRFToken is wrong.)
I also tried setting the Origin, Referer and Host to trick Instagram in believing the request came from its own window location, without luck. Setting the Host will even cause a 400 bad request. Even adding my access token in the URL had no effect (?access-token=########).
To sum my question up, how do those services obtain the profile pictures in a such a great quality of up to 1024x1024 from the cdninstagram servers?

Although I am late, but this might be of someone's help:
Step 1:
First thing you need to get HD Instagram profile picture is their profile ID. This can be found in the source code of user profile link. For example if you view source of the following link https://www.instagram.com/abdulhaq0/ and search for "logging_page_id" you will get "profilePage_1285389476". The numbers following the profilePage are the ID for this account.
Step 2:
Next you need put ID in the following URL https://i.instagram.com/api/v1/users/{ProfileID}/info/ and open it in browser. In our case link would be https://i.instagram.com/api/v1/users/1285389476/info/
Step 3:
Now on the link above search for "hd_profile_pic_url_info". There you can get the URL of HD Instagram profile picture.
Hope this helped.

Steps:
1. Get the instagram post link. Eg : https://www.instagram.com/p/Bo-Jru-g7Wa/
or if you don't have the link, the instagram api provides you with a permalink option in the result array which for the above link is Bo-Jru-g7Wa
Now just follow add media?size=l after the url ie.,
Result: High quality image url:
https://www.instagram.com/p/Bo-Jru-g7Wa/media?size=l
you can see it in action here: https://jsfiddle.net/nmj1z7wo/fiddle URL
This link can be considered as a shorthand code to instagram image URL's which are very much bigger

I believe those sites like https://instadp.site/ that show the hi-res of the user profile image do not use the official Instagram API.
In the past it was possible to hack Instagram's CDN URLs to change parameters and get the high resolution from them, but nowadays the URLs are signed and if you change any parameters the URL will fail.
So, the only solution they may be using is to emulate a client. There is a popular PHP client for this: https://github.com/mgp25/Instagram-API

The instagram API have updated. Now we will get only the 320x320 sized image from
https://www.instagram.com/{username}/?__a=1
Even if you get the user id from this endpoint the "user info" endpoint does not return the hd url in its response and should pass a header too now.
import requests
def get_user_by_user_id(user_id):
if user_id:
base_url = "https://i.instagram.com/api/v1/users/{}/info/"
headers = {
'user-agent':'Mozilla/5.0 (iPhone; CPU iPhone OS 12_3_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Mobile/15E148 Instagram 105.0.0.11.118 (iPhone11,8; iOS 12_3_1; en_US; en-US; scale=2.00; 828x1792; 165586599)'
}
try:
res = requests.get(base_url.format(user_id),headers=headers)
print(res.json())
except Exception as e:
print("getting user failed, due to '{}'".format(e.message))
get_user_by_user_id(userid)
You can check this code and try. Websites like instadp I guess they does not use instagram's official API

The easiest and most reliable way of getting HD (1080x1080) profile pic is Instaloader CLI.
instaloader USERNAME_OF_INTEREST --no-posts --login YOUR_USERNAME

How do I detect Sony Bravia internet browser?

I'd like to write detector of Sony Bravia TV internet browser. I knew it's Opera browser but I don't know exactly what's properties to detect it ? Anyone know what's Opera version or how can I know it's Sony TV browser ?
Thanks a lot

Detection method depends on whenever you're doing it on a server side (in which case you have to look at the extra headers that browser in tv send) or on a client side javascript, in which case you have to look at the navigator.userAgent property.
As to extra headers, the only information I could find is this example headers:
X-AV-Physical-Unit-Info: pa="BRAVIA KDL-46XBR9";,X-AV-Client-Info: av=5.0; cn="Sony Corporation"; mn="BRAVIA KDL-46XBR9"; mv="1.7";
As you can see above, tv identifies itself with extra X-AV-Physical-Unit-Info and X-AV-Client-info headers
As to client-side detection of tv browser, I found this post (refering to google tv but still...) in which you can see the content of a navigator.userAgent property on two devices, including Sony Bravia.
Mozilla/5.0 (X11; U; Linux i686; en-US) AppleWebKit/533.4 (KHTML, like Gecko) Chrome/5.0.375.127 Large Screen Safari/533.4 GoogleTV/ 162671

Sharepoint Webservices - GetUserInfo

I am trying to call GetUserInfo on a sharepoint list (using the sharepoint web services), which seems to work ok, but only If the user I am trying to get details for has already added an item to the list using the actual sharepoint site.
I would like to be able to call GetUserInfo for people that havent already added an item to the list.
The List itself is open to any NT AUTHORITY\authenticated users to post items, when They add a list item, it seems to add them as a site member, but doesnt seem to add them to a specific group or role (as far as I can see!)
Has anyone else come up against the same problem? Is there a workaround available?

After some digging around I managed to find out a way round this one.
The People webservice (people.asmx) has a method, ResolvePrincipals, which accepts a users NT login(s) (or email address) and resolves it to their associated sharepoint account for the site - (Including the Unique ID of the user - this is what I was after)
The method has a boolean parameter (addToUserInfoList) when set to true, it will automaticaly add the user to the site (if they don't already exist)
MSDN Documentation can be found here -
http://msdn.microsoft.com/en-us/library/people.people.resolveprincipals(v=office.12).aspx

Users don't actually get added into the SharePoint user list until they have visited a site.
Are you just after the NT Login ID or something else?

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string