Python 3.5.2 Iterating a get request

Python 3.5.2 Iterating a get request - python-3.x

Hoping someone can tell me whether this script is functioning the way I intended it to, and if not explain what I am doing wrong.
The RESTful API I am using has a parameter pageSize ranging from 10-50. I used pageSize=50. There was another parameter that I did not use called pageNumber
So, I thought this would be the right way to make the get request:
# Python 3.5.2
import requests
r = requests.get(url, stream=True)
with open("file.txt",'w', newline='', encoding='utf-8') as fd:
text_out = r.text
fd.write(text_out)
UPDATE
I think I understand a bit better. I read the documentation in more detail, but I am still missing how to get the entire data set from the API. Here is some more information:
verbs = requests.options(r.url)
print(verbs.headers)
{'Server': 'ninx', 'Date': 'Sat, 24 Dec 2016 22:50:13 GMT',
'Allow': 'OPTIONS,HEAD,GET', 'Content-Length': '0', 'Connection': 'keep-alive'}
print(r.headers)
{'Transfer-Encoding': 'chunked', 'Vary': 'Accept-Encoding',
'X-Entity-Count': '50', 'Connection': 'keep-alive',
'Content-Encoding': 'gzip', 'Date': 'Sat, 24 Dec 2016 23:59:07 GMT',
'Server': 'ninx', 'Content-Type': 'application/json; charset=UTF-8'}
Should I create a session and use the previously unused pageNumber parameter to create a new url until the 'X-Entity-Count' is zero? Or, is there a better way?

I found a discussion that helped clear this matter up for me...this updated question should probably be deleted...
API pagination best practices

Related

Cypress: intercept a network request with compression-type gzip to simulate mapbox

I am currently having trouble with mocking a particular request mapbox-gl is making. When the map is loaded from mapbox pbf-files are being requested and i have not been able to mock this.
My guess is that the core issue is that there seems to be an open bug with cypress issue-16420.
I tried alot of different intercept variants. I tried all kinds of response headers. I gziped, compressed, brd the file that I serve via fixture. I tried different encodings for the fixture. Nothing worked. One of the interceptors looks basically like this
cy.intercept({
method: 'GET',
url: '**/fonts/v1/mapbox/DIN%20Offc%20Pro%20Italic,Arial%20Unicode%20MS%20Regular/0-255.pbf?*',
}, {
fixture: 'fonts/italic.arial.0-255.pbf,binary',
statusCode: 204,
headers: {
'Connection': 'keep-alive',
'Keep-Alive': 'timeout=5',
'Transfer-Encoding': 'chunked',
'access-control-allow-origin': '*',
'access-control-expose-headers': 'Link',
'age': '11631145',
'cache-control': 'max-age=31536000',
'content-encoding': 'compress',
'content-type': 'application/x-protobuf',
'date': 'Sat, 19 Feb 2022 20:46:43 GMT',
'etag': 'W/"b040-+eCb/OHkPqToOcONTDlvpCrjmvs"',
'via': '1.1 4dd80d99fd5d0f6baaaf5179cd921f72.cloudfront.net (CloudFront)',
'x-amz-cf-id': '4uY9rjBgR_R12nkfHFrBMLEpNuWygW9DkmODlMEzwJHABTGCGg8pww==',
'x-amz-cf-pop': 'FRA56-P7',
'x-cache': 'Hit from cloudfront',
'x-origin': 'Mbx-Fonts'
}
}).as('get.0-255.pbf').as('getItalicArial0-255');
Now even if this is a bug there has to be some kind of workaround to serve the file in a cypress test without having an active internet connection. It would be great not having to rely on the network on tests. So all kinds of workarounds and dirty tricks are welcome in making this intercept work.

Stuck with xml download using python. How to handle that?

I need a hint from you about an issue I'm handling. Using requests to do some webscraping in python, the URL gives me a file to download, but when I get the content from the request, I get the following result:
b'"PD94bWwgdmVyc2lvbj0iMS4wIiBlbmNvZGluZz0iVVRGLTgiIHN0YW5kYWxvbmU9InllcyI/Pg0KPERhZG9zRWNvbm9taWNvRmluYW5jZWlyb3MgeG1sbnM6eHNpPSJodHRwOi8vd3d3LnczLm9yZy8yMDAxL1hNTFNjaGVtYS1pbnN0YW5jZSI+DQoJPERhZG9zR2VyYWlzPg0KCQk8Tm9tZUZ1bmRvPkZJSSBCVEdQIExPR0lTVElDQTwvTm9tZUZ1bmRvPg0KCQk8Q05QSkZ1bmRvPjExODM5NTkzMDAwMTA5PC9DTlBKRnVuZG8+DQoJCTxOb21lQWRtaW5pc3RyYWRvcj5CVEcgUGFjdHVhbCBTZXJ2acOnb3MgRmluYW5jZWlyb3MgUy5BLiBEVFZNPC9Ob21lQWRtaW5pc3RyYWRvcj4NCgkJPENOUEpBZG1pbmlzdHJhZG9yPjU5MjgxMjUzMDAwMTIzPC9DTlBKQWRtaW5pc3RyYWRvcj4NCgkJPFJlc3BvbnNhdmVsSW5mb3JtYWNhbz5MdWNhcyBNYXNzb2xhPC9SZXNwb25zYXZlbEluZm9ybWFjYW8+DQoJCTxUZWxlZm9uZUNvbnRhdG8+KDExKSAzMzgzLTI1MTM8L1RlbGVmb25lQ29udGF0bz4NCgkJPENvZElTSU5Db3RhPkJSQlRMR0NURjAwMDwvQ29kSVNJTkNvdGE+DQoJCTxDb2ROZWdvY2lhY2FvQ290YT5CVExHMTE8L0NvZE5lZ29jaWFjYW9Db3RhPg0KCTwvRGFkb3NHZXJhaXM+DQoJPEluZm9ybWVSZW5kaW1lbnRvcz4NCgkJPFJlbmRpbWVudG8+DQoJCQk8RGF0YUFwcm92YWNhbz4yMDIxLTEyLTE1PC9EYXRhQXByb3ZhY2FvPg0KCQkJPERhdGFCYXNlPjIwMjEtMTItMTU8L0RhdGFCYXNlPg0KCQkJPERhdGFQYWdhbWVudG8+MjAyMS0xMi0yMzwvRGF0YVBhZ2FtZW50bz4NCgkJCTxWYWxvclByb3ZlbnRvQ290YT4wLjcyPC9WYWxvclByb3ZlbnRvQ290YT4NCgkJCTxQZXJpb2RvUmVmZXJlbmNpYT5Ob3ZlbWJybzwvUGVyaW9kb1JlZmVyZW5jaWE+DQoJCQk8QW5vPjIwMjE8L0Fubz4NCgkJCTxSZW5kaW1lbnRvSXNlbnRvSVI+dHJ1ZTwvUmVuZGltZW50b0lzZW50b0lSPg0KCQk8L1JlbmRpbWVudG8+DQoJCTxBbW9ydGl6YWNhbyB0aXBvPSIiLz4NCgk8L0luZm9ybWVSZW5kaW1lbnRvcz4NCjwvRGFkb3NFY29ub21pY29GaW5hbmNlaXJvcz4="'
and these headers:
{'Date': 'Thu, 13 Jan 2022 13:25:03 GMT', 'Set-Cookie': 'dtCookie=v_4_srv_27_sn_A24AD4C76E5194F3DB0056C40CBABEF7_perc_100000_ol_0_mul_1_app-3A97e61c3a8a7c6a0b_1_rcs-3Acss_0; Path=/; Domain=.bmfbovespa.com.br, JSESSIONID=LWB+pcQEPreUbb+BtwZ9pyOm.sfnNODE01; Path=/fnet; Secure; HttpOnly, TS01871345=011d592ce1f641d52fa6af8d3b5a924eddc7997db2f6611d8d70aeab610f5e34ea2706a45b6f2c35f2b500d01fc681c74e5caa356c; Path=/; HTTPOnly, TS01e3f871=011d592ce1f641d52fa6af8d3b5a924eddc7997db2f6611d8d70aeab610f5e34ea2706a45b6f2c35f2b500d01fc681c74e5caa356c; path=/; domain=.bmfbovespa.com.br; HTTPonly, TS01d1c2dd=011d592ce1f641d52fa6af8d3b5a924eddc7997db2f6611d8d70aeab610f5e34ea2706a45b6f2c35f2b500d01fc681c74e5caa356c; path=/fnet; HTTPonly', 'X-OneAgent-JS-Injection': 'true', 'X-Frame-Options': 'SAMEORIGIN', 'Cache-Control': 'no-cache, no-store, must-revalidate', 'Pragma': 'no-cache', 'Expires': '0', 'Content-Disposition': 'attachment; filename="08706065000169-ACE28022020V01-000083505.xml"', 'Server-Timing': 'dtRpid;desc="258920448"', 'Connection': 'close', 'Content-Type': 'text/xml', 'X-XSS-Protection': '1; mode=block', 'Transfer-Encoding': 'chunked'}
But it works perfectly and download the .xml file when I point the browser to https://fnet.bmfbovespa.com.br/fnet/publico/downloadDocumento?id=247031 URL address, for example, with the following data
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<DadosEconomicoFinanceiros xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<DadosGerais>
<NomeFundo>FII BTGP LOGISTICA</NomeFundo>
<CNPJFundo>11839593000109</CNPJFundo>
<NomeAdministrador>BTG Pactual Serviços Financeiros S.A. DTVM</NomeAdministrador>
<CNPJAdministrador>59281253000123</CNPJAdministrador>
<ResponsavelInformacao>Lucas Massola</ResponsavelInformacao>
<TelefoneContato>(11) 3383-2513</TelefoneContato>
<CodISINCota>BRBTLGCTF000</CodISINCota>
<CodNegociacaoCota>BTLG11</CodNegociacaoCota>
</DadosGerais>
<InformeRendimentos>
<Rendimento>
<DataAprovacao>2021-12-15</DataAprovacao>
<DataBase>2021-12-15</DataBase>
<DataPagamento>2021-12-23</DataPagamento>
<ValorProventoCota>0.72</ValorProventoCota>
<PeriodoReferencia>Novembro</PeriodoReferencia>
<Ano>2021</Ano>
<RendimentoIsentoIR>true</RendimentoIsentoIR>
</Rendimento>
<Amortizacao tipo=""/>
</InformeRendimentos>
</DadosEconomicoFinanceiros>
It seems to me that the data is cryptographed, but I have no idea how to get the xml data to use the data inside it. Can you help me?
Thank you very much.
EDIT:
The example code I've used is quite simple:
Python 3.8.10 (tags/v3.8.10:3d8993a, May 3 2021, 11:48:03) [MSC v.1928 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information.
>>> import requests
>>> url='fnet.bmfbovespa.com.br/fnet/publico/downloadDocumento?id=247031'
>>> xhtml = requests.get(url,verify=False, headers={'User-Agent':'Mozzila/5.0'})
Then xhtml.content command shows the string. (There is a HTTPS warning due to the verify=False that i will handle after)
I have tried a solution using urllib.request, but got the same result

Data seems to be base64 encoded. Try to decode it:
import requests
import base64
url = 'http://fnet.bmfbovespa.com.br/fnet/publico/downloadDocumento?id=247031'
response = requests.get(url,verify=False, headers={'User-Agent':'Mozzila/5.0'})
decoded = base64.b64decode(response.content)
print(decoded)

Error 403 while using exchangelib to access Outlook Exchange server to read emails

I'm trying to read emails from the Microsoft Exchange server using EWS and exchangelib in Python for an email classification problem. But I am unable to connect to the exchange server.
I've tried specifying the version, auth_type, using a certificate (which gives a ssl verify error), using the smtp address in place of the username and it still doesn't connect.
Here is my code:
from exchangelib import Credentials, Account, EWSDateTime, EWSTimeZone, Configuration, DELEGATE, IMPERSONATION, NTLM, ServiceAccount, Version, Build
USER_NAME = 'domain\\user12345'
ACCOUNT_EMAIL = john.doe#ext.companyname.com'
ACCOUNT_PASSWORD = 'John#1234'
ACCOUNT_SERVER = 'oa.company.com'
creds = Credentials(USER_NAME, ACCOUNT_PASSWORD)
config = Configuration(server=ACCOUNT_SERVER, credentials=creds)
account = Account(primary_smtp_address=ACCOUNT_EMAIL, config=config, autodiscover=False, access_type=DELEGATE)
print('connecting ms exchange server account...')
print(type(account))
print(dir(account))
account.root.refresh()
Here is the error I am getting:
TransportError: Unknown failure
Retry: 0
Waited: 10
Timeout: 120
Session: 26271
Thread: 15248
Auth type: <requests_ntlm.requests_ntlm.HttpNtlmAuth object at 0x00000259AA1BD588>
URL: https://oa.company.com/EWS/Exchange.asmx
HTTP adapter: <requests.adapters.HTTPAdapter object at 0x00000259AA0DB7B8>
Allow redirects: False
Streaming: False
Response time: 0.28100000000085856
Status code: 403
Request headers: {'User-Agent': 'python-requests/2.21.0', 'Accept-Encoding': 'gzip, deflate', 'Accept': '*/*', 'Connection': 'Keep-Alive', 'Content-Type': 'text/xml; charset=utf-8', 'Content-Length': '469', 'Authorization': 'NTLM TlRMTVNTUAADAAAAGAAYAG0AAAAOAQ4BhQAAAAwADABYAAAACQAJAGQAAAAAAAAAbQAAABAAEACTAQAANoKJ4gYBsR0AAAAP7Pyb+wBnMdrlhr4FKVqPbklDSUNJQkFOS0xURElQUlUzODE5MAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAJTLmMBLPHowOJZ46XDs+4ABAQAAAAAAAAZdRefQLNUB+Fc6Z26oxvgAAAAAAgAYAEkAQwBJAEMASQBCAEEATgBLAEwAVABEAAEAEgBIAFkARABFAFgAQwBIADAAOAAEACAAaQBjAGkAYwBpAGIAYQBuAGsAbAB0AGQALgBjAG8AbQADADQASABZAEQARQBYAEMASAAwADgALgBpAGMAaQBjAGkAYgBhAG4AawBsAHQAZAAuAGMAbwBtAAUAIABpAGMAaQBjAGkAYgBhAG4AawBsAHQAZAAuAGMAbwBtAAcACAAGXUXn0CzVAQYABAACAAAACgAQAD9EWlwiiUs304wucsxnkyQAAAAAAAAAALfelDwG05hYOMUqY/e60PY=', 'Cookie': 'ClientId=SINZWMOJKWSKDGEKASFG; expires=Fri, 26-Jun-2020 10:13:02 GMT; path=/; HttpOnly'}
Response headers: {'Cache-Control': 'private', 'Server': 'Microsoft-IIS/8.5', 'request-id': 'ae4dee8d-34e0-471c-8252-b8c1056c8ea0', 'X-CalculatedBETarget': 'pqrexch05.domain.com', 'X-DiagInfo': 'PQREXCH05', 'X-BEServer': 'PQREXCH05', 'X-AspNet-Version': '4.0.30319', 'Set-Cookie': 'exchangecookie=681afc8a0905459182363cce9a98d021; expires=Sat, 27-Jun-2020 10:13:02 GMT; path=/; HttpOnly, X-BackEndCookie=S-1-5-21-1343024091-725345543-504838010-1766210=u56Lnp2ejJqBy87Iysqem5nSy8mbnNLLyZ7H0sfIysbSy5vMz8qdzcvPnpzHgYHNz87G0s/I0s3Iq87Pxc7Mxc/N; expires=Sat, 27-Jul-2019 10:13:02 GMT; path=/EWS; secure; HttpOnly', 'Persistent-Auth': 'true', 'X-Powered-By': 'ASP.NET', 'X-FEServer': 'PQREXCH05', 'Date': 'Thu, 27 Jun 2019 10:13:01 GMT', 'Content-Length': '0'}
Request data: b'<?xml version=\'1.0\' encoding=\'utf-8\'?>\n<s:Envelope xmlns:m="http://schemas.microsoft.com/exchange/services/2006/messages" xmlns:s="http://schemas.xmlsoap.org/soap/envelope/" xmlns:t="http://schemas.microsoft.com/exchange/services/2006/types"><s:Header><t:RequestServerVersion Version="Exchange2013_SP1"/></s:Header><s:Body><m:ResolveNames ReturnFullContactData="false"><m:UnresolvedEntry>ICICIBANKLTD\\IPRU38190</m:UnresolvedEntry></m:ResolveNames></s:Body></s:Envelope>'
Response data: b''

You might need to configure access policy for EWS using PowerShell.
For example (to allow all apps to use REST and EWS):
Set-OrganizationConfig -EwsApplicationAccessPolicy EnforceBlockList -EwsBlockList $null
Taken from Microsoft docs on Set-OrganizationConfig.
Please search for EwsApplicationAccessPolicy in the above link for more granular access control examples.

DNS query not specified or too small

I'm trying to make a python script to test if a server can answer in DNS-over-HTTPS.
So, I read this article and try to make the same request but in python :
import requests
r=requests.get("https://cloudflare-dns.com/dns-query?name=example.com&type=A", headers={"accept":"application/dns-message"})
print(r.url)
print(r.headers)
print(r.status_code)
Here is the output
https://cloudflare-dns.com/dns-query?name=example.com&type=A
{'Access-Control-Allow-Origin': '*', 'Vary': 'Accept-Encoding',
'CF-RAY': '48b33f92aec83e4a-ZRH', 'Expect-CT': 'max-age=604800,
report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"',
'Date': 'Tue, 18 Dec 2018 17:11:23 GMT', 'Transfer-Encoding':
'chunked', 'Server': 'cloudflare', 'Connection': 'keep-alive'}
400
If I base me on what's written here, my request is not specified or too small.
Does anyone sees where I'm mistaking?
Thanks

The form you are using to pass parameters needs application/dns-json as MIME Accept type. Otherwise for application/dns-message you have only a dns key with the value being the full DNS message encoded.
Compare:
curl -H 'accept: application/dns-json' 'https://cloudflare-dns.com/dns-query?name=example.com&type=AAAA'
(from https://developers.cloudflare.com/1.1.1.1/dns-over-https/json-format/)
with
curl -H 'accept: application/dns-message' -v 'https://cloudflare-dns.com/dns-query?dns=q80BAAABAAAAAAAAA3d3dwdleGFtcGxlA2NvbQAAAQAB' | hexdump
(from https://developers.cloudflare.com/1.1.1.1/dns-over-https/wireformat/)

Request email audit export fails with status 400 and "Premature end of file."

according to https://developers.google.com/admin-sdk/email-audit/#creating_a_mailbox_for_export I am trying to request the email audit export of an user in G Suite this way:
def requestAuditExport(account):
credentials = getCredentials()
http = credentials.authorize(httplib2.Http())
url = 'https://apps-apis.google.com/a/feeds/compliance/audit/mail/export/helpling.com/'+account
status, response = http.request(url, 'POST', headers={'Content-Type': 'application/atom+xml'})
print(status)
print(response)
And I get the following result:
{'content-length': '22', 'expires': 'Tue, 13 Dec 2016 14:19:37 GMT', 'date': 'Tue, 13 Dec 2016 14:19:37 GMT', 'x-frame-options': 'SAMEORIGIN', 'transfer-encoding': 'chunked', 'x-xss-protection': '1; mode=block', 'content-type': 'text/html; charset=UTF-8', 'x-content-type-options': 'nosniff', '-content-encoding': 'gzip', 'server': 'GSE', 'status': '400', 'cache-control': 'private, max-age=0', 'alt-svc': 'quic=":443"; ma=2592000; v="35,34"'}
b'Premature end of file.'
I cannot see where the problem is, can someone please give me a hint?
Thanks in advance!
Kay

Fix it by going intp the Admin Console, Manage API client access page under Security and add the Client ID, scope needed for the Directory API. For more information, check this document.

Okay, found out what was wrong and fixed it myself. Finally it looks like this:
http = getCredentials().authorize(httplib2.Http())
url = 'https://apps-apis.google.com/a/feeds/compliance/audit/mail/export/helpling.com/'+account
headers = {'Content-Type': 'application/atom+xml'}
xml_data = """<atom:entry xmlns:atom='http://www.w3.org/2005/Atom' xmlns:apps='http://schemas.google.com/apps/2006'> \
<apps:property name='includeDeleted' value='true'/> \
</atom:entry>"""
status, response = http.request(url, 'POST', headers=headers, body=xml_data)
Not sure if it was about the body or the header. It works now and I hope it will help others.
Thanks anyway.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Python 3.5.2 Iterating a get request - python-3.x

I found a discussion that helped clear this matter up for me...this updated question should probably be deleted... API pagination best practices

Related

Cypress: intercept a network request with compression-type gzip to simulate mapbox

Stuck with xml download using python. How to handle that?

Error 403 while using exchangelib to access Outlook Exchange server to read emails

DNS query not specified or too small

Request email audit export fails with status 400 and "Premature end of file."

Categories

Resources