DNS query not specified or too small - python-3.x

I'm trying to make a python script to test if a server can answer in DNS-over-HTTPS.
So, I read this article and try to make the same request but in python :
import requests
r=requests.get("https://cloudflare-dns.com/dns-query?name=example.com&type=A", headers={"accept":"application/dns-message"})
print(r.url)
print(r.headers)
print(r.status_code)
Here is the output
https://cloudflare-dns.com/dns-query?name=example.com&type=A
{'Access-Control-Allow-Origin': '*', 'Vary': 'Accept-Encoding',
'CF-RAY': '48b33f92aec83e4a-ZRH', 'Expect-CT': 'max-age=604800,
report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"',
'Date': 'Tue, 18 Dec 2018 17:11:23 GMT', 'Transfer-Encoding':
'chunked', 'Server': 'cloudflare', 'Connection': 'keep-alive'}
400
If I base me on what's written here, my request is not specified or too small.
Does anyone sees where I'm mistaking?
Thanks

The form you are using to pass parameters needs application/dns-json as MIME Accept type. Otherwise for application/dns-message you have only a dns key with the value being the full DNS message encoded.
Compare:
curl -H 'accept: application/dns-json' 'https://cloudflare-dns.com/dns-query?name=example.com&type=AAAA'
(from https://developers.cloudflare.com/1.1.1.1/dns-over-https/json-format/)
with
curl -H 'accept: application/dns-message' -v 'https://cloudflare-dns.com/dns-query?dns=q80BAAABAAAAAAAAA3d3dwdleGFtcGxlA2NvbQAAAQAB' | hexdump
(from https://developers.cloudflare.com/1.1.1.1/dns-over-https/wireformat/)

Related

Curl HEAD request returns a last-modified header but Node https.request(..., {method: HEAD},...) does not

Apparently there's something I don't know about HEAD requests.
Here's the URL: 'https://theweekinchess.com/assets/files/pgn/eurbli22.pgn', which I'll refer to as <URL> below.
If I curl this, I see a last-modified entry in the headers:
curl --head <URL>
HTTP/2 200
last-modified: Sun, 18 Dec 2022 18:07:16 GMT
accept-ranges: bytes
content-length: 1888745
host-header: c2hhcmVkLmJsdWVob3N0LmNvbQ==
content-type: application/x-chess-pgn
date: Wed, 11 Jan 2023 23:09:14 GMT
server: Apache
But if I make a HEAD request in Node using https, That information is missing:
https.request(<URL>, { method: 'HEAD' }, res => {
console.log([<URL>, res.headers])}).end()
This returns:
[
<URL>
{
date: 'Wed, 11 Jan 2023 23:16:15 GMT',
server: 'Apache',
p3p: 'CP="NOI NID ADMa OUR IND UNI COM NAV"',
'cache-control': 'private, must-revalidate',
'set-cookie': [
'evof3sqa=4b412b5913b38669fc928a0cca9870e4; path=/; secure; HttpOnly'
],
upgrade: 'h2,h2c',
connection: 'Upgrade, Keep-Alive',
'host-header': 'c2hhcmVkLmJsdWVob3N0LmNvbQ==',
'keep-alive': 'timeout=5, max=75',
'content-type': 'text/html; charset=UTF-8'
}
]
I tried axios instead of https:
const response = await axios.head('https://theweekinchess.com/assets/files/pgn/eurbli22.pgn');
console.log({response: response.headers})
And that works (incl. the proper MIME type):
date: 'Thu, 12 Jan 2023 20:00:48 GMT',
server: 'Apache',
upgrade: 'h2,h2c',
connection: 'Upgrade, Keep-Alive',
'last-modified': 'Sun, 18 Dec 2022 18:07:16 GMT',
'accept-ranges': 'bytes',
'content-length': '1888745',
'host-header': 'c2hhcmVkLmJsdWVob3N0LmNvbQ==',
'keep-alive': 'timeout=5, max=75',
'content-type': 'application/x-chess-pgn'
I also tried waiting for re2.on('end', console.log(res.headers)), but same output as before.
I'm going to close this issue and post it instead as a 'bug' on Node's site. I'm sure there's something that needs to be changed in how I'm executing the HEAD request.

Cypress: intercept a network request with compression-type gzip to simulate mapbox

I am currently having trouble with mocking a particular request mapbox-gl is making. When the map is loaded from mapbox pbf-files are being requested and i have not been able to mock this.
My guess is that the core issue is that there seems to be an open bug with cypress issue-16420.
I tried alot of different intercept variants. I tried all kinds of response headers. I gziped, compressed, brd the file that I serve via fixture. I tried different encodings for the fixture. Nothing worked. One of the interceptors looks basically like this
cy.intercept({
method: 'GET',
url: '**/fonts/v1/mapbox/DIN%20Offc%20Pro%20Italic,Arial%20Unicode%20MS%20Regular/0-255.pbf?*',
}, {
fixture: 'fonts/italic.arial.0-255.pbf,binary',
statusCode: 204,
headers: {
'Connection': 'keep-alive',
'Keep-Alive': 'timeout=5',
'Transfer-Encoding': 'chunked',
'access-control-allow-origin': '*',
'access-control-expose-headers': 'Link',
'age': '11631145',
'cache-control': 'max-age=31536000',
'content-encoding': 'compress',
'content-type': 'application/x-protobuf',
'date': 'Sat, 19 Feb 2022 20:46:43 GMT',
'etag': 'W/"b040-+eCb/OHkPqToOcONTDlvpCrjmvs"',
'via': '1.1 4dd80d99fd5d0f6baaaf5179cd921f72.cloudfront.net (CloudFront)',
'x-amz-cf-id': '4uY9rjBgR_R12nkfHFrBMLEpNuWygW9DkmODlMEzwJHABTGCGg8pww==',
'x-amz-cf-pop': 'FRA56-P7',
'x-cache': 'Hit from cloudfront',
'x-origin': 'Mbx-Fonts'
}
}).as('get.0-255.pbf').as('getItalicArial0-255');
Now even if this is a bug there has to be some kind of workaround to serve the file in a cypress test without having an active internet connection. It would be great not having to rely on the network on tests. So all kinds of workarounds and dirty tricks are welcome in making this intercept work.

Stuck with xml download using python. How to handle that?

I need a hint from you about an issue I'm handling. Using requests to do some webscraping in python, the URL gives me a file to download, but when I get the content from the request, I get the following result:
b'"PD94bWwgdmVyc2lvbj0iMS4wIiBlbmNvZGluZz0iVVRGLTgiIHN0YW5kYWxvbmU9InllcyI/Pg0KPERhZG9zRWNvbm9taWNvRmluYW5jZWlyb3MgeG1sbnM6eHNpPSJodHRwOi8vd3d3LnczLm9yZy8yMDAxL1hNTFNjaGVtYS1pbnN0YW5jZSI+DQoJPERhZG9zR2VyYWlzPg0KCQk8Tm9tZUZ1bmRvPkZJSSBCVEdQIExPR0lTVElDQTwvTm9tZUZ1bmRvPg0KCQk8Q05QSkZ1bmRvPjExODM5NTkzMDAwMTA5PC9DTlBKRnVuZG8+DQoJCTxOb21lQWRtaW5pc3RyYWRvcj5CVEcgUGFjdHVhbCBTZXJ2acOnb3MgRmluYW5jZWlyb3MgUy5BLiBEVFZNPC9Ob21lQWRtaW5pc3RyYWRvcj4NCgkJPENOUEpBZG1pbmlzdHJhZG9yPjU5MjgxMjUzMDAwMTIzPC9DTlBKQWRtaW5pc3RyYWRvcj4NCgkJPFJlc3BvbnNhdmVsSW5mb3JtYWNhbz5MdWNhcyBNYXNzb2xhPC9SZXNwb25zYXZlbEluZm9ybWFjYW8+DQoJCTxUZWxlZm9uZUNvbnRhdG8+KDExKSAzMzgzLTI1MTM8L1RlbGVmb25lQ29udGF0bz4NCgkJPENvZElTSU5Db3RhPkJSQlRMR0NURjAwMDwvQ29kSVNJTkNvdGE+DQoJCTxDb2ROZWdvY2lhY2FvQ290YT5CVExHMTE8L0NvZE5lZ29jaWFjYW9Db3RhPg0KCTwvRGFkb3NHZXJhaXM+DQoJPEluZm9ybWVSZW5kaW1lbnRvcz4NCgkJPFJlbmRpbWVudG8+DQoJCQk8RGF0YUFwcm92YWNhbz4yMDIxLTEyLTE1PC9EYXRhQXByb3ZhY2FvPg0KCQkJPERhdGFCYXNlPjIwMjEtMTItMTU8L0RhdGFCYXNlPg0KCQkJPERhdGFQYWdhbWVudG8+MjAyMS0xMi0yMzwvRGF0YVBhZ2FtZW50bz4NCgkJCTxWYWxvclByb3ZlbnRvQ290YT4wLjcyPC9WYWxvclByb3ZlbnRvQ290YT4NCgkJCTxQZXJpb2RvUmVmZXJlbmNpYT5Ob3ZlbWJybzwvUGVyaW9kb1JlZmVyZW5jaWE+DQoJCQk8QW5vPjIwMjE8L0Fubz4NCgkJCTxSZW5kaW1lbnRvSXNlbnRvSVI+dHJ1ZTwvUmVuZGltZW50b0lzZW50b0lSPg0KCQk8L1JlbmRpbWVudG8+DQoJCTxBbW9ydGl6YWNhbyB0aXBvPSIiLz4NCgk8L0luZm9ybWVSZW5kaW1lbnRvcz4NCjwvRGFkb3NFY29ub21pY29GaW5hbmNlaXJvcz4="'
and these headers:
{'Date': 'Thu, 13 Jan 2022 13:25:03 GMT', 'Set-Cookie': 'dtCookie=v_4_srv_27_sn_A24AD4C76E5194F3DB0056C40CBABEF7_perc_100000_ol_0_mul_1_app-3A97e61c3a8a7c6a0b_1_rcs-3Acss_0; Path=/; Domain=.bmfbovespa.com.br, JSESSIONID=LWB+pcQEPreUbb+BtwZ9pyOm.sfnNODE01; Path=/fnet; Secure; HttpOnly, TS01871345=011d592ce1f641d52fa6af8d3b5a924eddc7997db2f6611d8d70aeab610f5e34ea2706a45b6f2c35f2b500d01fc681c74e5caa356c; Path=/; HTTPOnly, TS01e3f871=011d592ce1f641d52fa6af8d3b5a924eddc7997db2f6611d8d70aeab610f5e34ea2706a45b6f2c35f2b500d01fc681c74e5caa356c; path=/; domain=.bmfbovespa.com.br; HTTPonly, TS01d1c2dd=011d592ce1f641d52fa6af8d3b5a924eddc7997db2f6611d8d70aeab610f5e34ea2706a45b6f2c35f2b500d01fc681c74e5caa356c; path=/fnet; HTTPonly', 'X-OneAgent-JS-Injection': 'true', 'X-Frame-Options': 'SAMEORIGIN', 'Cache-Control': 'no-cache, no-store, must-revalidate', 'Pragma': 'no-cache', 'Expires': '0', 'Content-Disposition': 'attachment; filename="08706065000169-ACE28022020V01-000083505.xml"', 'Server-Timing': 'dtRpid;desc="258920448"', 'Connection': 'close', 'Content-Type': 'text/xml', 'X-XSS-Protection': '1; mode=block', 'Transfer-Encoding': 'chunked'}
But it works perfectly and download the .xml file when I point the browser to https://fnet.bmfbovespa.com.br/fnet/publico/downloadDocumento?id=247031 URL address, for example, with the following data
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<DadosEconomicoFinanceiros xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<DadosGerais>
<NomeFundo>FII BTGP LOGISTICA</NomeFundo>
<CNPJFundo>11839593000109</CNPJFundo>
<NomeAdministrador>BTG Pactual Serviços Financeiros S.A. DTVM</NomeAdministrador>
<CNPJAdministrador>59281253000123</CNPJAdministrador>
<ResponsavelInformacao>Lucas Massola</ResponsavelInformacao>
<TelefoneContato>(11) 3383-2513</TelefoneContato>
<CodISINCota>BRBTLGCTF000</CodISINCota>
<CodNegociacaoCota>BTLG11</CodNegociacaoCota>
</DadosGerais>
<InformeRendimentos>
<Rendimento>
<DataAprovacao>2021-12-15</DataAprovacao>
<DataBase>2021-12-15</DataBase>
<DataPagamento>2021-12-23</DataPagamento>
<ValorProventoCota>0.72</ValorProventoCota>
<PeriodoReferencia>Novembro</PeriodoReferencia>
<Ano>2021</Ano>
<RendimentoIsentoIR>true</RendimentoIsentoIR>
</Rendimento>
<Amortizacao tipo=""/>
</InformeRendimentos>
</DadosEconomicoFinanceiros>
It seems to me that the data is cryptographed, but I have no idea how to get the xml data to use the data inside it. Can you help me?
Thank you very much.
EDIT:
The example code I've used is quite simple:
Python 3.8.10 (tags/v3.8.10:3d8993a, May 3 2021, 11:48:03) [MSC v.1928 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information.
>>> import requests
>>> url='fnet.bmfbovespa.com.br/fnet/publico/downloadDocumento?id=247031'
>>> xhtml = requests.get(url,verify=False, headers={'User-Agent':'Mozzila/5.0'})
Then xhtml.content command shows the string. (There is a HTTPS warning due to the verify=False that i will handle after)
I have tried a solution using urllib.request, but got the same result
Data seems to be base64 encoded. Try to decode it:
import requests
import base64
url = 'http://fnet.bmfbovespa.com.br/fnet/publico/downloadDocumento?id=247031'
response = requests.get(url,verify=False, headers={'User-Agent':'Mozzila/5.0'})
decoded = base64.b64decode(response.content)
print(decoded)

Request email audit export fails with status 400 and "Premature end of file."

according to https://developers.google.com/admin-sdk/email-audit/#creating_a_mailbox_for_export I am trying to request the email audit export of an user in G Suite this way:
def requestAuditExport(account):
credentials = getCredentials()
http = credentials.authorize(httplib2.Http())
url = 'https://apps-apis.google.com/a/feeds/compliance/audit/mail/export/helpling.com/'+account
status, response = http.request(url, 'POST', headers={'Content-Type': 'application/atom+xml'})
print(status)
print(response)
And I get the following result:
{'content-length': '22', 'expires': 'Tue, 13 Dec 2016 14:19:37 GMT', 'date': 'Tue, 13 Dec 2016 14:19:37 GMT', 'x-frame-options': 'SAMEORIGIN', 'transfer-encoding': 'chunked', 'x-xss-protection': '1; mode=block', 'content-type': 'text/html; charset=UTF-8', 'x-content-type-options': 'nosniff', '-content-encoding': 'gzip', 'server': 'GSE', 'status': '400', 'cache-control': 'private, max-age=0', 'alt-svc': 'quic=":443"; ma=2592000; v="35,34"'}
b'Premature end of file.'
I cannot see where the problem is, can someone please give me a hint?
Thanks in advance!
Kay
Fix it by going intp the Admin Console, Manage API client access page under Security and add the Client ID, scope needed for the Directory API. For more information, check this document.
Okay, found out what was wrong and fixed it myself. Finally it looks like this:
http = getCredentials().authorize(httplib2.Http())
url = 'https://apps-apis.google.com/a/feeds/compliance/audit/mail/export/helpling.com/'+account
headers = {'Content-Type': 'application/atom+xml'}
xml_data = """<atom:entry xmlns:atom='http://www.w3.org/2005/Atom' xmlns:apps='http://schemas.google.com/apps/2006'> \
<apps:property name='includeDeleted' value='true'/> \
</atom:entry>"""
status, response = http.request(url, 'POST', headers=headers, body=xml_data)
Not sure if it was about the body or the header. It works now and I hope it will help others.
Thanks anyway.

Python 3.5.2 Iterating a get request

Hoping someone can tell me whether this script is functioning the way I intended it to, and if not explain what I am doing wrong.
The RESTful API I am using has a parameter pageSize ranging from 10-50. I used pageSize=50. There was another parameter that I did not use called pageNumber
So, I thought this would be the right way to make the get request:
# Python 3.5.2
import requests
r = requests.get(url, stream=True)
with open("file.txt",'w', newline='', encoding='utf-8') as fd:
text_out = r.text
fd.write(text_out)
UPDATE
I think I understand a bit better. I read the documentation in more detail, but I am still missing how to get the entire data set from the API. Here is some more information:
verbs = requests.options(r.url)
print(verbs.headers)
{'Server': 'ninx', 'Date': 'Sat, 24 Dec 2016 22:50:13 GMT',
'Allow': 'OPTIONS,HEAD,GET', 'Content-Length': '0', 'Connection': 'keep-alive'}
print(r.headers)
{'Transfer-Encoding': 'chunked', 'Vary': 'Accept-Encoding',
'X-Entity-Count': '50', 'Connection': 'keep-alive',
'Content-Encoding': 'gzip', 'Date': 'Sat, 24 Dec 2016 23:59:07 GMT',
'Server': 'ninx', 'Content-Type': 'application/json; charset=UTF-8'}
Should I create a session and use the previously unused pageNumber parameter to create a new url until the 'X-Entity-Count' is zero? Or, is there a better way?
I found a discussion that helped clear this matter up for me...this updated question should probably be deleted...
API pagination best practices

Resources