Google Translate API : Multiple input texts - Python - python-3.x

I am struggling to find a way to input multiple texts in Google Translate API.
My setup includes the following things.
Using urllib.request.build_opener (Python3)
Google Translate API https://translation.googleapis.com/language/translate/v2
I know that we can pass multiple parameters (Multiple "q"), but I don't know how to do it with Python.
I referred Google Translate documents. I found this.
My Question :
How to add multiple texts to the input. ? Because the following code is not making any sense to me.
data = {'q':'cat', 'q':'dog','source':source,'target':target,'format':'html'}
This is my code.
data = {'q':'This is Text1', 'q':'This is Text2', 'q':'This is Text3', source':source,'target':target,'format':'html'}
_req = urllib.request.Request("https://translation.googleapis.com/language/translate/v2?key="+API_KEY)
_req.add_header('Content-length', len(data))
_req.data = urllib.parse.urlencode(data).encode("utf-8")
response = Connector._get(_req,_session)
Connector._get() is in some other file and it internally calls urllib.request.build_opener with data.
Thanks!

To post multiple parameters (with the same name) in Python for an HTTP request, you can use a list for the values. They'll be added to the URL like q=dog&q=cat.
Example:
headers = { 'content-type': 'application/json; charset=utf-8' }
params = {'q': ['cat', 'dog'],'source':source,'target':target,'format':'html'}
response = requests.post(
"https://translation.googleapis.com/language/translate/v2?key=",
headers=headers,
params=params
)
Specifically, params = {'q': ['cat', 'dog']} is relevant to your question.

I do not have tested by myself, but it seems that you should build the data string to give as data argument to your python urllib.request method. So something like data = "{\n \'q\':{}\n \'q\':{} {} etc.".format(qstr,qstr, etc...)
After that you could want to make it more painfull to have several qs.
You could make a loop and building your string with += operations.

Related

How to run a while loop to run a REST API call until no more results come back in Python

I'm writing a short Python program to request a JSON file using a Rest API call. The API limits me to a relatively small results set (50 or so) and I need to retrieve several thousand result sets. I've implemented a while loop to achieve this and it's working fairly well but I can't figure out the logic for 'continuing the while loop' until there are no more results to retrieve. Right now I've implemented a hard number value but would like to replace it with a conditional that stops the loop if no more results come back. The 'offset' field is the parameter that the API forces you to use to specify which set of results you want in your 50. My logic looks something like...
import requests
import json
from time import sleep
url = "https://someurl"
offsetValue = 0
PARAMS = {'limit':50, 'offset':offsetValue}
headers = {
"Accept": "application/json"
}
while offsetValue <= 1000:
response = requests.request(
"GET",
url,
headers=headers,
params = PARAMS
)
testfile = open("testfile.txt", "a")
testfile.write(json.dumps(json.loads(response.text), sort_keys=True, indent=4, separators=(",", ": ")))
testfile.close()
offsetValue = offsetValue + 1
sleep(1)
So I want to change the conditional the controls the while loop from a fixed number to a check to see if the result set for the getRequest is empty. Hopefully this makes sense.
Your loop can be while true. After each fetch, convert the payload to a dict. If the number of results is 0, then break.
Depending on how the API works, there may be other signals that there’s nothing more to fetch, e.g. some HTTP error, not necessarily the result count — you’ll have to discover the API’s logic for that.

python3, Trying to get an output from my function I defined, need some guidance

I found pretty cool ASN API tool that allows me to supply an AS # and it will go out and pull down the subnets that relate with that ASN.
Here is (rough) but partial code. I am defining a function ASNNUMBER (to which I will supply the number through another file)
When I call url here, it just gives me an n...
What I'm trying to do here, is append my str(ASNNUMBER) to the end of the ?q= parameter in the URL.
Once I do that, I'd like to display my results and output it to a file
import requests
def asnfinder(ASNNUMBER):
print('n\n######## Running ASNFinder ########\n')
url = 'https://api.hackertarget.com/aslookup?q=' + str(ASNNUMBER)
response = requests.get(url)
My results I'd like to get is an output of the get request I'm performing
## Running ASNFinder
n
Try to write something like that:
import requests
def asnfinder(ASNNUMBER):
print('n\n######## Running ASNFinder ########\n')
url = 'https://api.hackertarget.com/aslookup?q=' + str(ASNNUMBER)
response = requests.get(url)
data = response.text
print(data)
with open('filename', 'r') as f:
f.write(data)
It must works fine
P.S. If it helped ya, please make sure you mark this as the answer :)

Indexing HTML in Elasticsearch via python3

I am new to Elasticsearch. I have to index many HTML files via python3. I've seen many examples of adding info into Elasticsearch, but couldn't actually find anything appropriate for me. Can I index HTML files without extracting all their information in JSON format? I've seen some examples of indexing PDF to Elasticsearch via PHP using pipeline, but could not find something like this for python.
What do you mean by index HTML files to Elasticsearch? What kind of information do you want to send to Elasticsearch?
Yes its definitely possible, but give a bit more details of what you want to be sending to Elasticsearch. (full HTML pages, only the name, certain information from HTML files, etc)
Here a sample of a class that might be handy for you..
#ELK credentials
ELK_HOST = "[hostname]"
ELK_USER = "[elastic_user]"
ELK_PASSWORD= "[elastic_password]"
HEADERS = {
'host' : '[put hostname again if using redirects ;)]',
'Content-Type' : 'application/json',
}
class ElasticSearch():
def __init__(self,host,user,password):
self._host = host
self._user = user
self._password = password
self._auth = (self._user, self._password)
def update_index(self, index, data):
endpoint = str(index)+"/doc/"
uri = self._host +"/"+ endpoint
_data = data
_data = python_to_json(_data)
response = requests.post(uri, headers=HEADERS, auth=self._auth,data=_data)
es = ElasticSeach(ELK_HOST,ELK_USER,ELK_PASSWORD);
#some random data
data = {"test1": 1, "test2" : 2}
#update index (if doesnt exist, it will create a new one)
es.update_index("testindex",data)
hope this will help you!

How do we use POST method in Python using urllib.request?

I have to make use of POST method using urllib.request in Python and have written the following code for POST method.
values = {"abcd":"efgh"}
headers = {"Content-Type": "application/json", "Authorization": "Basic"+str(authKey)}
req = urllib.request.Request(url,values,headers=headers,method='POST')
response = urllib.request.urlopen(req)
print(response.read())
I am able to make use of 'GET' and 'DELETE' but not 'POST'.Could anyone help me out in solving this?
Thanks
If you really have to use urllib.request in POST, you have to:
Encode your data using urllib.parse.urlencode()(if sending a form)
Convert encoded data to bytes
Specify Content-Type header (application/octet-stream for raw binary data, application/x-www-form-urlencoded for forms , multipart/form-data for forms containing files and application/json for JSON)
If you do all of this, your code should be like:
req=urllib.request.Request(url,
urllib.parse.urlencode(data).encode(),
headers={"Content-Type":"application/x-www-form-urlencoded"}
)
urlopen=urllib.request.urlopen(req)
response=urlopen.read()
(for forms)
or
req=urllib.request.Request(url,
json.dumps(data).encode(),
headers={"Content-Type":"application/json"}
)
urlopen=urllib.request.urlopen(req)
response=urlopen.read()
(for JSON).
Sending files is a bit more complicated.
From urllib.request's official documentation:
For an HTTP POST request method, data should be a buffer in the
standard application/x-www-form-urlencoded format. The
urllib.parse.urlencode() function takes a mapping or sequence of
2-tuples and returns an ASCII string in this format. It should be
encoded to bytes before being used as the data parameter.
Read more:
Python - make a POST request using Python 3 urllib
RFC 7578 - Returning Values from Forms: multipart/form-data
You can use the requests module for this.
import requests
...
url="https://example.com/"
print url
data = {'id':"1", 'value': 1}
r = requests.post(url, data=data)
print(r.text)
print(r.status_code, r.reason)
You can send calls without installing any additional packages.
Call this function with your input data and url. function will return the response.
from urllib import request
import json
def make_request(input_data, url):
# dict to Json, then convert to string and then to bytes
input_data = str(json.dumps(input_data)).encode('utf-8')
# Post Method is invoked if data != None
req = request.Request(url, data=input_data)
return request.urlopen(req).read().decode('utf-8')

Alternatives to string.replace() method that allows for multiple sub-string search and replace

Python newbie here, I'm trying to search a video API which, for some reason won't allow me to search video titles with certain characters in the video title such as : or |
Currently, I have a function which calls the video API library and searches by title, which looks like this:
def videoNameExists(vidName):
vidName = vidName.encode("utf-8")
bugFixVidName = vidName.replace(":", "")
search_url ='http://cdn-api.ooyala.com/v2/syndications/49882e719/feed?pcode=1xeGMxOt7GBjZPp2'.format(bugFixVidName) #this URL is altered to protect privacy for this post
Is there an alternative to .replace() (or a way to use it that I'm missing) that would let me search for more than one sub-string at the same time?
Take a look a the Python re module, specifically at the method re.sub().
Here's an example for your case:
import re
def videoNameExists(vidName):
vidName = vidName.encode("utf-8")
# bugFixVidName = vidName.replace(":", "")
bugFixVidName = re.sub(r'[:|]', "", vidName)
search_url ='http://cdn-api.ooyala.com/v2/syndications/49882e719/feed?pcode=1xeGMxOt7GBjZPp2'.format(bugFixVidName) #this URL is altered to protect privacy for this post

Resources