how do I convert this python code in python function? - python-3.x

I was use threading Pool for my script. I have working code for html table to json conversion.
I am using pandas for html table to json.
html_source2 = str(html_source1)
pool = ThreadPool(4)
table = pd.read_html(html_source2)[0]
table= table.loc[:,~table.columns.str.startswith('Unnamed')]
d = (table.to_dict('records'))
print(json.dumps(d,ensure_ascii=False))
results = (json.dumps(d,ensure_ascii=False))
i want something like:
html_source2 = str(html_source1)
pool = ThreadPool(4)
def abcd():
table = pd.read_html(html_source2)[0]
table= table.loc[:,~table.columns.str.startswith('Unnamed')]
d = (table.to_dict('records'))
print(json.dumps(d,ensure_ascii=False))
results = (json.dumps(d,ensure_ascii=False))

You are almost there. You need to make the function take an input argument, here html_str and then have it return the results you need so you can use them outside the function.
html_source2 = str(html_source1)
pool = ThreadPool(4)
def abcd(html_str):
table = pd.read_html(html_str)[0]
table= table.loc[:,~table.columns.str.startswith('Unnamed')]
d = (table.to_dict('records'))
print(json.dumps(d,ensure_ascii=False))
results = (json.dumps(d,ensure_ascii=False))
return results
my_results = abcd(html_source2)
And remove the print call if you don't need to see the output in the function

I guess you don't know much about functions, parameters and how to call functions read here https://www.w3schools.com/python/python_functions.asp
Consider reading it it's a short read.

Related

How mock a method with parameters and return an iterable in python

I need to test this function with a unit test:
def nlp_extraction(texts, nlp=None):
extr = []
for doc in nlp.pipe([texts]):
extr.append(list([ent.label_, ent.text]) for ent in doc.ents)
extracao = [list(extr[i]) for i in range(len(extr))]
extracao = list(chain.from_iterable(extracao))
extracao = " ".join([item[1] for item in extracao])
return [texts, extracao]
I wrote, inicialy, this test and worked:
def test_nlp_extraction_entrada_correta():
nlp = loadModel('ner_extract_ingredients')
result_reference = ['xilitol', 'xilitol']
texts = 'xilitol'
result = nlp_extraction(texts, nlp)
assert result == result_reference
But in this test I need to load the model. As this is an unit test, I would like to mock the responses, thus load an external model can be disable. I am trying something like this (and a combination of the lines commented in the code):
def test_nlp_extraction_entrada_correta():
texts = 'xilitol'
doc = Mock(name="DOC")
ents = Mock(name="ENTS", label_='xilitol', text="xilitol")
doc.ents = [ents]
from nextmock import Mock
nlp = Mock()
nlp_mock = Mock()
nlp.with_args([texts]).returns([doc])
nlp_mock.pipe = nlp([texts])
# nlp_mock.pipe.with_args([texts]).returns(doc)
# nlp_mock.pipe = [Mock(return_value=doc)]
result = nlp_extraction(texts, nlp=nlp_mock)
assert result == result_reference
But an error always raise, saying that nlp.pipe([texts]) mock object is not iterable. So, I need to mock this part nlp.pipe([texts]) and return the doc object. How I can do this? Something I am missing in the proccess, can someone help me.
As Cpt.Hook said in comments, the solution was achieved using nlp.pipe.return_value = [doc].

scipy.minimize : how to include a static data in the function

I try to generalize an optimization function using scipy.optimize.
Actually I write this function in this way:
def value_to_optimize(data):
data_set = np.genfromtxt('myfilepathinstaticmode', delimiter=',',skip_header=1)
doe = data_set[:,:-1]
new_data_set = np.vstack((np.array(doe),np.array(data)))
return result_of_another_function(new_data_set)
def new_data():
rst = minimize(value_to_optimize,[0,0])
return rst.x
the function I try to optimize is the first one. And to do that I use the second function that use "minimize" and a x0 for starting optimization.
As you can see my problem is comming from 'myfilepathinstaticmode'. I would like to generalize my function, like value-to_optimize(filename,data), but at this moment, I cannot apply optimize() on it because it is only working on numbers.
Any idea on how to write it in a generalized manner ?
I personally would read the data outside of the minimizer function and then hand over only that data into the method:
def value_to_optimize(data, doe):
new_data_set = np.vstack((np.array(doe),np.array(data)))
return result_of_another_function(new_data_set)
def new_data():
data_set = np.genfromtxt('myfilepathinstaticmode', delimiter=',',skip_header=1)
doe = data_set[:,:-1]
rst = minimize(value_to_optimize, ([0,0], doe))
return rst.x
EDIT: I'm not sure if I understood your question correctly. So, alternatively, a more flexible approach would be to use functools.partial to generate a method with the filename as a parameter which you can then hand over to your optimizer.
Something is working in this way. I'm not convinced about the robustess of the code. The solution: I defined the function value_to_optimize() inside the function new_data(). Like this, the parameter 'filename' is out of "value_to_optimize()" but is a kind of "global" assignment inside new_data().
def new_data(filename):
def value_to_optimize(data):
data_set = np.genfromtxt(filename, delimiter=',',skip_header=1)
doe = data_set[:,:-1]
new_data_set = np.vstack((np.array(doe),np.array(data)))
return result_of_another_function(new_data_set)
rst = minimize(value_to_optimize,[0,0])
return rst.x

Importing a function into another script Indexing error

I am importing one of my sub functions into my main script and I keep getting the following:
IndexError: list index out of range
I am implementing the same exact function calls in the script as the functions asks for. Does anyone know a way to check to see if the function is properly imported?
Below is a sample of the code I am working with
from file_with_function import function
[output1 output2 output3] = function(roll = 15, pitch = 30,
yaw = 45)
print('Argument List: '+ str(sys.argv))
rolla = int(sys.argv[1])
pitcha = int(sys.argv[2])
yawa = int(sys.argv[3])
def function(roll = 15 , pitch = 30 , yaw = 45):
#script is here
if __name__ =='__main__':
function(
roll = rolla,
pitch = pitcha,
yaw = yawa)
There is either an error in the function you're trying to import, or there is something wrong with the data that you are passing into the function. I don't think it's an import error, but without looking at the code I can't really help you.

Execute multiple Steps for entries in a List in Python

I try to load a list from a txt.file and then want to execute multiple task on every single entry. Unfortunately the tasks are executed only on one entry instead of all of them.
I load the list from the txt.file with this function:
def load_dir_file():
directory = os.path.dirname(__file__)
filename = os.path.join(directory, "law_dir")
with open(filename, "r", encoding="utf-8") as fin:
dir_file = fin.readlines()
return dir_file
This is the code to execute those tasks
def create_html():
dir_lst = load_dir_file()
for dir_link_dirty in dir_lst:
dir_link = dir_link_dirty.replace('"',"").replace(",","").replace("\n","")
dir_link_code = urllib.request.urlopen(dir_link)
bs_dir_link_code = BeautifulSoup(dir_link_code, "html5lib")
h2_a_tag = bs_dir_link_code.h2.a
html_link = str(dir_link) + "/" + str(h2_a_tag["href"])
print(dir_lst)
return html_link
The txt. file looks like this now:
"https://www.gesetze-im-internet.de/ao_1977",
"https://www.gesetze-im-internet.de/bbg_2009",
"https://www.gesetze-im-internet.de/bdsg_2018"
I am new to programming and probably fail some very basic points up there. So if you want to add some recommendation how i can improve basically, I would more then appreciate it.
Based on your comment above it sounds like you want to return a list of html links not just one. To do that you need that function to build a list and have it return that list. You have a lot going on in create_html, so for illustration purposes I split that function into two: create_html_link_list and create_html_link.
def create_html_link(dir_link_dirty):
dir_link = dir_link_dirty.replace('"',"").replace(",","").replace("\n","")
dir_link_code = urllib.request.urlopen(dir_link)
bs_dir_link_code = BeautifulSoup(dir_link_code, "html5lib")
h2_a_tag = bs_dir_link_code.h2.a
html_link = str(dir_link) + "/" + str(h2_a_tag["href"])
return html_link
def create_html_link_list():
dir_lst = load_dir_file()
html_link_list = [
create_html_link(dir_link_dirty)
for dir_link_dirty in dir_lst
]
return html_link_list

Periodically running a function in python

I have the following function that gets some data from a web page and stores in a dictionary. The time stamp as a key, and the data (list) as value.
def getData(d):
page = requests.get('http://www.transportburgas.bg/bg/%D0%B5%D0%BB%D0%B5%D0%BA%D1%82%D1%80%D0%BE%D0%BD%D0%BD%D0%BE-%D1%82%D0%B0%D0%B1%D0%BB%D0%BE')
soup = BeautifulSoup(page.content, 'html.parser')
table = soup.find_all("table", class_="table table-striped table-hover")
rows = table[0].find_all('tr')
table_head = table[0].find_all('th')
header = []
tr_l = []
rows = []
for tr in table[0].find_all('tr'):
value = tr.get_text()
rows.append(value)
time_stamp = rows[1].split("\n")[1]
data = []
for i in rows[1:]:
a = i.split("\n")
if time.strptime(a[1], '%d/%m/%Y %H:%M:%S')[:4] == time.localtime()[:4]:
data.append(int(a[3]))
else:
data.append(np.nan)
d[time_stamp] = data
The data on the web page gets updated every 5 mins. I would like to make the function run automatically every 5 min. I am trying to do it with the time.sleep and this function:
def period_fun(it):
iterations = it
while iterations != 0:
getData(dic)
time.sleep(300)
iterations = iterations -1
However, this function runs only once and I end up with only one item in the dictionary. I have tried it with a simple print (1) instead of the function and it works (1 gets printed several times), but when I implement it with the function it doesn't work.
Would really appreciate any suggestions on the functions or how I could achieve my goal!
Best regards,
Mladen
How about using some library which use cron jobs?
Schedule looks nice although it is not using cron like syntax: https://github.com/dbader/schedule

Resources