Azure FaceAPI limits iteration to 20 items - azure

I have a list of image urls from which I use MS Azure faceAPI to extract some features from the photos. The problem is that whenever I iterate more than 20 urls, it seems not to work on any url after the 20th one. There is no error shown. However, when I manually changed the range to iterate the next 20 urls, it worked.
Side note, on free version, MS Azure Face allows only 20 requests/minute; however, even when I let time sleep up to 60s per 10 requests, the problem still persists.
FYI, I have 360,000 urls in total, and sofar I have made only about 1000 requests.
Can anyone help tell me why this happens and how to solve this? Thank you so much!
# My codes
i = 0
for post in list_post[800:1000]:
i += 1
try:
image_url = post['photo_url']
headers = {'Ocp-Apim-Subscription-Key': KEY}
params = {
'returnFaceId': 'true',
'returnFaceLandmarks': 'false',
'returnFaceAttributes': 'age,gender,headPose,smile,facialHair,glasses,emotion,hair,makeup,occlusion,accessories,blur,exposure,noise',
}
response = requests.post(face_api_url, params=params, headers=headers, json={"url": image_url})
post['face_feature'] = response.json()[0]
except (KeyError, IndexError):
continue
if i % 10 == 0:
time.sleep(60)

The free version has a max of 30 000 request per month, your 356 000 faces will therefore take a year to run.
The standard version costs USD 1 per 1000 requests, giving a total cost of USD 360. This option supports 10 transactions per second.
https://azure.microsoft.com/en-au/pricing/details/cognitive-services/face-api/

Related

Lost Clients PBI

I am trying to get the number of lost clients per month. The code I'm using for the measure is set forth next:
`LostClientsRunningTotal =
VAR currdate = MAX('Date'[Date])
VAR turnoverinperiod=[Turnover]
VAR clients=
ADDCOLUMNS(
Client,
"Turnover Until Now",
CALCULATE([Turnover],
DATESINPERIOD(
'Date'[Date],
currdate,
-1,
MONTH)),
"Running Total Turnover",
[RunningTotalTurnover]
)
VAR lostclients=
FILTER(clients,
[Running Total Turnover]>0 &&
[Turnover Until Now]=0)
RETURN
IF(turnoverinperiod>0,
COUNTROWS(lostclients))`
The problem is that I'm getting the running total and the result it returns is the following:
enter image description here
What I need is the lost clients per month so I tried to use the dateadd function to get the lost clients of the previous month and then subtract the current.
The desired result would be, for Nov-22 for instance, 629 (December running total) - 544 (November running total) = 85.
For some reason the **dateadd **function is not returning the desired result and I can't make head or tails of it.
Can you tell me how should I approach this issue please? Thank you in advance.

Converting video to images using OpenCV library problem

I have this code which converts .mov videos to images with the specified frequency (e.g. every 30 seconds here).
import cv2
# the path to take the video from
vidcap = cv2.VideoCapture(r"C:\Users\me\Camera_videos\Images_30sec\PIR-206_7.MOV")
def getFrame(sec):
vidcap.set(cv2.CAP_PROP_POS_MSEC,sec*1000)
hasFrames,image = vidcap.read()
if hasFrames:
cv2.imwrite("image"+str(count)+".jpg", image) # save frame as JPG file
return hasFrames
sec = 0
frameRate = 30 #//it will capture image every 30 seconds
count= 1
success = getFrame(sec)
while success:
count = count + 1
sec = sec + frameRate
sec = round(sec, 2)
success = getFrame(sec)
I have no problem with smaller files. A 5min long .mov file for example, produces 11 images as expected (5 x 60 seconds / 30 seconds = about 10 images with the first image taken at 0 seconds).
However, when I tried a bigger file, which is 483 MB and is about 32mins long, I have encountered a problem.
It is expected to generate some 32 x 60/30 = 64 images.
However, it runs and runs generating some 40'000 images until I manually stop the program. it seems to be stuck at one of the last images??
I have uploaded both .mov files to my google drive, if anyone wants to have a look.
small file
https://drive.google.com/file/d/1guKtLgM-vwt-5fG3_suJrhVbtwMSjMQe/view?usp=sharing
large file
https://drive.google.com/file/d/1V_HVRM29qwlsU0vCyWiOuBP-tkjdokul/view?usp=sharing
Can somebody advise on what's going on here?

Push aggregated metrics to the Prometheus Pushgateway from clustered Node JS

I've ran my node apps in a cluster with Prometheus client implemented. And I've mutated the metrics (for testing purpose) so I've got these result:
# HELP otp_generator_generate_failures A counter counts the number of generate OTP fails.
# TYPE otp_generator_generate_failures counter
otp_generator_generate_failures{priority="high"} 794
# HELP otp_generator_verify_failures A counter counts the number of verify OTP fails.
# TYPE otp_generator_verify_failures counter
otp_generator_verify_failures{priority="high"} 802
# HELP smsblast_failures A counter counts the number of SMS delivery fails.
# TYPE smsblast_failures counter
smsblast_failures{priority="high"} 831
# HELP send_success_calls A counter counts the number of send API success calls.
# TYPE send_success_calls counter
send_success_calls{priority="low"} 847
# HELP send_failure_calls A counter counts the number of send API failure calls.
# TYPE send_failure_calls counter
send_failure_calls{priority="high"} 884
# HELP verify_success_calls A counter counts the number of verify API success calls.
# TYPE verify_success_calls counter
verify_success_calls{priority="low"} 839
# HELP verify_failure_calls A counter counts the number of verify API failure calls.
# TYPE verify_failure_calls counter
verify_failure_calls{priority="high"} 840
FYI: I have adopted the given example from: https://github.com/siimon/prom-client/blob/master/example/cluster.js
Then, I pushed the metrics and everything work fine at the process. But, when I checked the gateway portal, it gives different result from those metrics above.
The one that becomes my opinion is the pushed metrics came from single instance instead of aggregated metrics from few instances running. Isn't it?
So, has anyone ever solved this problem?
Oh, this is my push code:
const { Pushgateway, register } = require('prom-client')
const promPg = new Pushgateway(
config.get('prometheus.pushgateway'),
{
timeout: 5000
},
register
)
promPg.push({ jobName: 'sms-otp-middleware' }, (err, resp, body) => {
console.log(`Error: ${err}`)
console.log(`Body: ${body}`)
console.log(`Response status: ${resp.statusCode}`)
res.status(200).send('metrics_pushed')
})

Ramp and Hold Users for Some time and Ramp again

I have the following scenario to be load tested for a service and it does not seem to work as expected. My scenario is as follows.
Test with rampUsers(100) over 15 minutes duration
Hold the users for about 10 minutes holdFor(10 minutes)
Then again rampUsers(200) over 15 minutes duration
Hold the users for about 10 minutes holdFor(10 minutes)
Then again rampUsers(200) over 15 minutes duration
I am trying to use throttle option for this but it does not seem to work as expected
here is my code snippets combinations that I have tried so far
//NUM_USERS = 300
//DURATION = 15 minutes
//CONSTANT_DURATION = 5 minutes
// Tried with different combinations of NUM_USERS and DURATION but not helpful
scn.inject(
rampUsers(NUM_USERS*1) during DURATION,
constantUsersPerSec(1) during CONSTANT_DURATION,
rampUsers(NUM_USERS*2) during DURATION,
constantUsersPerSec(2) during CONSTANT_DURATION,
rampUsers(NUM_USERS*3) during DURATION,
constantUsersPerSec(3) during CONSTANT_DURATION
)
scn.inject(
rampUsers(NUM_USERS) during DURATION
).throttle(
reachRps(NUM_USERS/4) in (CONSTANT_DURATION),
holdFor(CONSTANT_DURATION),
jumpToRps(NUM_USERS/3),
holdFor(CONSTANT_DURATION),
jumpToRps(NUM_USERS/2),
holdFor(CONSTANT_DURATION)
)
scn.inject(
rampUsers(NUM_USERS) during DURATION
).throttle(
holdFor(CONSTANT_DURATION),
reachRps(NUM_USERS+NUM_USERS) in (DURATION+DURATION),
holdFor(CONSTANT_DURATION)
)
Can any one help on this which one works in this case. I would like to have a graph like this
To target injection rate as you stated you want in the comments, you need something like this
scn.inject(
rampUsersPerSec(0) to (300) during DURATION,
constantUsersPerSec(300) during CONSTANT_DURATION,
rampUsersPerSec(300) to (600) during DURATION,
constantUsersPerSec(600) during CONSTANT_DURATION,
...
)

Trying to scrape and segregated into Headings and Contents. The problem is that both have same class and tags, How to segregate?

I am trying to web scrape http://www.intermediary.natwest.com/intermediary-solutions/lending-criteria.html segregating into 2 parts Heading and Content, The problem is that both have same class and tags. Other than using regex and hard coding, How to distinguish and extract into 2 columns in excel?
In the picture(https://ibb.co/8X5xY9C) or in the website link provided, Bold(Except Alphabet Letters(A) and later 'back to top' ) represents Heading and Explanation(non-bold just below bold) represents the content(Content even consists of 'li' and 'ul' blocks later in the site, which should come under respective Heading)
#Code to Start With
from bs4 import BeautifulSoup
import requests
url = "http://www.intermediary.natwest.com/intermediary-solutions/lending-criteria.html";
html = requests.get(url)
soup = BeautifulSoup(html.text, "html.parser")
Heading = soup.findAll('strong')
content = soup.findAll('div', {"class": "comp-rich-text"})
Output Excel looks something Link this
https://i.stack.imgur.com/NsMmm.png
I've thought about it a little more and thought of a better solution. Rather than "crowd" my initial solution, I chose to add a 2nd solution here:
So thinking about it again, and following my logic of splitting the html by the headlines (essentially breaking it up where we find <strong> tags), I choose to convert to strings using .prettify(), and then split on those specific strings/tags and read back into BeautifulSoup to pull the text. From what I see, it looks like it hasn't missed anything, but you'll have to search through the dataframe to double check:
import requests
from bs4 import BeautifulSoup
import pandas as pd
url = 'http://www.intermediary.natwest.com/intermediary-solutions/lending-criteria.html'
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.103 Safari/537.36'}
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
sections = soup.find_all('div',{'class':'accordion-section-content'})
results = {}
for section in sections:
splits = section.prettify().split('<strong>')
for each in splits:
try:
headline, content = each.split('</strong>')[0].strip(), each.split('</strong>')[1]
headline = BeautifulSoup(headline, 'html.parser').text.strip()
content = BeautifulSoup(content, 'html.parser').text.strip()
content_split = content.split('\n')
content = ' '.join([ text.strip() for text in content_split if text != ''])
results[headline] = content
except:
continue
df = pd.DataFrame(results.items(), columns = ['Headings','Content'])
df.to_csv('C:/test.csv', index=False)
Output:
print (df)
Headings Content
0 Age requirements Applicants must be at least 18 years old at th...
1 Affordability Our affordability calculator is the same one u...
2 Agricultural restriction The only acceptable agricultural tie is where ...
3 Annual percentage rate of charge (APRC) The APRC is all fees associated with the mortg...
4 Adverse credit We consult credit reference agencies to look a...
5 Applicants (number of) The maximum number of applicants is two.
6 Armed Forces personnel Unsecured personal loans are only acceptable f...
7 Back to back Back to back is typically where the vendor has...
8 Customer funded purchase: when the customer has funded the purchase usin...
9 Bridging: residential mortgage applications where the cu...
10 Inherited: a recently inherited property where the benefi...
11 Porting: where a fixed/discounted rate was ported to a ...
12 Repossessed property: where the vendor is the mortgage lender in pos...
13 Part exchange: where the vendor is a large national house bui...
14 Bank statements We accept internet bank statements in paper fo...
15 Bonus For guaranteed bonuses we will consider an ave...
16 British National working overseas Applicants must be resident in the UK. Applica...
17 Builder's Incentives The maximum amount of acceptable incentive is ...
18 Buy-to-let (purpose) A buy-to-let mortgage can be used for: Purcha...
19 Capital Raising - Acceptable purposes permanent home improvem...
20 Buy-to-let (affordability) Buy to Let affordability must be assessed usin...
21 Buy-to-let (eligibility criteria) The property must be in England, Scotland, Wal...
22 Definition of a portfolio landlord We define a portfolio landlord as a customer w...
23 Carer's Allowance Carer's Allowance is paid to people aged 16 or...
24 Cashback Where a mortgage product includes a cashback f...
25 Casual employment Contract/agency workers with income paid throu...
26 Certification of documents When submitting copies of documents, please en...
27 Child Benefit We can accept up to 100% of working tax credit...
28 Childcare costs We use the actual amount the customer has decl...
29 When should childcare costs not be included? There are a number of situations where childca...
.. ... ...
108 Shared equity We lend on the Government-backed shared equity...
109 Shared ownership We do not lend against Shared Ownership proper...
110 Solicitors' fees We have a panel of solicitors for our fees ass...
111 Source of deposit We reserve the right to ask for proof of depos...
112 Sole trader/partnerships We will take an average of the last two years'...
113 Standard variable rate A standard variable rate (SVR) is a type of v...
114 Student loans Repayment of student loans is dependent on rec...
115 Tenure Acceptable property tenure: Feuhold, Freehold,...
116 Term Minimum term is 3 years Residential - Maximum...
117 Unacceptable income types The following forms of income are classed as u...
118 Bereavement allowance: paid to widows, widowers or surviving civil pa...
119 Employee benefit trusts (EBT): this is a tax mitigation scheme used in conjun...
120 Expenses: not acceptable as they're paid to reimburse pe...
121 Housing Benefit: payment of full or partial contribution to cla...
122 Income Support: payment for people on low incomes, working les...
123 Job Seeker's Allowance: paid to people who are unemployed or working 1...
124 Stipend: a form of salary paid for internship/apprentic...
125 Third Party Income: earned by a spouse, partner, parent who are no...
126 Universal Credit: only certain elements of the Universal Credit ...
127 Universal Credit The Standard Allowance element, which is the n...
128 Valuations: day one instruction We are now instructing valuations on day one f...
129 Valuation instruction A valuation will be automatically instructed w...
130 Valuation fees A valuation will always be obtained using a pa...
131 Please note: W hen upgrading the free valuation for a home...
132 Adding fees to the loan Product fees are the only fees which can be ad...
133 Product fee This fee is paid when the mortgage is arranged...
134 Working abroad Previously, we required applicants to be empl...
135 Acceptable - We may consider applications from people who: ...
136 Not acceptable - We will not consider applications from people...
137 Working and Family Tax Credits We can accept up to 100% of Working Tax Credit...
[138 rows x 2 columns]
EDIT: SEE OTHER SOLUTION PROVIDED
It's tricky. I tried to essentially to grab the headings, then use those to grab all the text after the heading, and that proceeds the next heading. The code below is a little messy, and requires some cleaning up, but hopefully gets you to a point to work with it or get you moving in the right direction:
import requests
from bs4 import BeautifulSoup
import pandas as pd
import re
url = 'http://www.intermediary.natwest.com/intermediary-solutions/lending-criteria.html'
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.103 Safari/537.36'}
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
sections = soup.find_all('div',{'class':'accordion-section-content'})
results = {}
for section in sections:
headlines = section.find_all('strong')
headlines = [each.text for each in headlines ]
for i, headline in enumerate(headlines):
if headline != headlines[-1]:
next_headline = headlines[i+1]
else:
next_headline = ''
try:
find_content = section(text=headline)[0].parent.parent.find_next_siblings()
if ':' in headline and 'Gifted deposit' not in headline and 'Help to Buy' not in headline:
content = section(text=headline)[0].parent.nextSibling
results[headline] = content.strip()
break
except:
find_content = section(text=re.compile(headline))[0].parent.parent.find_next_siblings()
if find_content == []:
try:
find_content = section(text=headline)[0].parent.parent.parent.find_next_siblings()
except:
find_content = section(text=re.compile(headline))[0].parent.parent.parent.find_next_siblings()
content = []
for sibling in find_content:
if next_headline not in sibling.text or headline == headlines[-1]:
content.append(sibling.text)
else:
content = '\n'.join(content)
results[headline.strip()] = content.strip()
break
if headline == headlines[-1]:
content = '\n'.join(content)
results[headline] = content.strip()
df = pd.DataFrame(results.items())

Resources