I am trying to scrape data from this link. Where I want to first find all headings that are in bold.
I've achieved the above task using code below:
url = 'https://www.emirates.com/pk/english/help/covid-19/dubai-travel-requirements/tourists/'
r = requests.get(url)
soup = BeautifulSoup(r.content, 'html.parser')
headers = []
for sib in soup.findAll('strong'):
headers.append([sib.text])
The problem is there is a bold text in li tag I don't want that as header. E.g. If you are flying from India, Pakistan, Nigeria or Bangladesh is considered as header I don't want that to be included in header as it is in li tag. How can I solve this?
Next part where I am stuck is that I want to scrape all text under these headers. To achieve that I've written the following code:
main_data = []
data_str = ''
for i in range(0, len(headers)):
target = soup.find(['h3', 'p'], text=headers[i])
for sib in target.find_next_siblings():
if sib.name == "strong":
break
else:
data_str = sib.text + "."
main_data.append([data_str])
Currently the output contains list of lists but each tag is made a list. Also the content and headers are repeating.
The expected output is a list of lists containing text scraped from under each header.
Example:
For header Passengers will need to do COVID‑19 PCR tests only if it is mandated by the country they are travelling to.
main_data[0] = Please check the requirements of the country you are travelling to. The travel regulations change frequently. You may need to take a COVID‑19 PCR test before you depart or another particular type of COVID‑19 test specified by your destination.
This is a list of authorised COVID‑19 test laboratories in Dubai where you can get tested before you travel to your destination.
Solution of the first part can be:
import requests
from bs4 import BeautifulSoup
url = 'https://www.emirates.com/pk/english/help/covid-19/dubai-travel-requirements/tourists/'
r = requests.get(url)
soup = BeautifulSoup(r.content, 'html.parser')
headers = []
for sib in soup.select('p > strong'):
headers.append([sib.text])
for sib in soup.select('h3 > strong'):
headers.append([sib.text])
headers
Output:
[['Passengers will need to do COVID-19 PCR tests only if it is mandated by the country they are travelling to.'],
['Rapid COVID-19 testing at Dubai International airport for flights to China'],
['Special COVID-19 PCR test rates for Emirates passengers'],
['Requirements for all passengers arriving in Dubai'],
['Indian Nationals with a normal passport who are travelling to or from India via Dubai can obtain a visa on arrival in Dubai for a maximum stay of 14 days provided they:'],
['Test on arrival'],
['Transiting in Dubai'],
['Test exemptions:'],
['COVID-19 testing laboratories:'],
['Arriving passengers'],
['Vaccination certificate verification'],
['Before you book'],
['Before you travel'],
['When you arrive']]
Solution of the second part:
main_data = []
for i in range(0, len(headers)):
target = soup.find(['h3', 'p'], text=headers[i])
text = ''
for sib in target.find_next_siblings():
if sib.select_one('p > strong') is not None or sib.select_one('h3 > strong') is not None:
break
else:
text += sib.text
main_data.append([text])
main_data
Output:
[['Please check the requirements of the country you are travelling to. The travel regulations change frequently. You may need to take a COVID-19 PCR test before you depart or another particular type of COVID-19 test specified by your destination.This is a list of authorised COVID-19 test laboratories in Dubai\ufeff where you can get tested before you travel to your destination.'],
['All passengers, except children under the age of 1, who are travelling to China must have a negative rapid COVID-19 test certificate before travel. You must report to the check-in counter 5 hours before your flight and take this test at Dubai International airport at Emirates Terminal 3 departure area, next to Costa. For further information please refer to the travel requirements for China.'],
['Emirates has expanded its medical partnerships to offer all passengers exclusive home or office COVID-19 PCR testing rates at the following centres:Al Tadawi Medical CentreLocated at Al Masood building, Airport Road, Port Saeed area, Deira.The test costs AED 130 per person. Home or office testing within Dubai costs AED 240 per person. Test results will be available within 24 hours.Prime Medical CentresLocations in Dubai:\nAl Qusais Branch, Damascus StreetPremier Diagnostic and Medical Center, Salah Al Din Street\xa0Prime Corp Medical Center, Salah Al Din Street, DeiraSheikh Zayed Branch, Sheikh Zayed Street, near Noor Islamic BankPrime Specialist Medical Center Sharjah Branch, King Faisal St, Al MajazAjman Branch, Grand Mall, Sheikh Khalifa StThe test costs AED 150 per person. Home or office testing within Dubai for a minimum of two passengers is also available at AED 240 per person. Test results will be available within 24 hours.'],
["All passengers travelling to Dubai from any point of origin (GCC countries included) must hold a negative COVID-19 RT-PCR test certificate for a test taken no more than 72 hours before departure, except for travel from Bangladesh, Ethiopia, India, Nigeria, Pakistan, Sri Lanka, South Africa, Uganda, Vietnam, Zambia (for which specific requirements are stated above). Please see the requirements for travel from India below.The certificate must be a Reverse Transcription-Polymerase Chain Reaction (RT-PCR) test. Other test certificates including antibody tests, NHS COVID Test certificates, Rapid PCR tests and home testing kits are not accepted in Dubai. Travellers must bring an official printed or digital certificate in English or Arabic to check in – SMS certificates are not accepted. PCR certificates in other languages are acceptable if they can be validated at the originating station.COVID-19 RT-PCR test certificates must be issued by an authorised facility in the passenger's departure country. Certificates that have already been presented for travel to another destination can't be used for re-entry even if they are still within the validity period.For passengers arriving from the following countries, it is mandatory that the COVID-19 PCR report includes a QR code linked to the original report for verification purposes. The QR code must be presented at check-in and to representatives of the Dubai Health Authority (DHA) upon arrival in Dubai airports: Indonesia, Sudan, Lebanon, Egypt and Ethiopia."],
['have a visitor visa or a green card issued by the Unites States, ora residence visa issued by the United Kingdom or Europe unionThe visa issued by United States, United Kingdom or Europe union has to be valid for a minimum of 6 months'],
['Passengers arriving in Dubai from the following countries will be required to take another COVID-19 PCR test on arrival at Dubai International airport:Afghanistan, Angola, Argentina, Azerbaijan, Bahrain, Bangladesh, Bosnia & Herzegovina, Brazil, Cambodia, Chile, Croatia, Cyprus, Djibouti, Egypt, Eritrea, Ethiopia, Georgia, Ghana, Greece, Guinea, Hungary, India, Indonesia, Iran, Iraq, Israel, Ivory Coast, Jordan, Kenya, Kuwait, Kyrgyzstan, Lebanon, Malta, Moldova, Montenegro, Morocco, Myanmar, Nepal, Pakistan, Poland, Philippines, Qatar, Rwanda, Russia, Senegal, Slovakia, Somaliland, Somalia, South Africa, South Sudan, Sudan, Syria, Tajikistan, Tanzania, Thailand, Tunisia, Turkey, Turkmenistan, Uganda, Ukraine, Uzbekistan, Zimbabwe.'],
['All transit passengers must complete all the requirements of their final destination.Transit passengers from the following countries must present a negative COVID-19 PCR test certificate for a test taken no more than 72 hours before departure:\xa0Bangladesh, Ethiopia, India, Nigeria, Pakistan, Sri Lanka, South Africa, Uganda, Vietnam, Zambia, IndonesiaAll other transit passengers are not required to present this certificate unless it is mandated by their final destination.'],
["UAE nationals are exempt from taking a COVID-19 PCR test before departing for Dubai. They must be tested on arrival in Dubai, irrespective if they are holding a valid negative COVID-19 RT-PCR certificate from the point of origin.\n\nThis is also applicable for:\nPassengers accompanying a 1st degree UAE nationals' relative or domestic workersDomestic workers escorting a UAE national sponsor during travel.Children under the age of 12 and passengers who have a moderate or severe disability are exempt from taking a COVID-19 RT-PCR test.\nModerate or severe disability includes neurological disorders and intellectual or developmental disabilities. For example: Acute spinal cord injury, Alzheimer's disease, Amyotrophic lateral sclerosis (ALS), Ataxia, Autism spectrum, Bell's palsy, Brain tumours, Cerebral aneurysm, Cerebral palsy, Down Syndrome, Epilepsy and seizuresAll other passengers, including those who are visually impaired, hearing impaired or physically challenged must hold a negative COVID-19 RT-PCR test certificate as per the requirements.There may be specific test exemptions in your country of origin and final destination. Please check the requirements before you travel."],
['The UAE government has specified designated laboratories.\ufeff You can either use the recommended laboratories in the list or any trusted and certified laboratories in your country of origin to get your COVID-19 RT-PCR test.If you are flying from India, Pakistan, Nigeria or Bangladesh , you must get your certificate from one of the labs listed in the designated laboratories document to be accepted on the flight.'],
['Passengers who are planning to travel to Abu Dhabi must comply with the following protocols in place at all Abu Dhabi borders. These procedures may affect travel time.Effective 5 September 2021, Abu Dhabi authorities have revised the rules and updated travel procedures for UAE citizens and residents as well as visitors entering Abu Dhabi.Vaccinated travellersVaccinated travellers from green list destinations must take a COVID-19 PCR test on arrival and on day 6 after arrival but do not have to undergo quarantine.When arriving from other destinations (non-green), they must take a COVID-19 PCR test on arrival and on days 4 and 8 after arrival but do not have to undergo quarantine.The protocol applies to fully vaccinated UAE citizens and residents as well as visitors, which is also documented in the Alhosn App.Unvaccinated travellersUnvaccinated citizens, residents and visitors arriving into Abu Dhabi from green list destinations must take a COVID-19 PCR test on arrival and on days 6 and 9 after arrival but do not have to undergo quarantine.When arriving from other destinations (non-green), they must take a COVID-19 PCR test on arrival, quarantine for 10 days, wear a medically approved wristband and take another COVID-19 PCR test on day 9 of quarantine.To be considered fully vaccinated, individuals must have received two doses of the same vaccine at least 14 days before departure.'],
['Before departure, visitors need to register in the Register Arrivals section of the Federal Authority for Identity and Citizenship (ICA) app, complete the register arrivals form and upload an international vaccination certificate. Visitors will then receive an SMS including a link to download the Alhosn app.Upon arrival in Abu Dhabi, visitors will receive a Unified Identification Number (UID) either at the airport or via ICA app or website. Visitors will then need to download and register on the Alhosn app using the UID and phone number used for ICA registration or when taking a COVID-19 PCR test in the UAE.Visitors will receive a one-time password (OTP) to complete the Alhosn app registration process. Alhosn app allows users to check status, vaccination information, test results and travel test requirements and use a live QR code.These tests and processes are a legal requirement and those failing to follow this process are liable for fines.Find out where to get tested in Dubai before you enter Abu Dhabi\ufeff.'],
['Check if you need a visa.\ufeff Depending on your nationality you can get a visa on arrival, or you can apply for your prearranged visit visa from Dubai Immigration before you travel.'],
['GDRFA\ufeff or ICA\ufeff approval is not required for tourists travelling to the UAE.Passengers arriving from the following countries must follow specific protocols:Bangladesh, Ethiopia, India, Nigeria, Pakistan, Sri Lanka, South Africa, Uganda, Vietnam, ZambiaRequirements for passengers from these countries:A valid negative COVID-19 PCR test certificate with a QR code issued within 48 hours prior to departure from an approved health facilityA rapid PCR test report with a QR code for a test conducted at the departure airport within six hours of departureFor passengers travelling to Dubai as their final destination from Bangladesh, Nigeria, Vietnam and Zambia, travel is currently not possible as there are no rapid PCR testing facilities at the airport.'],
['You may need to take another COVID-19 PCR test on arrival. If you take a test at the airport, you must remain in your hotel or residence until you receive the test result.If the test result is positive, you will be required to undergo isolation and follow the Dubai Health Authority guidelines.You must also download the COVID19 – DXB Smart App iOS\ufeff-Android\ufeff']]
I've using the Stripe checkout session to create a checkout page for my app using.
All products price default currency is 'USD'. But I want to transfer to multi-currencies depend on the locale browser for customers in the overseas USA.
Example: Product price is 50 'USD'
If I living in Germany so the checkout should be generation all banking in Germany with Stripe supported and the amount should be converted to 42.33 'EUR'
stripeServer.checkout.sessions.create({
customer_email: email,
line_items: [
{
price: 'price_id'
}
],
payment_method_types: mode === STRIPE.PAYMENT_TYPE.SUBSCRIPTION ? ['card'] : ['card', 'alipay'],
mode,
billing_address_collection: 'auto',
locale: location || 'auto',
But now, it just transfers the language depend on location, the price is still just 'USD' instead of 'EUR'
Stripe checkout session images
I've read the docs of Stripe but only support the payouts to get profit billing to your account banking https://stripe.com/docs/payouts
So now if an Indian Customer or Singapore customer wants to payment, I need to create a new price ID with their country currency. Do we have another solution for this?
Thank for your advice and support for me
Stripe doesn't support that kind of automatic currency conversion today, no.
If you want to charge customers in different locations in different currencies, then you have to tell Stripe to charge an amount X in currency Y based on your own specific business logic.
If using Checkout, the idea is you would create a Product and can define various Prices in different currencies , https://stripe.com/docs/billing/prices-guide .
The amount you use for the local currency is something you as the merchant decide when setting this up. So yes, as you describe, you want to create a new Price for their country/currency combo.
More generally I want to point out that you probably don’t want to just do a straight conversion, you want to actually consciously price your product according to local competitors and purchase power like the Big Mac Index. If you Google “price localisation” there’s lots of blogs about this kind of thing. For example $50 is about 3,600INR, but the average monthly salary in India is (taking a random example from Google), 16,000INR. So that "$50" item is a much more expensive proposition to your Indian customer in that country if you just do straight conversions instead of consciously pricing your product.
Stripe has announced they are working on automatic currency conversion API. So, it's better to register your email on the below page, so that they can inform you once the feature is added.
https://stripe.com/docs/payments/checkout/present-local-currencies
I am using BeautifulSoup to parse some html page.
I want to get all text information within the <p> tags under this <div id="commentary">
link to image of that html script content which I want to get
When I use find_all to get all of the <p> tags, the list contains only the first one. I used to following code to count the no. of <p> tags present under <div>. You can clearly see from the above image that there are around 19 <p> tags within that highlighted <div> tag, still my code prints out 1.
content = soup.find('div', attrs={'class':'company-profile'})
points = content.find('div', attrs={'id':'commentary'})
count = 0
for point in points.find_all('p'):
count = count + 1
print(count)
print(points.text)
I don't know why is this happening and why the find_all method wont return the complete list.
I also tried using the points.text to print all of the text within <div id="commentary"> tag, but it prints contents of first <p> tag only.
(mlenv) chirag#debian10:~/ML/Finaments$ python main.py
<class 'bs4.element.Tag'>
State Bank of India is a Fortune 500 company. It is an Indian Multinational, Public Sector banking and financial services statutory body headquartered in Mumbai. It is the largest and oldest bank in India with over 200 years of history.#
1
1
Ratios (Q3FY21)
Capital Adequacy Ratio - 14.50%
Net Interest Margin - 3.34%
Gross NPA - 4.77%
Net NPA - 1.23%
CASA Ratio - 45.15%#
(mlenv) chirag#debian10:~/ML/Finaments$ ^C
(mlenv) chirag#debian10:~/ML/Finaments$
Those 1's are the from print(count) and then it only prints the content of first <p> tag from print(points.text).
I have just started using beautifulsoup, please help me.
You can go after the direct url that has that info. You'll need to pass in there the correct cookies and csrf tokens though:
import requests
from bs4 import BeautifulSoup
url = 'https://www.screener.in/wiki/company/3188/commentary/'
headers= {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.212 Safari/537.36',
'referer': 'https://www.screener.in/company/SBIN/consolidated/',
'x-csrftoken': 'E8zDjm7CtmSqCM2B9rTYPXTcPMJ22w2oynWzWzT4bCgAIaKkt4DmrirBSEPdCP0W',
'cookie': '_gcl_au=1.1.69436223.1621345270; _ga=GA1.2.2056656539.1621345271; _gid=GA1.2.1452432592.1621345271; csrftoken=E8zDjm7CtmSqCM2B9rTYPXTcPMJ22w2oynWzWzT4bCgAIaKkt4DmrirBSEPdCP0W; sessionid=mrdcmrlqpe72dqjrqgtrb2m2v375sjv0; _gat_UA-2456523-7=1'}
response = requests.post(url, headers=headers)
soup = BeautifulSoup(response.text, 'html.parser')
count = 0
for point in soup.find_all('p'):
count = count + 1
print(count)
print(soup.text)
Output:
19
Ratios (Q3FY21)
Capital Adequacy Ratio - 14.50%
Net Interest Margin - 3.34%
Gross NPA - 4.77%
Net NPA - 1.23%
CASA Ratio - 45.15%#
Branch Network
Presently, the bank operates a network of 22,330 branches and ~58,000 ATMs across India. It also operates ~71,000 business correspondent outlets across India.#
Market Share
The bank has a market share of 22.84% in deposits and 19.69% share in advances in India. It has a strong customer base of ~45 crore customers.#
Loan Book
Retail loans account for 39% of the loan book, followed by corporate (37%), SME (14%) and Agriculture (10%).#
Retail Book - Home loans account for 68% of the retail book, followed by xpress credit (22%), auto loans (9%), personal gold loans (2%) and others (9%).#
Exposure
The bank has a well-diversified loan book exposed to various sectors. Top sectors include home loans (23%), infrastructure (15%), services (12%) and agriculture (10%).
~75% of the corporate advances are rated A and better ratings from rating agencies. 38% of the corporate book accounts for PSUs & Govt. departments.#
Segmental NPAs
Presently, the total NPAs of the bank stands at 1,17,244 crores. agriculture segment accounts for the major ratio of NPAs i.e. 13.71% of all loans are NPA. Corporate segment accounts for 59,400 crores worth of NPAs i.e. 51% of total NPAs of the bank.#
International Business
The bank has a global footprint with a network of 233 branches/offices in 32 countries.# It has presence in USA, Canada, Brazil, Russia, Germany, France, Turkey, Australia, Bangladesh, Nepal, Sri Lanka and other countries.#
Presently, Overseas business accounts for 3% of total deposits# and 13% of total advances.#
Government Business
SBI has always been the banker of choice to the government of India and is the market leader in government business. It had turnover of ~52,50,000 lakh crores and commissions of ~3,700 crores from government business in FY20.#
Financial Inclusion Business
The bank has ~71,000 BC outlets which has primary focus on financial inclusion customers.# The bank accounts for 40% of all PMJDY accounts i.e. more than 12 crore accounts.# Presently, the deposits from PMJDY accounts are ~42,500 crores i.e. 1.2% of total deposits of the bank.
Digital Metrics
Increasing digitization resulted in ~40% of asset accounts and ~60% of liability customers added via digital channels in FY21.# 67% of all transactions were initiated through digital channels in 2020 which is up from 58% in the previous year.#
Subsidiaries Operations
The bank owns various subsidiaries which are engaged in related business activities :-
1. SBI Capital Markets Ltd (100% stake) - SBICAP is a leading investment banker, offering investment banking and corporate advisory services to clients across three product categories i.e. project advisory and structured finance, equity capital markets and debt capital markets.
This company further has wholly owned subsidiaries in related businesses viz. SBICAP Securities, SBICAP Trustee Co., SBICAP Ventures & others.#
2. SBI DHFI Ltd (72% stake) - It is a primary dealer and supports the book building process and provide depth and liquidity to secondary markets in G-Sec. It also deals in money market instruments, non G-Sec debt instruments, amongst others.#
3. SBI Cards and Payment Services Ltd (69% stake) - It is a non-banking financial company that offers extensive credit card portfolio to individual cardholders and corporate clients. It has diversified customer acquisition network that enables to engage prospective customers across multiple channels.#
The IPO of SBI Cards was launched in March 2020 wherein the company sold ~13 crore equity shares for a consideration of ₹10,350 crores.#
4. SBI Life Insurance Co. Ltd (57.6% stake) - It is one of the leading life insurance company in India which offers a wide range of individual and group insurance solutions that meet various life stage needs of customers.#
5. SBI Funds Management Pvt Ltd (63% stake) - It is a JV between SBI and AMUNDI (France). It is an asset management company with the fastest CAGR of 33% as against industrial average of 14% in the last 3 years.#
6. SBI General Insurance Company Ltd (70% stake) - It is a general insurance company which focuses on profitable growth in banc-assurance channel along with other distribution channels and line of businesses. It is first non-life insurance company in India to cross 6,000 crores in a decade of operations.#
Amalgamation of Associate Banks
In March 2017, the bank acquired its 5 associate state banks and Bharatiya Mahila Bank by allotting ~13.5 crore equity shares of SBI.#
ISO 3166 defines country codes such as GB, US, FR or RU.
I would like a reasonably definitive association from these country codes to the customary unit of measure for distances between places in those countries.
Specifically on iOS and OS X, the country code can be retrieved from NSLocale:
[[NSLocale currentLocale] objectForKey: NSLocaleCountryCode];
NSLocale also provides a way to see if a country uses metric or non metric units:
const bool useMetric = [[[NSLocale currentLocale] objectForKey: NSLocaleUsesMetricSystem] boolValue];
However, this is not sufficient. For example, in Great Britain (GB) the metric system is widely used, but distances between places continue to be officially measured in miles rather than kilometres.
I also faced this problem :-)
Countries which uses Metric system but still use miles :--
1. GB is only exception which still uses miles instead of metric.
Note: Canada also stared using KMs for road transport. Although, Canada still follows miles for train and horse transport
Countries which do not uses Metric System
Liberia, Myanmar and United States of America.
Note: Myanmar (Formerly Burma) is planning to move to metric system. Currently, Myanmar uses its own system different from imperial and metric.
In my app, i check whether country uses imperial or metric.
if (metric) then assign kms for all countries except britan
if (imperial) then assign miles for all countries except Burma
if burma then assign burma unit
if britan then assign miles
A chart showing countries using miles per hour for road speeds is available. It cites Wikipedia's articles on miles per hour as its source, which has the following to say:
These include roads in the United Kingdom,[1] the United States,[2] and UK and US territories; American Samoa,[3] the Bahamas,[4] Belize,[5] British Virgin Islands,[6] the Cayman Islands,[7] Dominica,[8] the Falkland Islands,[9] Grenada,[10] Guam,[11] Burma,[12] The N. Mariana Islands,[13] Samoa,[14] St. Lucia,[15] St. Vincent & The Grenadines,[16] St. Helena,[17] St. Kitts & Nevis,[18] Turks & Caicos Islands,[19] the U.S. Virgin Islands,[20][21] Antigua & Barbuda (although km are used for distance),[22] and Puerto Rico (same as former).[22]
I don't see a way to download this as data keyed from ISO3166 country code, but it's not a huge task to compile one.
I'll leave this answer unaccepted in case a better suggestion is available.
Officially, road distances in the UK are in kilometres, but road signs are in miles. Confusing? Yes! When a road engineer get aplan of a road, everythign is in kiolometres, government statistics are in kilometres, but road signs and car odometers are in miles. See https://en.wikipedia.org/wiki/Driver_location_sign for more info.