From the documentation:
Without arguments, return the list of names in the current local scope. With an argument, attempt to return a list of valid attributes for that object.
So i try see inside the scrapy module
import scrapy is a module right, or im wrong?
>>>dir(scrapy)
NameError: name 'scrapy' is not defined
Im complete newb in python and just try understand how works.
How can i see inside modules like documentation examples
>>> dir(sys)
['__displayhook__', '__doc__', '__excepthook__', '__loader__', '__name__',
'__package__', '__stderr__', '__stdin__', '__stdout__',
'_clear_type_cache', '_current_frames', '_debugmallocstats', '_getframe',
'_home', '_mercurial', '_xoptions', 'abiflags', 'api_version', 'argv',
'base_exec_prefix', 'base_prefix', 'builtin_module_names', 'byteorder',
'call_tracing', 'callstats', 'copyright', 'displayhook',
'dont_write_bytecode', 'exc_info', 'excepthook', 'exec_prefix',
'executable', 'exit', 'flags', 'float_info', 'float_repr_style',
'getcheckinterval', 'getdefaultencoding', 'getdlopenflags',
'getfilesystemencoding', 'getobjects', 'getprofile', 'getrecursionlimit',
'getrefcount', 'getsizeof', 'getswitchinterval', 'gettotalrefcount',
'gettrace', 'hash_info', 'hexversion', 'implementation', 'int_info',
'intern', 'maxsize', 'maxunicode', 'meta_path', 'modules', 'path',
'path_hooks', 'path_importer_cache', 'platform', 'prefix', 'ps1',
'setcheckinterval', 'setdlopenflags', 'setprofile', 'setrecursionlimit',
'setswitchinterval', 'settrace', 'stderr', 'stdin', 'stdout',
'thread_info', 'version', 'version_info', 'warnoptions']
Try this from your python interpreter:
In [1]: import scrapy
In [2]: dir(scrapy)
Out[2]:
['Field',
'FormRequest',
'Item',
'Request',
'Selector',
'Spider',
'__all__',
'__builtins__',
'__cached__',
'__doc__',
'__file__',
'__loader__',
'__name__',
'__package__',
'__path__',
'__spec__',
'__version__',
'_txv',
'exceptions',
'http',
'item',
'link',
'selector',
'signals',
'spiders',
'twisted_version',
'utils',
'version_info']
This worked for me in both Python 2 and 3. I have also confirmed that it works in both iPython and the standard interpreter. If it does not work for you even with the import, your environment may have gotten messed up in some way, and we can troubleshoot further.
import scrapy is a module right, or im wrong?
In this case scrapy is a module, and import scrapy is the syntax for making that module available in whatever context you are invoking the import from. This section of the Python tutorial has information on modules and importing them.
Below code works fine, but I need to scrape multiple URLs and I don't know really how...
Would be nice also if possible to scrape the urls from a CSV file...
Basically I'm trying to get a redirect link from a search link
from bs4 import BeautifulSoup
import requests
url = "https://www.tennis-point.fr/index.php?stoken=737F2976&lang=1&cl=search&searchparam=E705Y-0193"
# Getting the webpage, creating a Response object.
response = requests.get(url)
# Extracting the source code of the page.
data = response.text
# Passing the source code to BeautifulSoup to create a BeautifulSoup object for it.
soup = BeautifulSoup(data, 'lxml')
# Extracting all the <a> tags into a list.
tags = soup.find("div", {"class": "productsPicture"}).findAll("a")
# Extracting URLs from the attribute href in the <a> tags.
for tag in tags:
print(tag.get('href'))
This code will fetch all urls (href):
Code:
from bs4 import BeautifulSoup
import requests
url = "https://www.tennis-point.fr/index.php?stoken=737F2976&lang=1&cl=search&searchparam=E705Y-0193"
# Getting the webpage, creating a Response object.
response = requests.get(url)
# Extracting the source code of the page.
data = response.text
# Passing the source code to BeautifulSoup to create a BeautifulSoup object for it.
soup = BeautifulSoup(data, 'html.parser')
# print soup
urls =[ item.get("href") for item in soup.find_all("a")]
print(urls)
output:
[u'https://www.tennis-point.fr/frais-d-expedition-et-de-livraison/', u'https://www.tennis-point.fr/garantie-satisfait-ou-rembourse/', u'https://www.tennis-point.fr/protection-des-donnees/', u'tel:+33(0)368331651', u'https://www.tennis-point.fr/', u'https://www.tennis-point.fr/index.php?cl=account&', u'#loginBox', u'https://www.tennis-point.fr/index.php?cl=register&', u'https://www.tennis-point.fr/index.php?cl=account&', u'https://www.tennis-point.fr/index.php?cl=account_order&', u'https://www.tennis-point.fr/index.php?cl=account_user&', u'https://www.tennis-point.fr/index.php?cl=newsletter&', u'https://www.tennis-point.fr/aide-services/', u'#', u'https://www.tennis-point.fr/index.php?cl=account_wishlist&', u'https://www.tennis-point.fr/index.php?cl=basket&', u'#', u'https://www.tennis-point.fr/index.php?cl=basket&', u'#', u'https://www.tennis-point.fr/raquettes-de-tennis/', u'/raquettes-de-tennis/enfants/', u'/raquettes-de-tennis/unisex/', u'https://www.tennis-point.fr/raquettes-de-tennis/', u'https://www.tennis-point.fr/raquettes-de-tennis-raquettes-de-competition/', u'https://www.tennis-point.fr/raquettes-de-tennis-raquettes-polyvalentes/', u'https://www.tennis-point.fr/raquettes-de-tennis-raquettes-confort/', u'https://www.tennis-point.fr/raquettes-de-tennis-raquettes-enfants/', u'https://www.tennis-point.fr/raquettes-de-tennis-raquettes-d-occasion/', u'https://www.tennis-point.fr/raquettes-de-tennis-lots-de-raquettes/', u'https://www.tennis-point.fr/raquettes-de-tennis-accessoires-raquettes/', u'/raquettes-de-tennis/babolat/', u'/raquettes-de-tennis/dunlop/', u'/raquettes-de-tennis/head/', u'/raquettes-de-tennis/kirschbaum/', u'/raquettes-de-tennis/pacific/', u'/raquettes-de-tennis/prince/', u'/raquettes-de-tennis/prokennex/', u'/raquettes-de-tennis/tecnifibre/', u'/raquettes-de-tennis/tennis-point/', u'/raquettes-de-tennis/topspin/', u'/raquettes-de-tennis/tourna/', u'/raquettes-de-tennis/voelkl/', u'/raquettes-de-tennis/wilson/', u'/raquettes-de-tennis/yonex/', u'/marques/raquettes-de-tennis/', u'https://www.tennis-point.fr/vetements-de-tennis/', u'/vetements-de-tennis/enfants/', u'/vetements-de-tennis/femmes/', u'/vetements-de-tennis/filles/', u'/vetements-de-tennis/garcons/', u'/vetements-de-tennis/hommes/', u'/vetements-de-tennis/unisex/', u'https://www.tennis-point.fr/vetements-de-tennis/', u'https://www.tennis-point.fr/vetements-de-tennis-robes/', u'https://www.tennis-point.fr/vetements-de-tennis-shirts-tops/', u'https://www.tennis-point.fr/vetements-de-tennis-jupes/', u'https://www.tennis-point.fr/vetements-de-tennis-vestes/', u'https://www.tennis-point.fr/vetements-de-tennis-sweats-hoodies/', u'https://www.tennis-point.fr/vetements-de-tennis-shorts/', u'https://www.tennis-point.fr/vetements-de-tennis-pantalons/', u'https://www.tennis-point.fr/vetements-de-tennis-survetements/', u'https://www.tennis-point.fr/vetements-de-tennis-compression/', u'https://www.tennis-point.fr/vetements-de-tennis-chaussettes/', u'https://www.tennis-point.fr/vetements-de-tennis-sous-vetements/', u'https://www.tennis-point.fr/vetements-de-tennis-accessoires/', u'/vetements-de-tennis/adidas/', u'/vetements-de-tennis/asics/', u'/vetements-de-tennis/babolat/', u'/vetements-de-tennis/bidi-badu/', u'/vetements-de-tennis/bidi-badu-by-kilian-kerner/', u'/vetements-de-tennis/bjoern-borg/', u'/vetements-de-tennis/dunlop/', u'/vetements-de-tennis/erima/', u'/vetements-de-tennis/fila/', u'/vetements-de-tennis/head/', u'/vetements-de-tennis/hydrogen/', u'/vetements-de-tennis/lacoste/', u'/vetements-de-tennis/limited-sports/', u'/vetements-de-tennis/lotto/', u'/vetements-de-tennis/nike/', u'/vetements-de-tennis/puma/', u'/vetements-de-tennis/reebok/', u'/vetements-de-tennis/sergio-tacchini/', u'/vetements-de-tennis/tennis-point/', u'/vetements-de-tennis/tonic/', u'/vetements-de-tennis/under-armour/', u'/vetements-de-tennis/wilson/', u'/vetements-de-tennis/yonex/', u'/marques/vetements-de-tennis/', u'https://www.tennis-point.fr/chaussures-de-tennis/', u'/chaussures-de-tennis/enfants/', u'/chaussures-de-tennis/femmes/', u'/chaussures-de-tennis/hommes/', u'/chaussures-de-tennis/unisex/', u'https://www.tennis-point.fr/chaussures-de-tennis/', u'https://www.tennis-point.fr/chaussures-de-tennis-toutes-surfaces/', u'https://www.tennis-point.fr/chaussures-de-tennis-terre-battue/', u'https://www.tennis-point.fr/chaussures-de-tennis-moquette/', u'https://www.tennis-point.fr/chaussures-de-tennis-loisir/', u'https://www.tennis-point.fr/chaussures-de-tennis-accessoires-chaussures/', u'/chaussures-de-tennis/adidas/', u'/chaussures-de-tennis/asics/', u'/chaussures-de-tennis/babolat/', u'/chaussures-de-tennis/erdal/', u'/chaussures-de-tennis/head/', u'/chaussures-de-tennis/ivybands/', u'/chaussures-de-tennis/k-swiss/', u'/chaussures-de-tennis/lotto/', u'/chaussures-de-tennis/mizuno/', u'/chaussures-de-tennis/new-balance/', u'/chaussures-de-tennis/nike/', u'/chaussures-de-tennis/prince/', u'/chaussures-de-tennis/pro-touch/', u'/chaussures-de-tennis/salomon/', u'/chaussures-de-tennis/under-armour/', u'/chaussures-de-tennis/wilson/', u'/chaussures-de-tennis/yonex/', u'/marques/chaussures-de-tennis/', u'https://www.tennis-point.fr/sacs-de-tennis/', u'/sacs-de-tennis/enfants/', u'/sacs-de-tennis/femmes/', u'/sacs-de-tennis/hommes/', u'/sacs-de-tennis/unisex/', u'https://www.tennis-point.fr/sacs-de-tennis/', u'https://www.tennis-point.fr/sacs-de-tennis-sacs-a-raquettes/', u'https://www.tennis-point.fr/sacs-de-tennis-sacs-a-dos/', u'https://www.tennis-point.fr/sacs-de-tennis-sacs-de-sport/', u'https://www.tennis-point.fr/sacs-de-tennis-autres-sacs/', u'/sacs-de-tennis/adidas/', u'/sacs-de-tennis/asics/', u'/sacs-de-tennis/babolat/', u'/sacs-de-tennis/bidi-badu/', u'/sacs-de-tennis/dunlop/', u'/sacs-de-tennis/head/', u'/sacs-de-tennis/lacoste/', u'/sacs-de-tennis/nike/', u'/sacs-de-tennis/prince/', u'/sacs-de-tennis/tecnifibre/', u'/sacs-de-tennis/tennis-point/', u'/sacs-de-tennis/topspin/', u'/sacs-de-tennis/under-armour/', u'/sacs-de-tennis/wilson/', u'/sacs-de-tennis/yonex/', u'/marques/sacs-de-tennis/', u'https://www.tennis-point.fr/balles-de-tennis/', u'https://www.tennis-point.fr/balles-de-tennis/', u'https://www.tennis-point.fr/balles-de-tennis-balles-de-competition/', u'https://www.tennis-point.fr/balles-de-tennis-balles-d-entrainement/', u'https://www.tennis-point.fr/balles-de-tennis-balles-geantes/', u'https://www.tennis-point.fr/balles-de-tennis-balles-intermediaires/', u'https://www.tennis-point.fr/balles-de-tennis-balles-lots-de-balles/', u'https://www.tennis-point.fr/balles-de-tennis-balles-officielles-itf/', u'https://www.tennis-point.fr/balles-de-tennis-balles-sans-pression/', u'https://www.tennis-point.fr/balles-de-tennis-lots-de-balles/', u'/balles-de-tennis/babolat/', u'/balles-de-tennis/balls-unlimited/', u'/balles-de-tennis/dunlop/', u'/balles-de-tennis/head/', u'/balles-de-tennis/tennis-point/', u'/balles-de-tennis/tretorn/', u'/balles-de-tennis/wilson/', u'/marques/balles-de-tennis/', u'https://www.tennis-point.fr/cordages-de-tennis/', u'https://www.tennis-point.fr/cordages-de-tennis/', u'https://www.tennis-point.fr/cordages-de-tennis-bobines-cordage/', u'https://www.tennis-point.fr/cordages-de-tennis-cordages-en-set/', u'/cordages-de-tennis/babolat/', u'/cordages-de-tennis/dunlop/', u'/cordages-de-tennis/gamma/', u'/cordages-de-tennis/head/', u'/cordages-de-tennis/isospeed/', u'/cordages-de-tennis/kirschbaum/', u'/cordages-de-tennis/luxilon/', u'/cordages-de-tennis/msv/', u'/cordages-de-tennis/pacific/', u'/cordages-de-tennis/polyfibre/', u'/cordages-de-tennis/prince/', u'/cordages-de-tennis/signum-pro/', u'/cordages-de-tennis/solinco/', u'/cordages-de-tennis/tecnifibre/', u'/cordages-de-tennis/tennis-point/', u'/cordages-de-tennis/topspin/', u'/cordages-de-tennis/tourna/', u'/cordages-de-tennis/weiss-cannon/', u'/cordages-de-tennis/wilson/', u'/cordages-de-tennis/yonex/', u'https://www.tennis-point.fr/autres/', u'/autres/femmes/', u'/autres/unisex/', u'https://www.tennis-point.fr/autres/', u'https://www.tennis-point.fr/autres-grips-de-tennis/', u'https://www.tennis-point.fr/autres-padel/', u'https://www.tennis-point.fr/autres-squash/', u'https://www.tennis-point.fr/autres-badminton/', u'https://www.tennis-point.fr/autres-accessoires-pour-entraineurs/', u'https://www.tennis-point.fr/autres-equipement-court-de-tennis/', u'https://www.tennis-point.fr/autres-accessoires/', u'https://www.tennis-point.fr/autres-bons-d-achat/', u'/autres/adidas/', u'/autres/babolat/', u'/autres/bidi-badu/', u'/autres/dunlop/', u'/autres/gamma/', u'/autres/head/', u'/autres/nike/', u'/autres/pacific/', u'/autres/prince/', u'/autres/rehband/', u'/autres/schildkroet-fitness/', u'/autres/sergio-tacchini/', u'/autres/tecnifibre/', u'/autres/tegra/', u'/autres/tomtom/', u'/autres/toolz/', u'/autres/topspin/', u'/autres/tourna/', u'/autres/tretorn/', u'/autres/universal-sport/', u'/autres/wilson/', u'/marques/autres/', u'https://www.tennis-point.fr/marques/', u'https://www.tennis-point.fr/nike/', u'https://www.tennis-point.fr/adidas/', u'https://www.tennis-point.fr/wilson/', u'https://www.tennis-point.fr/head/', u'https://www.tennis-point.fr/babolat/', u'https://www.tennis-point.fr/asics/', u'https://www.tennis-point.fr/bidi-badu/', u'https://www.tennis-point.fr/k-swiss/', u'https://www.tennis-point.fr/babolat/', u'https://www.tennis-point.fr/head/', u'https://www.tennis-point.fr/toolz/', u'https://www.tennis-point.fr/wilson/', u'https://www.tennis-point.fr/2xu/', u'https://www.tennis-point.fr/70love/', u'https://www.tennis-point.fr/adidas/', u'https://www.tennis-point.fr/asics/', u'https://www.tennis-point.fr/atp/', u'https://www.tennis-point.fr/balls-unlimited/', u'https://www.tennis-point.fr/bidi-badu/', u'https://www.tennis-point.fr/bidi-badu-by-kilian-kerner/', u'https://www.tennis-point.fr/bjoern-borg/', u'https://www.tennis-point.fr/boot-doc/', u'https://www.tennis-point.fr/cep/', u'https://www.tennis-point.fr/currex/', u'https://www.tennis-point.fr/diadora/', u'https://www.tennis-point.fr/dunlop/', u'https://www.tennis-point.fr/enebe/', u'https://www.tennis-point.fr/energetics/', u'https://www.tennis-point.fr/erdal/', u'https://www.tennis-point.fr/erima/', u'https://www.tennis-point.fr/falke/', u'https://www.tennis-point.fr/fila/', u'https://www.tennis-point.fr/fitbit/', u'https://www.tennis-point.fr/gamma/', u'https://www.tennis-point.fr/garmin/', u'https://www.tennis-point.fr/hydrogen/', u'https://www.tennis-point.fr/isospeed/', u'https://www.tennis-point.fr/ivybands/', u'https://www.tennis-point.fr/k-swiss/', u'https://www.tennis-point.fr/kirschbaum/', u'https://www.tennis-point.fr/lacoste/', u'https://www.tennis-point.fr/limited-sports/', u'https://www.tennis-point.fr/lobster/', u'https://www.tennis-point.fr/lotto/', u'https://www.tennis-point.fr/luxilon/', u'https://www.tennis-point.fr/mikros/', u'https://www.tennis-point.fr/mizuno/', u'https://www.tennis-point.fr/msv/', u'https://www.tennis-point.fr/nasara/', u'https://www.tennis-point.fr/new-balance/', u'https://www.tennis-point.fr/nike/', u'https://www.tennis-point.fr/pacific/', u'https://www.tennis-point.fr/polyfibre/', u'https://www.tennis-point.fr/prince/', u'https://www.tennis-point.fr/pro-touch/', u'https://www.tennis-point.fr/prokennex/', u'https://www.tennis-point.fr/puma/', u'https://www.tennis-point.fr/reebok/', u'https://www.tennis-point.fr/rehband/', u'https://www.tennis-point.fr/salomon/', u'https://www.tennis-point.fr/schildkroet-fitness/', u'https://www.tennis-point.fr/sergio-tacchini/', u'https://www.tennis-point.fr/signum-pro/', u'https://www.tennis-point.fr/solinco/', u'https://www.tennis-point.fr/sports-tutor/', u'https://www.tennis-point.fr/sportsmed/', u'https://www.tennis-point.fr/syneo/', u'https://www.tennis-point.fr/talbot/', u'https://www.tennis-point.fr/tecnifibre/', u'https://www.tennis-point.fr/tegra/', u'https://www.tennis-point.fr/tennis-point/', u'https://www.tennis-point.fr/tomtom/', u'https://www.tennis-point.fr/tonic/', u'https://www.tennis-point.fr/topspin/', u'https://www.tennis-point.fr/tourna/', u'https://www.tennis-point.fr/tretorn/', u'https://www.tennis-point.fr/tri-tennis/', u'https://www.tennis-point.fr/under-armour/', u'https://www.tennis-point.fr/universal-sport/', u'https://www.tennis-point.fr/voelkl/', u'https://www.tennis-point.fr/weiss-cannon/', u'https://www.tennis-point.fr/x-bionic/', u'https://www.tennis-point.fr/x-socks/', u'https://www.tennis-point.fr/yonex/', u'https://www.tennis-point.fr/professionnels/', u'https://www.tennis-point.fr/roger-federer/', u'https://www.tennis-point.fr/serena-williams/', u'https://www.tennis-point.fr/rafael-nadal/', u'https://www.tennis-point.fr/victoria-azarenka/', u'https://www.tennis-point.fr/andy-murray/', u'https://www.tennis-point.fr/maria-sharapova/', u'https://www.tennis-point.fr/novak-djokovic/', u'https://www.tennis-point.fr/angelique-kerber/', u'https://www.tennis-point.fr/agnieszka-radwanska/', u'https://www.tennis-point.fr/alexander-zverev/', u'https://www.tennis-point.fr/alize-cornet/', u'https://www.tennis-point.fr/andrea-petkovic/', u'https://www.tennis-point.fr/andy-murray/', u'https://www.tennis-point.fr/angelique-kerber/', u'https://www.tennis-point.fr/', u'https://www.tennis-point.fr/benoit-paire/', u'https://www.tennis-point.fr/bernard-tomic/', u'https://www.tennis-point.fr/borna-coric/', u'https://www.tennis-point.fr/carina-witthoeft/', u'https://www.tennis-point.fr/caroline-garcia/', u'https://www.tennis-point.fr/caroline-wozniacki/', u'https://www.tennis-point.fr/coco-vandeweghe/', u'https://www.tennis-point.fr/david-ferrer/', u'https://www.tennis-point.fr/david-goffin/', u'https://www.tennis-point.fr/dominic-thiem/', u'https://www.tennis-point.fr/dominika-cibulkova/', u'https://www.tennis-point.fr/', u'https://www.tennis-point.fr/elena-vesnina/', u'https://www.tennis-point.fr/', u'https://www.tennis-point.fr/eugenie-bouchard/', u'https://www.tennis-point.fr/fabio-fognini/', u'https://www.tennis-point.fr/feliciano-lopez/', u'https://www.tennis-point.fr/fernando-verdasco/', u'https://www.tennis-point.fr/florian-mayer/', u'https://www.tennis-point.fr/frances-tiafoe/', u'https://www.tennis-point.fr/gael-monfils/', u'https://www.tennis-point.fr/garbine-muguruza/', u'https://www.tennis-point.fr/gilles-muller/', u'https://www.tennis-point.fr/gilles-simon/', u'https://www.tennis-point.fr/grigor-dimitrov/', u'https://www.tennis-point.fr/ivo-karlovic/', u'https://www.tennis-point.fr/jack-sock/', u'https://www.tennis-point.fr/jan-lennard-struff/', u'https://www.tennis-point.fr/jelena-ostapenko/', u'https://www.tennis-point.fr/jo-wilfried-tsonga/', u'https://www.tennis-point.fr/johanna-konta/', u'https://www.tennis-point.fr/juan-martin-del-potro/', u'https://www.tennis-point.fr/julia-goerges/', u'https://www.tennis-point.fr/karolina-pliskova/', u'https://www.tennis-point.fr/kei-nishikori/', u'https://www.tennis-point.fr/kevin-anderson/', u'https://www.tennis-point.fr/kiki-bertens/', u'https://www.tennis-point.fr/kristina-mladenovic/', u'https://www.tennis-point.fr/', u'https://www.tennis-point.fr/lucas-pouille/', u'https://www.tennis-point.fr/lucie-afarova/', u'https://www.tennis-point.fr/', u'https://www.tennis-point.fr/magdalena-rybarikova/', u'https://www.tennis-point.fr/marcos-baghdatis/', u'https://www.tennis-point.fr/maria-sharapova/', u'https://www.tennis-point.fr/marin-cilic/', u'https://www.tennis-point.fr/martin-klizan/', u'https://www.tennis-point.fr/milos-raonic/', u'https://www.tennis-point.fr/mischa-zverev/', u'https://www.tennis-point.fr/mona-barthel/', u'https://www.tennis-point.fr/', u'https://www.tennis-point.fr/nick-kyrgios/', u'https://www.tennis-point.fr/novak-djokovic/', u'https://www.tennis-point.fr/pablo-cuevas/', u'https://www.tennis-point.fr/petra-kvitova/', u'https://www.tennis-point.fr/philipp-kohlschreiber/', u'https://www.tennis-point.fr/rafael-nadal/', u'https://www.tennis-point.fr/richard-gasquet/', u'https://www.tennis-point.fr/roberto-bautista-agut/', u'https://www.tennis-point.fr/roger-federer/', u'https://www.tennis-point.fr/samantha-stosur/', u'https://www.tennis-point.fr/serena-williams/', u'https://www.tennis-point.fr/simona-halep/', u'https://www.tennis-point.fr/sloane-stephens/', u'https://www.tennis-point.fr/stan-wawrinka/', u'https://www.tennis-point.fr/', u'https://www.tennis-point.fr/tomas-berdych/', u'https://www.tennis-point.fr/victoria-azarenka/', u'https://www.tennis-point.fr/viktor-troicki/', u'https://www.tennis-point.fr/yulia-putintseva/', u'https://www.tennis-point.fr/promos/', u'/promos/enfants/', u'/promos/femmes/', u'/promos/filles/', u'/promos/garcons/', u'/promos/hommes/', u'/promos/unisex/', u'https://www.tennis-point.fr/promos/', u'https://www.tennis-point.fr/promos-raquettes-de-tennis/', u'https://www.tennis-point.fr/promos-vetements-de-tennis/', u'https://www.tennis-point.fr/promos-chaussures-de-tennis/', u'https://www.tennis-point.fr/promos-sacs-de-tennis/', u'https://www.tennis-point.fr/promos-balles-de-tennis/', u'https://www.tennis-point.fr/promos-cordages-de-tennis/', u'https://www.tennis-point.fr/promos-grips/', u'https://www.tennis-point.fr/promos-autres/', u'/promos/2xu/', u'/promos/70love/', u'/promos/adidas/', u'/promos/asics/', u'/promos/atp/', u'/promos/babolat/', u'/promos/bidi-badu/', u'/promos/bidi-badu-by-kilian-kerner/', u'/promos/bjoern-borg/', u'/promos/cep/', u'/promos/diadora/', u'/promos/dunlop/', u'/promos/enebe/', u'/promos/erima/', u'/promos/falke/', u'/promos/fila/', u'/promos/gamma/', u'/promos/head/', u'/promos/hydrogen/', u'/promos/isospeed/', u'/promos/k-swiss/', u'/promos/lacoste/', u'/promos/limited-sports/', u'/marques/promos/', u'javascript:history.back()', u'https://www.tennis-point.fr/', u'/chaussures-de-tennis/', None, u'/39-5/?searchparam=E705Y-0193', u'/40/?searchparam=E705Y-0193', u'/40-5/?searchparam=E705Y-0193', u'/41-5/?searchparam=E705Y-0193', u'/42-5/?searchparam=E705Y-0193', u'/42/?searchparam=E705Y-0193', u'/43-5/?searchparam=E705Y-0193', u'/44/?searchparam=E705Y-0193', u'/44-5/?searchparam=E705Y-0193', u'/45/?searchparam=E705Y-0193', u'/46-5/?searchparam=E705Y-0193', u'/46/?searchparam=E705Y-0193', u'/47/?searchparam=E705Y-0193', u'/49/?searchparam=E705Y-0193', u'https://www.tennis-point.fr/?cl=search&searchparam=E705Y-0193&searchparam=E705Y-0193', u'https://www.tennis-point.fr/?cl=search&searchparam=E705Y-0193&searchparam=E705Y-0193&listorderby=oxpos&listorder=desc', u'https://www.tennis-point.fr/?cl=search&searchparam=E705Y-0193&searchparam=E705Y-0193&listorderby=tc_first_stock_date&listorder=desc', u'https://www.tennis-point.fr/?cl=search&searchparam=E705Y-0193&searchparam=E705Y-0193&listorderby=oxprice&listorder=asc', u'https://www.tennis-point.fr/?cl=search&searchparam=E705Y-0193&searchparam=E705Y-0193&listorderby=oxprice&listorder=desc', u'https://www.tennis-point.fr/?cl=search&searchparam=E705Y-0193&searchparam=E705Y-0193&listorderby=_value_uvp&listorder=desc', u'https://www.tennis-point.fr/?cl=search&searchparam=E705Y-0193&searchparam=E705Y-0193&listorderby=oxrating&listorder=desc', u'https://www.tennis-point.fr/asics-gel-game-6-02013802643000.html', u'https://www.tennis-point.fr/asics-gel-game-6-02013802643000.html', u'https://www.tennis-point.fr/avantage-tennis-point/', u'https://www.tennis-point.fr/frais-d-expedition-et-de-livraison/', u'https://www.tennis-point.fr/garantie-satisfait-ou-rembourse/', u'tel:+33 (0) 3 68 33 16 51', u'mailto:info#tennis-point.fr', u'https://www.tennis-point.fr/contact/', u'https://www.trustedshops.com/bewertung/info_X06E4EA65E7491784124112051AF688AD.html', u'https://www.facebook.com/tennispoint.fr/', u'https://www.instagram.com/tennis_point_official/', u'https://www.tennis-point.fr/mentions-legales/', u'https://www.tennis-point.fr/conditions-generales/', u'https://www.tennis-point.fr/retractation-formulaire-de-retractation/', u'https://www.tennis-point.fr/protection-des-donnees/', u'https://www.tennis-point.fr/garantie-satisfait-ou-rembourse/', u'https://www.tennis-point.fr/aide-services/', u'https://www.tennis-point.fr/service-retour/', u'https://www.tennis-point.fr/affiliate/', u'https://www.tennis-point.fr/connexionentraineurs/', u'https://www.tennis-point.fr/modes-de-paiement/', u'https://www.tennis-point.fr/frais-d-expedition-et-de-livraison/', u'https://www.tennis-point.fr/retour-reclamation/', u'https://www.tennis-point.fr/index.php?cl=account&', u'https://www.tennis-point.de', u'https://www.tennis-point.com', u'https://www.tennis-point.fr', u'https://www.tennis-point.it', u'https://www.tennis-point.es', u'https://www.tennis-point.nl', u'https://www.tennis-point.be', u'https://www.tennis-point.cz', u'https://www.tennis-point.sk', u'https://www.tennis-point.ch', u'https://www.tennis-point.at', u'https://www.tennis-point.co.uk', u'https://www.tennis-point.dk', u'https://www.tennis-point.se', u'/nike/', u'/adidas/', u'/wilson/', u'/babolat/', u'/head/', u'/asics/', u'/dunlop/', u'/tennis-point/', u'/under-armour/', u'/k-swiss/', u'/yonex/', u'/prince/', u'/lacoste/', u'/lotto/', u'/tretorn/', u'/limited-sports/', u'/fila/', u'/bjoern-borg/', u'/raquettes-de-tennis/', u'/vetements-de-tennis/', u'/chaussures-de-tennis/', u'/sacs-de-tennis/', u'/balles-de-tennis/', u'/cordages-de-tennis/', u'/marques/', u'/professionnels/', u'/promos/', u'https://www.tennis-point.fr/index.php?cl=forgotpwd&', u'https://www.tennis-point.fr/index.php?cl=register&']
You're going to have to scrap each of your URLs individually. If you want to scrape the same stuff from each page, a simple loop will suffice.
urls = ['url1','url2','url3']
for u in urls:
response = requests.get(u)
data = response.text
soup = BeautifulSoup(data,'lxml')
#
#Scraping Code
#
I'm trying to scrape synonyms from thesaurus.com using Python and they list the synonyms using an unordered list.
from lxml import html
import requests
term = (input("Enter in a term to find the synonyms of: "))
page = requests.get('http://www.thesaurus.com/browse/' + term.lower(),allow_redirects=True)
if page.status_code == 200:
tree = html.fromstring(page.content)
synonyms = tree.xpath('//div[#class="relevancy-list"]/text()')
print(synonyms)
else:
print("No synonyms found!")
My code outputs just blank spaces instead of the synonyms. How do I scrape the actual synonyms instead of the spaces.
The /text() only prints the text immediately under the current tag. So your current code will not print the synonyms since it's under another tag inside the div tag.
You should use //text() to print all texts under the current tag. But this will print ALL texts, including the unnecessary ones.
For your use case, since the synonyms are inside a <span class="text"> tag, you can use this XPath:
//div[#class="relevancy-list"]//span[#class="text"]/text()
which selects all texts found inside a span with class "text" found inside a div with class "relevancy-list".
For input term set, the output using that XPath is:
['firm', 'bent', 'stated', 'specified', 'rooted', 'established', 'confirmed', 'pat', 'immovable', 'obstinate', 'ironclad', 'predetermined', 'intent', 'entrenched', 'appointed', 'regular', 'prescribed', 'determined', 'scheduled', 'fixed', 'settled', 'certain', 'customary', 'decisive', 'definite', 'inveterate', 'pigheaded', 'resolute', 'rigid', 'steadfast', 'stubborn', 'unflappable', 'usual', 'concluded', 'agreed', 'resolved', 'stipulated', 'arranged', 'prearranged', 'dead set on', 'hanging tough', 'locked in', 'set in stone', 'solid as a rock', 'stiff-necked', 'well-set', 'immovable', 'entrenched', 'located', 'solid', 'situate', 'stiff', 'placed', 'stable', 'fixed', 'settled', 'situated', 'rigid', 'strict', 'stubborn', 'unyielding', 'hidebound', 'positioned', 'sited', 'jelled', 'hard and fast', 'deportment', 'comportment', 'fit', 'presence', 'mien', 'hang', 'carriage', 'air', 'turn', 'attitude', 'address', 'demeanor', 'position', 'inclination', 'port', 'posture', 'setting', 'scene', 'scenery', 'flats', 'stage set', u'mise en sc\xe8ne', 'series', 'array', 'lot', 'collection', 'batch', 'crowd', 'cluster', 'gang', 'bunch', 'crew', 'circle', 'body', 'coterie', 'faction', 'company', 'bundle', 'outfit', 'band', 'clique', 'mob', 'kit', 'class', 'clan', 'compendium', 'clutch', 'camp', 'sect', 'push', 'organization', 'clump', 'assemblage', 'pack', 'gaggle', 'rat pack', 'locate', 'head', 'prepare', 'fix', 'introduce', 'turn', 'settle', 'lay', 'install', 'put', 'apply', 'post', 'establish', 'wedge', 'point', 'lock', 'affix', 'direct', 'rest', 'seat', 'station', 'plop', 'spread', 'lodge', 'situate', 'plant', 'park', 'bestow', 'train', 'stick', 'plank', 'arrange', 'insert', 'level', 'plunk', 'mount', 'aim', 'cast', 'deposit', 'ensconce', 'fasten', 'embed', 'anchor', 'make fast', 'make ready', 'zero in', 'appoint', 'name', 'schedule', 'make', 'impose', 'stipulate', 'settle', 'determine', 'establish', 'fix', 'specify', 'designate', 'decree', 'resolve', 'rate', 'conclude', 'price', 'prescribe', 'direct', 'value', 'ordain', 'allocate', 'instruct', 'allot', 'dictate', 'estimate', 'regulate', 'assign', 'arrange', 'lay down', 'agree upon', 'fix price', 'fix', 'stiffen', 'thicken', 'condense', 'jelly', 'clot', 'congeal', 'solidify', 'cake', 'coagulate', 'jell', 'gelatinize', 'crystallize', 'jellify', 'gel', 'become firm', 'gelate', 'drop', 'subside', 'sink', 'vanish', 'dip', 'disappear', 'descend', 'go down', 'initiate', 'begin', 'raise', 'abet', 'provoke', 'instigate', 'commence', 'foment', 'whip up', 'put in motion', 'set on', 'stir up']
Note that you will get the synonyms for all senses of the word.
You might want to loop over the result of //div[#class="relevancy-list"] manually, and extract the //span[#class="text"]/text() for each div found to get the synonyms per sense.
import requests
from bs4 import BeautifulSoup
term = input("Enter in a term to find the synonyms of: ")
page = requests.get('http://www.thesaurus.com/browse/' + term.lower(), allow_redirects=True)
if page.status_code == 200:
soup = BeautifulSoup(page.content, 'html.parser')
get_syn_tag = soup.find('div', {'class': 'relevancy-list'})
list_items = get_syn_tag.findAll('li')
synonyms = [] # to fetch synonym anytime used list to append all synonyms
for i in list_items:
synonym = i.find('span', {'class':'text'}).text
print(synonym) # prints single synonym on each iteration
synonyms.append(synonym) # appends synonym to list
else:
print("No synonyms found!")
finding all li tag is to be more precise, however in this case below line will also work :
synonym_list = [i.text for i in get_syn_tag.findAll('span', {'class':'text'})] # this will create a list of all available synonyms if there is no other `span` tag with same class `text` in the specified `div`