I installed in Debian 8.5 the package libpam-ldapd, then I proceeded to configure the file /etc/nslcd.conf with the following configuration:
# /etc/nslcd.conf
# nslc
d configuration file. See nslcd.conf(5)
# for details.
# The user and group nslcd should run as.
uid nslcd
gid nslcd
# The location at which the LDAP server(s) should be reachable.
uri ldap://172.17.192.100
# The search base that will be used for all queries.
base DC=myorg,DC=com
# The LDAP protocol version to use.
ldap_version 3
binddn CN=ldapuser,DC=myorg,DC=com
bindpw secret
# The search scope.
#scope sub
filter passwd (objectClass=person)
map passwd uid sAMAccountName
map passwd uidNumber employeeID
map passwd gidNumber objectSid
filter shadow (objectClass=person)
map shadow uid sAMAccountName
Problem is that when logging into the server with user#myorg.com I have the following log (auth sucessfull but search fails due to the #myorg.com section, also it uses the nslcd_pam_authc() function ):
nslcd: [8e1f29] <passwd="user#myorg.com"> DEBUG: ldap_initialize(ldap://172.17.192.100)
nslcd: [8e1f29] <passwd="user#myorg.com"> DEBUG: ldap_set_rebind_proc()
nslcd: [8e1f29] <passwd="user#myorg.com"> DEBUG: ldap_set_option(LDAP_OPT_PROTOCOL_VERSION,3)
nslcd: [8e1f29] <passwd="user#myorg.com"> DEBUG: ldap_set_option(LDAP_OPT_DEREF,0)
nslcd: [8e1f29] <passwd="user#myorg.com"> DEBUG: ldap_set_option(LDAP_OPT_TIMELIMIT,0)
nslcd: [8e1f29] <passwd="user#myorg.com"> DEBUG: ldap_set_option(LDAP_OPT_TIMEOUT,0)
nslcd: [8e1f29] <passwd="user#myorg.com"> DEBUG: ldap_set_option(LDAP_OPT_NETWORK_TIMEOUT,0)
nslcd: [8e1f29] <passwd="user#myorg.com"> DEBUG: ldap_set_option(LDAP_OPT_REFERRALS,LDAP_OPT_ON)
nslcd: [8e1f29] <passwd="user#myorg.com"> DEBUG: ldap_set_option(LDAP_OPT_RESTART,LDAP_OPT_ON)
nslcd: [8e1f29] <passwd="user#myorg.com"> DEBUG: ldap_simple_bind_s("CN=isldap,DC=TI,DC=ads","***") (uri="ldap://172.17.192.100")
nslcd: [8e1f29] <passwd="user#myorg.com"> DEBUG: ldap_result(): end of results (0 total)
nslcd: [8e1f29] <passwd="user#myorg.com"> DEBUG: myldap_search(base="DC=myorg,DC=com", filter="(&(objectClass=person)(sAMAccountName=user#myorg.com))")
nslcd: [8e1f29] <passwd="user#myorg.com"> DEBUG: ldap_result(): end of results (0 total)
nslcd: [e87ccd] DEBUG: connection from pid=9046 uid=0 gid=0
nslcd: [e87ccd] <authc="user#myorg.com"> DEBUG: nslcd_pam_authc("user#myorg.com","sshd","***")
nslcd: [e87ccd] <authc="user#myorg.com"> DEBUG: myldap_search(base="DC=myorg,DC=com", filter="(&(objectClass=person)(sAMAccountName=user#myorg.com))")
nslcd: [e87ccd] <authc="user#myorg.com"> DEBUG: ldap_result(): end of results (0 total)
nslcd: [e87ccd] <authc="user#myorg.com"> DEBUG: myldap_search(base="DC=myorg,DC=com", filter="(&(objectClass=person)(sAMAccountName=user#myorg.com))")
nslcd: [e87ccd] <authc="user#myorg.com"> DEBUG: ldap_result(): end of results (0 total)
nslcd: [e87ccd] <authc="user#myorg.com"> DEBUG: "user#myorg.com": user not found: No such object
If I login using only user the search has success but the authentication does not. (Tries to authenticate using full DN and ldap_sasl_bind() function)
nslcd: [8b4567] <host=10.0.2.2> DEBUG: ldap_simple_bind_s("CN=ldapuserDC=myorg,DC=com","***") (uri="ldap://172.17.192.100")
nslcd: [8b4567] <host=10.0.2.2> DEBUG: ldap_result(): end of results (0 total)
nslcd: [8b4567] <host=10.0.2.2> DEBUG: myldap_search(base="OU=Guatemala Support Team,OU=TI_Service_Accounts,DC=TI,DC=ads", filter="(&(objectClass=ipHost)(ipHostNumber=10.0.2.2))")
nslcd: [8b4567] <host=10.0.2.2> DEBUG: ldap_result(): end of results (0 total)
nslcd: [7b23c6] DEBUG: connection from pid=9099 uid=0 gid=0
nslcd: [7b23c6] <passwd="user"> DEBUG: myldap_search(base="DC=myorg,DC=com", filter="(&(objectClass=person)(sAMAccountName=user))")
nslcd: [7b23c6] <passwd="user"> DEBUG: ldap_initialize(ldap://172.17.192.100)
nslcd: [7b23c6] <passwd="user"> DEBUG: ldap_set_rebind_proc()
nslcd: [7b23c6] <passwd="user"> DEBUG: ldap_simple_bind_s("CN=ldapuser,DC=myorg,DC=com","***") (uri="ldap://172.17.192.100")
nslcd: [7b23c6] <passwd="user"> DEBUG: ldap_result(): CN=User John Doe,DC=myorg,DC=com
nslcd: [7b23c6] <passwd="user"> CN=User John Doe,DC=myorg,DC=com: objectSid: missing
nslcd: [7b23c6] <passwd="user"> DEBUG: ldap_result(): end of results (1 total)
nslcd: [7b23c6] <passwd="user"> DEBUG: myldap_search(base="OU=Guatemala Support Team,OU=TI_Service_Accounts,DC=TI,DC=ads", filter="(&(objectClass=person)(sAMAccountName=user))")
nslcd: [7b23c6] <passwd="user"> DEBUG: ldap_result(): end of results (0 total)
nslcd: [3c9869] DEBUG: connection from pid=9099 uid=0 gid=0
nslcd: [3c9869] <passwd="user"> DEBUG: myldap_search(base="DC=myorg,DC=com", filter="(&(objectClass=person)(sAMAccountName=user))")
nslcd: [3c9869] <passwd="user"> DEBUG: ldap_result(): CN=User John Doe,DC=myorg,DC=com
nslcd: [3c9869] <passwd="user"> CN=User John Doe,DC=myorg,DC=com: objectSid: missing
nslcd: [3c9869] <passwd="user"> DEBUG: ldap_result(): end of results (1 total)
nslcd: [3c9869] <passwd="user"> DEBUG: myldap_search(base="OU=Guatemala Support Team,OU=TI_Service_Accounts,DC=TI,DC=ads", filter="(&(objectClass=person)(sAMAccountName=user))")
nslcd: [3c9869] <passwd="user"> DEBUG: ldap_result(): end of results (0 total)
nslcd: [334873] DEBUG: connection from pid=9099 uid=0 gid=0
nslcd: [334873] <passwd="user"> DEBUG: myldap_search(base="DC=myorg,DC=com", filter="(&(objectClass=person)(sAMAccountName=user))")
nslcd: [334873] <passwd="user"> DEBUG: ldap_result(): CN=User John Doe,DC=myorg,DC=com
nslcd: [334873] <passwd="user"> CN=User John Doe,DC=myorg,DC=com: objectSid: missing
nslcd: [334873] <passwd="user"> DEBUG: ldap_result(): end of results (1 total)
nslcd: [334873] <passwd="user"> DEBUG: myldap_search(base="OU=Guatemala Support Team,OU=TI_Service_Accounts,DC=TI,DC=ads", filter="(&(objectClass=person)(sAMAccountName=user))")
nslcd: [334873] <passwd="user"> DEBUG: ldap_result(): end of results (0 total)
nslcd: [b0dc51] DEBUG: connection from pid=9099 uid=0 gid=0
nslcd: [b0dc51] <authc="user"> DEBUG: nslcd_pam_authc("user","sshd","***")
nslcd: [b0dc51] <authc="user"> DEBUG: myldap_search(base="DC=myorg,DC=com", filter="(&(objectClass=person)(sAMAccountName=user))")
nslcd: [b0dc51] <authc="user"> DEBUG: ldap_initialize(ldap://172.17.192.100)
nslcd: [b0dc51] <authc="user"> DEBUG: ldap_set_rebind_proc()
nslcd: [b0dc51] <authc="user"> DEBUG: ldap_simple_bind_s("CN=ldapuserDC=myorg,DC=com","***") (uri="ldap://172.17.192.100")
nslcd: [b0dc51] <authc="user"> DEBUG: ldap_result(): CN=User John Doe,DC=myorg,DC=com
nslcd: [b0dc51] <authc="user"> DEBUG: myldap_search(base="CN=User John Doe,DC=myorg,DC=com", filter="(objectClass=*)")
nslcd: [b0dc51] <authc="user"> DEBUG: ldap_initialize(ldap://172.17.192.100)
nslcd: [b0dc51] <authc="user"> DEBUG: ldap_set_rebind_proc()
nslcd: [b0dc51] <authc="user"> DEBUG: ldap_sasl_bind("CN=User John Doe,DC=myorg,DC=com","***") (uri="ldap://172.17.192.100")
nslcd: [b0dc51] <authc="user"> DEBUG: ldap_parse_result() result: Invalid credentials: 80090308: LdapErr: DSID-0C0903D0, comment: AcceptSecurityContext error, data 52e, v2580
nslcd: [b0dc51] <authc="user"> DEBUG: failed to bind to LDAP server ldap://172.17.192.100: Invalid credentials: 80090308: LdapErr: DSID-0C0903D0, comment: AcceptSecurityContext error, data 52e, v2580
nslcd: [b0dc51] <authc="user"> DEBUG: ldap_unbind()
nslcd: [b0dc51] <authc="user"> CN=User John Doe,DC=myorg,DC=com: Invalid credentials
nslcd: [b0dc51] <authc="user"> DEBUG: myldap_search(base="DC=myorg,DC=com", filter="(&(objectClass=person)(sAMAccountName=user))")
nslcd: [b0dc51] <authc="user"> DEBUG: ldap_result(): CN=User John Doe,DC=myorg,DC=com
Question: how should I configured nslcd.conf if I wanted to:
Login with user
Search in the sAMAccount field equal to user
Thank you in advance and sorry for the long post.
In /etc/nslcd.conf, Try changing (objectClass=person) to
(&(objectCategory=person)(objectClass=user))
-jim
Related
and thanks in advance.
I'm attempting to use scrapy, which is somewhat new to me. I built (what I thought was) a simple spider which does the following:
class SuperSpider(CrawlSpider):
name = 'KYM_entries'
start_urls = ['https://knowyourmeme.com/memes/all/page/1']
def parse(self, response):
for entry in response.xpath('/html/body/div[3]/div/div[3]/section'):
yield {
# The link to a meme entry page on Know Your Meme
'entry_link': entry.xpath('./td[2]/a/#href').get()
}
Then I run the following in a terminal window:
$ scrapy crawl KYM_entries -O practice.csv
/usr/lib/python3/dist-packages/pkg_resources/__init__.py:116: PkgResourcesDeprecationWarning: 0.1.43ubuntu1 is an invalid version and will not be supported in a future release
warnings.warn(
/usr/lib/python3/dist-packages/pkg_resources/__init__.py:116: PkgResourcesDeprecationWarning: 1.1build1 is an invalid version and will not be supported in a future release
warnings.warn(
2022-12-26 20:08:04 [scrapy.utils.log] INFO: Scrapy 2.7.1 started (bot: KYM_spider)
2022-12-26 20:08:04 [scrapy.utils.log] INFO: Versions: lxml 4.8.0.0, libxml2 2.9.13, cssselect 1.1.0, parsel 1.6.0, w3lib 1.22.0, Twisted 22.1.0, Python 3.10.6 (main, Nov 14 2022, 16:10:14) [GCC 11.3.0], pyOpenSSL 21.0.0 (OpenSSL 3.0.2 15 Mar 2022), cryptography 3.4.8, Platform Linux-5.15.0-56-generic-x86_64-with-glibc2.35
2022-12-26 20:08:04 [scrapy.crawler] INFO: Overridden settings:
{'BOT_NAME': 'KYM_spider',
'NEWSPIDER_MODULE': 'KYM_spider.spiders',
'ROBOTSTXT_OBEY': True,
'SPIDER_MODULES': ['KYM_spider.spiders']}
2022-12-26 20:08:04 [py.warnings] WARNING: /usr/local/lib/python3.10/dist-packages/scrapy/utils/request.py:231: ScrapyDeprecationWarning: '2.6' is a deprecated value for the 'REQUEST_FINGERPRINTER_IMPLEMENTATION' setting.
It is also the default value. In other words, it is normal to get this warning if you have not defined a value for the 'REQUEST_FINGERPRINTER_IMPLEMENTATION' setting. This is so for backward compatibility reasons, but it will change in a future version of Scrapy.
See the documentation of the 'REQUEST_FINGERPRINTER_IMPLEMENTATION' setting for information on how to handle this deprecation.
return cls(crawler)
2022-12-26 20:08:04 [scrapy.utils.log] DEBUG: Using reactor: twisted.internet.epollreactor.EPollReactor
2022-12-26 20:08:04 [scrapy.extensions.telnet] INFO: Telnet Password: 97ac3d17f1e4cea1
2022-12-26 20:08:04 [scrapy.middleware] INFO: Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
'scrapy.extensions.telnet.TelnetConsole',
'scrapy.extensions.memusage.MemoryUsage',
'scrapy.extensions.feedexport.FeedExporter',
'scrapy.extensions.logstats.LogStats']
2022-12-26 20:08:04 [scrapy.middleware] INFO: Enabled downloader middlewares:
['scrapy.downloadermiddlewares.robotstxt.RobotsTxtMiddleware',
'scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
'scrapy.downloadermiddlewares.retry.RetryMiddleware',
'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware',
'scrapy.downloadermiddlewares.stats.DownloaderStats']
2022-12-26 20:08:04 [scrapy.middleware] INFO: Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
'scrapy.spidermiddlewares.referer.RefererMiddleware',
'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
'scrapy.spidermiddlewares.depth.DepthMiddleware']
2022-12-26 20:08:04 [scrapy.middleware] INFO: Enabled item pipelines:
[]
2022-12-26 20:08:04 [scrapy.core.engine] INFO: Spider opened
2022-12-26 20:08:04 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2022-12-26 20:08:04 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6023
2022-12-26 20:08:05 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://knowyourmeme.com/robots.txt> (referer: None)
2022-12-26 20:08:05 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://knowyourmeme.com/memes/all/page/1> (referer: None)
2022-12-26 20:08:05 [scrapy.core.engine] INFO: Closing spider (finished)
2022-12-26 20:08:05 [scrapy.statscollectors] INFO: Dumping Scrapy stats:
{'downloader/request_bytes': 466,
'downloader/request_count': 2,
'downloader/request_method_count/GET': 2,
'downloader/response_bytes': 11690,
'downloader/response_count': 2,
'downloader/response_status_count/200': 2,
'elapsed_time_seconds': 0.953839,
'finish_reason': 'finished',
'finish_time': datetime.datetime(2022, 12, 27, 1, 8, 5, 833510),
'httpcompression/response_bytes': 45804,
'httpcompression/response_count': 2,
'log_count/DEBUG': 3,
'log_count/INFO': 10,
'log_count/WARNING': 1,
'memusage/max': 65228800,
'memusage/startup': 65228800,
'response_received_count': 2,
'robotstxt/request_count': 1,
'robotstxt/response_count': 1,
'robotstxt/response_status_count/200': 1,
'scheduler/dequeued': 1,
'scheduler/dequeued/memory': 1,
'scheduler/enqueued': 1,
'scheduler/enqueued/memory': 1,
'start_time': datetime.datetime(2022, 12, 27, 1, 8, 4, 879671)}
2022-12-26 20:08:05 [scrapy.core.engine] INFO: Spider closed (finished)
This returns an empty CSV, which I suppose means either something is wrong with the xpath, or there is something wrong with the connection to Know Your Meme. However, beyond the 200 code saying it is connecting to the site, I'm unsure how to troubleshoot what is happening here.
So I have a couple questions, one more direct to my issue, and one more broadly interested in this output:
Is there a way to see at what point my script is failing to retrieve the specified data in the xpath for this particular case?
Is there a simple guide or reference for how to read scrapy output?
I have looked into your code. There are a few issues with the selectors/XPath. I have updated the CSS selector and removed the XPATH. meme URLs are relative URLs so I have added urljoin method to make these URLs absolute URLs. I have added start_request method as my version of scrapy is 2.6.0. if you are using a lower version of scrapy (1.6.0) you can remove this method.
class SuperSpider(CrawlSpider):
name = 'KYM_entries'
start_urls = ['https://knowyourmeme.com/memes/all/page/1']
def start_requests(self):
yield Request(self.start_urls[0], callback=self.parse)
def parse(self, response):
for entry in response.css('.entry-grid-body .photo'):
yield {
# The link to a meme entry page on Know Your Meme
'entry_link': response.urljoin(entry.css('::attr(href)').get())
}
The code is working fine now. Below is the output.
2022-12-27 13:14:52 [scrapy.core.engine] INFO: Spider opened
2022-12-27 13:14:52 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2022-12-27 13:14:52 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6023
2022-12-27 13:14:53 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://knowyourmeme.com/memes/all/page/1> (referer: None)
2022-12-27 13:14:53 [scrapy.core.scraper] DEBUG: Scraped from <200 https://knowyourmeme.com/memes/all/page/1>
{'entry_link': 'https://knowyourmeme.com/memes/mayinquangcao'}
2022-12-27 13:14:53 [scrapy.core.scraper] DEBUG: Scraped from <200 https://knowyourmeme.com/memes/all/page/1>
{'entry_link': 'https://knowyourmeme.com/memes/this-is-x-bitch-we-clown-in-this-muthafucka-betta-take-yo-sensitive-ass-back-to-y'}
2022-12-27 13:14:53 [scrapy.core.scraper] DEBUG: Scraped from <200 https://knowyourmeme.com/memes/all/page/1>
{'entry_link': 'https://knowyourmeme.com/memes/subcultures/choo-choo-charles'}
2022-12-27 13:14:53 [scrapy.core.scraper] DEBUG: Scraped from <200 https://knowyourmeme.com/memes/all/page/1>
{'entry_link': 'https://knowyourmeme.com/memes/subcultures/bug-fables-the-everlasting-sapling'}
2022-12-27 13:14:53 [scrapy.core.scraper] DEBUG: Scraped from <200 https://knowyourmeme.com/memes/all/page/1>
{'entry_link': 'https://knowyourmeme.com/memes/onii-holding-a-picture'}
2022-12-27 13:14:53 [scrapy.core.scraper] DEBUG: Scraped from <200 https://knowyourmeme.com/memes/all/page/1>
{'entry_link': 'https://knowyourmeme.com/memes/vintage-recipe-videos'}
2022-12-27 13:14:53 [scrapy.core.scraper] DEBUG: Scraped from <200 https://knowyourmeme.com/memes/all/page/1>
{'entry_link': 'https://knowyourmeme.com/memes/ytpmv-elf'}
2022-12-27 13:14:53 [scrapy.core.scraper] DEBUG: Scraped from <200 https://knowyourmeme.com/memes/all/page/1>
{'entry_link': 'https://knowyourmeme.com/memes/i-just-hit-a-dog-going-70mph-on-my-truck'}
2022-12-27 13:14:53 [scrapy.core.scraper] DEBUG: Scraped from <200 https://knowyourmeme.com/memes/all/page/1>
{'entry_link': 'https://knowyourmeme.com/memes/women-dodging-accountability'}
2022-12-27 13:14:53 [scrapy.core.scraper] DEBUG: Scraped from <200 https://knowyourmeme.com/memes/all/page/1>
{'entry_link': 'https://knowyourmeme.com/memes/grinchs-ultimatum'}
2022-12-27 13:14:53 [scrapy.core.scraper] DEBUG: Scraped from <200 https://knowyourmeme.com/memes/all/page/1>
{'entry_link': 'https://knowyourmeme.com/memes/where-is-idos-black-and-white'}
2022-12-27 13:14:53 [scrapy.core.scraper] DEBUG: Scraped from <200 https://knowyourmeme.com/memes/all/page/1>
{'entry_link': 'https://knowyourmeme.com/memes/basilisk-time'}
2022-12-27 13:14:53 [scrapy.core.scraper] DEBUG: Scraped from <200 https://knowyourmeme.com/memes/all/page/1>
{'entry_link': 'https://knowyourmeme.com/memes/subcultures/rankinbass-productions'}
2022-12-27 13:14:53 [scrapy.core.scraper] DEBUG: Scraped from <200 https://knowyourmeme.com/memes/all/page/1>
{'entry_link': 'https://knowyourmeme.com/memes/subcultures/error143'}
2022-12-27 13:14:53 [scrapy.core.scraper] DEBUG: Scraped from <200 https://knowyourmeme.com/memes/all/page/1>
{'entry_link': 'https://knowyourmeme.com/memes/whatsapp-university'}
2022-12-27 13:14:53 [scrapy.core.scraper] DEBUG: Scraped from <200 https://knowyourmeme.com/memes/all/page/1>
{'entry_link': 'https://knowyourmeme.com/memes/messi-autism-speculation-messi-is-autistic'}
I'm trying to scrape data from this website: https://aa-dc.org/meetings?tsml-day=any&tsml-type=IPM
I have made the following script for the intial data:
import scrapy
class WaiascrapSpider(scrapy.Spider):
name = 'waiascrap'
allowed_domains = ['clsaa-dc.org']
start_urls = ['https://aa-dc.org/meetings?tsml-day=any&tsml-type=IPM']
def parse(self, response):
rows = response.xpath("//tr")
for row in rows:
day = rows.xpath("(//tr/td[#class='time']/span)[1]/text()").get()
time = rows.xpath("//tr/td[#class='time']/span/time/text()").get()
yield{
'day': day,
'time': time,
}
however the data I'm getting is repeated, like if I'm not navigating the For cycle:
PS C:\Users\gasgu\PycharmProjects\ScrapingProject\projects\waia>
scrapy crawl waiascrap 2021-08-20 15:25:11 [scrapy.utils.log] INFO:
Scrapy 2.5.0 started (bot: waia) 2021-08-20 15:25:11
[scrapy.utils.log] INFO: Versions: lxml 4.6.3.0, libxml2 2.9.5,
cssselect 1.1.0, parsel 1.6.0, w3lib 1.22.0, Twisted 21.7.0, Python
3.9.6 (t ags/v3.9.6:db3ff76, Jun 28 2021, 15:26:21) [MSC v.1929 64 bit (AMD64)], pyOpenSSL 20.0.1 (OpenSSL 1.1.1k 25 Mar 2021), cryptography
3.4.7, Platform Windows-10-
10.0.19042-SP0 2021-08-20 15:25:11 [scrapy.utils.log] DEBUG: Using reactor: twisted.internet.selectreactor.SelectReactor 2021-08-20
15:25:11 [scrapy.crawler] INFO: Overridden settings: {'BOT_NAME':
'waia', 'NEWSPIDER_MODULE': 'waia.spiders', 'ROBOTSTXT_OBEY': True,
'SPIDER_MODULES': ['waia.spiders']} 2021-08-20 15:25:11
[scrapy.extensions.telnet] INFO: Telnet Password: 9299b6be5840b21c
2021-08-20 15:25:11 [scrapy.middleware] INFO: Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
'scrapy.extensions.telnet.TelnetConsole',
'scrapy.extensions.logstats.LogStats'] 2021-08-20 15:25:11
[scrapy.middleware] INFO: Enabled downloader middlewares:
['scrapy.downloadermiddlewares.robotstxt.RobotsTxtMiddleware',
'scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
'scrapy.downloadermiddlewares.retry.RetryMiddleware',
'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware',
'scrapy.downloadermiddlewares.stats.DownloaderStats'] 2021-08-20
15:25:11 [scrapy.middleware] INFO: Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
'scrapy.spidermiddlewares.referer.RefererMiddleware',
'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
'scrapy.spidermiddlewares.depth.DepthMiddleware'] 2021-08-20 15:25:11
[scrapy.middleware] INFO: Enabled item pipelines: [] 2021-08-20
15:25:11 [scrapy.core.engine] INFO: Spider opened 2021-08-20 15:25:11
[scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min),
scraped 0 items (at 0 items/min) 2021-08-20 15:25:11
[scrapy.extensions.telnet] INFO: Telnet console listening on
127.0.0.1:6023 2021-08-20 15:25:12 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://aa-dc.org/robots.txt> (referer: None) 2021-08-20
15:25:13 [scrapy.core.engine] DEBUG: Crawled (200) <GET
https://aa-dc.org/meetings?tsml-day=any&tsml-type=IPM> (referer: None)
2021-08-20 15:25:16 [scrapy.core.scraper] DEBUG: Scraped from <200
https://aa-dc.org/meetings?tsml-day=any&tsml-type=IPM> {'day':
'Sunday', 'time': '6:45 am'} 2021-08-20 15:25:19 [scrapy.core.scraper]
DEBUG: Scraped from <200
https://aa-dc.org/meetings?tsml-day=any&tsml-type=IPM> {'day':
'Sunday', 'time': '6:45 am'} 2021-08-20 15:25:22 [scrapy.core.scraper]
DEBUG: Scraped from <200
https://aa-dc.org/meetings?tsml-day=any&tsml-type=IPM> {'day':
'Sunday', 'time': '6:45 am'} 2021-08-20 15:25:26 [scrapy.core.scraper]
DEBUG: Scraped from <200
https://aa-dc.org/meetings?tsml-day=any&tsml-type=IPM> {'day':
'Sunday', 'time': '6:45 am'} 2021-08-20 15:25:29 [scrapy.core.scraper]
DEBUG: Scraped from <200
https://aa-dc.org/meetings?tsml-day=any&tsml-type=IPM> {'day':
'Sunday', 'time': '6:45 am'} 2021-08-20 15:25:32 [scrapy.core.scraper]
DEBUG: Scraped from <200
https://aa-dc.org/meetings?tsml-day=any&tsml-type=IPM> {'day':
'Sunday', 'time': '6:45 am'} 2021-08-20 15:25:35 [scrapy.core.scraper]
DEBUG: Scraped from <200
https://aa-dc.org/meetings?tsml-day=any&tsml-type=IPM> {'day':
'Sunday', 'time': '6:45 am'} 2021-08-20 15:25:39 [scrapy.core.scraper]
DEBUG: Scraped from <200
https://aa-dc.org/meetings?tsml-day=any&tsml-type=IPM> {'day':
'Sunday', 'time': '6:45 am'}
EDIT:
now it's working, there was a combination of the errors marked by #Prophet, and a problem with my Xpath.
I'm putting my code working below:
import scrapy
class WaiascrapSpider(scrapy.Spider):
name = 'waiascrap'
allowed_domains = ['clsaa-dc.org']
start_urls = ['https://aa-dc.org/meetings?tsml-day=any&tsml-type=IPM']
def parse(self, response):
rows = response.xpath("//tr")
for row in rows:
day = row.xpath(".//td[#class='time']/span/text()").get()
time = row.xpath(".//td[#class='time']/span/time/text()").get()
yield {
'day': day,
'time': time,
}
To select element inside element you have to put a dot . in front of the XPath expression saying "from here".
Otherwise it will bring you the first match of (//tr/td[#class='time']/span)[1]/text() on the entire page each time, as you see.
Also, since you are iterating per each row it should be row.xpath..., not rows.xpath since rows is a list of elements while each row is an element.
Also, to apply search on a web element according to XPath locator you should use find_element_by_xpath method, not xpath.
def parse(self, response):
rows = response.xpath("//tr")
for row in rows:
day = row.find_element_by_xpath(".(//tr/td[#class='time']/span)[1]/text()").get()
time = row.find_element_by_xpath("//.tr/td[#class='time']/span/time/text()").get()
yield{
'day': day,
'time': time,
}
I'm creating a web scraper and want to callback to get sub-pages, but it seems not working correctly and no result, any one can help ?
Here is my code
class YellSpider(scrapy.Spider):
name = "yell"
start_urls = [url, ]
def parse(self, response):
pageNum = 0
pages = len(response.xpath(".//div[#class='col-sm-14 col-md-16 col-lg-14 text-center']/*"))
for page in range(pages):
pageNum = page + 1
for x in range(5):
num = random.randint(5, 8)
time.sleep(num)
for item in response.xpath(".//a[#href][contains(#href,'/#view=map')][contains(#href,'/biz/')]"):
subcategory = base_url + item.xpath("./#href").extract_first().replace("/#view=map", "")
sub_req = scrapy.Request(subcategory, callback=self.parse_details)
yield sub_req
next_page = base_url + "ucs/UcsSearchAction.do?&selectedClassification=" + classification + "&keywords=" + keyword + "&location=" + city + "&pageNum=" + str(
pageNum + 1)
if next_page:
yield scrapy.Request(next_page, self.parse)
def parse_details(self, sub_req):
for x in range(5):
num = random.randint(1, 5)
# time.sleep(num)
name = sub_req.xpath(".//h1[#class='text-h1 businessCard--businessName']/text()").extract_first()
address = " ".join(
sub_req.xpath(".//span[#class='address'][#itemprop='address']/child::node()/text()").extract())
telephone = sub_req.xpath(".//span[#class='business--telephoneNumber']/text()").extract_first()
web = sub_req.xpath(
".//div[#class='row flexColumns-sm-order-8 floatedColumns-md-right floatedColumns-lg-right floatedColumns-md-19 floatedColumns-lg-19']//a[#itemprop='url']/#href").extract_first()
hours = ""
overview = ""
yield {
'Name': name,
'Address': address,
'Telephone': telephone,
'Web Site': web
}
I want to callback from response to parse_details
I expect to loop over all adds in sub_req and scrape the data from it
and here is the logfile:
2019-06-17 15:50:33 [scrapy.utils.log] INFO: Scrapy 1.6.0 started (bot: Web_Scraper)
2019-06-17 15:50:33 [scrapy.utils.log] INFO: Versions: lxml 4.3.4.0, libxml2 2.9.5, cssselect 1.0.3, parsel 1.5.1, w3lib 1.20.0, Twisted 19.2.1, Python 3.7.3 (v3.7.3:ef4ec6ed12, Mar 25 2019, 22:22:05) [MSC v.1916 64 bit (AMD64)], pyOpenSSL 19.0.0 (OpenSSL 1.1.1c 28 May 2019), cryptography 2.7, Platform Windows-10-10.0.16299-SP0
2019-06-17 15:50:33 [scrapy.crawler] INFO: Overridden settings: {'BOT_NAME': 'Web_Scraper', 'LOG_FILE': 'output.log', 'NEWSPIDER_MODULE': 'Web_Scraper.spiders', 'ROBOTSTXT_OBEY': True, 'SPIDER_MODULES': ['Web_Scraper.spiders']}
2019-06-17 15:50:33 [scrapy.extensions.telnet] INFO: Telnet Password: 91c423015a2cd984
2019-06-17 15:50:34 [scrapy.middleware] INFO: Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
'scrapy.extensions.telnet.TelnetConsole',
'scrapy.extensions.logstats.LogStats']
2019-06-17 15:50:34 [scrapy.middleware] INFO: Enabled downloader middlewares:
['scrapy.downloadermiddlewares.robotstxt.RobotsTxtMiddleware',
'scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
'scrapy.downloadermiddlewares.retry.RetryMiddleware',
'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware',
'scrapy.downloadermiddlewares.stats.DownloaderStats']
2019-06-17 15:50:34 [scrapy.middleware] INFO: Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
'scrapy.spidermiddlewares.referer.RefererMiddleware',
'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
'scrapy.spidermiddlewares.depth.DepthMiddleware']
2019-06-17 15:50:34 [scrapy.middleware] INFO: Enabled item pipelines:
[]
2019-06-17 15:50:34 [scrapy.core.engine] INFO: Spider opened
2019-06-17 15:50:34 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2019-06-17 15:50:34 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6023
2019-06-17 15:50:35 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.yell.com/robots.txt> (referer: None)
2019-06-17 15:50:36 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.yell.com/ucs/UcsSearchAction.do?&selectedClassification=Tutoring&keywords=Math&location=liverpool&pageNum=1> (referer: None)
2019-06-17 15:50:49 [scrapy.dupefilters] DEBUG: Filtered duplicate request: <GET https://www.yell.com//biz/pennine-tuition-services-liverpool-9465687> - no more duplicates will be shown (see DUPEFILTER_DEBUG to show all duplicates)
2019-06-17 15:50:56 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (301) to <GET https://www.yell.com/biz/pennine-tuition-services-liverpool-9465687/> from <GET https://www.yell.com//biz/pennine-tuition-services-liverpool-9465687>
2019-06-17 15:50:56 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (301) to <GET https://www.yell.com/biz/home-maths-tutoring-liverpool-7467622/> from <GET https://www.yell.com//biz/home-maths-tutoring-liverpool-7467622>
2019-06-17 15:50:56 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (301) to <GET https://www.yell.com/biz/kumon-maths-and-english-wallasey-8945913/> from <GET https://www.yell.com//biz/kumon-maths-and-english-wallasey-8945913>
2019-06-17 15:50:56 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (301) to <GET https://www.yell.com/biz/maths-tution-wallasey-7574939/> from <GET https://www.yell.com//biz/maths-tution-wallasey-7574939>
2019-06-17 15:50:56 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (301) to <GET https://www.yell.com/biz/patrick-haslam-tutoring-liverpool-8777349/> from <GET https://www.yell.com//biz/patrick-haslam-tutoring-liverpool-8777349>
2019-06-17 15:50:56 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (301) to <GET https://www.yell.com/biz/maths-tution-wallasey-8327361/> from <GET https://www.yell.com//biz/maths-tution-wallasey-8327361>
2019-06-17 15:50:56 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (301) to <GET https://www.yell.com/biz/dr-john-ankers-science-and-maths-tutor-liverpool-8467525/> from <GET https://www.yell.com//biz/dr-john-ankers-science-and-maths-tutor-liverpool-8467525>
2019-06-17 15:50:56 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (301) to <GET https://www.yell.com/biz/tutor-services-liverpool-liverpool-8134849/> from <GET https://www.yell.com//biz/tutor-services-liverpool-liverpool-8134849>
2019-06-17 15:50:56 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (301) to <GET https://www.yell.com/biz/advanced-maths-tutorials-liverpool-3755223/> from <GET https://www.yell.com//biz/advanced-maths-tutorials-liverpool-3755223>
2019-06-17 15:50:56 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (301) to <GET https://www.yell.com/biz/kumon-maths-and-english-liverpool-8903743/> from <GET https://www.yell.com//biz/kumon-maths-and-english-liverpool-8903743>
2019-06-17 15:50:56 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (301) to <GET https://www.yell.com/biz/kumon-maths-and-english-study-centre-bebington-wirral-7460985/> from <GET https://www.yell.com//biz/kumon-maths-and-english-study-centre-bebington-wirral-7460985>
2019-06-17 15:50:56 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.yell.com/biz/pennine-tuition-services-liverpool-9465687/> (referer: https://www.yell.com/ucs/UcsSearchAction.do?&selectedClassification=Tutoring&keywords=Math&location=liverpool&pageNum=1)
2019-06-17 15:50:56 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (301) to <GET https://www.yell.com/biz/north-west-tutors-liverpool-901511208/> from <GET https://www.yell.com//biz/north-west-tutors-liverpool-901511208>
2019-06-17 15:50:56 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.yell.com/biz/home-maths-tutoring-liverpool-7467622/> (referer: https://www.yell.com/ucs/UcsSearchAction.do?&selectedClassification=Tutoring&keywords=Math&location=liverpool&pageNum=1)
2019-06-17 15:50:56 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (301) to <GET https://www.yell.com/biz/triple-m-education-prenton-8934754/> from <GET https://www.yell.com//biz/triple-m-education-prenton-8934754>
2019-06-17 15:50:56 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.yell.com/biz/kumon-maths-and-english-wallasey-8945913/> (referer: https://www.yell.com/ucs/UcsSearchAction.do?&selectedClassification=Tutoring&keywords=Math&location=liverpool&pageNum=1)
2019-06-17 15:50:56 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.yell.com/biz/pennine-tuition-services-liverpool-9465687/>
{'Name': None, 'Address': '', 'Telephone': None, 'Web Site': None}
2019-06-17 15:50:56 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.yell.com/biz/maths-tution-wallasey-7574939/> (referer: https://www.yell.com/ucs/UcsSearchAction.do?&selectedClassification=Tutoring&keywords=Math&location=liverpool&pageNum=1)
2019-06-17 15:50:56 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (301) to <GET https://www.yell.com/biz/maths-tuition-wallasey-901339881/> from <GET https://www.yell.com//biz/maths-tuition-wallasey-901339881>
2019-06-17 15:50:56 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.yell.com/biz/dr-john-ankers-science-and-maths-tutor-liverpool-8467525/> (referer: https://www.yell.com/ucs/UcsSearchAction.do?&selectedClassification=Tutoring&keywords=Math&location=liverpool&pageNum=1)
2019-06-17 15:50:56 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.yell.com/biz/home-maths-tutoring-liverpool-7467622/>
{'Name': None, 'Address': '', 'Telephone': None, 'Web Site': None}
2019-06-17 15:50:56 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.yell.com/biz/tutor-services-liverpool-liverpool-8134849/> (referer: https://www.yell.com/ucs/UcsSearchAction.do?&selectedClassification=Tutoring&keywords=Math&location=liverpool&pageNum=1)
2019-06-17 15:50:56 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.yell.com/biz/kumon-maths-and-english-wallasey-8945913/>
{'Name': None, 'Address': '', 'Telephone': None, 'Web Site': None}
2019-06-17 15:50:56 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.yell.com/biz/patrick-haslam-tutoring-liverpool-8777349/> (referer: https://www.yell.com/ucs/UcsSearchAction.do?&selectedClassification=Tutoring&keywords=Math&location=liverpool&pageNum=1)
2019-06-17 15:50:56 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.yell.com/biz/maths-tution-wallasey-8327361/> (referer: https://www.yell.com/ucs/UcsSearchAction.do?&selectedClassification=Tutoring&keywords=Math&location=liverpool&pageNum=1)
2019-06-17 15:50:56 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.yell.com/biz/advanced-maths-tutorials-liverpool-3755223/> (referer: https://www.yell.com/ucs/UcsSearchAction.do?&selectedClassification=Tutoring&keywords=Math&location=liverpool&pageNum=1)
2019-06-17 15:50:56 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.yell.com/biz/maths-tution-wallasey-7574939/>
{'Name': None, 'Address': '', 'Telephone': None, 'Web Site': None}
2019-06-17 15:50:57 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (301) to <GET https://www.yell.com/biz/activett-liverpool-901311152/> from <GET https://www.yell.com//biz/activett-liverpool-901311152>
2019-06-17 15:50:57 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.yell.com/biz/kumon-maths-and-english-liverpool-8903743/> (referer: https://www.yell.com/ucs/UcsSearchAction.do?&selectedClassification=Tutoring&keywords=Math&location=liverpool&pageNum=1)
2019-06-17 15:50:57 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.yell.com/biz/dr-john-ankers-science-and-maths-tutor-liverpool-8467525/>
{'Name': None, 'Address': '', 'Telephone': None, 'Web Site': None}
2019-06-17 15:50:57 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.yell.com/ucs/UcsSearchAction.do?&selectedClassification=Tutoring&keywords=Math&location=liverpool&pageNum=4> (referer: https://www.yell.com/ucs/UcsSearchAction.do?&selectedClassification=Tutoring&keywords=Math&location=liverpool&pageNum=1)
2019-06-17 15:50:57 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.yell.com/biz/tutor-services-liverpool-liverpool-8134849/>
{'Name': None, 'Address': '', 'Telephone': None, 'Web Site': None}
2019-06-17 15:50:57 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (301) to <GET https://www.yell.com/biz/liz-beattie-tutoring-prenton-7618961/> from <GET https://www.yell.com//biz/liz-beattie-tutoring-prenton-7618961>
2019-06-17 15:50:57 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.yell.com//biz/askademia-liverpool-8680035> (referer: https://www.yell.com/ucs/UcsSearchAction.do?&selectedClassification=Tutoring&keywords=Math&location=liverpool&pageNum=1)
2019-06-17 15:50:57 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.yell.com/biz/patrick-haslam-tutoring-liverpool-8777349/>
{'Name': None, 'Address': '', 'Telephone': None, 'Web Site': None}
2019-06-17 15:50:57 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.yell.com/biz/maths-tution-wallasey-8327361/>
{'Name': None, 'Address': '', 'Telephone': None, 'Web Site': None}
2019-06-17 15:50:57 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.yell.com/biz/kumon-maths-and-english-study-centre-bebington-wirral-7460985/> (referer: https://www.yell.com/ucs/UcsSearchAction.do?&selectedClassification=Tutoring&keywords=Math&location=liverpool&pageNum=1)
2019-06-17 15:50:57 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.yell.com/biz/north-west-tutors-liverpool-901511208/> (referer: https://www.yell.com/ucs/UcsSearchAction.do?&selectedClassification=Tutoring&keywords=Math&location=liverpool&pageNum=1)
2019-06-17 15:50:57 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.yell.com/biz/advanced-maths-tutorials-liverpool-3755223/>
{'Name': None, 'Address': '', 'Telephone': None, 'Web Site': None}
2019-06-17 15:50:57 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.yell.com/biz/triple-m-education-prenton-8934754/> (referer: https://www.yell.com/ucs/UcsSearchAction.do?&selectedClassification=Tutoring&keywords=Math&location=liverpool&pageNum=1)
2019-06-17 15:50:57 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.yell.com//biz/1-2-1-tutoring-liverpool-901224945> (referer: https://www.yell.com/ucs/UcsSearchAction.do?&selectedClassification=Tutoring&keywords=Math&location=liverpool&pageNum=1)
2019-06-17 15:50:57 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.yell.com/biz/kumon-maths-and-english-liverpool-8903743/>
{'Name': None, 'Address': '', 'Telephone': None, 'Web Site': None}
2019-06-17 15:50:57 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.yell.com/biz/maths-tuition-wallasey-901339881/> (referer: https://www.yell.com/ucs/UcsSearchAction.do?&selectedClassification=Tutoring&keywords=Math&location=liverpool&pageNum=1)
2019-06-17 15:50:57 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.yell.com//biz/dmw-tuition-ltd-liverpool-6887458> (referer: https://www.yell.com/ucs/UcsSearchAction.do?&selectedClassification=Tutoring&keywords=Math&location=liverpool&pageNum=1)
2019-06-17 15:50:57 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.yell.com//biz/askademia-liverpool-8680035>
{'Name': None, 'Address': '', 'Telephone': None, 'Web Site': None}
2019-06-17 15:50:57 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.yell.com//biz/wallasey-tuition-wallasey-7390339> (referer: https://www.yell.com/ucs/UcsSearchAction.do?&selectedClassification=Tutoring&keywords=Math&location=liverpool&pageNum=1)
2019-06-17 15:50:57 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.yell.com//biz/zenitheducators-liverpool-7791342> (referer: https://www.yell.com/ucs/UcsSearchAction.do?&selectedClassification=Tutoring&keywords=Math&location=liverpool&pageNum=1)
2019-06-17 15:50:57 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.yell.com//biz/explore-learning-liverpool-901511688> (referer: https://www.yell.com/ucs/UcsSearchAction.do?&selectedClassification=Tutoring&keywords=Math&location=liverpool&pageNum=1)
2019-06-17 15:50:57 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.yell.com/biz/kumon-maths-and-english-study-centre-bebington-wirral-7460985/>
{'Name': None, 'Address': '', 'Telephone': None, 'Web Site': None}
2019-06-17 15:50:57 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.yell.com/biz/north-west-tutors-liverpool-901511208/>
{'Name': None, 'Address': '', 'Telephone': None, 'Web Site': None}
2019-06-17 15:50:57 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.yell.com//biz/edes-educational-centre-liverpool-8380869> (referer: https://www.yell.com/ucs/UcsSearchAction.do?&selectedClassification=Tutoring&keywords=Math&location=liverpool&pageNum=1)
2019-06-17 15:50:57 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.yell.com/biz/triple-m-education-prenton-8934754/>
{'Name': None, 'Address': '', 'Telephone': None, 'Web Site': None}
2019-06-17 15:50:57 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.yell.com//biz/love2teach-liverpool-liverpool-9678322> (referer: https://www.yell.com/ucs/UcsSearchAction.do?&selectedClassification=Tutoring&keywords=Math&location=liverpool&pageNum=1)
2019-06-17 15:50:57 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.yell.com/biz/activett-liverpool-901311152/> (referer: https://www.yell.com/ucs/UcsSearchAction.do?&selectedClassification=Tutoring&keywords=Math&location=liverpool&pageNum=1)
2019-06-17 15:50:57 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.yell.com//biz/1-2-1-tutoring-liverpool-901224945>
{'Name': None, 'Address': '', 'Telephone': None, 'Web Site': None}
2019-06-17 15:50:57 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.yell.com/biz/maths-tuition-wallasey-901339881/>
{'Name': None, 'Address': '', 'Telephone': None, 'Web Site': None}
2019-06-17 15:50:57 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.yell.com//biz/guaranteed-grades-liverpool-7368523> (referer: https://www.yell.com/ucs/UcsSearchAction.do?&selectedClassification=Tutoring&keywords=Math&location=liverpool&pageNum=1)
2019-06-17 15:50:57 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.yell.com//biz/dmw-tuition-ltd-liverpool-6887458>
{'Name': None, 'Address': '', 'Telephone': None, 'Web Site': None}
2019-06-17 15:50:57 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.yell.com//biz/wallasey-tuition-wallasey-7390339>
{'Name': None, 'Address': '', 'Telephone': None, 'Web Site': None}
2019-06-17 15:50:57 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.yell.com/biz/liz-beattie-tutoring-prenton-7618961/> (referer: https://www.yell.com/ucs/UcsSearchAction.do?&selectedClassification=Tutoring&keywords=Math&location=liverpool&pageNum=1)
2019-06-17 15:50:57 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.yell.com//biz/zenitheducators-liverpool-7791342>
{'Name': None, 'Address': '', 'Telephone': None, 'Web Site': None}
2019-06-17 15:50:57 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.yell.com/ucs/UcsSearchAction.do?&selectedClassification=Tutoring&keywords=Math&location=liverpool&pageNum=1> (referer: https://www.yell.com/ucs/UcsSearchAction.do?&selectedClassification=Tutoring&keywords=Math&location=liverpool&pageNum=4)
2019-06-17 15:50:57 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.yell.com//biz/explore-learning-liverpool-901511688>
{'Name': None, 'Address': '', 'Telephone': None, 'Web Site': None}
2019-06-17 15:50:57 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.yell.com//biz/edes-educational-centre-liverpool-8380869>
{'Name': None, 'Address': '', 'Telephone': None, 'Web Site': None}
2019-06-17 15:50:57 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.yell.com//biz/love2teach-liverpool-liverpool-9678322>
{'Name': None, 'Address': '', 'Telephone': None, 'Web Site': None}
2019-06-17 15:50:57 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.yell.com/biz/activett-liverpool-901311152/>
{'Name': None, 'Address': '', 'Telephone': None, 'Web Site': None}
2019-06-17 15:50:57 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.yell.com//biz/guaranteed-grades-liverpool-7368523>
{'Name': None, 'Address': '', 'Telephone': None, 'Web Site': None}
2019-06-17 15:50:57 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.yell.com/biz/liz-beattie-tutoring-prenton-7618961/>
{'Name': None, 'Address': '', 'Telephone': None, 'Web Site': None}
2019-06-17 15:50:57 [scrapy.core.engine] INFO: Closing spider (finished)
2019-06-17 15:50:57 [scrapy.statscollectors] INFO: Dumping Scrapy stats:
{'downloader/request_bytes': 21544,
'downloader/request_count': 45,
'downloader/request_method_count/GET': 45,
'downloader/response_bytes': 257613,
'downloader/response_count': 45,
'downloader/response_status_count/200': 29,
'downloader/response_status_count/301': 16,
'dupefilter/filtered': 51,
'finish_reason': 'finished',
'finish_time': datetime.datetime(2019, 6, 17, 13, 50, 57, 376036),
'item_scraped_count': 25,
'log_count/DEBUG': 71,
'log_count/INFO': 9,
'request_depth_max': 3,
'response_received_count': 29,
'robotstxt/request_count': 1,
'robotstxt/response_count': 1,
'robotstxt/response_status_count/200': 1,
'scheduler/dequeued': 44,
'scheduler/dequeued/memory': 44,
'scheduler/enqueued': 44,
'scheduler/enqueued/memory': 44,
'start_time': datetime.datetime(2019, 6, 17, 13, 50, 34, 114473)}
2019-06-17 15:50:57 [scrapy.core.engine] INFO: Spider closed (finished)
Running chef 12 on Ubuntu 14.1. Using self signed certs to setup the server, when I try to run knife commands from my client it fails with following error. Any operation has same error. chef server logs do not have any errors or info during the query.
knife config
[root#ip-10-233-2-40 ~]# cat ~/.chef/knife.rb
log_level :debug
log_location STDOUT
node_name 'admin'
client_key '/root/.chef/admin.pem'
validation_client_name 'dev'
validation_key '/root/.chef/dev-validator.pem'
chef_server_url 'https://chef.example.com/organizations/dev'
syntax_check_cache_path '/root/.chef/syntax_check_cache'
root#ip-10-233-2-177:~/ssl-certs# chef-server-ctl status
run: bookshelf: (pid 1092) 1998s; run: log: (pid 1064) 1998s
run: nginx: (pid 6140) 723s; run: log: (pid 1063) 1998s
run: oc_bifrost: (pid 1077) 1998s; run: log: (pid 1058) 1998s
run: oc_id: (pid 1091) 1998s; run: log: (pid 1061) 1998s
run: opscode-erchef: (pid 1090) 1998s; run: log: (pid 1066) 1998s
run: opscode-expander: (pid 1076) 1998s; run: log: (pid 1060) 1998s
run: opscode-expander-reindexer: (pid 1096) 1998s; run: log: (pid 1059) 1998s
run: opscode-solr4: (pid 1075) 1998s; run: log: (pid 1057) 1998s
run: postgresql: (pid 1085) 1998s; run: log: (pid 1056) 1998s
run: rabbitmq: (pid 1062) 1998s; run: log: (pid 1046) 1998s
run: redis_lb: (pid 6124) 723s; run: log: (pid 1065) 1998s
[root#ip-10-233-2-40 ~]# knife environment create staging
ERROR: The object you are looking for could not be found
/opt/chef/embedded/lib/ruby/2.1.0/net/http/response.rb:325:in `stream_check': undefined method `closed?' for nil:NilClass (NoMethodError)
from /opt/chef/embedded/lib/ruby/2.1.0/net/http/response.rb:199:in `read_body'
from /opt/chef/embedded/lib/ruby/2.1.0/net/http/response.rb:226:in `body'
from /opt/chef/embedded/lib/ruby/gems/2.1.0/gems/chef-12.5.1/lib/chef/knife.rb:499:in `rescue in format_rest_error'
from /opt/chef/embedded/lib/ruby/gems/2.1.0/gems/chef-12.5.1/lib/chef/knife.rb:497:in `format_rest_error'
from /opt/chef/embedded/lib/ruby/gems/2.1.0/gems/chef-12.5.1/lib/chef/knife.rb:459:in `humanize_http_exception'
from /opt/chef/embedded/lib/ruby/gems/2.1.0/gems/chef-12.5.1/lib/chef/knife.rb:418:in `humanize_exception'
from /opt/chef/embedded/lib/ruby/gems/2.1.0/gems/chef-12.5.1/lib/chef/knife.rb:409:in `rescue in run_with_pretty_exceptions'
from /opt/chef/embedded/lib/ruby/gems/2.1.0/gems/chef-12.5.1/lib/chef/knife.rb:400:in `run_with_pretty_exceptions'
from /opt/chef/embedded/lib/ruby/gems/2.1.0/gems/chef-12.5.1/lib/chef/knife.rb:203:in `run'
from /opt/chef/embedded/lib/ruby/gems/2.1.0/gems/chef-12.5.1/lib/chef/application/knife.rb:142:in `run'
from /opt/chef/embedded/lib/ruby/gems/2.1.0/gems/chef-12.5.1/bin/knife:25:in `<top (required)>'
from /usr/bin/knife:54:in `load'
from /usr/bin/knife:54:in `<main>'`enter code here`
Update
[root#ip-10-233-2-40 ~]# knife client list -VV
INFO: Using configuration from /root/.chef/knife.rb
DEBUG: Chef::HTTP calling Chef::HTTP::JSONInput#handle_request
DEBUG: Chef::HTTP calling Chef::HTTP::JSONOutput#handle_request
DEBUG: Chef::HTTP calling Chef::HTTP::CookieManager#handle_request
DEBUG: Chef::HTTP calling Chef::HTTP::Decompressor#handle_request
DEBUG: Chef::HTTP calling Chef::HTTP::Authenticator#handle_request
DEBUG: Signing the request as admin
DEBUG: Chef::HTTP calling Chef::HTTP::RemoteRequestID#handle_request
DEBUG: Using 10.233.0.182:3128 for proxy
DEBUG: Initiating GET to https://chef.example.com/organizations/dev/clients
DEBUG: ---- HTTP Request Header Data: ----
DEBUG: Accept: application/json
DEBUG: Accept-Encoding: gzip;q=1.0,deflate;q=0.6,identity;q=0.3
DEBUG: X-OPS-SIGN: algorithm=sha1;version=1.0;
DEBUG: X-OPS-USERID: admin
DEBUG: X-OPS-TIMESTAMP: 2015-10-21T17:40:17Z
DEBUG: X-OPS-CONTENT-HASH: 2jmj7l5rSw0yVb/vlWAYkK/YBwk=
DEBUG: X-OPS-AUTHORIZATION-1: m/vlWcZBPE7XUN7qhX6t/T9hXTT+2x/JehpOYq6My1ffEID6n+U+Xc+lHWto
DEBUG: X-OPS-AUTHORIZATION-2: Lq4ZEfNT1ltZkkYZ9Ii8EoF3eajUQmb2buwKMWae3yvxrZ5rgllJPf5q4gy3
DEBUG: X-OPS-AUTHORIZATION-3: IEqUUst+KzmoRHCiC1LeYxKXy+oeo45F4Vw4xHlOWgS0piqXfrmXnkrxs8Um
DEBUG: X-OPS-AUTHORIZATION-4: ZDqdLvcQ10WjoW9Wz4F2+fRh/BdRHjwMF80LVPwrtylf+GbdIhmCU3xxVvOq
DEBUG: X-OPS-AUTHORIZATION-5: w1Z2p03UcpRfMZy1pQV59A0Y3yv57Db5n3PJdjD9TlitNK++/HXcqO3IfO2U
DEBUG: X-OPS-AUTHORIZATION-6: 0QbZYZaeGSkJw0ArQDeffnjbpzAhSXhUfbs+in9tRg==
DEBUG: HOST: chef.example.com:443
DEBUG: X-Ops-Server-API-Version: 1
DEBUG: X-REMOTE-REQUEST-ID: 6a00a52a-7eeb-43d6-920d-fffc685c1b2a
DEBUG: ---- End HTTP Request Header Data ----
/opt/chef/embedded/lib/ruby/2.1.0/net/http/response.rb:119:in `error!': 404 "Not Found" (Net::HTTPServerException)
from /opt/chef/embedded/lib/ruby/2.1.0/net/http/response.rb:128:in `value'
from /opt/chef/embedded/lib/ruby/2.1.0/net/http.rb:915:in `connect'
from /opt/chef/embedded/lib/ruby/2.1.0/net/http.rb:863:in `do_start'
from /opt/chef/embedded/lib/ruby/2.1.0/net/http.rb:852:in `start'
from /opt/chef/embedded/lib/ruby/2.1.0/net/http.rb:1375:in `request'
from /opt/chef/embedded/lib/ruby/gems/2.1.0/gems/chef-12.5.1/lib/chef/http/basic_client.rb:65:in `request'
from /opt/chef/embedded/lib/ruby/gems/2.1.0/gems/chef-12.5.1/lib/chef/http.rb:266:in `block in send_http_request'
from /opt/chef/embedded/lib/ruby/gems/2.1.0/gems/chef-12.5.1/lib/chef/http.rb:298:in `block in retrying_http_errors'
from /opt/chef/embedded/lib/ruby/gems/2.1.0/gems/chef-12.5.1/lib/chef/http.rb:296:in `loop'
from /opt/chef/embedded/lib/ruby/gems/2.1.0/gems/chef-12.5.1/lib/chef/http.rb:296:in `retrying_http_errors'
from /opt/chef/embedded/lib/ruby/gems/2.1.0/gems/chef-12.5.1/lib/chef/http.rb:260:in `send_http_request'
from /opt/chef/embedded/lib/ruby/gems/2.1.0/gems/chef-12.5.1/lib/chef/http.rb:143:in `request'
from /opt/chef/embedded/lib/ruby/gems/2.1.0/gems/chef-12.5.1/lib/chef/http.rb:110:in `get'
from /opt/chef/embedded/lib/ruby/gems/2.1.0/gems/chef-12.5.1/lib/chef/api_client_v1.rb:198:in `list'
from /opt/chef/embedded/lib/ruby/gems/2.1.0/gems/chef-12.5.1/lib/chef/knife/client_list.rb:38:in `run'
from /opt/chef/embedded/lib/ruby/gems/2.1.0/gems/chef-12.5.1/lib/chef/knife.rb:405:in `block in run_with_pretty_exceptions'
from /opt/chef/embedded/lib/ruby/gems/2.1.0/gems/chef-12.5.1/lib/chef/local_mode.rb:44:in `with_server_connectivity'
from /opt/chef/embedded/lib/ruby/gems/2.1.0/gems/chef-12.5.1/lib/chef/knife.rb:404:in `run_with_pretty_exceptions'
from /opt/chef/embedded/lib/ruby/gems/2.1.0/gems/chef-12.5.1/lib/chef/knife.rb:203:in `run'
from /opt/chef/embedded/lib/ruby/gems/2.1.0/gems/chef-12.5.1/lib/chef/application/knife.rb:142:in `run'
from /opt/chef/embedded/lib/ruby/gems/2.1.0/gems/chef-12.5.1/bin/knife:25:in `<top (required)>'
from /usr/bin/knife:54:in `load'
from /usr/bin/knife:54:in `<main>'
[root#ip-10-233-2-40 ~]# telnet chef.example.com 443
Trying 10.233.2.177...
Connected to chef.example.com.
Escape character is '^]'.
I have this script for measuring the time it takes for a query to perform:
for (var i = 0; i < 10000; i++) {
(function(start) {
Models.User.findOneAsync({
userId: 'ABCD'
}, 'age name location')
.then(function(user) {
logger.debug(Date.now() - start);
})
})(Date.now());
}
As a result, I'm getting an incremented list of results:
2015-05-19T11:09:16.204Z - debug: 4369
2015-05-19T11:09:16.205Z - debug: 4367
2015-05-19T11:09:16.205Z - debug: 4367
2015-05-19T11:09:16.206Z - debug: 4368
2015-05-19T11:09:16.206Z - debug: 4367
2015-05-19T11:09:16.206Z - debug: 4368
2015-05-19T11:09:16.206Z - debug: 4367
2015-05-19T11:09:16.206Z - debug: 4369
2015-05-19T11:09:16.206Z - debug: 4368
2015-05-19T11:09:16.206Z - debug: 4368
2015-05-19T11:09:16.206Z - debug: 4368
2015-05-19T11:09:16.206Z - debug: 4367
2015-05-19T11:09:16.212Z - debug: 4373
2015-05-19T11:09:16.212Z - debug: 4373
2015-05-19T11:09:16.248Z - debug: 4408
2015-05-19T11:09:16.376Z - debug: 4536
2015-05-19T11:09:16.459Z - debug: 4619
2015-05-19T11:09:16.475Z - debug: 4635
2015-05-19T11:09:16.493Z - debug: 4654
2015-05-19T11:09:16.552Z - debug: 4713
2015-05-19T11:09:16.636Z - debug: 4796
2015-05-19T11:09:16.794Z - debug: 4954
2015-05-19T11:09:16.830Z - debug: 4990
2015-05-19T11:09:16.841Z - debug: 5001
2015-05-19T11:09:16.845Z - debug: 5005
2015-05-19T11:09:17.133Z - debug: 5293
2015-05-19T11:09:17.176Z - debug: 5336
2015-05-19T11:09:17.182Z - debug: 5341
2015-05-19T11:09:17.230Z - debug: 5390
2015-05-19T11:09:17.421Z - debug: 5580
2015-05-19T11:09:17.437Z - debug: 5596
2015-05-19T11:09:17.441Z - debug: 5600
2015-05-19T11:09:17.513Z - debug: 5672
2015-05-19T11:09:17.569Z - debug: 5728
2015-05-19T11:09:17.658Z - debug: 5817
2015-05-19T11:09:17.697Z - debug: 5855
2015-05-19T11:09:17.708Z - debug: 5867
2015-05-19T11:09:17.712Z - debug: 5870
2015-05-19T11:09:17.732Z - debug: 5891
2015-05-19T11:09:17.805Z - debug: 5963
2015-05-19T11:09:17.852Z - debug: 6010
2015-05-19T11:09:17.890Z - debug: 6048
2015-05-19T11:09:17.985Z - debug: 6143
2015-05-19T11:09:17.989Z - debug: 6147
2015-05-19T11:09:18.013Z - debug: 6171
2015-05-19T11:09:18.016Z - debug: 6175
2015-05-19T11:09:18.031Z - debug: 6189
2015-05-19T11:09:18.170Z - debug: 6327
2015-05-19T11:09:18.196Z - debug: 6353
2015-05-19T11:09:18.205Z - debug: 6362
2015-05-19T11:09:18.209Z - debug: 6367
2015-05-19T11:09:18.224Z - debug: 6382
2015-05-19T11:09:18.317Z - debug: 6474
2015-05-19T11:09:18.360Z - debug: 6516
2015-05-19T11:09:18.369Z - debug: 6526
2015-05-19T11:09:18.433Z - debug: 6590
2015-05-19T11:09:18.460Z - debug: 6616
2015-05-19T11:09:18.513Z - debug: 6668
2015-05-19T11:09:18.541Z - debug: 6697
2015-05-19T11:09:18.553Z - debug: 6711
2015-05-19T11:09:18.586Z - debug: 6741
2015-05-19T11:09:18.672Z - debug: 6827
2015-05-19T11:09:18.688Z - debug: 6844
2015-05-19T11:09:18.693Z - debug: 6849
2015-05-19T11:09:18.729Z - debug: 6884
2015-05-19T11:09:18.817Z - debug: 6972
2015-05-19T11:09:18.823Z - debug: 6980
2015-05-19T11:09:18.828Z - debug: 6983
2015-05-19T11:09:18.882Z - debug: 7036
2015-05-19T11:09:18.919Z - debug: 7075
2015-05-19T11:09:19.016Z - debug: 7170
2015-05-19T11:09:19.020Z - debug: 7174
2015-05-19T11:09:19.043Z - debug: 7197
2015-05-19T11:09:19.066Z - debug: 7222
2015-05-19T11:09:19.177Z - debug: 7331
2015-05-19T11:09:19.182Z - debug: 7335
2015-05-19T11:09:19.189Z - debug: 7343
2015-05-19T11:09:19.189Z - debug: 7344
2015-05-19T11:09:19.191Z - debug: 7344
2015-05-19T11:09:19.280Z - debug: 7433
2015-05-19T11:09:19.340Z - debug: 7494
2015-05-19T11:09:19.344Z - debug: 7497
2015-05-19T11:09:19.358Z - debug: 7512
2015-05-19T11:09:19.362Z - debug: 7518
2015-05-19T11:09:19.455Z - debug: 7608
2015-05-19T11:09:19.499Z - debug: 7651
2015-05-19T11:09:19.504Z - debug: 7656
2015-05-19T11:09:19.515Z - debug: 7669
2015-05-19T11:09:19.569Z - debug: 7722
2015-05-19T11:09:19.574Z - debug: 7726
2015-05-19T11:09:19.574Z - debug: 7726
2015-05-19T11:09:19.667Z - debug: 7818
2015-05-19T11:09:19.672Z - debug: 7823
2015-05-19T11:09:19.678Z - debug: 7830
2015-05-19T11:09:19.689Z - debug: 7844
2015-05-19T11:09:19.716Z - debug: 7868
2015-05-19T11:09:19.835Z - debug: 7986
2015-05-19T11:09:19.839Z - debug: 7989
2015-05-19T11:09:19.845Z - debug: 7997
2015-05-19T11:09:19.978Z - debug: 8128
2015-05-19T11:09:19.989Z - debug: 8136
2015-05-19T11:09:19.995Z - debug: 8146
2015-05-19T11:09:19.999Z - debug: 8153
2015-05-19T11:09:20.012Z - debug: 8166
2015-05-19T11:09:20.023Z - debug: 8174
2015-05-19T11:09:20.026Z - debug: 8177
2015-05-19T11:09:20.116Z - debug: 8262
2015-05-19T11:09:20.127Z - debug: 8272
2015-05-19T11:09:20.136Z - debug: 8287
2015-05-19T11:09:20.154Z - debug: 8307
2015-05-19T11:09:20.179Z - debug: 8324
2015-05-19T11:09:20.262Z - debug: 8407
2015-05-19T11:09:20.275Z - debug: 8425
2015-05-19T11:09:20.279Z - debug: 8423
2015-05-19T11:09:20.306Z - debug: 8456
2015-05-19T11:09:20.309Z - debug: 8463
2015-05-19T11:09:20.396Z - debug: 8540
2015-05-19T11:09:20.422Z - debug: 8565
2015-05-19T11:09:20.424Z - debug: 8574
2015-05-19T11:09:20.441Z - debug: 8594
2015-05-19T11:09:20.452Z - debug: 8601
2015-05-19T11:09:20.455Z - debug: 8605
2015-05-19T11:09:20.461Z - debug: 8604
2015-05-19T11:09:20.549Z - debug: 8691
2015-05-19T11:09:20.555Z - debug: 8697
2015-05-19T11:09:20.565Z - debug: 8711
2015-05-19T11:09:20.598Z - debug: 8752
2015-05-19T11:09:20.602Z - debug: 8750
2015-05-19T11:09:20.659Z - debug: 8801
2015-05-19T11:09:20.689Z - debug: 8830
2015-05-19T11:09:20.701Z - debug: 8842
2015-05-19T11:09:20.707Z - debug: 8853
2015-05-19T11:09:20.712Z - debug: 8864
2015-05-19T11:09:20.752Z - debug: 8898
2015-05-19T11:09:20.798Z - debug: 8938
2015-05-19T11:09:20.829Z - debug: 8969
2015-05-19T11:09:20.844Z - debug: 8989
2015-05-19T11:09:20.850Z - debug: 8990
2015-05-19T11:09:20.869Z - debug: 9021
2015-05-19T11:09:20.880Z - debug: 9033
2015-05-19T11:09:20.893Z - debug: 9038
2015-05-19T11:09:20.939Z - debug: 9078
2015-05-19T11:09:20.965Z - debug: 9104
2015-05-19T11:09:20.979Z - debug: 9124
2015-05-19T11:09:20.984Z - debug: 9122
2015-05-19T11:09:21.051Z - debug: 9189
2015-05-19T11:09:21.057Z - debug: 9202
2015-05-19T11:09:21.057Z - debug: 9210
2015-05-19T11:09:21.096Z - debug: 9234
2015-05-19T11:09:21.112Z - debug: 9249
2015-05-19T11:09:21.121Z - debug: 9265
2015-05-19T11:09:21.130Z - debug: 9267
2015-05-19T11:09:21.133Z - debug: 9284
2015-05-19T11:09:21.195Z - debug: 9339
2015-05-19T11:09:21.200Z - debug: 9345
2015-05-19T11:09:21.239Z - debug: 9375
2015-05-19T11:09:21.247Z - debug: 9383
2015-05-19T11:09:21.270Z - debug: 9404
2015-05-19T11:09:21.283Z - debug: 9434
2015-05-19T11:09:21.334Z - debug: 9468
2015-05-19T11:09:21.337Z - debug: 9481
2015-05-19T11:09:21.348Z - debug: 9500
2015-05-19T11:09:21.352Z - debug: 9496
2015-05-19T11:09:21.378Z - debug: 9512
2015-05-19T11:09:21.385Z - debug: 9518
2015-05-19T11:09:21.416Z - debug: 9559
2015-05-19T11:09:21.419Z - debug: 9552
2015-05-19T11:09:21.470Z - debug: 9603
2015-05-19T11:09:21.475Z - debug: 9625
2015-05-19T11:09:21.490Z - debug: 9634
2015-05-19T11:09:21.517Z - debug: 9649
2015-05-19T11:09:21.522Z - debug: 9654
2015-05-19T11:09:21.554Z - debug: 9697
2015-05-19T11:09:21.562Z - debug: 9694
2015-05-19T11:09:21.575Z - debug: 9706
2015-05-19T11:09:21.622Z - debug: 9773
2015-05-19T11:09:21.635Z - debug: 9779
2015-05-19T11:09:21.656Z - debug: 9787
2015-05-19T11:09:21.683Z - debug: 9814
2015-05-19T11:09:21.688Z - debug: 9818
2015-05-19T11:09:21.691Z - debug: 9833
2015-05-19T11:09:21.716Z - debug: 9846
2015-05-19T11:09:21.720Z - debug: 9869
2015-05-19T11:09:21.751Z - debug: 9893
2015-05-19T11:09:21.778Z - debug: 9921
2015-05-19T11:09:21.786Z - debug: 9936
2015-05-19T11:09:21.790Z - debug: 9919
2015-05-19T11:09:21.820Z - debug: 9949
2015-05-19T11:09:21.830Z - debug: 9959
2015-05-19T11:09:21.864Z - debug: 10005
2015-05-19T11:09:21.870Z - debug: 9998
2015-05-19T11:09:21.884Z - debug: 10033
2015-05-19T11:09:21.928Z - debug: 10070
2015-05-19T11:09:21.931Z - debug: 10059
2015-05-19T11:09:21.949Z - debug: 10076
2015-05-19T11:09:21.984Z - debug: 10111
2015-05-19T11:09:21.987Z - debug: 10128
2015-05-19T11:09:22.015Z - debug: 10142
2015-05-19T11:09:22.019Z - debug: 10160
2015-05-19T11:09:22.046Z - debug: 10172
2015-05-19T11:09:22.060Z - debug: 10202
2015-05-19T11:09:22.064Z - debug: 10215
2015-05-19T11:09:22.089Z - debug: 10215
2015-05-19T11:09:22.123Z - debug: 10248
2015-05-19T11:09:22.130Z - debug: 10270
2015-05-19T11:09:22.152Z - debug: 10277
2015-05-19T11:09:22.156Z - debug: 10302
2015-05-19T11:09:22.180Z - debug: 10320
2015-05-19T11:09:22.183Z - debug: 10325
2015-05-19T11:09:22.186Z - debug: 10310
2015-05-19T11:09:22.228Z - debug: 10349
2015-05-19T11:09:22.260Z - debug: 10380
Ok, so it's not entirely increment but it's going up forever...
I would expect to get a list of almost the same number (queries should take almost the same time).
Anyone knows why the times keeps going up?
It's because you're not measuring just the query times.
The basic flow of this code is:
The for loop of 10000 iterations runs to completion, setting the start time for each iteration to the time each iteration occurred.
For the first 5 of those iterations (or whatever poolsize you're using with your MongoDB connection), their queries start as soon as their findOneAsync call is made.
As queries complete, their findOneAsync callbacks are put on the event queue and their connections are returned to the pool, allowing subsequent iterations' queries to start.
So all iterations' times include the time to complete the rest of the for loop after it, the time spent waiting for a connection in the pool to become available, and the time their findOneAsync callback spent waiting in the event queue.
If you want to get an accurate picture of how long the queries are taking, use MongoDB's profiling support.