Can SonarQube really detect memory leaks? - memory-leaks

I'm using SonarQube (v8.9) at work with SonarScanner (v4.2).
I've created two memory leaks, one in Javascript and one in Python. They couldn't be detected from SonarScanner.
These are the snippets:
JS:
beforeMount () {
Window.test = {
name: 'home',
node: document.getElementById('home')
}
}
Python:
import requests
import gc
def call():
response = requests.get('https://google.com')
print("Status code", response.status_code)
return
def main():
print("No.of tracked objects before calling get method")
print(len( gc.get_objects() ) )
call()
print("No.of tracked objects after calling get method")
print(len( gc.get_objects() ) )
if __name__ == "__main__":
main()
The questions are:
Can SonarQube/SonarScanner detect memory leaks?
Can a static analyzer detect memory leaks? (Neither Bandit nor Semgrep could detect these)
Do you have examples of snippets of code that create memory leaks that I can use to test?
Thanks

Related

Multiprocessing HTTP get requests with a Tornado/Python3 API Service

I build two different Tornado/Python3 API services.
Service1 opens several different threads and semi-parallel sends a GET request to Service2.
However, Service2 (Tornado/Python3) then proceeds to sequentially process the GET request.
Code example of Service2:
class MainHandler(tornado.web.RequestHandler):
def get(self):
self.write("Some Microservice v1")
def make_app():
return tornado.web.Application([
(r"/v1", MainHandler),
(r"/v1/addfile", AddHandler, dict(filepaths = filepaths)),
(r"/v1/getfiles", GetHandler, dict(filepaths = filepaths)),
(r"/v1/getfile", GetFileHandler, dict(filepaths = filepaths)),
])
if __name__ == "__main__":
app = make_app()
app.listen(8887, address='127.0.0.1')
tornado.ioloop.IOLoop.current().start()
I have tried to manually fork processes, to achieve better performance, but this did not work.
if __name__ == "__main__":
#app = make_app()
#app.listen(8887, address='127.0.0.1')
#tornado.ioloop.IOLoop.current().start()
sockets = tornado.netutil.bind_sockets(8887)
tornado.process.fork_processes(8)
server = tornado.httpserver.HTTPServer(app)
server.add_sockets(sockets)
tornado.ioloop.IOLoop.instance().start()
What am I missing here? Do I actually need to go into each Handler class and define coroutines and callbacks?
Greatful for help,
Thank you.
Working as intended. My Service1 was using up all my cores on the test environment, that is why there were no cores left for Service2.

Failure on unit tests with pytest, tornado and aiopg, any query fail

I've a REST API running on Python 3.7 + Tornado 5, with postgresql as database, using aiopg with SQLAlchemy core (via the aiopg.sa binding). For the unit tests, I use py.test with pytest-tornado.
All the tests go ok as soon as no query to the database is involved, where I'd get this:
Runtime error: Task cb=[IOLoop.add_future..() at venv/lib/python3.7/site-packages/tornado/ioloop.py:719]> got Future attached to a different loop
The same code works fine out of the tests, I'm capable of handling 100s of requests so far.
This is part of an #auth decorator which will check the Authorization header for a JWT token, decode it and get the user's data and attach it to the request; this is the part for the query:
partner_id = payload['partner_id']
provided_scopes = payload.get("scope", [])
for scope in scopes:
if scope not in provided_scopes:
logger.error(
'Authentication failed, scopes are not compliant - '
'required: {} - '
'provided: {}'.format(scopes, provided_scopes)
)
raise ForbiddenException(
"insufficient permissions or wrong user."
)
db = self.settings['db']
partner = await Partner.get(db, username=partner_id)
# The user is authenticated at this stage, let's add
# the user info to the request so it can be used
if not partner:
raise UnauthorizedException('Unknown user from token')
p = Partner(**partner)
setattr(self.request, "partner_id", p.uuid)
setattr(self.request, "partner", p)
The .get() async method from Partner comes from the Base class for all models in the app. This is the .get method implementation:
#classmethod
async def get(cls, db, order=None, limit=None, offset=None, **kwargs):
"""
Get one instance that will match the criteria
:param db:
:param order:
:param limit:
:param offset:
:param kwargs:
:return:
"""
if len(kwargs) == 0:
return None
if not hasattr(cls, '__tablename__'):
raise InvalidModelException()
tbl = cls.__table__
instance = None
clause = cls.get_clause(**kwargs)
query = (tbl.select().where(text(clause)))
if order:
query = query.order_by(text(order))
if limit:
query = query.limit(limit)
if offset:
query = query.offset(offset)
logger.info(f'GET query executing:\n{query}')
try:
async with db.acquire() as conn:
async with conn.execute(query) as rows:
instance = await rows.first()
except DataError as de:
[...]
return instance
The .get() method above will either return a model instance (row representation) or None.
It uses the db.acquire() context manager, as described in aiopg's doc here: https://aiopg.readthedocs.io/en/stable/sa.html.
As described in this same doc, the sa.create_engine() method returns a connection pool, so the db.acquire() just uses one connection from the pool. I'm sharing this pool to every request in Tornado, they use it to perform the queries when they need it.
So this is the fixture I've set up in my conftest.py:
#pytest.fixture
async def db():
dbe = await setup_db()
return dbe
#pytest.fixture
def app(db, event_loop):
"""
Returns a valid testing Tornado Application instance.
:return:
"""
app = make_app(db)
settings.JWT_SECRET = 'its_secret_one'
return app
I can't find an explanation of why this is happening; Tornado's doc and source makes it clear that asyncIO event loop is used as default, and by debugging it I can see the event loop is indeed the same one, but for some reason it seems to get closed or stopped abruptly.
This is one test that fails:
#pytest.mark.gen_test(timeout=2)
def test_score_returns_204_empty(app, http_server, http_client, base_url):
score_url = '/'.join([base_url, URL_PREFIX, 'score'])
token = create_token('test', scopes=['score:get'])
headers = {
'Authorization': f'Bearer {token}',
'Accept': 'application/json',
}
response = yield http_client.fetch(score_url, headers=headers, raise_error=False)
assert response.code == 204
This test fails as it returns 401 instead of 204, given the query on the auth decorator fails due to the RuntimeError, which returns then an Unauthorized response.
Any idea from the async experts here will be very appreciated, I'm quite lost on this!!!
Well, after a lot of digging, testing and, of course, learning quite a lot about asyncio, I made it work myself. Thanks for the suggestions so far.
The issue was that the event_loop from asyncio was not running; as #hoefling mentioned, pytest itself does not support coroutines, but pytest-asyncio brings such a useful feature to your tests. This is very well explained here: https://medium.com/ideas-at-igenius/testing-asyncio-python-code-with-pytest-a2f3628f82bc
So, without pytest-asyncio, your async code that needs to be tested will look like this:
def test_this_is_an_async_test():
loop = asyncio.get_event_loop()
result = loop.run_until_complete(my_async_function(param1, param2, param3)
assert result == 'expected'
We use loop.run_until_complete() as, otherwise, the loop will never be running, as this is the way asyncio works by default (and pytest makes nothing to make it work differently).
With pytest-asyncio, your test works with the well-known async / await parts:
async def test_this_is_an_async_test(event_loop):
result = await my_async_function(param1, param2, param3)
assert result == 'expected'
pytest-asyncio in this case wraps the run_until_complete() call above, summarizing it heavily, so the event loop will run and be available for your async code to use it.
Please note: the event_loop parameter in the second case is not even necessary here, pytest-asyncio gives one available for your test.
On the other hand, when you are testing your Tornado app, you usually need to get a http server up and running during your tests, listening in a well-known port, etc., so the usual way goes by writing fixtures to get a http server, base_url (usually http://localhost:, with an unused port, etc etc).
pytest-tornado comes up as a very useful one, as it offers several of these fixtures for you: http_server, http_client, unused_port, base_url, etc.
Also to mention, it gets a pytest mark's gen_test() feature, which converts any standard test to use coroutines via yield, and even to assert it will run with a given timeout, like this:
#pytest.mark.gen_test(timeout=3)
def test_fetch_my_data(http_client, base_url):
result = yield http_client.fetch('/'.join([base_url, 'result']))
assert len(result) == 1000
But, this way it does not support async / await, and actually only Tornado's ioloop will be available via the io_loop fixture (although Tornado's ioloop uses by default asyncio underneath from Tornado 5.0), so you'd need to combine both pytest.mark.gen_test and pytest.mark.asyncio, but in the right order! (which I did fail).
Once I understood better what could be the problem, this was the next approach:
#pytest.mark.gen_test(timeout=2)
#pytest.mark.asyncio
async def test_score_returns_204_empty(http_client, base_url):
score_url = '/'.join([base_url, URL_PREFIX, 'score'])
token = create_token('test', scopes=['score:get'])
headers = {
'Authorization': f'Bearer {token}',
'Accept': 'application/json',
}
response = await http_client.fetch(score_url, headers=headers, raise_error=False)
assert response.code == 204
But this is utterly wrong, if you understand how Python's decorator wrappers work. With the code above, pytest-asyncio's coroutine is then wrapped in a pytest-tornado yield gen.coroutine, which won't get the event-loop running... so my tests were still failing with the same problem. Any query to the database were returning a Future waiting for an event loop to be running.
My updated code once I made myself up of the silly mistake:
#pytest.mark.asyncio
#pytest.mark.gen_test(timeout=2)
async def test_score_returns_204_empty(http_client, base_url):
score_url = '/'.join([base_url, URL_PREFIX, 'score'])
token = create_token('test', scopes=['score:get'])
headers = {
'Authorization': f'Bearer {token}',
'Accept': 'application/json',
}
response = await http_client.fetch(score_url, headers=headers, raise_error=False)
assert response.code == 204
In this case, the gen.coroutine is wrapped inside the pytest-asyncio coroutine, and the event_loop runs the coroutines as expected!
But there were still a minor issue that took me a little while to realize, too; pytest-asyncio's event_loop fixture creates for every test a new event loop, while pytest-tornado creates too a new IOloop. And the tests were still failing, but this time with a different error.
The conftest.py file now looks like this; please note I've re-declared the event_loop fixture to use the event_loop from pytest-tornado io_loop fixture itself (please recall pytest-tornado creates a new io_loop on each test function):
#pytest.fixture(scope='function')
def event_loop(io_loop):
loop = io_loop.current().asyncio_loop
yield loop
loop.stop()
#pytest.fixture(scope='function')
async def db():
dbe = await setup_db()
yield dbe
#pytest.fixture
def app(db):
"""
Returns a valid testing Tornado Application instance.
:return:
"""
app = make_app(db)
settings.JWT_SECRET = 'its_secret_one'
yield app
Now all my tests work, I'm back a happy man and very proud of my now better understanding of the asyncio way of life. Cool!

Not found view doesn't work on Pyramid using traversal

I'm using Pyramid (1.5.7) + traversal and following the documentation I've tried all possible ways to get the "Not found exception view" working.
from pyramid.view import notfound_view_config,forbidden_view_config, view_config
#notfound_view_config(renderer="error/not_found.jinja2")
def not_found_view(request):
request.response.status = 404
return {}
#forbidden_view_config(renderer="error/forbidden.jinja2")
def forbidden_view(request):
return {}
Using contexts:
from pyramid.view import view_config
from pyramid.httpexceptions import HTTPForbidden, HTTPUnauthorized
#view_config(context=HTTPNotFound, renderer="error/not_found.jinja2")
def not_found_view(request):
request.response.status = 404
return {}
#view_config(context=HTTPForbidden, renderer="error/forbidden.jinja2")
def forbidden_view(request):
return {}
I'm using the Scan mode, but I've tried also adding a custom function to the configuration:
def main(globals, **settings):
config = Configurator()
config.add_notfound_view(notfound)
Not luck either, all time getting the following unhandled exception:
raise HTTPNotFound(msg)
pyramid.httpexceptions.HTTPNotFound: /example-url
Ouch... My bad! I was using a tween which was preventing Pyramid to load the Exceptions:
def predispatch_factory(handler, registry):
# one-time configuration code goes here
def predispatch(request):
# code to be executed for each request before
# the actual application code goes here
response = handler(request)
# code to be executed for each request after
# the actual application code goes here
return response
return predispatch
Still I don't know why but after removing this tween all seems to work as expected.

Limit parallel processes in CherryPy?

I have a CherryPy server running on a BeagleBone Black. Server generates a simple webpage and does local SPI reads / writes (hardware interface). The application is going to be used on a local network with 1-2 clients at a time.
I need to prevent a CherryPy class function being called twice, two or more instances before it completes.
Thoughts?
As saaj commented, a simple threading.Lock() will prevent the handler from being run at the same time by another client. I might also add, using cherrypy.session.acquire_lock() will prevent the same client from the running two handlers simultaneously.
Refreshing article on Python locks and stuff: http://effbot.org/zone/thread-synchronization.htm
Although I would make saaj's solution much simpler by using a "with" statement in Python, to hide all those fancy lock acquisitions/releases and try/except block.
lock = threading.Lock()
#cherrypy.expose
def index(self):
with lock:
# do stuff in the handler.
# this code will only be run by one client at a time
return '<html></html>'
It is general synchronization question, though CherryPy side has a subtlety. CherryPy is a threaded-server so it is sufficient to have an application level lock, e.g. threading.Lock.
The subtlety is that you can't see the run-or-fail behaviour from within a single browser because of pipelining, Keep-Alive or caching. Which one it is is hard to guess as the behaviour varies in Chromium and Firefox. As far as I can see CherryPy will try to serialize processing of request coming from single TCP connection, which effectively results in subsequent requests waiting for active request in a queue. With some trial-and-error I've found that adding cache-prevention token leads to the desired behaviour (even though Chromium still sends Connection: keep-alive for XHR where Firefox does not).
If run-or-fail in single browser isn't important to you you can safely ignore the previous paragraph and JavaScript code in the following example.
Update
The cause of request serialisation coming from one browser to the same URL doesn't lie in server-side. It's an implementation detail of a browser cache (details). Though, the solution of adding random query string parameter, nc, is correct.
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import threading
import time
import cherrypy
config = {
'global' : {
'server.socket_host' : '127.0.0.1',
'server.socket_port' : 8080,
'server.thread_pool' : 8
}
}
class App:
lock = threading.Lock()
#cherrypy.expose
def index(self):
return '''<!DOCTYPE html>
<html>
<head>
<title>Lock demo</title>
<script type='text/javascript' src='http://cdnjs.cloudflare.com/ajax/libs/qooxdoo/3.5.1/q.min.js'></script>
<script type='text/javascript'>
function runTask(wait)
{
var url = (wait ? '/runOrWait' : '/runOrFail') + '?nc=' + Date.now();
var xhr = q.io.xhr(url);
xhr.on('loadend', function(xhr)
{
if(xhr.status == 200)
{
console.log('success', xhr.responseText)
}
else if(xhr.status == 503)
{
console.log('busy');
}
});
xhr.send();
}
q.ready(function()
{
q('p a').on('click', function(event)
{
event.preventDefault();
var wait = parseInt(q(event.getTarget()).getData('wait'));
runTask(wait);
});
});
</script>
</head>
<body>
<p><a href='#' data-wait='0'>Run or fail</a></p>
<p><a href='#' data-wait='1'>Run or wait</a></p>
</body>
</html>
'''
def calculate(self):
time.sleep(8)
return 'Long task result'
#cherrypy.expose
def runOrWait(self, **kwargs):
self.lock.acquire()
try:
return self.calculate()
finally:
self.lock.release()
#cherrypy.expose
def runOrFail(self, **kwargs):
locked = self.lock.acquire(False)
if not locked:
raise cherrypy.HTTPError(503, 'Task is already running')
else:
try:
return self.calculate()
finally:
self.lock.release()
if __name__ == '__main__':
cherrypy.quickstart(App(), '/', config)

How can I proxy big contents with tornado on Python3?

I am trying to implement asynchronous http reverse proxy with tornado on Python3.
Handler class is as follows:
class RProxyHandler(tornado.web.RequestHandler):
#tornado.web.asynchronous
def get(self):
backend_url = 'http://backend-host/content.html' # temporary fixed
req = tornado.httpclient.HTTPRequest(
url=backend_url)
http_client = tornado.httpclient.AsyncHTTPClient()
http_client.fetch(req, self.backend_callback)
def backend_callback(self, response):
self.write(response.body)
self.finish()
When content.html is small, this code works fine. But with large content.html, this code raises Exception:
ERROR:tornado.general:Reached maximum read buffer size
I found the way to handle large contents with pycurl. Though, it seems does not work with Python3.
In addition, I added streaming_callback option to HTTPRequest. But the callback won't be called when disabled chunked response by backend server.
How can I handle large contents?
Thanks.
You should be able to pass max_buffer_size to the tornado.httpclient.AsyncHTTPClient()
call to set the max buffer size. for a 50MB buffer:
import tornado.ioloop
import tornado.web
from tornado.httpclient import AsyncHTTPClient
from tornado import gen
from tornado.web import asynchronous
class MainHandler(tornado.web.RequestHandler):
client = AsyncHTTPClient(max_buffer_size=1024*1024*150)
#gen.coroutine
#asynchronous
def get(self):
response = yield self.client.fetch("http://test.gorillaservers.com/100mb.bin", request_timeout=180)
self.finish("%s\n" % len(response.body))
application = tornado.web.Application([
(r"/", MainHandler),
])
if __name__ == "__main__":
application.listen(8888)
tornado.ioloop.IOLoop.instance().start()
Update: Now a full example program.

Resources