How to get script/file name of particular line of code? - python-3.x

I've a generic file log.py in which i've imported python logging and used all functions in it.
now i need to format log such that
(current_script_name) [levelname] message
log.py
import logging
import inspect
import sys
class log(object):
#file_name = sys.argv[0] #this will give b.py instead of a.py
file_name = str(inspect.getfile(inspect.currentframe())).split("/")[-1]
logging.basicConfig(level=logging.INFO, format="("+file_name+") [%(levelname)s] %(message)s")
def __init__(self):
pass
def info(msg):
logging.info(msg)
def error(msg):
logging.error(msg)
def warning(msg):
logging.warning(msg)
def debug(msg):
logging.debug(msg)
Now, i've another file called a.py
where i have imported above file and used like
import log
def some():
log.info("Hello this is information")
This will give an output below when i call some() in b.py
(log.py) [INFO] Hello this is information
but i expect below output because log.info() code is used in a.py
(a.py) [INFO] Hello this is information
Note: I shouldn't pass any argument for log.info(msg) line in a.py

One solution would be to pass the file name of the executing script to the logger.
Inside a.py:
log.info("Hello this is information", executing_script=__file__)
(For more information about using the file variable: https://note.nkmk.me/en/python-script-file-path/)
You would have to update all the function definitions in your log class if you do it this way though. Alternatively, you could include the executing script file name in the log class initializer:
class log(object):
def __init__(self, executing_script):
self.file_name = executing_script
# ...
This way you can just reference self.file_name in your logging functions instead of having to pass a value to each one.

Related

Custom importer to fetch files from web before execution

I'm looking at the documentation here to try and manipulate the way the import statement works. My code uses imports in all forms
import <module>
import <package.module>
from <package> import <module>
from <package.module> import <function>
from <package.module> import *
My goal is: for a certain folder, let's call it myfolder, any import for any module within myfolder (however deep in the structure) should have some preprocessing. No matter how it's imported. Preprocessing in this case is to download the python file from an internal CMS and use that instead of the one on the disk.
​
​
I understand the meta_path and path_hooks part, and I think I need to work with the path_hooks to return a FileFinder object to the built-in meta's PathFinder. Here's what I have so far:
import os, sys
class PathhookOverride():
def __init__(self, path) -> None:
"""
This will be called when PathFinder() iterates through sys.path_hooks
"""
relative_path = os.path.relpath(path, os.getcwd())
if not relative_path.startswith('myfolder'):
## We want to override only imports that have myfolder as the first part of the relative path
raise ImportError
if os.path.isdir(path):
## We know that this is a directory, we don't want to handle this
print(f'PathhookOverride: {path} is a directory')
raise ImportError
dot_separated_path = ".".join(relative_path.split(os.path.sep))
print(dot_separated_path)
## Pull file here later
cache = sys.path_importer_cache
raise ImportError ## Go to next path_hook
def change_importer():
"""Inserts the finder into the import machinery"""
sys.path_hooks.insert(0, PathhookOverride)
from myfolder.package.module import function
Expected output:
When I import my module or function using any of the above formats, I should get the path of the file being imported.
i.e., in the code snippet above, it should print the dot_separated_path:
myfolder.package.module
Actual output:
PathhookOverride: c:\test1\myfolder is a directory
PathhookOverride: c:\test1\myfolder\package is a directory
The override only catches the directories. The path of the files are never sent to the override hook.
What am I missing? Thanks.

Is there a way to exclude lines of code from code coverage measurement using coveragerc?

I am using pytest, with coveragerc to measure code coverage of my source file, which is given below
import argparse
import sys
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument('--input_dir', type=str, required=True, help='fully qualified directory path')
arguments = parser.parse_known_args(sys.argv[1:])
my_func(arguments.input_dir)
My coveragerc configuration is given below
[report]
exclude_lines =
if __name__ == .__main__.:
pytest config
[pytest]
python_files=*_test.py
addopts = --cov='.' --cov-config=.coveragerc
Tests to validate functionality in my_func exist, but not for the lines of code under main function. I would like to be able to exclude all lines within the main function from the code coverage because its functionality just initialises the parser and calls the another function.

How to import variable values from a file that is running?

I am running a python file, say file1, and in that, I am importing another python file, say file2, and calling one of its functions. Now, the file2 needs the value of a variable which is defined in file 1. Also, before importing file2 in file1, the value of the variable was changed during the run-time. How do I make the file file2, access the current value of the variable from file 1?
The content of file1 is:
variable = None
if __name__ == '__main__':
variable = 123
from file2 import func1
func1()
The content of file2 is:
from file1 import variable as var
def func1():
print(var)
When I run the file1, I want the function func1 in file2 to print 123. But it prints None. One way I can tackle this is by saving the content of the variable in some ordinary file when it is modified, and then retrieving it when needed. But the application in which I am using this code, the size of the variable is massive, like around 300 MB. So, I believe it won't be efficient enough to write the content of the variable in a text file, every time it is modified. How do I do this? (Any suggestions are welcome)
The main script is run with the name __main__, not by its module name. This is also how the if __name__ == '__main__' check works. Importing it by its regular name creates a separate module with the regular content.
If you want to access its attributes, import it as __main__:
from __main__ import variable as var
def func1():
print(var)
Note that importing __main__ is fragile. On top of duplicating the module, you may end up importing a different module if your program structure changes. If you want to exchange global data, use well-defined module names:
# constants.py
variable = None
# file1.py
if __name__ == '__main__':
import constants
constants.variable = 123
from file2 import func1
func1()
# file2.py
from constants import variable as var
def func1():
print(var)
Mandatory disclaimer: Ideally, functions do not rely on global variables. Use parameters for passing variables into functions:
# constants.py
variable = None
# file1.py
if __name__ == '__main__':
from file2 import func1
func1(123)
# file2.py
from constants import variable
def func1(var=variable):
print(var)

how to pass command line argument from pytest to code

I am trying to pass arguments from a pytest testcase to a module being tested. For example, using the main.py from Python boilerplate, I can run it from the command line as:
$ python3 main.py
usage: main.py [-h] [-f] [-n NAME] [-v] [--version] arg
main.py: error: the following arguments are required: arg
$ python3 main.py xx
hello world
Namespace(arg='xx', flag=False, name=None, verbose=0)
Now I am trying to do the same with pytest, with the following test_sample.py
(NOTE: the main.py requires command line arguments. But these arguments need to be hardcoded in a specific test, they should not be command line arguments to pytest. The pytest testcase only needs to send these values as command line arguments to main.main().)
import main
def test_case01():
main.main()
# I dont know how to pass 'xx' to main.py,
# so for now I just have one test with no arguments
and running the test as:
pytest -vs test_sample.py
This fails with error messages. I tried to look at other answers for a solution but could not use them. For example, 42778124 suggests to create a separate file run.py which is not a desirable thing to do. And 48359957 and 40880259 seem to deal more with command line arguments for pytest, instead of passing command line arguments to the main code.
I dont need the pytest to take command line arguments, the arguments can be hardcoded inside a specific test. But these arguments need to be passed as arguments to the main code. Can you give me a test_sample.py, that calls main.main() with some arguments?
If you can't modify the signature of the main method, you can use the monkeypatching technique to temporarily replace the arguments with the test data. Example: imagine writing tests for the following program:
import argparse
def main():
parser = argparse.ArgumentParser(description='Greeter')
parser.add_argument('name')
args = parser.parse_args()
return f'hello {args.name}'
if __name__ == '__main__':
print(main())
When running it from the command line:
$ python greeter.py world
hello world
To test the main function with some custom data, monkeypatch sys.argv:
import sys
import greeter
def test_greeter(monkeypatch):
with monkeypatch.context() as m:
m.setattr(sys, 'argv', ['greeter', 'spam'])
assert greeter.main() == 'hello spam'
When combined with the parametrizing technique, this allows to easily test different arguments without modifying the test function:
import sys
import pytest
import greeter
#pytest.mark.parametrize('name', ['spam', 'eggs', 'bacon'])
def test_greeter(monkeypatch, name):
with monkeypatch.context() as m:
m.setattr(sys, 'argv', ['greeter', name])
assert greeter.main() == 'hello ' + name
Now you get three tests, one for each of the arguments:
$ pytest -v test_greeter.py
...
test_greeter.py::test_greeter[spam] PASSED
test_greeter.py::test_greeter[eggs] PASSED
test_greeter.py::test_greeter[bacon] PASSED
A good practice might to have this kind of code, instead of reading arguments from main method.
# main.py
def main(arg1):
return arg1
if __name__ == '__main__':
parser = argparse.ArgumentParser(description='My awesome script')
parser.add_argument('word', help='a word')
args = parser.parse_args()
main(args.word)
This way, your main method can easily be tested in pytest
import main
def test_case01():
main.main(your_hardcoded_arg)
I am not sure you can call a python script to test except by using os module, which might be not a good practice

Using the globals argument of timeit.timeit

I am attempting to run timeit.timeit in the following class:
from contextlib import suppress
from pathlib import Path
import subprocess
from timeit import timeit
class BackupVolume():
'''
Backup a file system on a volume using tar
'''
targetFile = "bd.tar.gz"
srcPath = Path("/BulkData")
excludes = ["--exclude=VirtualBox VMs/*", # Exclude all the VM stuff
"--exclude=*.tar*"] # Exclude this tar file
#classmethod
def backupData(cls, targetPath="~"): # pylint: disable=invalid-name
'''
Runs tar to backup the data in /BulkData so we can reorganize that
volume. Deletes any old copy of the backup repository.
Parameters:
:param str targetPath: Where the backup should be created.
'''
# pylint: disable=invalid-name
tarFile\
= Path(Path(targetPath /
cls.targetFile).resolve())
with suppress(FileNotFoundError):
tarFile.unlink()
timeit('subprocess.run(["tar", "-cf", tarFile.as_posix(),'
'cls.excludes[0], cls.excludes[1], cls.srcPath.as_posix()])',
number=1, globals=something)
The problem I have is that inside timeit() it cannot interpret subprocess. I believe that the globals argument to timeit() should help but I have no idea how to specify the module namespace. Can someone show me how?
I think in your case globals = globals() in the timeit call would work.
Explanation
The globals argument specifies a namespace in which to execute the code. Due to your import of the subprocess module (outside the function, even outside the class) you can use globals(). In doing so you have access to a dictionary of the current module, you can find more info in the documentation.
Super simple example
In this example I'll expose 3 different scenarios.
Need to access globals
Need to access locals
Custom namespace
Code to follow the example:
import subprocess
from timeit import timeit
import math
class ExampleClass():
def performance_glob(self):
return timeit("subprocess.run('ls')", number = 1, globals = globals())
def performance_loc(self):
a = 69
b = 42
return timeit("a * b", number = 1, globals = locals())
def performance_mix(self):
a = 69
return timeit("math.sqrt(a)", number = 1, globals = {'math': math, 'a': a})
In performance_glob you are timing something that needs a global import, the module subprocess. If you don't pass the globals namespace you'll get an error message like this NameError: name 'subprocess' is not defined
On the contrary, if you pass globals() to the function that depends on local values performance_loc the needed variables for the timeit execution a and b won't be in the scope. That's why you can use locals()
The last one is a general scenario where you need both the local vars in the function and general imports. If you keep in mind that the parameter globals can be specified as a dictionary, you just need to provide the necessary keys, you can customize it.

Resources