seek(0) on Linux /proc/sys/* pseudo-files? - linux

Are there documented standards for the semantics of Linux /proc/sys file descriptors?
Is it proper to use seek(0) on them?
Here's a piece of code which seems to work fine for my tests:
#!/usr/bin/python
from time import sleep
with open('/proc/sys/fs/file-nr','r') as f:
while True:
d = f.readline()
print d.split()[0]
f.seek(0)
sleep(1)
This seems to work. However, I'd like to know if that's the right way to do such things or if I should loop over open() ... read() ... close()
In this particular case I'll be using this with the collectd Python plugin ... so this particular code would be running indefinitely in a daemon. However, I'm interested in the answer for the general class of questions.
(Incidentally is there an "open files/inodes" module/plugin for collectd)?

Yes, it is proper to use lseek(2) and fseek(3) on files on proc pseudo-file system. Calls which aren't appropriate will result and error, thus if python seek (calling presumably lseek/fseek underneath) works, it's appropriate.

Related

How to work around opencv bug crashing on imshow when not in main thread [duplicate]

This has been answered for Android, Objective C and C++ before, but apparently not for Python. How do I reliably determine whether the current thread is the main thread? I can think of a few approaches, none of which really satisfy me, considering it could be as easy as comparing to threading.MainThread if it existed.
Check the thread name
The main thread is instantiated in threading.py like this:
Thread.__init__(self, name="MainThread")
so one could do
if threading.current_thread().name == 'MainThread'
but is this name fixed? Other codes I have seen checked whether MainThread is contained anywhere in the thread's name.
Store the starting thread
I could store a reference to the starting thread the moment the program starts up, i.e. while there are no other threads yet. This would be absolutely reliable, but way too cumbersome for such a simple query?
Is there a more concise way of doing this?
The problem with threading.current_thread().name == 'MainThread' is that one can always do:
threading.current_thread().name = 'MyName'
assert threading.current_thread().name == 'MainThread' # will fail
Perhaps the following is more solid:
threading.current_thread().__class__.__name__ == '_MainThread'
Having said that, one may still cunningly do:
threading.current_thread().__class__.__name__ = 'Grrrr'
assert threading.current_thread().__class__.__name__ == '_MainThread' # will fail
But this option still seems better; "after all, we're all consenting adults here."
UPDATE:
Python 3.4 introduced threading.main_thread() which is much better than the above:
assert threading.current_thread() is threading.main_thread()
UPDATE 2:
For Python < 3.4, perhaps the best option is:
isinstance(threading.current_thread(), threading._MainThread)
The answers here are old and/or bad, so here's a current solution:
if threading.current_thread() is threading.main_thread():
...
This method is available since Python 3.4+.
If, like me, accessing protected attributes gives you the Heebie-jeebies, you may want an alternative for using threading._MainThread, as suggested. In that case, you may exploit the fact that only the Main Thread can handle signals, so the following can do the job:
import signal
def is_main_thread():
try:
# Backup the current signal handler
back_up = signal.signal(signal.SIGINT, signal.SIG_DFL)
except ValueError:
# Only Main Thread can handle signals
return False
# Restore signal handler
signal.signal(signal.SIGINT, back_up)
return True
Updated to address potential issue as pointed out by #user4815162342.

Using 'with' with 'next()' in python 3 [duplicate]

I came across the Python with statement for the first time today. I've been using Python lightly for several months and didn't even know of its existence! Given its somewhat obscure status, I thought it would be worth asking:
What is the Python with statement
designed to be used for?
What do
you use it for?
Are there any
gotchas I need to be aware of, or
common anti-patterns associated with
its use? Any cases where it is better use try..finally than with?
Why isn't it used more widely?
Which standard library classes are compatible with it?
I believe this has already been answered by other users before me, so I only add it for the sake of completeness: the with statement simplifies exception handling by encapsulating common preparation and cleanup tasks in so-called context managers. More details can be found in PEP 343. For instance, the open statement is a context manager in itself, which lets you open a file, keep it open as long as the execution is in the context of the with statement where you used it, and close it as soon as you leave the context, no matter whether you have left it because of an exception or during regular control flow. The with statement can thus be used in ways similar to the RAII pattern in C++: some resource is acquired by the with statement and released when you leave the with context.
Some examples are: opening files using with open(filename) as fp:, acquiring locks using with lock: (where lock is an instance of threading.Lock). You can also construct your own context managers using the contextmanager decorator from contextlib. For instance, I often use this when I have to change the current directory temporarily and then return to where I was:
from contextlib import contextmanager
import os
#contextmanager
def working_directory(path):
current_dir = os.getcwd()
os.chdir(path)
try:
yield
finally:
os.chdir(current_dir)
with working_directory("data/stuff"):
# do something within data/stuff
# here I am back again in the original working directory
Here's another example that temporarily redirects sys.stdin, sys.stdout and sys.stderr to some other file handle and restores them later:
from contextlib import contextmanager
import sys
#contextmanager
def redirected(**kwds):
stream_names = ["stdin", "stdout", "stderr"]
old_streams = {}
try:
for sname in stream_names:
stream = kwds.get(sname, None)
if stream is not None and stream != getattr(sys, sname):
old_streams[sname] = getattr(sys, sname)
setattr(sys, sname, stream)
yield
finally:
for sname, stream in old_streams.iteritems():
setattr(sys, sname, stream)
with redirected(stdout=open("/tmp/log.txt", "w")):
# these print statements will go to /tmp/log.txt
print "Test entry 1"
print "Test entry 2"
# back to the normal stdout
print "Back to normal stdout again"
And finally, another example that creates a temporary folder and cleans it up when leaving the context:
from tempfile import mkdtemp
from shutil import rmtree
#contextmanager
def temporary_dir(*args, **kwds):
name = mkdtemp(*args, **kwds)
try:
yield name
finally:
shutil.rmtree(name)
with temporary_dir() as dirname:
# do whatever you want
I would suggest two interesting lectures:
PEP 343 The "with" Statement
Effbot Understanding Python's
"with" statement
1.
The with statement is used to wrap the execution of a block with methods defined by a context manager. This allows common try...except...finally usage patterns to be encapsulated for convenient reuse.
2.
You could do something like:
with open("foo.txt") as foo_file:
data = foo_file.read()
OR
from contextlib import nested
with nested(A(), B(), C()) as (X, Y, Z):
do_something()
OR (Python 3.1)
with open('data') as input_file, open('result', 'w') as output_file:
for line in input_file:
output_file.write(parse(line))
OR
lock = threading.Lock()
with lock:
# Critical section of code
3.
I don't see any Antipattern here.
Quoting Dive into Python:
try..finally is good. with is better.
4.
I guess it's related to programmers's habit to use try..catch..finally statement from other languages.
The Python with statement is built-in language support of the Resource Acquisition Is Initialization idiom commonly used in C++. It is intended to allow safe acquisition and release of operating system resources.
The with statement creates resources within a scope/block. You write your code using the resources within the block. When the block exits the resources are cleanly released regardless of the outcome of the code in the block (that is whether the block exits normally or because of an exception).
Many resources in the Python library that obey the protocol required by the with statement and so can used with it out-of-the-box. However anyone can make resources that can be used in a with statement by implementing the well documented protocol: PEP 0343
Use it whenever you acquire resources in your application that must be explicitly relinquished such as files, network connections, locks and the like.
Again for completeness I'll add my most useful use-case for with statements.
I do a lot of scientific computing and for some activities I need the Decimal library for arbitrary precision calculations. Some part of my code I need high precision and for most other parts I need less precision.
I set my default precision to a low number and then use with to get a more precise answer for some sections:
from decimal import localcontext
with localcontext() as ctx:
ctx.prec = 42 # Perform a high precision calculation
s = calculate_something()
s = +s # Round the final result back to the default precision
I use this a lot with the Hypergeometric Test which requires the division of large numbers resulting form factorials. When you do genomic scale calculations you have to be careful of round-off and overflow errors.
An example of an antipattern might be to use the with inside a loop when it would be more efficient to have the with outside the loop
for example
for row in lines:
with open("outfile","a") as f:
f.write(row)
vs
with open("outfile","a") as f:
for row in lines:
f.write(row)
The first way is opening and closing the file for each row which may cause performance problems compared to the second way with opens and closes the file just once.
See PEP 343 - The 'with' statement, there is an example section at the end.
... new statement "with" to the Python
language to make
it possible to factor out standard uses of try/finally statements.
points 1, 2, and 3 being reasonably well covered:
4: it is relatively new, only available in python2.6+ (or python2.5 using from __future__ import with_statement)
The with statement works with so-called context managers:
http://docs.python.org/release/2.5.2/lib/typecontextmanager.html
The idea is to simplify exception handling by doing the necessary cleanup after leaving the 'with' block. Some of the python built-ins already work as context managers.
Another example for out-of-the-box support, and one that might be a bit baffling at first when you are used to the way built-in open() behaves, are connection objects of popular database modules such as:
sqlite3
psycopg2
cx_oracle
The connection objects are context managers and as such can be used out-of-the-box in a with-statement, however when using the above note that:
When the with-block is finished, either with an exception or without, the connection is not closed. In case the with-block finishes with an exception, the transaction is rolled back, otherwise the transaction is commited.
This means that the programmer has to take care to close the connection themselves, but allows to acquire a connection, and use it in multiple with-statements, as shown in the psycopg2 docs:
conn = psycopg2.connect(DSN)
with conn:
with conn.cursor() as curs:
curs.execute(SQL1)
with conn:
with conn.cursor() as curs:
curs.execute(SQL2)
conn.close()
In the example above, you'll note that the cursor objects of psycopg2 also are context managers. From the relevant documentation on the behavior:
When a cursor exits the with-block it is closed, releasing any resource eventually associated with it. The state of the transaction is not affected.
In python generally “with” statement is used to open a file, process the data present in the file, and also to close the file without calling a close() method. “with” statement makes the exception handling simpler by providing cleanup activities.
General form of with:
with open(“file name”, “mode”) as file_var:
processing statements
note: no need to close the file by calling close() upon file_var.close()
The answers here are great, but just to add a simple one that helped me:
with open("foo.txt") as file:
data = file.read()
open returns a file
Since 2.6 python added the methods __enter__ and __exit__ to file.
with is like a for loop that calls __enter__, runs the loop once and then calls __exit__
with works with any instance that has __enter__ and __exit__
a file is locked and not re-usable by other processes until it's closed, __exit__ closes it.
source: http://web.archive.org/web/20180310054708/http://effbot.org/zone/python-with-statement.htm

How to idiomatically end an Asyncio Operation in Python

I'm working on code where I have a long running shell command whose output is sent to disk. This command will generate hundreds of GBs per file. I have successfully written code that calls this command asynchronously and successfully yields control (awaits) for it to complete.
I also have code that can asynchronously read that file as it is being written to so that I can process the data contained therein. The problem I'm running into is that I can't find a way to stop the file reader once the shell command completes.
I guess I'm looking for some sort of interrupt I can pass into my writer function once the shell command ends that I can use to tell it to close the file and wrap up the event loop.
Here is my writer function. Right now, it runs forever waiting for new data to be written to the file.
import asyncio
PERIOD = 0.5
async def readline(f):
while True:
data = f.readline()
if data:
return data
await asyncio.sleep(PERIOD)
async def read_zmap_file():
with open('/data/largefile.json'
, mode = 'r+t'
, encoding = 'utf-8'
) as f:
i = 0
while True:
line = await readline(f)
print('{:>10}: {!s}'.format(str(i), line.strip()))
i += 1
loop = asyncio.get_event_loop()
loop.run_until_complete(read_zmap_file())
loop.close()
If my approach is off, please let me know. I'm relatively new to asynchronous programming. Any help would be appreciated.
So, I'd do something like
reader = loop.create_task(read_zmap_file)
Then in your code that manages the shell process once the shell process exits, you can do
reader.cancel()
You can do
loop.run_until_complete(reader)
Alternatively, you could simply set a flag somewhere and use that flag in your while statement. You don't need to use asyncio primitives when something simpler works.
That said, I'd look into ways that your reader can avoid the periodic sleep. if your reader will be able to keep up with the shell command, I'd recommend a pipe, because pipes can be used with select (and thus added to an event loop). Then in your reader you can write to to a file if you need a permanent log. I realize the discussion of avoiding the periodic sleep is beyond the scope of this question, and I don't want to go into more detail than I have, but you di ask for hints on how best to approach async programming.

What makes Python3's print function thread safe?

I've seen on various mailing lists and forums that people keep mentioning that the print function in Python 3 is thread safe. From my own testing, I see no reason to doubt that.
import threading
import time
import random
def worker(letter):
print(letter * 50)
threads = [threading.Thread(target=worker, args=(let,)) for let in "ABCDEFGHIJ"]
for t in threads:
t.start()
for t in threads:
t.join()
When I run it with Python 3, even though some of the lines may be out of order, they are still always on their own lines. With Python 2, however, the output is fairly sporadic. Some lines are joined together or indented. This is also the case when I from __future__ import print_function
Python 2.7 builtin_print <- not thread safe
Python 3.6 builtin_print <- thread safe?
I'm just trying to understand WHY this is the case?
For Python 3.7: The print() function is a builtin, it by default sends output to sys.stdout, the documentation of which says, among other things:
When interactive, stdout and stderr streams are line-buffered.
Otherwise, they are block-buffered like regular text files. You can
override this value with the -u command-line option.
So its really the combination of interactive mode and sys.stderr that is responsible for the behaviour of the print function as demonstrated in the example.
And we can get closer to the truth if the worker function in your example program is changed to
def worker(letter):
print(letter*25, letter*25, sep='\n')
then we get outputs similar to the one below, which clearly shows that print in itself is not thread safe, what you can expect is that individual lines do not get interleaved with each other.
DDDDDDDDDDDDDDDDDDDDDDDDDJJJJJJJJJJJJJJJJJJJJJJJJJ
JJJJJJJJJJJJJJJJJJJJJJJJJ
DDDDDDDDDDDDDDDDDDDDDDDDDGGGGGGGGGGGGGGGGGGGGGGGGG
GGGGGGGGGGGGGGGGGGGGGGGGGAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAHHHHHHHHHHHHHHHHHHHHHHHHH
HHHHHHHHHHHHHHHHHHHHHHHHH
FFFFFFFFFFFFFFFFFFFFFFFFF
IIIIIIIIIIIIIIIIIIIIIIIIICCCCCCCCCCCCCCCCCCCCCCCCC
CCCCCCCCCCCCCCCCCCCCCCCCC
IIIIIIIIIIIIIIIIIIIIIIIII
EEEEEEEEEEEEEEEEEEEEEEEEE
EEEEEEEEEEEEEEEEEEEEEEEEEFFFFFFFFFFFFFFFFFFFFFFFFF
BBBBBBBBBBBBBBBBBBBBBBBBB
BBBBBBBBBBBBBBBBBBBBBBBBB
So ultimately thread safety of print is determined by the buffering strategy used.

Jython How to stop script from thread?

I'm looking for some exit code that will be run from a thread but will be able to kill the main script. It's in Jython but I can't use java.lang.System.exit() because I still want the Java app I'm in to run, and sys.exit() isn't working. Ideally I would like to output a message then exit.
My code uses the threading.Timer function to run a function after a certain period of time. Here I'm using it to end a for loop that is executing for longer than 1 sec. Here is my code:
import threading
def exitFunct():
#exit code here
t = threading.Timer(1.0, exitFunct)
t.start()
for i in range(1, 2000):
print i
Well, if you had to, you could call mainThread.stop(). But you shouldn't.
This article explains why what you're trying to do is considered a bad idea.
If you want to kill the current process and you don't care about flushing IO buffers or reseting the terminal, you can use os._exit().
I don't know why they made this so hard.

Resources