FileNotFoundError But The File Is There: Cryptography Edition - python-3.x

I'm working on a script that takes a checksum and directory as inputs.
Without too much background, I'm looking for 'malware' (ie. a flag) in a directory of executables. I'm given the SHA512 sum of the 'malware'. I've gotten it to work (I found the flag), but I ran into an issue with the output after generalizing the function for different cryptographic protocols, encodings, and individual files instead of directories:
FileNotFoundError: [Errno 2] No such file or directory : 'lessecho'
There is indeed a file lessecho in the directory, and as it happens, is close to the file that returns the actual flag. Probably a coincidence. Probably.
Below is my Python script:
#!/usr/bin/python3
import hashlib, sys, os
"""
### TO DO ###
Add other encryption techniques
Include file read functionality
"""
def main(to_check = sys.argv[1:]):
dir_to_check = to_check[0]
hash_to_check = to_check[1]
BUF_SIZE = 65536
for f in os.listdir(dir_to_check):
sha256 = hashlib.sha256()
with open(f, 'br') as f: <--- line where the issue occurs
while True:
data = f.read(BUF_SIZE)
if not data:
break
sha256.update(data)
f.close()
if sha256.hexdigest() == hash_to_check:
return f
if __name__ == '__main__':
k = main()
print(k)
Credit to Randall for his answer here
Here are some humble trinkets from my native land in exchange for your wisdom.

Your listdir call is giving you bare filenames (e.g. lessecho), but that is within the dir_to_check directory (which I'll call foo for convenience). To open the file, you need to join those two parts of the path back together, to get a proper path (e.g. foo/lessecho). The os.path.join function does exactly that:
for f in os.listdir(dir_to_check):
sha256 = hashlib.sha256()
with open(os.path.join(dir_to_check, f), 'br') as f: # add os.path.join call here!
...
There are a few other issues in the code, unrelated to your current error. One is that you're using the same variable name f for both the file name (from the loop) and file object (in the with statement). Pick a different name for one of them, since you need both available (because I assume you intend return f to return the filename, not the recently closed file object).
And speaking of the closed file, you're actually closing the file object twice. The first one happens at the end of the with statement (that's why you use with). The second is your manual call to f.close(). You don't need the manual call at all.

Related

How to copy from zip file to a folder without unzipping it?

How to make this code works?
There is a zip file with folders and .png files in it. Folder ".\icons_by_year" is empty. I need to get every file one by one without unzipping it and copy to the root of the selected folder (so no extra folders made).
class ArrangerOutZip(Arranger):
def __init__(self):
self.base_source_folder = '\\icons.zip'
self.base_output_folder = ".\\icons_by_year"
def proceed(self):
self.create_and_copy()
def create_and_copy(self):
reg_pattern = re.compile('.+\.\w{1,4}$')
f = open(self.base_source_folder, 'rb')
zfile = zipfile.ZipFile(f)
for cont in zfile.namelist():
if reg_pattern.match(cont):
with zfile.open(cont) as file:
shutil.copyfileobj(file, self.base_output_folder)
zfile.close()
f.close()
arranger = ArrangerOutZip()
arranger.proceed()
shutil.copyfileobj uses file objects for source and destination files. To open the destination you need to construct a file path for it. pathlib is a part of the standard python library and is a nice way to handle file paths. And ZipFile.extract does some of the work of creating intermediate output directories for you (plus sets file metadata) and can be used instead of copyfileobj.
One risk of unzipping files is that they can contain absolute or relative paths outside of the target directory you intend (e.g., "../../badvirus.exe"). extract is a bit too lax about that - putting those files in the root of the target directory - so I wrote a little something to reject the whole zip if you are being messed with.
With a few tweeks to make this a testable program,
from pathlib import Path
import re
import zipfile
#import shutil
#class ArrangerOutZip(Arranger):
class ArrangerOutZip:
def __init__(self, base_source_folder, base_output_folder):
self.base_source_folder = Path(base_source_folder).resolve(strict=True)
self.base_output_folder = Path(base_output_folder).resolve()
def proceed(self):
self.create_and_copy()
def create_and_copy(self):
"""Unzip files matching pattern to base_output_folder, raising
ValueError if any resulting paths are outside of that folder.
Output folder created if it does not exist."""
reg_pattern = re.compile('.+\.\w{1,4}$')
with open(self.base_source_folder, 'rb') as f:
with zipfile.ZipFile(f) as zfile:
wanted_files = [cont for cont in zfile.namelist()
if reg_pattern.match(cont)]
rebased_files = self._rebase_paths(wanted_files,
self.base_output_folder)
for cont, rebased in zip(wanted_files, rebased_files):
print(cont, rebased, rebased.parent)
# option 1: use shutil
#rebased.parent.mkdir(parents=True, exist_ok=True)
#with zfile.open(cont) as file, open(rebased, 'wb') as outfile:
# shutil.copyfileobj(file, outfile)
# option 2: zipfile does the work for you
zfile.extract(cont, self.base_output_folder)
#staticmethod
def _rebase_paths(pathlist, target_dir):
"""Rebase relative file paths to target directory, raising
ValueError if any resulting paths are not within target_dir"""
target = Path(target_dir).resolve()
newpaths = []
for path in pathlist:
newpath = target.joinpath(path).resolve()
newpath.relative_to(target) # raises ValueError if not subpath
newpaths.append(newpath)
return newpaths
#arranger = ArrangerOutZip('\\icons.zip', '.\\icons_by_year')
import sys
try:
arranger = ArrangerOutZip(sys.argv[1], sys.argv[2])
arranger.proceed()
except IndexError:
print("usage: test.py zipfile targetdir")
I'd take a look at the zipfile libraries' getinfo() and also ZipFile.Path() for construction since the constructor class can also use paths that way if you intend to do any creation.
Specifically PathObjects. This is able to do is to construct an object with a path in it, and it appears to be based on pathlib. Assuming you don't need to create zipfiles, you can ignore this ZipFile.Path()
However, that's not exactly what I wanted to point out. Rather consider the following:
zipfile.getinfo()
There is a person who I think is getting at this exact situation here:
https://www.programcreek.com/python/example/104991/zipfile.getinfo
This person seems to be getting a path using getinfo(). It's also clear that NOT every zipfile has the info.

Python: command line, sys.argv, "if __name__ == '__main__' "

I have a moderate amount of experience using Python in Jupyter but am pretty clueless about how to use the command line. I have this prompt for a homework assignment -- I understand how the algorithms work, but I don't know how to format everything so it works from the command line in the way that is specified.
The prompt:
Question 1: 80 points
Input: a text file that specifies a travel problem (see travel-input.txt
for the format) and a search algorithm
(more details are below).
python map.py [file] [search] should read
the travel problem from “file” and run the “search” algorithm to find
a solution. It will print the solution and its cost.
search is one of
[DFTS, DFGS, BFTS, BFGS, UCTS, UCGS, GBFTS, GBFGS, ASTS, ASGS]
Here is the template I was given:
from search import ... # TODO import the necessary classes and methods
import sys
if __name__ == '__main__':
input_file = sys.argv[1]
search_algo_str = sys.argv[2]
# TODO implement
goal_node = ... # TODO call the appropriate search function with appropriate parameters
# Do not change the code below.
if goal_node is not None:
print("Solution path", goal_node.solution())
print("Solution cost", goal_node.path_cost)
else:
print("No solution was found.")
So as far as python map.py [file] [search] goes, 'file' refers to travel-input.txt and 'search' refers to one of DFTS, DFGS, BFTS,... etc - a user-specified choice. My questions:
Where do I put my search functions? Should they all just be back-to-back in the same block of code?
How do I get the command line to recognize each function from its four or five-letter code? Is it just the name of the function? If I call it just using those letters, how can the functions receive input?
Do I need to reference the input file anywhere in my code?
Does it matter where I save my files in order for them to be accessible from the command line - .py files, travel-input.txt, etc? I've tried accessing them from the command line, with no success.
Thanks for the help!
The function definitions go before the if __name__ == "__main__" block. To select the correct function you can put them in a dict and use the four-letter abbreviations as keys, i.e.
def dfts_search(...):
...
def dfgs_search(...):
....
...
if __name__ == "__main__":
input_file = sys.argv[1]
search_algo_str = sys.argv[2]
search_dict = {"DFTS": dfts_search, "DFGS": dfgs_search, ...}
try:
func = search_dict[search_algo_str]
result = func(...)
except KeyError:
print(f'{search_algo_str} is an unknown search algorithm')
Not sure what you mean by reference, but input_file already refers to the input file. You will need to write a function to read the file and process the contents.
The location of the files shouldn't matter too much. Putting everything in the same directory is probably easiest. In the command window, just cd to the directory where the files are located and run the script as described in the assignment.

Python avoid partial writes with non-blocking write to named pipe

I am running python3.8 on linux.
In my script, I create a named pipe, and open it as follows:
import os
import posix
import time
file_name = 'fifo.txt'
os.mkfifo(file_name)
f = posix.open(file_name, os.O_RDWR | os.O_NONBLOCK)
os.set_blocking(f, False)
Without yet having opened the file for reading elsewhere ( for instance, with cat), I start to write to the file in a loop.
base_line = 'abcdefghijklmnopqrstuvwxyz'
s = base_line * 10000 + '\n'
while True:
try:
posix.write(f, s.encode())
except BlockingIOError as e:
print("Exception occurred: {}".format(e))
time.sleep(.5)
When I then go to read from the named pipe with cat, I find that there was a partial-write that took place.
I am confused how I can know how many bytes were written in this instance. Since the exception was thrown, I do not have access to the return value (num bytes written). The documentation suggests that BlockingIOError has a property called characters_written, however when I try to access this field an AttributeError is raised.
In summary: How can I either avoid this partial write in the first place, or at least know how much was partially written in this instance?
os.write performs an unbuffered write. The docs state that BlockingIOError only has a characters_written attribute when a buffered write operation would block.
If any bytes were successfully written before the pipe became full, that number of bytes will be returned from os.write. Otherwise, you'll get an exception. Of course, something like a drive failure will also cause an exception, even if some bytes were written. This is no different from how POSIX write works, except instead of returning -1 on error, an exception is raised.
If you don't like dealing with the exception, you can use a wrapper around the file descriptor, such as a io.FileIO object. I've modified your code since it tries to write the entire buffer every time you looped back to the os.write call (if it failed once, it will fail every time):
import io
import os
import time
base_line = 'abcdefghijklmnopqrstuvwxyz'
data = (base_line * 10000 + '\n').encode()
file_name = 'fifo.txt'
os.mkfifo(file_name)
fd = os.open(file_name, os.O_RDWR | os.O_NONBLOCK)
# os.O_NONBLOCK makes os.set_blocking(fd, False) unnecessary.
with io.FileIO(fd, 'wb') as f:
written = 0
while written < len(data):
n = f.write(data[written:])
if n is None:
time.sleep(.5)
else:
written += n
BTW, you might use the selectors module instead of time.sleep; I noticed a slight delay when trying to read from the pipe because of the sleep delay, which shouldn't happen if you use the selectors module:
with io.FileIO(fd, 'wb') as f:
written = 0
sel = selectors.DefaultSelector()
sel.register(f, selectors.EVENT_WRITE)
while written < len(data):
n = f.write(data[written:])
if n is None:
# Wait here until we can start writing again.
sel.select()
else:
written += n
sel.unregister(f)
Some useful information can also be found in the answer to POSIX named pipe (fifo) drops record in nonblocking mode.

Is there a method in Python to "check" if a textfile has been modified or appended? [duplicate]

I have a log file being written by another process which I want to watch for changes. Each time a change occurs I'd like to read the new data in to do some processing on it.
What's the best way to do this? I was hoping there'd be some sort of hook from the PyWin32 library. I've found the win32file.FindNextChangeNotification function but have no idea how to ask it to watch a specific file.
If anyone's done anything like this I'd be really grateful to hear how...
[Edit] I should have mentioned that I was after a solution that doesn't require polling.
[Edit] Curses! It seems this doesn't work over a mapped network drive. I'm guessing windows doesn't 'hear' any updates to the file the way it does on a local disk.
Did you try using Watchdog?
Python API library and shell utilities to monitor file system events.
Directory monitoring made easy with
A cross-platform API.
A shell tool to run commands in response to directory changes.
Get started quickly with a simple example in Quickstart...
If polling is good enough for you, I'd just watch if the "modified time" file stat changes. To read it:
os.stat(filename).st_mtime
(Also note that the Windows native change event solution does not work in all circumstances, e.g. on network drives.)
import os
class Monkey(object):
def __init__(self):
self._cached_stamp = 0
self.filename = '/path/to/file'
def ook(self):
stamp = os.stat(self.filename).st_mtime
if stamp != self._cached_stamp:
self._cached_stamp = stamp
# File has changed, so do something...
If you want a multiplatform solution, then check QFileSystemWatcher.
Here an example code (not sanitized):
from PyQt4 import QtCore
#QtCore.pyqtSlot(str)
def directory_changed(path):
print('Directory Changed!!!')
#QtCore.pyqtSlot(str)
def file_changed(path):
print('File Changed!!!')
fs_watcher = QtCore.QFileSystemWatcher(['/path/to/files_1', '/path/to/files_2', '/path/to/files_3'])
fs_watcher.connect(fs_watcher, QtCore.SIGNAL('directoryChanged(QString)'), directory_changed)
fs_watcher.connect(fs_watcher, QtCore.SIGNAL('fileChanged(QString)'), file_changed)
It should not work on windows (maybe with cygwin ?), but for unix user, you should use the "fcntl" system call. Here is an example in Python. It's mostly the same code if you need to write it in C (same function names)
import time
import fcntl
import os
import signal
FNAME = "/HOME/TOTO/FILETOWATCH"
def handler(signum, frame):
print "File %s modified" % (FNAME,)
signal.signal(signal.SIGIO, handler)
fd = os.open(FNAME, os.O_RDONLY)
fcntl.fcntl(fd, fcntl.F_SETSIG, 0)
fcntl.fcntl(fd, fcntl.F_NOTIFY,
fcntl.DN_MODIFY | fcntl.DN_CREATE | fcntl.DN_MULTISHOT)
while True:
time.sleep(10000)
Check out pyinotify.
inotify replaces dnotify (from an earlier answer) in newer linuxes and allows file-level rather than directory-level monitoring.
For watching a single file with polling, and minimal dependencies, here is a fully fleshed-out example, based on answer from Deestan (above):
import os
import sys
import time
class Watcher(object):
running = True
refresh_delay_secs = 1
# Constructor
def __init__(self, watch_file, call_func_on_change=None, *args, **kwargs):
self._cached_stamp = 0
self.filename = watch_file
self.call_func_on_change = call_func_on_change
self.args = args
self.kwargs = kwargs
# Look for changes
def look(self):
stamp = os.stat(self.filename).st_mtime
if stamp != self._cached_stamp:
self._cached_stamp = stamp
# File has changed, so do something...
print('File changed')
if self.call_func_on_change is not None:
self.call_func_on_change(*self.args, **self.kwargs)
# Keep watching in a loop
def watch(self):
while self.running:
try:
# Look for changes
time.sleep(self.refresh_delay_secs)
self.look()
except KeyboardInterrupt:
print('\nDone')
break
except FileNotFoundError:
# Action on file not found
pass
except:
print('Unhandled error: %s' % sys.exc_info()[0])
# Call this function each time a change happens
def custom_action(text):
print(text)
watch_file = 'my_file.txt'
# watcher = Watcher(watch_file) # simple
watcher = Watcher(watch_file, custom_action, text='yes, changed') # also call custom action function
watcher.watch() # start the watch going
Well after a bit of hacking of Tim Golden's script, I have the following which seems to work quite well:
import os
import win32file
import win32con
path_to_watch = "." # look at the current directory
file_to_watch = "test.txt" # look for changes to a file called test.txt
def ProcessNewData( newData ):
print "Text added: %s"%newData
# Set up the bits we'll need for output
ACTIONS = {
1 : "Created",
2 : "Deleted",
3 : "Updated",
4 : "Renamed from something",
5 : "Renamed to something"
}
FILE_LIST_DIRECTORY = 0x0001
hDir = win32file.CreateFile (
path_to_watch,
FILE_LIST_DIRECTORY,
win32con.FILE_SHARE_READ | win32con.FILE_SHARE_WRITE,
None,
win32con.OPEN_EXISTING,
win32con.FILE_FLAG_BACKUP_SEMANTICS,
None
)
# Open the file we're interested in
a = open(file_to_watch, "r")
# Throw away any exising log data
a.read()
# Wait for new data and call ProcessNewData for each new chunk that's written
while 1:
# Wait for a change to occur
results = win32file.ReadDirectoryChangesW (
hDir,
1024,
False,
win32con.FILE_NOTIFY_CHANGE_LAST_WRITE,
None,
None
)
# For each change, check to see if it's updating the file we're interested in
for action, file in results:
full_filename = os.path.join (path_to_watch, file)
#print file, ACTIONS.get (action, "Unknown")
if file == file_to_watch:
newText = a.read()
if newText != "":
ProcessNewData( newText )
It could probably do with a load more error checking, but for simply watching a log file and doing some processing on it before spitting it out to the screen, this works well.
Thanks everyone for your input - great stuff!
Check my answer to a similar question. You could try the same loop in Python. This page suggests:
import time
while 1:
where = file.tell()
line = file.readline()
if not line:
time.sleep(1)
file.seek(where)
else:
print line, # already has newline
Also see the question tail() a file with Python.
This is another modification of Tim Goldan's script that runs on unix types and adds a simple watcher for file modification by using a dict (file=>time).
usage: whateverName.py path_to_dir_to_watch
#!/usr/bin/env python
import os, sys, time
def files_to_timestamp(path):
files = [os.path.join(path, f) for f in os.listdir(path)]
return dict ([(f, os.path.getmtime(f)) for f in files])
if __name__ == "__main__":
path_to_watch = sys.argv[1]
print('Watching {}..'.format(path_to_watch))
before = files_to_timestamp(path_to_watch)
while 1:
time.sleep (2)
after = files_to_timestamp(path_to_watch)
added = [f for f in after.keys() if not f in before.keys()]
removed = [f for f in before.keys() if not f in after.keys()]
modified = []
for f in before.keys():
if not f in removed:
if os.path.getmtime(f) != before.get(f):
modified.append(f)
if added: print('Added: {}'.format(', '.join(added)))
if removed: print('Removed: {}'.format(', '.join(removed)))
if modified: print('Modified: {}'.format(', '.join(modified)))
before = after
Here is a simplified version of Kender's code that appears to do the same trick and does not import the entire file:
# Check file for new data.
import time
f = open(r'c:\temp\test.txt', 'r')
while True:
line = f.readline()
if not line:
time.sleep(1)
print 'Nothing New'
else:
print 'Call Function: ', line
Well, since you are using Python, you can just open a file and keep reading lines from it.
f = open('file.log')
If the line read is not empty, you process it.
line = f.readline()
if line:
// Do what you want with the line
You may be missing that it is ok to keep calling readline at the EOF. It will just keep returning an empty string in this case. And when something is appended to the log file, the reading will continue from where it stopped, as you need.
If you are looking for a solution that uses events, or a particular library, please specify this in your question. Otherwise, I think this solution is just fine.
Simplest solution for me is using watchdog's tool watchmedo
From https://pypi.python.org/pypi/watchdog I now have a process that looks up the sql files in a directory and executes them if necessary.
watchmedo shell-command \
--patterns="*.sql" \
--recursive \
--command='~/Desktop/load_files_into_mysql_database.sh' \
.
As you can see in Tim Golden's article, pointed by Horst Gutmann, WIN32 is relatively complex and watches directories, not a single file.
I'd like to suggest you look into IronPython, which is a .NET python implementation.
With IronPython you can use all the .NET functionality - including
System.IO.FileSystemWatcher
Which handles single files with a simple Event interface.
This is an example of checking a file for changes. One that may not be the best way of doing it, but it sure is a short way.
Handy tool for restarting application when changes have been made to the source. I made this when playing with pygame so I can see effects take place immediately after file save.
When used in pygame make sure the stuff in the 'while' loop is placed in your game loop aka update or whatever. Otherwise your application will get stuck in an infinite loop and you will not see your game updating.
file_size_stored = os.stat('neuron.py').st_size
while True:
try:
file_size_current = os.stat('neuron.py').st_size
if file_size_stored != file_size_current:
restart_program()
except:
pass
In case you wanted the restart code which I found on the web. Here it is. (Not relevant to the question, though it could come in handy)
def restart_program(): #restart application
python = sys.executable
os.execl(python, python, * sys.argv)
Have fun making electrons do what you want them to do.
Seems that no one has posted fswatch. It is a cross-platform file system watcher. Just install it, run it and follow the prompts.
I've used it with python and golang programs and it just works.
ACTIONS = {
1 : "Created",
2 : "Deleted",
3 : "Updated",
4 : "Renamed from something",
5 : "Renamed to something"
}
FILE_LIST_DIRECTORY = 0x0001
class myThread (threading.Thread):
def __init__(self, threadID, fileName, directory, origin):
threading.Thread.__init__(self)
self.threadID = threadID
self.fileName = fileName
self.daemon = True
self.dir = directory
self.originalFile = origin
def run(self):
startMonitor(self.fileName, self.dir, self.originalFile)
def startMonitor(fileMonitoring,dirPath,originalFile):
hDir = win32file.CreateFile (
dirPath,
FILE_LIST_DIRECTORY,
win32con.FILE_SHARE_READ | win32con.FILE_SHARE_WRITE,
None,
win32con.OPEN_EXISTING,
win32con.FILE_FLAG_BACKUP_SEMANTICS,
None
)
# Wait for new data and call ProcessNewData for each new chunk that's
# written
while 1:
# Wait for a change to occur
results = win32file.ReadDirectoryChangesW (
hDir,
1024,
False,
win32con.FILE_NOTIFY_CHANGE_LAST_WRITE,
None,
None
)
# For each change, check to see if it's updating the file we're
# interested in
for action, file_M in results:
full_filename = os.path.join (dirPath, file_M)
#print file, ACTIONS.get (action, "Unknown")
if len(full_filename) == len(fileMonitoring) and action == 3:
#copy to main file
...
Since I have it installed globally, my favorite approach is to use nodemon. If your source code is in src, and your entry point is src/app.py, then it's as easy as:
nodemon -w 'src/**' -e py,html --exec python src/app.py
... where -e py,html lets you control what file types to watch for changes.
Here's an example geared toward watching input files that write no more than one line per second but usually a lot less. The goal is to append the last line (most recent write) to the specified output file. I've copied this from one of my projects and just deleted all the irrelevant lines. You'll have to fill in or change the missing symbols.
from PyQt5.QtCore import QFileSystemWatcher, QSettings, QThread
from ui_main_window import Ui_MainWindow # Qt Creator gen'd
class MainWindow(QMainWindow, Ui_MainWindow):
def __init__(self, parent=None):
QMainWindow.__init__(self, parent)
Ui_MainWindow.__init__(self)
self._fileWatcher = QFileSystemWatcher()
self._fileWatcher.fileChanged.connect(self.fileChanged)
def fileChanged(self, filepath):
QThread.msleep(300) # Reqd on some machines, give chance for write to complete
# ^^ About to test this, may need more sophisticated solution
with open(filepath) as file:
lastLine = list(file)[-1]
destPath = self._filemap[filepath]['dest file']
with open(destPath, 'a') as out_file: # a= append
out_file.writelines([lastLine])
Of course, the encompassing QMainWindow class is not strictly required, ie. you can use QFileSystemWatcher alone.
Just to put this out there since no one mentioned it: there's a Python module in the Standard Library named filecmp which has this cmp() function that compares two files.
Just make sure you don't do from filecmp import cmp to not overshadow the built-in cmp() function in Python 2.x. That's okay in Python 3.x, though, since there's no such built-in cmp() function anymore.
Anyway, this is how its use looks like:
import filecmp
filecmp.cmp(path_to_file_1, path_to_file_2, shallow=True)
The argument shallow defaults to True. If the argument's value is True, then only the metadata of the files are compared; however, if the argument's value is False, then the contents of the files are compared.
Maybe this information will be useful to someone.
watchfiles (https://github.com/samuelcolvin/watchfiles) is a Python API and CLI that uses the Notify (https://github.com/notify-rs/notify) library written in Rust.
The rust implementation currently (2022-10-09) supports:
Linux / Android: inotify
macOS: FSEvents or kqueue, see features
Windows: ReadDirectoryChangesW
FreeBSD / NetBSD / OpenBSD / DragonflyBSD: kqueue
All platforms: polling
Binaries available on PyPI (https://pypi.org/project/watchfiles/) and conda-forge (https://github.com/conda-forge/watchfiles-feedstock).
You can also use a simple library called repyt, here is an example:
repyt ./app.py
related #4Oh4 solution a smooth change for a list of files to watch;
import os
import sys
import time
class Watcher(object):
running = True
refresh_delay_secs = 1
# Constructor
def __init__(self, watch_files, call_func_on_change=None, *args, **kwargs):
self._cached_stamp = 0
self._cached_stamp_files = {}
self.filenames = watch_files
self.call_func_on_change = call_func_on_change
self.args = args
self.kwargs = kwargs
# Look for changes
def look(self):
for file in self.filenames:
stamp = os.stat(file).st_mtime
if not file in self._cached_stamp_files:
self._cached_stamp_files[file] = 0
if stamp != self._cached_stamp_files[file]:
self._cached_stamp_files[file] = stamp
# File has changed, so do something...
file_to_read = open(file, 'r')
value = file_to_read.read()
print("value from file", value)
file_to_read.seek(0)
if self.call_func_on_change is not None:
self.call_func_on_change(*self.args, **self.kwargs)
# Keep watching in a loop
def watch(self):
while self.running:
try:
# Look for changes
time.sleep(self.refresh_delay_secs)
self.look()
except KeyboardInterrupt:
print('\nDone')
break
except FileNotFoundError:
# Action on file not found
pass
except Exception as e:
print(e)
print('Unhandled error: %s' % sys.exc_info()[0])
# Call this function each time a change happens
def custom_action(text):
print(text)
# pass
watch_files = ['/Users/mexekanez/my_file.txt', '/Users/mexekanez/my_file1.txt']
# watcher = Watcher(watch_file) # simple
if __name__ == "__main__":
watcher = Watcher(watch_files, custom_action, text='yes, changed') # also call custom action function
watcher.watch() # start the watch going
The best and simplest solution is to use pygtail:
https://pypi.python.org/pypi/pygtail
from pygtail import Pygtail
import sys
while True:
for line in Pygtail("some.log"):
sys.stdout.write(line)
import inotify.adapters
from datetime import datetime
LOG_FILE='/var/log/mysql/server_audit.log'
def main():
start_time = datetime.now()
while True:
i = inotify.adapters.Inotify()
i.add_watch(LOG_FILE)
for event in i.event_gen(yield_nones=False):
break
del i
with open(LOG_FILE, 'r') as f:
for line in f:
entry = line.split(',')
entry_time = datetime.strptime(entry[0],
'%Y%m%d %H:%M:%S')
if entry_time > start_time:
start_time = entry_time
print(entry)
if __name__ == '__main__':
main()
The easiest solution would get the two instances of the same file after an interval and Compare them. You Could try something like this
while True:
# Capturing the two instances models.py after certain interval of time
print("Looking for changes in " + app_name.capitalize() + " models.py\nPress 'CTRL + C' to stop the program")
with open(app_name.capitalize() + '/filename', 'r+') as app_models_file:
filename_content = app_models_file.read()
time.sleep(5)
with open(app_name.capitalize() + '/filename', 'r+') as app_models_file_1:
filename_content_1 = app_models_file_1.read()
# Comparing models.py after certain interval of time
if filename_content == filename_content_1:
pass
else:
print("You made a change in " + app_name.capitalize() + " filename.\n")
cmd = str(input("Do something with the file?(y/n):"))
if cmd == 'y':
# Do Something
elif cmd == 'n':
# pass or do something
else:
print("Invalid input")
If you're using windows, create this POLL.CMD file
#echo off
:top
xcopy /m /y %1 %2 | find /v "File(s) copied"
timeout /T 1 > nul
goto :top
then you can type "poll dir1 dir2" and it will copy all the files from dir1 to dir2 and check for updates once per second.
The "find" is optional, just to make the console less noisy.
This is not recursive. Maybe you could make it recursive using /e on the xcopy.
I don't know any Windows specific function. You could try getting the MD5 hash of the file every second/minute/hour (depends on how fast you need it) and compare it to the last hash. When it differs you know the file has been changed and you read out the newest lines.
I'd try something like this.
try:
f = open(filePath)
except IOError:
print "No such file: %s" % filePath
raw_input("Press Enter to close window")
try:
lines = f.readlines()
while True:
line = f.readline()
try:
if not line:
time.sleep(1)
else:
functionThatAnalisesTheLine(line)
except Exception, e:
# handle the exception somehow (for example, log the trace) and raise the same exception again
raw_input("Press Enter to close window")
raise e
finally:
f.close()
The loop checks if there is a new line(s) since last time file was read - if there is, it's read and passed to the functionThatAnalisesTheLine function. If not, script waits 1 second and retries the process.

Reopening a closed stringIO object in Python 3

So, I create a StringIO object to treat my string as a file:
>>> a = 'Me, you and them\n'
>>> import io
>>> f = io.StringIO(a)
>>> f.read(1)
'M'
And then I proceed to close the 'file':
>>> f.close()
>>> f.closed
True
Now, when I try to open the 'file' again, Python does not permit me to do so:
>>> p = open(f)
Traceback (most recent call last):
File "<pyshell#166>", line 1, in <module>
p = open(f)
TypeError: invalid file: <_io.StringIO object at 0x0325D4E0>
Is there a way to 'reopen' a closed StringIO object? Or should it be declared again using the io.StringIO() method?
Thanks!
I have a nice hack, which I am currently using for testing (Since my code can make I/O operations, and giving it StringIO is a nice get-around).
If this problem is kind of one time thing:
st = StringIO()
close = st.close
st.close = lambda: None
f(st) # Some function which can make I/O changes and finally close st
st.getvalue() # This is available now
close() # If you don't want to store the close function you can also:
StringIO.close(st)
If this is recurring thing, you can also define a context-manager:
#contextlib.contextmanager
def uncloseable(fd):
"""
Context manager which turns the fd's close operation to no-op for the duration of the context.
"""
close = fd.close
fd.close = lambda: None
yield fd
fd.close = close
which can be used in the following way:
st = StringIO()
with uncloseable(st):
f(st)
# Now st is still open!!!
I hope this helps you with your problem, and if not, I hope you will find the solution you are looking for.
Note: This should work exactly the same for other file-like objects.
No, there is no way to re-open an io.StringIO object. Instead, just create a new object with io.StringIO().
Calling close() on an io.StringIO object throws away the "file contents" data, so re-opening couldn't give access to that anyways.
If you need the data, call getvalue() before closing.
See also the StringIO documentation here:
The text buffer is discarded when the close() method is called.
and here:
getvalue()
Return a str containing the entire contents of the buffer.
The builtin open() creates a file object (i.e. a stream), but in your example, f is already a stream.
That's the reason why you get TypeError: invalid file
After the method close() has executed, any stream operation will raise ValueError.
And the documentation does not mention about how to reopen a closed stream.
Maybe you need not close() the stream yet if you want to use (reopen) it again later.
When you f.close() you remove it from memory. You're basically doing a deref x, call x; you're looking for a memory location that doesn't exist.
Here is what you could do in stead:
import io
a = 'Me, you and them\n'
f = io.StringIO(a)
f.read(1)
f.close()
# Put the text form a without the first char into StringIO.
p = io.StringIO(a[1:]).
# do some work with p.
I think your confusion comes form thinking of io.StringIO as a file on the block device. If you used open() and not StringIO, then you would be correct in your example and you could reopen the file. StringIO is not a file. It's the idea of a file object in memory. A file object does have a StringIO, but It also exists physically on the block device. A StringIO is just a buffer, a staging area in memory of the data with in it. When you call open() a buffer is created, but there is still the data on block device.
Perhaps this is more what you want
fo = open('f.txt','w+')
fo.write('Me, you and them\n')
fo.read(1)
fo.close()
# reopen the now closed file `f`
p = open('f.txt','r')
# do stuff with p
p.close()
Here we are writing the string to the block device, so that when we close the file, the information written to it will remain after it's closed. Because this is creating a file in the directory the progarm is run in, it may be a good idea to give the file an extension. For example, you could name the file f.txt instead of f.

Resources