I have doubt in the following scenario
Scenario:
A process or program starts with opening a file in a write mode and entering a infinite loop say example: while(1) whose body has logic to write to the opened file.
Problem: What if i delete the opened or created file soon after the process enters the infinite loop
In Unix, users really cannot delete files, they only can drop references to files. The kernel deletes the file when there are no references (hard links and open file descriptors) left.
From what you're saying, it sounds like in reality you don't want an infinite loop, but rather a while loop with some flag, something to the effect of
while (file exists)
perform operation
Add a line that checks to see if the file exists during the while loop. If it doesn't exist, kill the loop.
It appears that what happens is that your file disappears (basically).
Try this, create a file test.py and put the following in it:
import os
f = open('out.txt', 'w') # Open file for writing
f.write("Hi Mom!") # Write something
os.remove('out.txt') # Delete the file
try:
while True: # Do forever
f.write("Silly English Kanighit!")
except:
f.close()
then $ python test.py and hit enter. Ctrl-C should stop the execution. This will open, then delete the file, then continue writing to the file that no longer exists, for the reasons that have previously been mentioned.
However, if you really have a different question such as "How can I prevent my file from being accidentally deleted while I'm writing to it?" or something else, it's probably better to ask that question.
Related
External Updates that happened to that file content during the time between open() and first read() are not returned in the read() content.
How can I get the latest file content from the read()?
I've tried flush() and seek(0) but didn't help.
https://repl.it/repls/RealGreedyTransfer#main.py
import time
def myfoo(handle):
print("myfoo started", flush=True)
time.sleep(50)
# External updates that happen during that time don't show up in read()
# foo.flush()
# foo.seek(0)
# can't close and re-open file handle
print(handle.read()) # <-- Not reading updates done after file open
# Upstream code base passing a file handle under an exclusive fcntl.lockf() lock
handle = open('temp.txt', 'r+')
myfoo(handle)
The issue is with the way files are written. Many text editors don’t just write to the file, they use a different method: they write to a temporary file, and then rename it to the original filename. Since renames are atomic in POSIX, in the event of a system crash during saving, the old version of the file will be available, and the new version might or might not be available in the temporary file.
For most purposes, this works as desired. The only exception is in this case, where you’re holding onto a file handle. Renames/moves/deletions do not affect the file handles, they are still open with the file they were opened with, even if that file is no longer accessible from the filesystem. You can experiment with this by opening a file, then removing it with rm, and then reading from the file — it will still show you the file contents from before you deleted it. You can also access the file in Linux inside /proc/XX/fd.
Your file handle won’t see changes, unless they are actually written (and flushed) to the same file (without the rename dance). If you’re working with something that writes by renaming, you would need to reopen the file to see the new contents.
I know from experience that if I try to open the same file in Vim in multiple terminals at the same time, I get an error. (Maybe because of temporary files?)
And I know from experience that if I open a text file in Python and read through it, I have to reset the pointer when I'm done.
But I've found that if I run the same Python script in multiple terminals at the same time, I don't get any error; it just successfully runs the script in both. How does this work? Doesn't Python need to read my script from the beginning in order to run it? Is the script copied to a temporary file, or something?
I know from experience that if I try to open the same file in Vim in multiple terminals at the same time, I get an error.
That's not actually true. Vim actually will let you open the same file in multiple terminals at the same time; it's just that it gives you a warning first to let you know that this is happening, so you can abort before you make changes. (It's not safe to modify the file concurrently in two different instances of Vim, because the two instances won't coordinate at all.)
Furthermore, Vim will only give you this warning if you try to open the same file for editing in multiple terminals at the same time. It won't complain if you're just opening the file for reading (using the -R flag).
And I know from experience that if I open a text file in Python and read through it, I have to reset the pointer when I'm done.
That's not exactly true, either. If you make multiple separate calls to open, you'll have multiple separate file objects, and each separately maintains its position in the file. So something like
with open('filename.txt', 'r') as first:
with open('filename.txt', 'r') as second:
print(first.read())
print(second.read())
will print the complete contents of filename.txt twice.
The only reason you'd need to reset the position when you're done reading a file is if you want to use the same file object to read the file again, or if you've opened the file in read/write mode (r+ rather than r) and you now want to switch from reading to writing.
But I've found that if I run the same Python script in multiple terminals at the same time, I don't get any error; it just successfully runs the script in both. How does this work? Doesn't Python need to read my script from the beginning in order to run it? Is the script copied to a temporary file, or something?
As I think should now be clear — there's no problem here. There's no reason that two instances of Python can't both read the same script file at the same time. Linux allows that. (And in fact, if you delete the file, Linux will keep the file on disk until all programs that had it open have either closed it or exited.)
In fact, there's also no reason that two processes can't write to the same file at the same time, though here you have to be very careful to avoid the processes causing problems for each other or corrupting the file.
terminal is just running the command you said it to execute, there is no pointer or anything
you jus
I wrote some code to pull certain lines from a large text file and noticed some strange things missing, so I ran the following code to make sure the for loop was actually hitting every line in the file:
xf=open("bigFile.txt", r)
xxf=open("newFile.txt",w)
for line in xf:
xxf.write(line)
This ends up not copying all the lines for some reason. Could anyone tell me what I'm not understanding or doing wrong? It ends up only making a file about 60-70% as big as it should be? Any insight would be greatly appreciated.
EDIT: Thanks for the input skrrgwasme & Shreevardhan. To clarify, my ultimate goal is not just to copy the file, in my working code I put some comparison operators before writing the line, for example:
for line in xf:
firstChar=line[:1]
if firstChar==1:
xxf.write(line)
That is why I am using the "for line in file". Should I do this some other way?
To copy a file, it's better to use functions from shutil module like copyfile(), copy(), or copy2().
For example
from shutil import copyfile, copy2
copyfile('bigFile.txt', 'newFile.txt')
or
copy2('bigFile.txt', 'newFile.txt')
You need to close your file. There's no guarantee that buffers you're writing into are being flushed to disk before your script exits. You can do this very easily by using a context manager:
with open("bigFile.txt") as xf, open("newFile.txt", "w") as xxf:
for line in xf:
xxf.write(line)
In your current code, you would write xf.close() and xxf.close(), but using a context manager like this will handle it for you, and even close the files if an exception occurs.
Also, if you really are simply copying the file, you can also use shutil.copyfile().
Please can someone help me create a batch file that detects when its being copied.
I am pretty good with batch but all I want to do is put a security warning on my batch program like this: "Do Not Copy This File Or It Will Be Deleted!" then it deletes itself when the user try's to copy it (so it can't be stolen etc...)
A running program can lock the file so that nothing else can open it. I'm not sure how to do this in a batch script, but I assume that there's some way that it could lock itself. But if the file is just sitting there and no other running process has it locked, that won't work.
Why can't you use file permissions to prevent others from accessing the file?
Not that this will prevent anyone of reading the file and copy it nevertheless...
if not "%computername%#%~df0"=="AKOYA#C:\Users\Stephan\test\4\s.bat" echo this has been copied!! & del %~df0
The most naive, worst way I can think of to replace the contents of a file is:
f = open('file.txt', 'w')
f.write('stuff')
f.close()
Obviously, if that operation fails at some point before closing, you lose the contents of the original file while not necessarily finishing the new content.
So, what is the completely proper way to do this (if there is one). I imagine it's something like:
f = open('file.txt.tmp', 'w')
f.write('stuff')
f.close()
move('file.txt.tmp', 'file.txt') # dangerous line?
But is that completely atomic and safe? What are the proper commands to actually perform the move. If I have another process with an open connection to file.txt I assume that it will hold on to its pointer to the original file until is closes. What if another process tries to open up file.txt in the middle of the move?
I don't really care what version of the file my processes get as long as they get a complete, uncorrupted version.
Your implementation of move should use the rename function, which is atomic. Processes opening the file will see either the old or the new contents, there is no middle state. If a process already has opened the file it will keep having access to the old version after move.