Problem with trying to encrypt .txts files in python - python-3.x

Ok so I made this simple encryption program (I am new to python) that encrypts every file in the directory it is being used (For learning purposes).
Here's the code:
import os
import time
txts = list()
for i in os.listdir():
if i.endswith('.txt'):
txts.append(i)
for i in txts:
filer = open(i, 'r')
st = str()
for j in filer.read():
st += chr(ord(j)+2)
filew = open(i,'w')
filew.write(st)
def decrypt():
for i in txts:
filer = open(i, 'r')
st = str()
for j in filer.read():
st += chr(ord(j)-2)
filew = open(i,'w')
filew.write(st)
So my problem is: It encrypts every single file txt file in the directory, besides the last one, always. The last file always gets overwritten with nothing, unlike all the others, no matter what txt is the last file. Ive checked the txts list and All the txt files in the directory. But the last file, just doesnt want to get encrypted. Lets say I put abcd in the file, after my program runs in the file there won't be a single thing.

When you put the "encryption" code in a function, the file objects returned by open go out of scope and are garbage collected. Part of that process for file objects is flushing write buffers and closing the files.
According to the documentation:
Warning: Calling f.write() without using the with keyword or calling f.close() might result in the arguments of f.write() not being completely written to the disk, even if the program exits successfully.

Related

Why would a python script keep running after the output is generated (strange behavior)?

Background: The purpose of this script is to take eight very large (~7GB) FASTQ files, subsample each, and concatenate each subsample into one "master" FASTQ file. The resulting file is about 60GB. Each file is subsampled to 120,000,000 lines.
The issue: The basic purpose of this script is to output a huge file. I have print statements & time stamps in my code so I know that it goes through the entire script, processes the input files and creates the output files. After I see the final print statement, I go to my directory and see that the output file has been generated, it's the correct size, and it was last modified a while ago, despite the fact that the script is still running. At this point, however, the code has still not finished running, and it will actually stall there for about 2-3 hours before I can enter anything into my terminal again.
My code is behaving like it gets stuck on the last line of the script even after it's finished creating the output file.
I'm hoping someone might be able to identify what's causing this weird behavior. Below is a dummy version of what my script does:
import random
import itertools
infile1 = "sample1_1.fastq"
inFile2 = "sample1_2.fastq"
with open(infile1, 'r') as file_1:
f1 = file_1.read()
with open(inFile2, 'r') as file_2:
f2 = file_2.read()
fastq1 = f1.split('\n')
fastq2 = f2.split('\n')
def subsampleFASTQ(compile1, compile2):
random.seed(42)
random_1 = random.sample(compile1, 30000000)
random.seed(42)
random_2 = random.sample(compile2, 30000000)
return random_1, random_2
combo1, combo2 = subsampleFASTQ(fastq1, fastq2)
with open('sampleout_1.fastq', 'w') as out1:
out1.write('\n'.join(str(i) for i in combo1))
with open('sampleout_2.fastq', 'w') as out2:
out2.write('\n'.join(str(i) for i in combo2))
My ideas of what it could be:
File size is causing some slowness
There is some background process running in this script that wont let it finish (but i have no idea how to debug that-- any resources would be appreciated)

system is not writing into original dictionary file

def update(login_info):
stids = 001
file = open('regis.txt', 'r+')
for line in file:
if stids in line:
x = eval(line)
print(x)
c = input('what course you would like to update >> ')
get = x.get(c)
print('This is your current mark for the course', get)
mark = input('What is the new mark? >>')
g = mark.upper()
x.update({c: g})
file.write(str(x))
Before writing into the file
After writing into the file
This is what happens in the idle
As you can see, the system is not writing the data into the original dictionary. How can we improve on that? Pls, explain in detail. Thx all
Python doesn't just make relations like that. In Python's perspective, you are reading a regular text file, executing a command from the line read. That command creates an object which has no relationship to the line it was created from. But writing to the file should still work in my opinion. But you moved a line further (because you read the line where the data was and now you are at the end of it).
When you read a file, the position of where we are on the file changes. Iterating over the file like that (i.e for line in file:) invokes implicitly next() on the file. For efficiency reasons, positioning is disabled (file.tell() will not tell the current position). When you wrote to the file, for some reason you appended the text to the end, and if you test it it will no longer continue the loop even though it is still on the second line.
Reading and writing at the same time looks like an undefined behaviour.
Beginner Python: Reading and writing to the same file

Why is the same function in python-chess returning different results?

I'm new to working with python-chess and I was perusing the official documentation. I noticed this very weird thing I just can't make sense of. This is from the documentation:
import chess.pgn
pgn = open("data/pgn/kasparov-deep-blue-1997.pgn")
first_game = chess.pgn.read_game(pgn)
second_game = chess.pgn.read_game(pgn)
So as you can see the exact same function pgn.read_game() results in two different games to show up. I tried with my own pgn file and sure enough first_game == second_game resulted in False. I also tried third_game = chess.pgn.read_game() and sure enough that gave me the (presumably) third game from the pgn file. How is this possible? If I'm using the same function shouldn't it return the same result every time for the same file? Why should the variable name matter(I'm assuming it does) unless programming languages changed overnight or there's a random function built-in somewhere?
The only way that this can be possible is if some data is changing. This could be data that chess.pgn.read_game reads from elsewhere, or could be something to do with the object you're passing in.
In Python, file-like objects store where they are in the file. If they didn't, then this code:
with open("/home/wizzwizz4/Documents/TOPSECRET/diary.txt") as f:
line = f.readline()
while line:
print(line, end="")
line = f.readline()
would just print the first line over and over again. When data's read from a file, Python won't give you that data again unless you specifically ask for it.
There are multiple games in this file, stored one after each other. You're passing in the same file each time, but you're not resetting the read cursor to the beginning of the file (f.seek(0)) or closing and reopening the file, so it's going to read the next data available – i.e., the next game.

Output data from subprocess command line by line

I am trying to read a large data file (= millions of rows, in a very specific format) using a pre-built (in C) routine. I want to then yeild the results of this, line by line, via a generator function.
I can read the file OK, but where as just running:
<command> <filename>
directly in linux will print the results line by line as it finds them, I've had no luck trying to replicate this within my generator function. It seems to output the entire lot as a single string that I need to split on newline, and of course then everything needs reading before I can yield line 1.
This code will read the file, no problem:
import subprocess
import config
file_cmd = '<command> <filename>'
for rec in (subprocess.check_output([file_cmd], shell=True).decode(config.ENCODING).split('\n')):
yield rec
(ENCODING is set in config.py to iso-8859-1 - it's a Swedish site)
The code I have works, in that it gives me the data, but in doing so, it tries to hold the whole lot in memory. I have larger files than this to process which are likely to blow the available memory, so this isn't an option.
I've played around with bufsize on Popen, but not had any success (and also, I can't decode or split after the Popen, though I guess the fact I need to split right now is actually my problem!).
I think I have this working now, so will answer my own question in the event somebody else is looking for this later ...
proc = subprocess.Popen(shlex.split(file_cmd), stdout=subprocess.PIPE)
while True:
output = proc.stdout.readline()
if output == b'' and proc.poll() is not None:
break
if output:
yield output.decode(config.ENCODING).strip()

file.read() not working as intended in string comparison

stackoverflow.
I've been trying to get the following code to create a .txt file, write some string on it and then print some message if said string was in the file. This is merely a study for a more complex project, but even given it's simplicity, it's still not working.
Code:
import io
file = open("C:\\Users\\...\\txt.txt", "w+") #"..." is the rest of the file destination
file.write('wololo')
if "wololo" in file.read():
print ("ok")
This function always skips the if as if there was no "wololo" inside the file, even though I've checked it all times and it was properly in there.
I'm not exactly sure what could be the problem, and I've spend a great deal of time searching everywhere for a solution, all to no avail. What could be wrong in this simple code?
Oh, and if I was to search for a string in a much bigger .txt file, would it still be wise to use file.read()?
Thanks!
When you write to your file, the cursor is moved to the end of your file. If you want to read the data aferwards, you'll have to move the cursor to the beginning of the file, such as:
file = open("txt.txt", "w+")
file.write('wololo')
file.seek(0)
if "wololo" in file.read():
print ("ok")
file.close() # Remember to close the file
If the file is big, you should consider to iterate over the file line by line instead. This would avoid that the entire file is stored in memory. Also consider using a context manager (the with keyword), so that you don't have to explicitly close the file yourself.
with open('bigdata.txt', 'rb') as ifile: # Use rb mode in Windows for reading
for line in ifile:
if 'wololo' in line:
print('OK')
else:
print('String not in file')

Resources