".DS" meaning in python - python-3.x

I know It's probably a very silly question, but could someone please tell me what's the meaning of ".DS" in the following function?? does it has a special python meaning or is it only used in this project and it's my bad that I didn't get it?
def load_paired_img_wrd(folder, word_vectors, use_word_vectors=True):
class_names = [fold for fold in os.listdir(folder) if ".DS" not in fold]
image_list = []
labels_list = []
paths_list = []
for cl in class_names:
splits = cl.split("_")
if use_word_vectors:
vectors = np.array([word_vectors[split] if split in word_vectors else np.zeros(shape=300) for split in splits])
class_vector = np.mean(vectors, axis=0)
subfiles = [f for f in os.listdir(folder + "/" + cl) if ".DS" not in f]
for subf in subfiles:
full_path = os.path.join(folder, cl, subf)
img = image.load_img(full_path, target_size=(224, 224))
x_raw = image.img_to_array(img)
x_expand = np.expand_dims(x_raw, axis=0)
x = preprocess_input(x_expand)
image_list.append(x)
if use_word_vectors:
labels_list.append(class_vector)
paths_list.append(full_path)
img_data = np.array(image_list)
img_data = np.rollaxis(img_data, 1, 0)
img_data = img_data[0]
return img_data, np.array(labels_list), paths_list

this is probably trying to filter out the junk .DS_Store files that appear on macos
The file .DS_Store is created in any directory (folder) accessed by the Finder application

That's a text value. It could as easily have been "BS" or "Foo!" and the code would operate in the same way.
In this case, the program is looking to see if there are files with the string ".DS" in them, and removing them from a list.

Related

Rename multiple files with python

I'm trying to create a program to rename multiple files at once. This would be through Python, and I realize I'm recreating the wheel but I'm trying to understand what I'm doing wrong. Any help would be greatly appreciated. Program.......
import os
path = "LOCATION"
dir_list = os.listdir(path)
myList = []
for x in dir_list:
if x.endswith(".mp3"):
f1 = x.split("-")
ln1 = f1[0] # Band Name
ln2 = f1[1] # Album Title
ln3 = f1[2] # Track number
ln4 = f1[3] # Song Name
newname = x.join(ln2 + ln3)
os.rename(x, newname)
print(newname)
Your error:
line 14, in <module> os.rename(x, newname) -> FileNotFoundError: [WinError 2] The system cannot find the file specified:
...Is likely due to the path not being included in your os.rename() call, I suggest changing os.rename(x, newname) to os.rename(path + x, path + newname) which will solve that issue.
I also noticed some funky behavior with the way you were grabbing the song information, so if you have any further issues, here's the code I used to debug your original issue which seems to have the result you're going for:
import os
path = "C:\\Users\\Pepe\\Documents\\StackOverflow\\73430533\\"
dir_list = os.listdir(path)
for x in dir_list:
if x.endswith(".mp3"):
# I ignore the ".mp3" to keep the file names clean
nameDetails = x.split('.mp3')[0].split('-')
bandName = nameDetails[0]
albumTitle = nameDetails[1]
trackNumber = nameDetails[2]
songName = nameDetails[3]
newName = f"{albumName} | {trackName}.mp3"
print(f"Renaming \"{x}\" to \"{newName}\"")
os.rename(path + x, path + newName)

Plotting multiple lines with a Nested Dictionary, and unknown variables to Line Graph

I was able to find somewhat of an answer to my question, but it was not as nested as my dictionary and so I am really unsure how to proceed as I am still very new to python. I currently have a nested dictionary like
{'140.10': {'46': {'1': '-49.50918', '2': '-50.223637', '3': '49.824406'}, '28': {'1': '-49.50918', '2': '-50.223637', '3': '49.824406'}}}:
I am wanting to plot it so that '140.10' becomes the title of the graph and '46' and '28' become the individual lines and key '1' for example is on the y axis and the x axis is the final number (in this case '-49.50918). Essentially a graph like this:
I generated this graph with a csv file that is written at another part of the code just with excel:
[![enter image description here][2]][2]
The problem I am running into is that these keys are autogenerated from a larger csv file and I will not know their exact value until the code has been run. As each of the keys are autogenerated in an earlier part of the script. As I will be running it over various files called the Graph name, and each file will have a different values for:
{key1:{key2_1: {key3_1: value1, key3_2: value2, key3_3: value3}, key_2_2 ...}}}
I have tried to do something like this:
for filename in os.listdir(Directory):
if filename.endswith('.csv'):
q = filename.split('.csv')[0]
s = q.split('_')[0]
if s in time_an_dict:
atom = list(time_an_dict[s])
ion = time_an_dict[s]
for f in time_an_dict[s]:
x_val = []
y_val = []
fz = ion[f]
for i in time_an_dict[s][f]:
pos = (fz[i])
frame = i
y_val.append(frame)
x_val.append(pos)
'''ions = atom
frame = frames
position = pos
plt.plot(frame, position, label = frames)
plt.xlabel("Frame")
plt.ylabel("Position")
plt.show()
#plt.savefig('{}_Pos.png'.format(s))'''
But it has not run as intended.
I have also tried:
for filename in os.listdir(Directory):
if filename.endswith('_Atom.csv'):
q = filename.split('.csv')[0]
s = q.split('_')[0]
if s in window_dict:
name = s + '_Atom.csv'
time_an_dict[s] = analyze_time(name,window_dict[s])
new = '{}_A_pos.csv'.format(s)
ions = list(time_an_dict.values())[0].keys()
for i in ions:
x_axis_values = []
y_axis_values = []
frame = list(time_an_dict[s][i])
x_axis_values.append(frame)
empty = []
print(x_axis_values)
for x in frame:
values = time_an_dict[s][i][x]
empty.append(values)
y_axis_values.append(empty)
plt.plot(x_axis_values, y_axis_values, label = x )
plt.show()
But keep getting the error:
Traceback (most recent call last): File "Atoms_pos.py", line 175, in
plt.plot(x_axis_values, y_axis_values, label = x ) File "/Users/hxb51/opt/anaconda3/lib/python3.8/site-packages/matplotlib/pyplot.py",
line 2840, in plot
return gca().plot( File "/Users/hxb51/opt/anaconda3/lib/python3.8/site-packages/matplotlib/axes/_axes.py",
line 1743, in plot
lines = [*self._get_lines(*args, data=data, **kwargs)] File "/Users/hxb51/opt/anaconda3/lib/python3.8/site-packages/matplotlib/axes/_base.py",
line 273, in call
yield from self._plot_args(this, kwargs) File "/Users/hxb51/opt/anaconda3/lib/python3.8/site-packages/matplotlib/axes/_base.py",
line 394, in _plot_args
self.axes.xaxis.update_units(x) File "/Users/hxb51/opt/anaconda3/lib/python3.8/site-packages/matplotlib/axis.py",
line 1466, in update_units
default = self.converter.default_units(data, self) File "/Users/hxb51/opt/anaconda3/lib/python3.8/site-packages/matplotlib/category.py",
line 107, in default_units
axis.set_units(UnitData(data)) File "/Users/hxb51/opt/anaconda3/lib/python3.8/site-packages/matplotlib/category.py",
line 176, in init
self.update(data) File "/Users/hxb51/opt/anaconda3/lib/python3.8/site-packages/matplotlib/category.py",
line 209, in update
for val in OrderedDict.fromkeys(data): TypeError: unhashable type: 'numpy.ndarray'
Here is the remainder of the other parts of the code that generate the files and dictionaries I am using. I was told in another question I asked that this could be helpful.
# importing dependencies
import math
import sys
import pandas as pd
import MDAnalysis as mda
import os
import numpy as np
import csv
import matplotlib.pyplot as plt
################################################################################
###############################################################################
Directory = '/Users/hxb51/Desktop/Q_prof/Displacement_Charge/Blah'
os.chdir(Directory)
################################################################################
''' We are only looking at the positions of the CLAs and SODs and not the DRUDE counterparts. We are assuming the DRUDE
are very close and it is not something that needs to be concerned with'''
def Positions(dcd, topo):
fields = ['Window', 'ION', 'ResID', 'Location', 'Position', 'Frame', 'Final']
with open('{}_Atoms.csv'.format(s), 'a') as d:
writer = csv.writer(d)
writer.writerow(fields)
d.close()
CLAs = u.select_atoms('segid IONS and name CLA')
SODs = u.select_atoms('segid IONS and name SOD')
CLA_res = len(CLAs)
SOD_res = len(SODs)
frame = 0
for ts in u.trajectory[-10:]:
frame +=1
CLA_pos = CLAs.positions[:,2]
SOD_pos = SODs.positions[:,2]
for i in range(CLA_res):
ids = i + 46
if CLA_pos[i] < 0:
with open('{}_Atoms.csv'.format(s), 'a') as q:
new_line = [s,'CLA', ids, 'Bottom', CLA_pos[i], frame,10]
writes = csv.writer(q)
writes.writerow(new_line)
q.close()
else:
with open('{}_Atoms.csv'.format(s), 'a') as q:
new_line = [s,'CLA', ids, 'Top', CLA_pos[i], frame, 10]
writes = csv.writer(q)
writes.writerow(new_line)
q.close()
for i in range(SOD_res):
ids = i
if SOD_pos[i] < 0:
with open('{}_Atoms.csv'.format(s), 'a') as q:
new_line = [s,'SOD', ids, 'Bottom', SOD_pos[i], frame,10]
writes = csv.writer(q)
writes.writerow(new_line)
q.close()
else:
with open('{}_Atoms.csv'.format(s), 'a') as q:
new_line = [s,'SOD', ids, 'Top', SOD_pos[i], frame, 10]
writes = csv.writer(q)
writes.writerow(new_line)
q.close()
csv_Data = pd.read_csv('{}_Atoms.csv'.format(s))
filename = s + '_Atom.csv'
sorted_df = csv_Data.sort_values(["ION", "ResID", "Frame"],
ascending=[True, True, True])
sorted_df.to_csv(filename, index = False)
os.remove('{}_Atoms.csv'.format(s))
''' this function underneath looks at the ResIds, compares them to make sure they are the same and then counts how many
times the ion flip flops around the boundaries'''
def turn_dict(f):
read = open(f)
reader = csv.reader(read, delimiter=",", quotechar = '"')
my_dict = {}
new_list = []
for row in reader:
new_list.append(row)
for i in range(len(new_list[:])):
prev = i - 1
if new_list[i][2] == new_list[prev][2]:
if new_list[i][3] != new_list[prev][3]:
if new_list[i][2] in my_dict:
my_dict[new_list[i][2]] += 1
else:
my_dict[new_list[i][2]] = 1
return my_dict
def plot_flips(f):
dict = turn_dict(f)
ions = list(dict.keys())
occ = list(dict.values())
plt.bar(range(len(dict)), occ, tick_label = ions)
plt.title("{}".format(s))
plt.xlabel("Residue ID")
plt.ylabel("Boundary Crosses")
plt.savefig('{}_Flip.png'.format(s))
def analyze_time(f, dicts):
read = open(f)
reader = csv.reader(read, delimiter=",", quotechar='"')
new_list = []
keys = list(dicts.keys())
time_dict = {}
pos_matrix = {}
for row in reader:
new_list.append(row)
fields = ['ResID', 'Position', 'Frame']
with open('{}_A_pos.csv'.format(s), 'a') as k:
writer = csv.writer(k)
writer.writerow(fields)
k.close()
for i in range(len(new_list[:])):
if new_list[i][2] in keys:
with open('{}_A_pos.csv'.format(s), 'a') as k:
new_line = [new_list[i][2], new_list[i][4], new_list[i][5]]
writes = csv.writer(k)
writes.writerow(new_line)
k.close()
read = open('{}_A_pos.csv'.format(s))
reader = csv.reader(read, delimiter=",", quotechar='"')
time_list = []
for row in reader:
time_list.append(row)
for j in range(len(keys)):
for i in range(len(time_list[1:])):
if time_list[i][0] == keys[j]:
pos_matrix[time_list[i][2]] = time_list[i][1]
time_dict[keys[j]] = pos_matrix
return time_dict
window_dict = {}
for filename in os.listdir(Directory):
s = filename.split('.dcd')[0]
fors = s + '.txt'
topos = '/Users/hxb51/Desktop/Q_prof/Displacement_Charge/topo.psf'
if filename.endswith('.dcd'):
print('We are starting with {} \n '.format(s))
u = mda.Universe(topos, filename)
Positions(filename, topos)
name = s + '_Atom.csv'
plot_flips(name)
window_dict[s] = turn_dict(name)
continue
time_an_dict = {}
for filename in os.listdir(Directory):
if filename.endswith('.csv'):
q = filename.split('.csv')[0]
s = q.split('_')[0]
if s in window_dict:
name = s + '_Atom.csv'
time_an_dict[s] = analyze_time(name,window_dict[s])
for filename in os.listdir(Directory):
if filename.endswith('.csv'):
q = filename.split('.csv')[0]
s = q.split('_')[0]
if s in time_an_dict:
atom = list(time_an_dict[s])
ion = time_an_dict[s]
for f in time_an_dict[s]:
x_val = []
y_val = []
fz = ion[f]
for i in time_an_dict[s][f]:
pos = (fz[i])
frame = i
y_val.append(frame)
x_val.append(pos)
'''ions = atom
frame = frames
position = pos
plt.plot(frame, position, label = frames)
plt.xlabel("Frame")
plt.ylabel("Position")
plt.show()
#plt.savefig('{}_Pos.png'.format(s))'''
Everything here runs well except this last bottom block of code. That deals with trying to make a graph from a nested dictionary. Any help would be appreciated!
Thanks!
I figured out the answer:
for filename in os.listdir(Directory):
if filename.endswith('_Atom.csv'):
q = filename.split('.csv')[0]
s = q.split('_')[0]
if s in window_dict:
name = s + '_Atom.csv'
time_an_dict[s] = analyze_time(name,window_dict[s])
new = '{}_A_pos.csv'.format(s)
ions = list(time_an_dict[s])
plt.yticks(np.arange(-50, 50, 5))
plt.xlabel('Frame')
plt.ylabel('Z axis position(Ang)')
plt.title([s])
for i in ions:
x_value = []
y_value = []
time_frame =len(time_an_dict[s][i]) +1
for frame in range(1,time_frame):
frame = str(frame)
x_value.append(int(frame))
y_value.append(float(time_an_dict[s][i][frame]))
plt.plot(x_value, y_value, label=[i])
plt.xticks(np.arange(1, 11, 1))
plt.legend()
plt.savefig('{}_Positions.png'.format(s))
plt.clf()
os.remove("{}_A_pos.csv".format(s))
From there, with the combo of the other parts of the code, it produces these graphs:
For more than 1 file as long as there is more '.dcd' files.

Why can't I split files when generating some TFrecord files?

Why can't I split files when generating some TFrecords files?
I'm doing some job predicting protein stuctures. As you may know, one protein molecule might have different strands. So I need to split the list of the atoms into different TFrecords by the strand name.
The problem is, this code ended up by generating several TFrecords with nothing written. All blank.
Or, is there a method to split the strands while training my module? Then I could ignore this problem and put the strand name in the TFrecords as a feature.
'''
with all module imported and no errors raised
'''
def generate_TFrecord(intPosition, endPosition, path):
CrtS = x #x is the name of the current strand
path = path + CrtS
writer = tf.io.TFRecordWriter('%s.tfrecord' %path)
for i in range(intPosition, endPosition):
if identifyCoreCarbon(i):
vectros = getVectors(i)
features = {}
'''
feeding this dict
'''
tf_features = tf.train.Features(feature = features)
tf_example = tf.train.Example(features = tf_features)
tf_serialized = tf_example.SerializeToString()
writer.write(tf_serialized)
'''
if checkStrand(i) == False:
writer.write(tf_serialized)
intPosition = i
'''
writer.close()
'''
strand_index is a list of all the startpoint of a single strand
'''
for loop in strand_index:
generate_TFrecord(loop, endPosition, path)
'''
________division___________
This code below works, but only generate a single tfrecord containing all the atom imformations.
writer = tf.io.TFRecordWriter('%s.tfrecord' %path)
for i in range(0, endPosition):
if identifyCoreCarbon(i):
vectros = getVectors(i)
features = {}
'''
feeing features
'''
tf_features = tf.train.Features(feature = features)
tf_example = tf.train.Example(features = tf_features)
tf_serialized = tf_example.SerializeToString()
writer.write(tf_serialized)
writer.close()
'''

how to repeat a script in another code with another input data

I am newbie in python I wrote a code that in this I load a txt file and I get my result in another txt file, and I want to repeat this code for other txt files that I have all of them in same folder. I want to load almost 300 txt files and do this, but I don't know how do that. thanks
dat = np.loadtxt('test1.txt')
x = dat[:, 0]
y = dat[:, 2]
peak = LorentzianModel()
constant = ConstantModel()
pars = peak.guess(y, x=x)
pars.update( constant.make_params())
pars['c'].set(1.04066)
mod = peak + constant
out=mod.fit(y, pars, x=x)
comps = out.eval_components(x=x)
writer = (out.fit_report(min_correl=0.25))
path = '/Users/dellcity/Desktop/'
filename = 'output.txt'
with open(path + filename, 'wt') as f:
f.write(writer)
you need to define a function that gets the filename as a parameter and in the main part of your programm create a loop in which you find all files which you want to load and then call the function, e.g.:
import os
def myFunction(filename):
dat = np.loadtxt(filename)
x = dat[:, 0]
y = dat[:, 2]
peak = LorentzianModel()
constant = ConstantModel()
pars = peak.guess(y, x=x)
pars.update( constant.make_params())
pars['c'].set(1.04066)
mod = peak + constant
out=mod.fit(y, pars, x=x)
comps = out.eval_components(x=x)
writer = (out.fit_report(min_correl=0.25))
path = '/Users/dellcity/Desktop/'
filename = 'output.txt'
# open in mode a = append
with open(path + filename, 'at') as f:
f.write(writer)
# the parameter of os.listdir is the path to your file,
# change to the path of your data files
for filename in os.listdir('.'):
if filename.endswith(".txt"):
myFunction(filename)

reading textfile returning empty variable in tensorflow

I have a text file which has 110 rows and 1024 columns of float values. I am trying to load the textfile and it doesnt read any thing.
filename = '300_faults.txt'
filename_queue = tf.train.string_input_producer([filename])
reader = tf.TextLineReader()
_,a = reader.read(filename_queue)
#x = np.loadtxt('300_faults.txt') # working
#a = tf.constant(x,tf.float32) # working
model = tf.initialize_all_variables()
with tf.Session() as session:
session.run(model)
print(session.run(tf.shape(a)))
printing the shape of the variable returns [].
Firstly - tf.shape(a) == [] doesn't mean that variable is empty. All scalars and strings have shape [].
https://www.tensorflow.org/programmers_guide/dims_types
May be you can check "rank" instead - it would be 0 for scalars and strings.
Other than that it looks like string_input_producer is a queue and it needs additional wiring to make ti work.
Please try this
filename = '300_faults.txt'
filename_queue = tf.train.string_input_producer([filename])
reader = tf.TextLineReader()
_,a = reader.read(filename_queue)
#x = np.loadtxt('300_faults.txt') # working
#a = tf.constant(x,tf.float32) # working
model = tf.initialize_all_variables()
with tf.Session() as session:
session.run(model)
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(coord=coord)
print(session.run(tf.shape(a)))
print(session.run((a)))
coord.request_stop()
coord.join(threads)

Resources