ImportError: cannot import name 'count_hi' from 'countstring' - python-3.x

I have two .py files. When I try to run starting.py it just says "ImportError: cannot import name 'count_hi' from 'countstring' (C:\Users\suspended.mirror\PycharmProjects\blankpage\venv\countstring.py)". How can I get it to run?
I've tried various variations of renaming the import, adding .py, ensuring both .py files are in the same directory but still wrong apparently.
starting.py
from countstring import count_hi
count_hi("testhi")
countstring.py
class countstring:
def count_hi(str):
k = 0
i = 0
n = 0
is_hi = ""
hi_count = 0
while k < (len(str) - 1):
first_letter = str[0 + i]
second_letter = str[1 + n]
is_hi = first_letter + second_letter
i += 1
n += 1
k += 1
if is_hi == "hi":
hi_count += 1
print(hi_count)

Python needs to know which directory that module is in. Try using a relative import:
from .countstring import count_hi
Hope this helps!

I actually figured it out myself.
The second module countstring.py didn't need the "class countstring:" at the top. I don't know why that is but possibly because that module wasn't creating an object. It was just defining a function.

Related

Running MPI python script in MPI azure ml pipeline

I'm trying to run distributed python job through azure ML pipelines using MPIStep pipeline class, by referring to the below example link - https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/machine-learning-pipelines/pipeline-style-transfer/pipeline-style-transfer.ipynb
I tried implemented the same but even I change the node count parameter in MpiStep class, while running the script the it shows size (i.e comm.Get_size()) as 1 always. Can you please help me in what I'm missing here. Is there any specific setup required on the cluster?
Code snippets:
Pipeline code snippet:
model_dir = model_ds.path('./'+saved_model_blob+'/',data_reference_name='saved_model_path').as_mount()
label_dir = model_ds.path('./'+model_label_blob+'/',data_reference_name='model_label_blob').as_mount()
input_images = result_ds.path('./'+score_blob_name+'/',data_reference_name='Input_images').as_mount()
output_container = 'abc'
inti_container = 'xyz'
distributed_batch_score_step = MpiStep(
name="batch_scoring",
source_directory=SCRIPT_FOLDER,
script_name="batch_scoring_script_mpi.py",
arguments=["--dataset_path", input_images,
"--model_name", model_dir,
"--label_dir", label_dir,
"--intermediate_data_container", inti_container,
"--output_container", output_container],
compute_target=gpu_cluster,
inputs=[input_images, model_dir,label_dir],
pip_packages=["tensorflow","tensorflow-gpu==1.13.1","pillow","azure-keyvault","azure-storage-blob"],
conda_packages=["mesa-libgl-cos6-x86_64","mpi4py==3.0.2","opencv=3.4.2","scikit-learn=0.21.2"],
use_gpu=True,
allow_reuse = False,
node_count = nodecount_param,
process_count_per_node = 1
)
Python Script code snippet:
def run(input_dataset,comm):
rank = comm.Get_rank()
size = comm.Get_size()
print("Rank:" , rank)
print("Size:", size) # shows always 1, even the input node count is >1
print(MPI.Get_processor_name())
file_names = get_file_names(args.dataset_path)
sorted(file_names)
partition_size = len(file_names) // size
print("partition_size-->",partition_size)
partitioned_filenames = file_names[rank * partition_size: (rank + 1) * partition_size]
print("RANK {} - is processing {} images out of the total {}".format(rank, len(partitioned_filenames),
len(file_names)))
# call to Function 01
# call to Function 02
img_names = score_df['image_name'].unique()
output_batch = pd.DataFrame()
for i in img_names:
# call to Function 3
output_batch = output_batch.append(pp_output, ignore_index=True)
output_paths_list = comm.gather(output_batch, root=0)
print("RANK {} - number of pre-aggregated output files {}".format(rank, len(output_batch)))
print("saved in", currentDT + '\\' + 'data.csv')
if rank == 0:
print("RANK {} - number of aggregated output files {}".format(rank, len(output_paths_list)))
print("RANK {} - end".format(rank))
if __name__ == "__main__":
with tf.device('/GPU:0'):
init()
comm = MPI.COMM_WORLD
run(args.dataset_path,comm)
Got to know the issue is due to package version, earlier it is installed via conda with conda_packages=["mpi4py==3.0.2"], it worked after changing the install through pip - pip_packages=["mpi4py"]

Python script to move oldest 1000 file into another directory

Here is my code with reads the input from a config file and moving files to another directory based on a condition and logs the information to a log file
import shutil
import configparser
import logging.handlers
import os
#Reading the input configuration
config = configparser.ConfigParser()
config.read("config_input.ini")
src_filepath = (config.get("Configuration Inputs","src_filepath"))
dst_filepath = (config.get("Configuration Inputs","dst_filepath"))
log_file_name = (config.get("Configuration Inputs","log_file_name"))
file_limit = int((config.get("Configuration Inputs","file_limit")))
if not os.path.exists (dst_filepath):
os.makedirs(dst_filepath)
onlyfiles_in_dst = next ( os.walk ( dst_filepath ) ) [ 2 ]
file_count_indst = len ( onlyfiles_in_dst )
onlyfiles_in_src = next ( os.walk ( src_filepath ) ) [ 2 ]
file_count_insrc = len ( onlyfiles_in_src )
def sorted_ls(src_filepath):
mtime = lambda f: os.stat(os.path.join(src_filepath, f)).st_mtime
return list(sorted(os.listdir(src_filepath), key=mtime))
move_list = sorted_ls(src_filepath)
#print (move_list)
if file_count_indst < file_limit:
for mfile in move_list:
shutil.move(src_filepath + '\\' + mfile, dst_filepath)
**#Logging everything**
logger = logging.getLogger()
logging.basicConfig(filename=log_file_name, format='%(asctime)s %(message)s', filemode='a')
logger.setLevel(logging.INFO)
logger.info('Number of files moved from source ' + str(len(move_list)))
But the problem is I want to move only the 1000 files from source to destination.
Something like
"ls -lrt| head ls -lrt | head -n 1000"
which I can not do iy as I am running this script on Windows platform.
Please suggest a proper way to do it.
Also please suggest how can I put it under a user defined class and may be can use in some other program.
Can't a simple counter be the solution?
if file_count_indst < file_limit:
count=0;
for mfile in move_list:
shutil.move(src_filepath + '\\' + mfile, dst_filepath)
count = count +1
if count==1000:
break

How to create folder tree from list with paths

I am trying to put every path of a folder structure I need to be created into a list and then create all of them with os.makedirs() but something goes wrong. Only the Head-Folders are created, not the Sub-Folders.
def output_folders(trcpaths):
#trcpath is a list with several paths, example: ['/home/usr/folder1', '/home/usr/folder2']
global outputfolders
outputfolders = []
#Create Paths
for x, j in enumerate(trcpaths):
for i in os.listdir(trcpaths[x]):
if i.endswith('trc'):
folderpath1 = (j + '/' + i).split('.')[0] #/home/usr/folder1/outputfolder
folderpath2 = folderpath1 + '/Steps' #/home/usr/folder1/outputfolder/Steps
folderpath3 = folderpath2 + '/Step_1' #/home/usr/folder1/outputfolder/Steps/Step_1
folderpath4 = folderpath2 + '/Step_2'
folderpath5 = folderpath2 + '/Step_3'
folderpath6 = folderpath2 + '/Step_4'
folderpath7 = folderpath2 + '/Threshold'
outputfolders.append(folderpath1)
outputfolders.append(folderpath2)
outputfolders.append(folderpath3)
outputfolders.append(folderpath4)
outputfolders.append(folderpath5)
outputfolders.append(folderpath6)
outputfolders.append(folderpath7)
#Create Folders
for j, i in enumerate(outputfolders):
print(i)
if os.path.exists(i):
if j == 0:
input('The Output-Folder already exists! Overwrite?' )
shutil.rmtree(i)
os.makedirs(i)
Although when I print(i) the right folderpaths are printed but only the "Head-Folderpaths" are created like /home/usr/folder1/outputfolder and not all subsequent folderpaths. Why so?
This is what I get:
/home/usr/folder1/outputfolder
/home/usr/folder2/outputfolder
But this is what I need:
/home/usr/folder1/outputfolder
/home/usr/folder1/outputfolder/Steps
/home/usr/folder1/outputfolder/Steps/Step_1
/home/usr/folder1/outputfolder/Steps/Step_2
/home/usr/folder1/outputfolder/Steps/Step_3
/home/usr/folder1/outputfolder/Steps/Step_4
/home/usr/folder1/outputfolder/Steps/Threshold
/home/usr/folder2/outputfolder
/home/usr/folder2/outputfolder/Steps
/home/usr/folder2/outputfolder/Steps/Step_1
/home/usr/folder2/outputfolder/Steps/Step_2
/home/usr/folder2/outputfolder/Steps/Step_3
/home/usr/folder2/outputfolder/Steps/Step_4
/home/usr/folder2/outputfolder/Steps/Threshold
to keep your logic and your coding, with this code:
for j, i in enumerate(outputfolders):
print(i)
if os.path.exists(i):
if j == 0:
input('The Output-Folder already exists! Overwrite?' )
shutil.rmtree(i)
os.makedirs(i)
you dont create folder..you only delete the existing folder and recreate if it already exists
i'll add else to complete the operation:
for j, i in enumerate(outputfolders):
print(i)
if os.path.exists(i):
if j == 0:
input('The Output-Folder already exists! Overwrite?' )
shutil.rmtree(i)
os.makedirs(i)
else:
os.makedirs(i)
Try this (Tested on my Windows machine but should work on Linux as well)
import os
NUM_OF_STEPS = 5
def make_output_folders(trc_paths):
output_folders = []
for idx, path in enumerate(trc_paths):
for leaf in os.listdir(path):
if leaf.endswith('trc') and os.path.isdir(os.path.join(path, leaf)):
trc_folder = os.path.join(path, leaf)
output_folders.append(os.path.join(trc_folder, 'output_folder', 'Steps'))
steps_folder = output_folders[-1]
for x in range(1, NUM_OF_STEPS):
output_folders.append(os.path.join(steps_folder, 'Step_{}'.format(x)))
output_folders.append(os.path.join(trc_folder,'output_folder', 'Threshold'))
for _path in output_folders:
print(_path)
if not os.path.exists(_path):
os.makedirs(_path)
output_folders = []
# 'folder_1' contains a sub folder named '1_trc'
# 'folder_2' contains a sub folder named '2_trc'
make_output_folders(['c:\\temp\\55721430\\folder1', 'c:\\temp\\55721430\\folder2'])

Python save panda to csv files separately within FOR loop

I would like to save down some panda dataframe downloads separately as csv files. I am getting an error in the last line.
Is the syntax off?
Kind regards
sortedByISIN = pd.DataFrame()
for i in data['isin'].unique():
print('Adding ' + i)
d1 = data[data['isin'] == i]
d1['next_signal'] = d1['signal'].shift(-1)
#Shift x periods in the future
d1['futprice'] = d1['mid'].shift(-6)
d1['futT'] = d1['creationTimeStamp'].shift(-6)
d1['move'] = d1.apply(lambda row: (row['futprice'] - row['mid'])/row['mid'] * 10000 if row['futT'] - row['creationTimeStamp'] < 300000 else 0, axis=1)
d1['signal_transition'] = d1['next_signal'] - d1['signal']
sortedByISIN = sortedByISIN.append(d1)
sortedByISIN['period'] = np.floor(sortedByISIN.creationTimeStamp/3600000)
sortedByISIN.to_csv('Book'%i.csv')
or you can use also:
sortedByISIN.to_csv('Book' + str(i) + '.csv')
Use format:
sortedByISIN.to_csv('Book{}.csv'.format(i))
And for python 3.6+ is possible use f-strings:
sortedByISIN.to_csv(f'Book{i}.csv')

How to convert cmudict-0.7b or cmudict-0.7b.dict in to FST format to use it with phonetisaurus?

I am looking for a simple procedure to generate FST (finite state transducer) from cmudict-0.7b or cmudict-0.7b.dict, which will be used with phonetisaurus.
I tried following set of commands (phonetisaurus Aligner, Google NGramLibrary and phonetisaurus arpa2wfst) and able to generate FST but it didn't work. I am not sure where I did a mistake or miss any step. I guess very first command ie phonetisaurus-align, is not correct.
phonetisaurus-align --input=cmudict.dict --ofile=cmudict/cmudict.corpus --seq1_del=false
ngramsymbols < cmudict/cmudict.corpus > cmudict/cmudict.syms
/usr/local/bin/farcompilestrings --symbols=cmudict/cmudict.syms --keep_symbols=1 cmudict/cmudict.corpus > cmudict/cmudict.far
ngramcount --order=8 cmudict/cmudict.far > cmudict/cmudict.cnts
ngrammake --v=2 --bins=3 --method=kneser_ney cmudict/cmudict.cnts > cmudict/cmudict.mod
ngramprint --ARPA cmudict/cmudict.mod > cmudict/cmudict.arpa
phonetisaurus-arpa2wfst-omega --lm=cmudict/cmudict.arpa > cmudict/cmudict.fst
I tried fst with phonetisaurus-g2p as follows:
phonetisaurus-g2p --model=cmudict/cmudict.fst --nbest=3 --input=HELLO --words
But it didn't return anything....
Appreciate any help on this matter.
It is very important to keep dictionary in the right format. Phonetisaurus is very sensitive about that, it requires word and phonemes to be tab separated, spaces would not work then. It also does not allow pronunciation variant numbers CMUSphinx uses like (2) or (3). You need to cleanup dictionary with simple python script for example before feeding it into phonetisaurus. Here is the one I use:
#!/usr/bin/python
import sys
if len(sys.argv) != 3:
print "Split the list on train and test sets"
print
print "Usage: traintest.py file split_count"
exit()
infile = open(sys.argv[1], "r")
outtrain = open(sys.argv[1] + ".train", "w")
outtest = open(sys.argv[1] + ".test", "w")
cnt = 0
split_count = int(sys.argv[2])
for line in infile:
items = line.split()
if items[0][-1] == ')':
items[0] = items[0][:-3]
if items[0].find("_") > 0:
continue
line = items[0] + '\t' + " ".join(items[1:]) + '\n'
if cnt % split_count == 3:
outtest.write(line)
else:
outtrain.write(line)
cnt = cnt + 1

Resources