python: multiple functions or abstract classes when dealing with data flow requirement - python-3.x

I have more of a design question, but I am not sure how to handle that. I have a script preprocessing.py where I read a .csv file of text column that I would like to preprocess by removing punctuations, characters, ...etc.
What I have done now is that I have written a class with several functions as follows:
class Preprocessing(object):
def __init__(self, file):
self.my_data = pd.read_csv(file)
def remove_punctuation(self):
self.my_data['text'] = self.my_data['text'].str.replace('#','')
def remove_hyphen(self):
self.my_data['text'] = self.my_data['text'].str.replace('-','')
def remove_words(self):
self.my_data['text'] = self.my_data['text'].str.replace('reference','')
def save_data(self):
self.my_data.to_csv('my_data.csv')
def preprocessing(file_my):
f = Preprocessing(file_my)
f.remove_punctuation()
f.remove_hyphen()
f.remove_words()
f.save_data()
return f
if __name__ == '__main__':
preprocessing('/path/to/file.csv')
although it works fine, i would like to be able to expand the code easily and have smaller classes instead of having one large class. So i decided to use abstract class:
import pandas as pd
from abc import ABC, abstractmethod
my_data = pd.read_csv('/Users/kgz/Desktop/german_web_scraping/file.csv')
class Preprocessing(ABC):
#abstractmethod
def processor(self):
pass
class RemovePunctuation(Preprocessing):
def processor(self):
return my_data['text'].str.replace('#', '')
class RemoveHyphen(Preprocessing):
def processor(self):
return my_data['text'].str.replace('-', '')
class Removewords(Preprocessing):
def processor(self):
return my_data['text'].str.replace('reference', '')
final_result = [cls().processor() for cls in Preprocessing.__subclasses__()]
print(final_result)
So now each class is responsible for one task but there are a few issues I do not know how to handle since I am new to abstract classes. first, I am reading the file outside the classes, and I am not sure if that is good practice? if not, should i pass it as an argument to the processor function or have another class who is responsible to read the data.
Second, having one class with several functions allowed for a flow, so every transformation happened in order (i.e, first punctuation is removes, then hyphen is removed,...etc) but I do not know how to handle this order and dependency in abstract classes.

Related

Python create dynamic class and set multi bases from imported module

I found several example here, but is not what exactly looking for, will try to explain here
from this answer tried to achieve my result but is not what looking for
How can I dynamically create derived classes from a base class
i have a module that holds many classes
importing the module
import importlib
# are awalable many classes here
forms = importlib.import_module('my_forms')
Now, based on forms i need to create a new class and add bases to my new class all classes that are availabe in forms
this what i tried, but can not find a way to assign the bases
import inspect
def create_DynamicClass():
class DynamicClass(BaseClass):
pass
for form_name, class_name in inspect.getmembers(forms):
for i in class_name():
# here the code to added all bases to DynamicClass
return DynamicClass()
example how my_forms module looks
class MyClass1(BaseClass):
attr1 = 1
attr2 = 2
#coroutine
def prepare(self):
# some code for each class
class MyClass2(BaseClass):
attr3 = 3
attr4 = 4
#coroutine
def prepare(self):
# some code for each class
class MyClass3(BaseClass):
attr5 = 5
attr6 = 6
#coroutine
def prepare(self):
# some code for each class
The result that i want to achieve is the following, will make a static class to show desired result but need to be dynamic
I need to create my class dynamic because in my_forms module can be any amount of classes
# inherits all classes from my_forms module
class MyResultClass(MyClass1, MyClass2, MyClass3):
# here get all available attributes from all classes
#coroutine
def prepare(self):
# as well need each prepare function for each class as well
yield MyClass1().prepare()
yield MyClass2().prepare()
yield MyClass3().prepare()
Simply declare the dynamic class with all of your base classes. To do so, put all of your base classes in a list, and unpack the list in the class definition statement with the * operator like this:
def createClass(baseClasess):
class NewClass(*baseClasses):
pass
return NewClass
DynamicClass = createClass([class1, class2, ...])
i have managed to find a solution, will post here, if any recommendation to make it better will appreciate
forms = importlib.import_module('my_forms')
class Form(BaseForm):
#coroutine
def prepare(self):
for form_name, class_name in inspect.getmembers(forms, inspect.isclass):
try:
yield class_name().prepare()
except TypeError:
continue
def createClass(meta):
for form_name, class_name in inspect.getmembers(forms, inspect.isclass):
try:
Form.__bases__ += (class_name, )
for field in class_name():
field_type = fl.SelectField() if hasattr(field, 'choices') else fl.StringField()
setattr(Form, field.name, field_type)
except TypeError:
continue
return Form(meta=meta)

how to apply factory pattern in python?

i have a file reader class ( sample code below), defined a CHUNK_SIZE class member and a iterator function to read a data from a given file. I want to create objects of this class, based on a different chunk sizes, i was thiking of creating factory pattern or something similar, to create objects like in java or other languages . i created a factory class below, so that i can do something like myobj = get_file_reader(2048). does it makes sense to do this in python or does it have different convention or ways to do this. my idea is , if i need to add a specific file readers say, csv_reader or text_reader or image_reader, it can inherit from my file_reader class?
factory class
class File_Reader_Factory {
def get_File_Reader(size)
if size == null:
throw valueerror ;
else:
return File_Reader(size);
class File_Reader:
def __init__(self, chunk_size):
#initialize code
CHUNK_SIZE = chunk_size
def read_file_chunks(file_object):
"""Lazy function (generator) to read a file piece by piece.
while True:
data = file_object.read(CHUNK_SIZE)
if not data:
break
yield data

Python pro way to make an abstract class allowing each child class to define its own attributes, Python3

I have to model several cases, each case is realised by a class. I want to make sure that each class must have 2 methods get_input() and run(). So in my opinion, I can write a CaseBase class where these 2 methods are decorated as #abstractmethod. Therefore, any child class has to implement these 2 methods. And this is exactly my goal.
However, due to the nature of my work, each case is for distinct subject, and it is not easy to define a fixed group of attributes. The attributes should be defined in the __init__ method of a class. That means I don't know what exactly attributes to write in the CaseBase class. All I know is that all children cases must have some common attributes, like self._common_1 and self._common_2.
Therefore, my idea is that I also decorate the __init__ method of CaseBase class by #abstractmethod. See my code below.
from abc import ABC, abstractmethod
from typing import Dict, List
class CaseBase(ABC):
#abstractmethod
def __init__(self):
self._common_1: Dict[str, float] = {}
self._common_2: List[float] = []
...
#abstractmethod
def get_input(self, input_data: dict):
...
#abstractmethod
def run(self):
...
class CaseA(CaseBase):
def __init__(self):
self._common_1: Dict[str, float] = {}
self._common_2: List[float] = []
self._a1: int = 0
self._a2: str = ''
def get_input(self, input_data: dict):
self._common_1 = input_data['common_1']
self._common_2 = input_data['common_2']
self._a1 = input_data['a1']
self._a2 = input_data['a2']
def run(self):
print(self._common_1)
print(self._common_2)
print(self._a1)
print(self._a2)
def main():
case_a = CaseA()
case_a.get_input(input_data={'common_1': {'c1': 1.1}, 'common_2': [1.1, 2.2], 'a1': 2, 'a2': 'good'})
case_a.run()
if __name__ == '__main__':
main()
My question: Is my way a good Python style?
I followed many Python tutorials about how to make Abstract class and child class. They all give examples where a fixed group of attributes are defined in the __init__ method of the base class. I also see some approach to use super().__init__ code in the child class to change the attributes defined in the base class or to add new attributes. But I am not sure if it is better (more pro) than my way.
Thanks.
You mostly used the abc module in python 3.10 correctly. but it doesn't make sense to decorate the constructor with #abstractmethod. It's unnecessary. Each class, derived or not, can and will have its own constructor. You can call super().__init__(args) within the child class to call the constructor of its immediate parent if you didn't want to duplicate its code but wanted to do further initialization in the child class constructor.

Accessing variables from a method in class A and using it in Class B in python3.5

I have a BaseClass and two classes (Volume and testing) which inherits from the BaseClass. The class "Volume" use a method "driving_style" from another python module. I am trying to write another method "test_Score" which wants to access variables computed in the method "driving_style" which I want to use to compute further. These results will be accessed to the class "testing" as shown.
from training import Accuracy
import ComputeData
import model
class BaseClass(object):
def __init__(self, connections):
self.Type = 'Stock'
self.A = connections.A
self.log = self.B.log
def getIDs(self, assets):
ids = pandas.Series(assets.ids, index=assets.B)
return ids
class Volume(BaseClass):
def __init__(self, connections):
BaseClass.__init__(self, connections)
self.daystrade = 30
self.high_low = True
def learning(self, data, rootClass):
params.daystrade = self.daystrade
params.high_low = self.high_low
style = Accuracy.driving_style()
return self.Object(data.universe, style)
class testing(BaseClass):
def __init__(self, connections):
BaseClass.__init__(self, connections)
def learning(self, data, rootClass):
test_score = Accuracy.test_score()
return self.Object(data.universe, test_score)
def driving_style(date, modelDays, params):
daystrade = params.daystrade
high_low = params.high_low
DriveDays = model.DateRange(date, params.daystrade)
StopBy = ComputeData.instability(DriveDays)
if high_low:
style = ma.average(StopBy)
else:
style = ma.mean(StopBy)
return style
def test_score(date, modelDays, params):
"want to access the following from the method driving_style:"
DriveDays =
StopBy =
return test_score ("which i compute using values DriveDays and StopBy and use test_score in the method learning inside
the 'class - testing' which inherits some params from the BaseClass")
You can't use locals from a call to a function that was made elsewhere and has already returned.
A bad solution is to store them as globals that you can read from later (but that get replaced on every new call). A better solution might to return the relevant info to the caller along with the existing return values (return style, DriveDays, StopBy) and somehow get it to where it needs to go. If necessary, you could wrap the function into a class and store the computed values as attributes on an instance of the class, while keeping the return type the same.
But the best solution is probably to refactor, so the stuff you want is computed by dedicated methods that you can call directly from test_score and driving_style independently, without duplicating code or creating complicated state dependencies.
In short, basically any time you think you need to access locals from another function, you're almost certainly experiencing an XY problem.

QObject and QThread relations

I have a pyqt4 gui which allows me to import multiple .csv files. I've created a loop that goes through this list of tuples that have the following parameters (filename + location of file, filename, bool,bool, set of dates in file)=tup.
I've created several classes that my gui frequently refers to in order to pull parameters off a projects profile. Let's call this class profile(). I also have another class that has a lot of functions based on formatting, such as datetime, text edits,etc...let's call this classMyFormatting(). Then I created a QThread class that is created for each file in the list, and this one is called Import_File(QThread). And lets say this class takes in a few parameters for the __init__(self,tup).
My ideal goal is to be able to make an independent instance of MyFormatting() and profile() for the Import_File(QThread). I am trying to get my head around on how to utilize the QObject capabilities to solve this..but I keep getting the error that the thread is being destroyed while still running.
for tup in importedlist:
importfile = Import_File(tup)
self.connect(importfile,QtCore.SIGNAL('savedfile(PyQt_PyObject()'),self.printstuffpassed)
importfile.start()
I was thinking of having the two classes be declared as
MyFormatting(QObject):
def __init__(self):
QObject.__init__(self)
def func1(self,stuff):
dostuff
def func2(self):
morestuff
profile(QObject):
def __init__(self):
QObject.__init__(self)
def func11(self,stuff):
dostuff
def func22(self):
morestuff
AND for the QThread:
Import_File(QThread):
def __init__(self,tup):
QThread.__init(self)
common_calc_stuff = self.calc(tup[4])
f = open(tup[0] + '.csv', 'w')
self.tup = tup
# this is where I thought of pulling an instance just for this thread
self.MF = MyFormatting()
self.MF_thread = QtCore.QThread()
self.MF.moveToThread(self.MF_thread)
self.MF_thread.start()
self.prof = profile()
self.prof_thread = QtCore.QThread()
self.prof.moveToThread(self.prof_thread)
self.prof_thread.start()
def func33(self,stuff):
dostuff
self.prof.func11(tup[4])
def func44(self):
morestuff
def run(self):
if self.tup[3] == True:
self.func33
self.MF.func2
elif self.tup[3] ==False:
self.func44
if self.tup[2] == True:
self.prof.func22
self.emit(QtCore.SIGNAL('savedfile()',)
Am I totally thinking of it the wrong way? How can I keep to somewhat of the same structure that I have for the coding and still be able to implement the multithreading and not have the same resource tapped at the same time, which I think is the reason why my qui keeps crashing? Or how can I make sure that each instance of those objects get turned off that they don't interfere with the other instances?

Resources