Combinable filters
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ margin-bottom:0;
}
$begingroup$
I have an initial pool of subjects, then I need to apply a set of general criteria to retain a smaller subset (SS1) of subjects. Then I need to divide this smaller subset (SS1) into yet finer subsets (SS1-A, SS1-B and the rest). A specific set of criteria will be applied to SS1 to obtain the SS1-A, while another set of specific criteria will be applied to obtain the SS1-B, and the rest will be discarded. The set of criteria/filter will need to be flexible, I would like to add, remove, or combine filters for testing and development, as well as for further clients' requests.
I created a small structure code below to help me understand and test the implementation of template method and filter methods. I use a list and some filter instead of actual subject pool, but the idea is similar that the list items can be seen as subjects with different attributes.
from abc import ABC, abstractmethod
class DataProcessing(ABC):
def __init__(self, my_list):
self.my_list = my_list
def data_processing_steps(self):
self.remove_duplicate()
self.general_filtering()
self.subject_specific_filtering()
self.return_list()
def remove_duplicate(self):
self.my_list = set(list(self.my_list))
@abstractmethod
def general_filtering(self): pass
def subject_specific_filtering(self): pass
def return_list(self):
return self.my_list
class DataProcessing_Project1(DataProcessing):
def general_filtering(self):
maxfilter_obj = MaxFilter()
minfilter_obj = MinFilter()
CombinedFilter_obj = CombinedFilter(maxfilter_obj, minfilter_obj)
self.my_list = CombinedFilter_obj.filter(self.my_list)
class DataProcessing_Project1_SubjectA(DataProcessing_Project1):
def subject_specific_filtering(self):
twentythreefilter_obj = TwentyThreeFilter()
self.my_list = twentythreefilter_obj.filter(self.my_list)
class DataProcessing_Project1_SubjectB(DataProcessing_Project1): pass
class Criteria():
@abstractmethod
def filter(self, request):
raise NotImplementedError('Should have implemented this.')
class CombinedFilter(Criteria):
def __init__(self, filter1, filter2):
self.filter1 = filter1
self.filter2 = filter2
def filter(self, this_list):
filteredList1 = self.filter1.filter(this_list)
filteredList2 = self.filter2.filter(filteredList1)
return filteredList2
class MaxFilter(Criteria):
def __init__(self, max_val=100):
self.max_val = max_val
def filter(self, this_list):
filteredList =
for item in this_list:
if item <= self.max_val:
filteredList.append(item)
return filteredList
class MinFilter(Criteria):
def __init__(self, min_val=10):
self.min_val = min_val
def filter(self, this_list):
filteredList =
for item in this_list:
if item >= self.min_val:
filteredList.append(item)
return filteredList
class TwentyThreeFilter(Criteria):
def __init__(self): pass
def filter(self, this_list):
filteredList =
for item in this_list:
if item != 23:
filteredList.append(item)
return filteredList
this_list = [1, 2, 23, 4, 34, 456, 234, 23, 3457, 5, 2]
ob = MaxFilter()
this_list2 = ob.filter(this_list)
print(this_list2)
ob2 = MinFilter()
this_list3 = ob2.filter(this_list2)
print(this_list3)
ob3 = CombinedFilter(ob, ob2)
this_list4 = ob3.filter(this_list)
print(this_list4)
ob4 = DataProcessing_Project1(my_list=this_list)
ob4.data_processing_steps()
print(ob4.return_list())
ob5 = DataProcessing_Project1_SubjectA(my_list=this_list)
ob5.data_processing_steps()
print(ob5.return_list())
# Error
twentythreefilter_obj = TwentyThreeFilter()
ob6 = CombinedFilter(ob, ob2, twentythreefilter_obj)
this_list4 = ob3.filter(this_list)
print(this_list4)
I am fairly new to design pattern, I wonder if this is implemented correctly, and if there are areas that can be improved?
Also for ob6
, I would like to add another filter as a parameter for combinedFilter()
, but I am not sure how to set the __init__
and filter()
within the ComninedFilter
class so that it can accommodate the addition of any number of new filters.
python python-3.x object-oriented
$endgroup$
add a comment |
$begingroup$
I have an initial pool of subjects, then I need to apply a set of general criteria to retain a smaller subset (SS1) of subjects. Then I need to divide this smaller subset (SS1) into yet finer subsets (SS1-A, SS1-B and the rest). A specific set of criteria will be applied to SS1 to obtain the SS1-A, while another set of specific criteria will be applied to obtain the SS1-B, and the rest will be discarded. The set of criteria/filter will need to be flexible, I would like to add, remove, or combine filters for testing and development, as well as for further clients' requests.
I created a small structure code below to help me understand and test the implementation of template method and filter methods. I use a list and some filter instead of actual subject pool, but the idea is similar that the list items can be seen as subjects with different attributes.
from abc import ABC, abstractmethod
class DataProcessing(ABC):
def __init__(self, my_list):
self.my_list = my_list
def data_processing_steps(self):
self.remove_duplicate()
self.general_filtering()
self.subject_specific_filtering()
self.return_list()
def remove_duplicate(self):
self.my_list = set(list(self.my_list))
@abstractmethod
def general_filtering(self): pass
def subject_specific_filtering(self): pass
def return_list(self):
return self.my_list
class DataProcessing_Project1(DataProcessing):
def general_filtering(self):
maxfilter_obj = MaxFilter()
minfilter_obj = MinFilter()
CombinedFilter_obj = CombinedFilter(maxfilter_obj, minfilter_obj)
self.my_list = CombinedFilter_obj.filter(self.my_list)
class DataProcessing_Project1_SubjectA(DataProcessing_Project1):
def subject_specific_filtering(self):
twentythreefilter_obj = TwentyThreeFilter()
self.my_list = twentythreefilter_obj.filter(self.my_list)
class DataProcessing_Project1_SubjectB(DataProcessing_Project1): pass
class Criteria():
@abstractmethod
def filter(self, request):
raise NotImplementedError('Should have implemented this.')
class CombinedFilter(Criteria):
def __init__(self, filter1, filter2):
self.filter1 = filter1
self.filter2 = filter2
def filter(self, this_list):
filteredList1 = self.filter1.filter(this_list)
filteredList2 = self.filter2.filter(filteredList1)
return filteredList2
class MaxFilter(Criteria):
def __init__(self, max_val=100):
self.max_val = max_val
def filter(self, this_list):
filteredList =
for item in this_list:
if item <= self.max_val:
filteredList.append(item)
return filteredList
class MinFilter(Criteria):
def __init__(self, min_val=10):
self.min_val = min_val
def filter(self, this_list):
filteredList =
for item in this_list:
if item >= self.min_val:
filteredList.append(item)
return filteredList
class TwentyThreeFilter(Criteria):
def __init__(self): pass
def filter(self, this_list):
filteredList =
for item in this_list:
if item != 23:
filteredList.append(item)
return filteredList
this_list = [1, 2, 23, 4, 34, 456, 234, 23, 3457, 5, 2]
ob = MaxFilter()
this_list2 = ob.filter(this_list)
print(this_list2)
ob2 = MinFilter()
this_list3 = ob2.filter(this_list2)
print(this_list3)
ob3 = CombinedFilter(ob, ob2)
this_list4 = ob3.filter(this_list)
print(this_list4)
ob4 = DataProcessing_Project1(my_list=this_list)
ob4.data_processing_steps()
print(ob4.return_list())
ob5 = DataProcessing_Project1_SubjectA(my_list=this_list)
ob5.data_processing_steps()
print(ob5.return_list())
# Error
twentythreefilter_obj = TwentyThreeFilter()
ob6 = CombinedFilter(ob, ob2, twentythreefilter_obj)
this_list4 = ob3.filter(this_list)
print(this_list4)
I am fairly new to design pattern, I wonder if this is implemented correctly, and if there are areas that can be improved?
Also for ob6
, I would like to add another filter as a parameter for combinedFilter()
, but I am not sure how to set the __init__
and filter()
within the ComninedFilter
class so that it can accommodate the addition of any number of new filters.
python python-3.x object-oriented
$endgroup$
add a comment |
$begingroup$
I have an initial pool of subjects, then I need to apply a set of general criteria to retain a smaller subset (SS1) of subjects. Then I need to divide this smaller subset (SS1) into yet finer subsets (SS1-A, SS1-B and the rest). A specific set of criteria will be applied to SS1 to obtain the SS1-A, while another set of specific criteria will be applied to obtain the SS1-B, and the rest will be discarded. The set of criteria/filter will need to be flexible, I would like to add, remove, or combine filters for testing and development, as well as for further clients' requests.
I created a small structure code below to help me understand and test the implementation of template method and filter methods. I use a list and some filter instead of actual subject pool, but the idea is similar that the list items can be seen as subjects with different attributes.
from abc import ABC, abstractmethod
class DataProcessing(ABC):
def __init__(self, my_list):
self.my_list = my_list
def data_processing_steps(self):
self.remove_duplicate()
self.general_filtering()
self.subject_specific_filtering()
self.return_list()
def remove_duplicate(self):
self.my_list = set(list(self.my_list))
@abstractmethod
def general_filtering(self): pass
def subject_specific_filtering(self): pass
def return_list(self):
return self.my_list
class DataProcessing_Project1(DataProcessing):
def general_filtering(self):
maxfilter_obj = MaxFilter()
minfilter_obj = MinFilter()
CombinedFilter_obj = CombinedFilter(maxfilter_obj, minfilter_obj)
self.my_list = CombinedFilter_obj.filter(self.my_list)
class DataProcessing_Project1_SubjectA(DataProcessing_Project1):
def subject_specific_filtering(self):
twentythreefilter_obj = TwentyThreeFilter()
self.my_list = twentythreefilter_obj.filter(self.my_list)
class DataProcessing_Project1_SubjectB(DataProcessing_Project1): pass
class Criteria():
@abstractmethod
def filter(self, request):
raise NotImplementedError('Should have implemented this.')
class CombinedFilter(Criteria):
def __init__(self, filter1, filter2):
self.filter1 = filter1
self.filter2 = filter2
def filter(self, this_list):
filteredList1 = self.filter1.filter(this_list)
filteredList2 = self.filter2.filter(filteredList1)
return filteredList2
class MaxFilter(Criteria):
def __init__(self, max_val=100):
self.max_val = max_val
def filter(self, this_list):
filteredList =
for item in this_list:
if item <= self.max_val:
filteredList.append(item)
return filteredList
class MinFilter(Criteria):
def __init__(self, min_val=10):
self.min_val = min_val
def filter(self, this_list):
filteredList =
for item in this_list:
if item >= self.min_val:
filteredList.append(item)
return filteredList
class TwentyThreeFilter(Criteria):
def __init__(self): pass
def filter(self, this_list):
filteredList =
for item in this_list:
if item != 23:
filteredList.append(item)
return filteredList
this_list = [1, 2, 23, 4, 34, 456, 234, 23, 3457, 5, 2]
ob = MaxFilter()
this_list2 = ob.filter(this_list)
print(this_list2)
ob2 = MinFilter()
this_list3 = ob2.filter(this_list2)
print(this_list3)
ob3 = CombinedFilter(ob, ob2)
this_list4 = ob3.filter(this_list)
print(this_list4)
ob4 = DataProcessing_Project1(my_list=this_list)
ob4.data_processing_steps()
print(ob4.return_list())
ob5 = DataProcessing_Project1_SubjectA(my_list=this_list)
ob5.data_processing_steps()
print(ob5.return_list())
# Error
twentythreefilter_obj = TwentyThreeFilter()
ob6 = CombinedFilter(ob, ob2, twentythreefilter_obj)
this_list4 = ob3.filter(this_list)
print(this_list4)
I am fairly new to design pattern, I wonder if this is implemented correctly, and if there are areas that can be improved?
Also for ob6
, I would like to add another filter as a parameter for combinedFilter()
, but I am not sure how to set the __init__
and filter()
within the ComninedFilter
class so that it can accommodate the addition of any number of new filters.
python python-3.x object-oriented
$endgroup$
I have an initial pool of subjects, then I need to apply a set of general criteria to retain a smaller subset (SS1) of subjects. Then I need to divide this smaller subset (SS1) into yet finer subsets (SS1-A, SS1-B and the rest). A specific set of criteria will be applied to SS1 to obtain the SS1-A, while another set of specific criteria will be applied to obtain the SS1-B, and the rest will be discarded. The set of criteria/filter will need to be flexible, I would like to add, remove, or combine filters for testing and development, as well as for further clients' requests.
I created a small structure code below to help me understand and test the implementation of template method and filter methods. I use a list and some filter instead of actual subject pool, but the idea is similar that the list items can be seen as subjects with different attributes.
from abc import ABC, abstractmethod
class DataProcessing(ABC):
def __init__(self, my_list):
self.my_list = my_list
def data_processing_steps(self):
self.remove_duplicate()
self.general_filtering()
self.subject_specific_filtering()
self.return_list()
def remove_duplicate(self):
self.my_list = set(list(self.my_list))
@abstractmethod
def general_filtering(self): pass
def subject_specific_filtering(self): pass
def return_list(self):
return self.my_list
class DataProcessing_Project1(DataProcessing):
def general_filtering(self):
maxfilter_obj = MaxFilter()
minfilter_obj = MinFilter()
CombinedFilter_obj = CombinedFilter(maxfilter_obj, minfilter_obj)
self.my_list = CombinedFilter_obj.filter(self.my_list)
class DataProcessing_Project1_SubjectA(DataProcessing_Project1):
def subject_specific_filtering(self):
twentythreefilter_obj = TwentyThreeFilter()
self.my_list = twentythreefilter_obj.filter(self.my_list)
class DataProcessing_Project1_SubjectB(DataProcessing_Project1): pass
class Criteria():
@abstractmethod
def filter(self, request):
raise NotImplementedError('Should have implemented this.')
class CombinedFilter(Criteria):
def __init__(self, filter1, filter2):
self.filter1 = filter1
self.filter2 = filter2
def filter(self, this_list):
filteredList1 = self.filter1.filter(this_list)
filteredList2 = self.filter2.filter(filteredList1)
return filteredList2
class MaxFilter(Criteria):
def __init__(self, max_val=100):
self.max_val = max_val
def filter(self, this_list):
filteredList =
for item in this_list:
if item <= self.max_val:
filteredList.append(item)
return filteredList
class MinFilter(Criteria):
def __init__(self, min_val=10):
self.min_val = min_val
def filter(self, this_list):
filteredList =
for item in this_list:
if item >= self.min_val:
filteredList.append(item)
return filteredList
class TwentyThreeFilter(Criteria):
def __init__(self): pass
def filter(self, this_list):
filteredList =
for item in this_list:
if item != 23:
filteredList.append(item)
return filteredList
this_list = [1, 2, 23, 4, 34, 456, 234, 23, 3457, 5, 2]
ob = MaxFilter()
this_list2 = ob.filter(this_list)
print(this_list2)
ob2 = MinFilter()
this_list3 = ob2.filter(this_list2)
print(this_list3)
ob3 = CombinedFilter(ob, ob2)
this_list4 = ob3.filter(this_list)
print(this_list4)
ob4 = DataProcessing_Project1(my_list=this_list)
ob4.data_processing_steps()
print(ob4.return_list())
ob5 = DataProcessing_Project1_SubjectA(my_list=this_list)
ob5.data_processing_steps()
print(ob5.return_list())
# Error
twentythreefilter_obj = TwentyThreeFilter()
ob6 = CombinedFilter(ob, ob2, twentythreefilter_obj)
this_list4 = ob3.filter(this_list)
print(this_list4)
I am fairly new to design pattern, I wonder if this is implemented correctly, and if there are areas that can be improved?
Also for ob6
, I would like to add another filter as a parameter for combinedFilter()
, but I am not sure how to set the __init__
and filter()
within the ComninedFilter
class so that it can accommodate the addition of any number of new filters.
python python-3.x object-oriented
python python-3.x object-oriented
edited 2 hours ago
200_success
132k20158423
132k20158423
asked 3 hours ago
KubiK888KubiK888
1264
1264
add a comment |
add a comment |
2 Answers
2
active
oldest
votes
$begingroup$
Your approach is suitable for a language like Java. But in Python? Stop writing classes! This is especially true for your task, where much of the code consists of do-nothing placeholders (in bold below) just to allow functionality to be implemented by subclasses.
from abc import ABC, abstractmethod
class DataProcessing(ABC):
def __init__(self, my_list):
self.my_list = my_list
def data_processing_steps(self):
self.remove_duplicate()
self.general_filtering()
self.subject_specific_filtering()
self.return_list()
def remove_duplicate(self):
self.my_list = set(list(self.my_list))
@abstractmethod
def general_filtering(self): pass
def subject_specific_filtering(self): pass
def return_list(self):
return self.my_list
class DataProcessing_Project1(DataProcessing):
def general_filtering(self):
maxfilter_obj = MaxFilter()
minfilter_obj = MinFilter()
CombinedFilter_obj = CombinedFilter(maxfilter_obj, minfilter_obj)
self.my_list = CombinedFilter_obj.filter(self.my_list)
class DataProcessing_Project1_SubjectA(DataProcessing_Project1):
def subject_specific_filtering(self):
twentythreefilter_obj = TwentyThreeFilter()
self.my_list = twentythreefilter_obj.filter(self.my_list)
class DataProcessing_Project1_SubjectB(DataProcessing_Project1): pass
Furthermore, it's unnatural to have my_list
be part of the state of the DataProcessing
instance, and it's especially awkward to have to retrieve the result by calling .return_list()
.
Note that in
def remove_duplicate(self):
self.my_list = set(list(self.my_list))
… my_list
temporarily becomes a set
rather than a list
. You should have written self.my_list = list(set(self.my_list))
instead.
Suggested solution
This task is more naturally suited to functional programming. Each filter can be a function that accepts an iterable and returns an iterable. You can then easily combine filters through function composition.
As a bonus, you can take advantage of default parameter values in Python to supply generic processing steps. Then, just use None
to indicate that an absent processing step.
######################################################################
# Primitive filters
######################################################################
def deduplicator():
return lambda iterable: list(set(iterable))
def at_least(threshold=10):
return lambda iterable: [n for n in iterable if n >= threshold]
def at_most(threshold=100):
return lambda iterable: [n for n in iterable if n <= threshold]
def is_not(bad_value):
return lambda iterable: [n for n in iterable if n != bad_value]
######################################################################
# Higher-order filters
######################################################################
def compose(*filters):
def composed(iterable):
for f in filters:
if f is not None:
iterable = f(iterable)
return iterable
return composed
def data_processing(
deduplicate=deduplicator(),
general=compose(at_least(), at_most()),
specific=None,
):
return compose(deduplicate, general, specific)
######################################################################
# Demonstration
######################################################################
this_list = [1, 2, 23, 4, 34, 456, 234, 23, 3457, 5, 2]
ob = at_most()
this_list2 = ob(this_list)
print(this_list2) # [1, 2, 23, 4, 34, 23, 5, 2]
ob2 = at_least()
this_list3 = ob2(this_list2)
print(this_list3) # [23, 34, 23]
ob3 = compose(ob, ob2)
this_list4 = ob3(this_list)
print(this_list4) # [23, 34, 23]
ob4 = data_processing()
print(ob4(this_list)) # [34, 23]
ob5 = data_processing(specific=is_not(23))
print(ob5(this_list)) # [34]
ob6 = compose(ob, ob2, is_not(23))
print(ob6(this_list)) # [34]
$endgroup$
add a comment |
$begingroup$
I think you would benefit from viewing your processing steps and criteria as filters that operate on iterables.
Suppose you have a sequence, like a set
or a list
or a tuple
. You could iterate over that sequence like so:
for item in sequence:
pass
Now suppose you use the iter()
built-in function to create an iterator, instead. Now you can pass around that iterator, and even extract values from it:
it = iter(sequence)
first_item = next(it)
print_remaining_items(it)
Finally, suppose you take advantage of generator functions and avoid collecting and returning entire lists. You can iterate over the elements of an iterable, inspect the individual values, and yield the ones you choose:
def generator(it):
for item in it:
if choose(item):
yield item
This allows you to process one iterable, and iterate over the results of your function, which makes it another iterable.
Thus, you can build a "stack" of iterables, with your initial sequence (or perhaps just an iterable) at the bottom, and some generator function at each higher level:
ibl = sequence
st1 = generator(ibl)
st2 = generator(st1)
st3 = generator(st2)
for item in st3:
print(item) # Will print chosen items from sequence
So how would this work in practice?
Let's start with a simple use case: you have an iterable, and you wish to filter it using one or more simple conditionals.
class FilteredData:
def __init__(self, ibl):
self.iterable = ibl
self.condition = self.yes
def __iter__(self):
for item in self.ibl:
if self.condition(item):
yield item
def yes(self, item):
return True
obj = FilteredData([1,2,3,4])
for item in obj:
print(item) # 1, 2, 3, 4
obj.condition = lambda item: item % 2 == 0
for item in obj:
print(item) # 2, 4
How can we combine multiple conditions? By "stacking" objects. Wrap one iterable item inside another, and you "compose" the filters:
obj = FilteredData([1,2,3,4])
obj.condition = lambda item: item % 2 == 0
obj2 = FilteredData(obj)
obj2.condition = lambda item: item < 3
for item in obj2:
print(item) # 2
Obviously, you can make things more complex. I'd suggest that you not do that until you establish a clear need.
For example, you could pass in the lambda as part of the constructor. Or subclass FilteredData.
Another example, you could "slurp" up the entire input as part of your __iter__
method in order to compute some aggregate value (like min, max, or average) then yield the values one at a time. It's painful since it consumes O(N) memory instead of just O(1), but sometimes it's necessary. That would require a subclass, or a more complex class.
$endgroup$
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "196"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f219228%2fcombinable-filters%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
Your approach is suitable for a language like Java. But in Python? Stop writing classes! This is especially true for your task, where much of the code consists of do-nothing placeholders (in bold below) just to allow functionality to be implemented by subclasses.
from abc import ABC, abstractmethod
class DataProcessing(ABC):
def __init__(self, my_list):
self.my_list = my_list
def data_processing_steps(self):
self.remove_duplicate()
self.general_filtering()
self.subject_specific_filtering()
self.return_list()
def remove_duplicate(self):
self.my_list = set(list(self.my_list))
@abstractmethod
def general_filtering(self): pass
def subject_specific_filtering(self): pass
def return_list(self):
return self.my_list
class DataProcessing_Project1(DataProcessing):
def general_filtering(self):
maxfilter_obj = MaxFilter()
minfilter_obj = MinFilter()
CombinedFilter_obj = CombinedFilter(maxfilter_obj, minfilter_obj)
self.my_list = CombinedFilter_obj.filter(self.my_list)
class DataProcessing_Project1_SubjectA(DataProcessing_Project1):
def subject_specific_filtering(self):
twentythreefilter_obj = TwentyThreeFilter()
self.my_list = twentythreefilter_obj.filter(self.my_list)
class DataProcessing_Project1_SubjectB(DataProcessing_Project1): pass
Furthermore, it's unnatural to have my_list
be part of the state of the DataProcessing
instance, and it's especially awkward to have to retrieve the result by calling .return_list()
.
Note that in
def remove_duplicate(self):
self.my_list = set(list(self.my_list))
… my_list
temporarily becomes a set
rather than a list
. You should have written self.my_list = list(set(self.my_list))
instead.
Suggested solution
This task is more naturally suited to functional programming. Each filter can be a function that accepts an iterable and returns an iterable. You can then easily combine filters through function composition.
As a bonus, you can take advantage of default parameter values in Python to supply generic processing steps. Then, just use None
to indicate that an absent processing step.
######################################################################
# Primitive filters
######################################################################
def deduplicator():
return lambda iterable: list(set(iterable))
def at_least(threshold=10):
return lambda iterable: [n for n in iterable if n >= threshold]
def at_most(threshold=100):
return lambda iterable: [n for n in iterable if n <= threshold]
def is_not(bad_value):
return lambda iterable: [n for n in iterable if n != bad_value]
######################################################################
# Higher-order filters
######################################################################
def compose(*filters):
def composed(iterable):
for f in filters:
if f is not None:
iterable = f(iterable)
return iterable
return composed
def data_processing(
deduplicate=deduplicator(),
general=compose(at_least(), at_most()),
specific=None,
):
return compose(deduplicate, general, specific)
######################################################################
# Demonstration
######################################################################
this_list = [1, 2, 23, 4, 34, 456, 234, 23, 3457, 5, 2]
ob = at_most()
this_list2 = ob(this_list)
print(this_list2) # [1, 2, 23, 4, 34, 23, 5, 2]
ob2 = at_least()
this_list3 = ob2(this_list2)
print(this_list3) # [23, 34, 23]
ob3 = compose(ob, ob2)
this_list4 = ob3(this_list)
print(this_list4) # [23, 34, 23]
ob4 = data_processing()
print(ob4(this_list)) # [34, 23]
ob5 = data_processing(specific=is_not(23))
print(ob5(this_list)) # [34]
ob6 = compose(ob, ob2, is_not(23))
print(ob6(this_list)) # [34]
$endgroup$
add a comment |
$begingroup$
Your approach is suitable for a language like Java. But in Python? Stop writing classes! This is especially true for your task, where much of the code consists of do-nothing placeholders (in bold below) just to allow functionality to be implemented by subclasses.
from abc import ABC, abstractmethod
class DataProcessing(ABC):
def __init__(self, my_list):
self.my_list = my_list
def data_processing_steps(self):
self.remove_duplicate()
self.general_filtering()
self.subject_specific_filtering()
self.return_list()
def remove_duplicate(self):
self.my_list = set(list(self.my_list))
@abstractmethod
def general_filtering(self): pass
def subject_specific_filtering(self): pass
def return_list(self):
return self.my_list
class DataProcessing_Project1(DataProcessing):
def general_filtering(self):
maxfilter_obj = MaxFilter()
minfilter_obj = MinFilter()
CombinedFilter_obj = CombinedFilter(maxfilter_obj, minfilter_obj)
self.my_list = CombinedFilter_obj.filter(self.my_list)
class DataProcessing_Project1_SubjectA(DataProcessing_Project1):
def subject_specific_filtering(self):
twentythreefilter_obj = TwentyThreeFilter()
self.my_list = twentythreefilter_obj.filter(self.my_list)
class DataProcessing_Project1_SubjectB(DataProcessing_Project1): pass
Furthermore, it's unnatural to have my_list
be part of the state of the DataProcessing
instance, and it's especially awkward to have to retrieve the result by calling .return_list()
.
Note that in
def remove_duplicate(self):
self.my_list = set(list(self.my_list))
… my_list
temporarily becomes a set
rather than a list
. You should have written self.my_list = list(set(self.my_list))
instead.
Suggested solution
This task is more naturally suited to functional programming. Each filter can be a function that accepts an iterable and returns an iterable. You can then easily combine filters through function composition.
As a bonus, you can take advantage of default parameter values in Python to supply generic processing steps. Then, just use None
to indicate that an absent processing step.
######################################################################
# Primitive filters
######################################################################
def deduplicator():
return lambda iterable: list(set(iterable))
def at_least(threshold=10):
return lambda iterable: [n for n in iterable if n >= threshold]
def at_most(threshold=100):
return lambda iterable: [n for n in iterable if n <= threshold]
def is_not(bad_value):
return lambda iterable: [n for n in iterable if n != bad_value]
######################################################################
# Higher-order filters
######################################################################
def compose(*filters):
def composed(iterable):
for f in filters:
if f is not None:
iterable = f(iterable)
return iterable
return composed
def data_processing(
deduplicate=deduplicator(),
general=compose(at_least(), at_most()),
specific=None,
):
return compose(deduplicate, general, specific)
######################################################################
# Demonstration
######################################################################
this_list = [1, 2, 23, 4, 34, 456, 234, 23, 3457, 5, 2]
ob = at_most()
this_list2 = ob(this_list)
print(this_list2) # [1, 2, 23, 4, 34, 23, 5, 2]
ob2 = at_least()
this_list3 = ob2(this_list2)
print(this_list3) # [23, 34, 23]
ob3 = compose(ob, ob2)
this_list4 = ob3(this_list)
print(this_list4) # [23, 34, 23]
ob4 = data_processing()
print(ob4(this_list)) # [34, 23]
ob5 = data_processing(specific=is_not(23))
print(ob5(this_list)) # [34]
ob6 = compose(ob, ob2, is_not(23))
print(ob6(this_list)) # [34]
$endgroup$
add a comment |
$begingroup$
Your approach is suitable for a language like Java. But in Python? Stop writing classes! This is especially true for your task, where much of the code consists of do-nothing placeholders (in bold below) just to allow functionality to be implemented by subclasses.
from abc import ABC, abstractmethod
class DataProcessing(ABC):
def __init__(self, my_list):
self.my_list = my_list
def data_processing_steps(self):
self.remove_duplicate()
self.general_filtering()
self.subject_specific_filtering()
self.return_list()
def remove_duplicate(self):
self.my_list = set(list(self.my_list))
@abstractmethod
def general_filtering(self): pass
def subject_specific_filtering(self): pass
def return_list(self):
return self.my_list
class DataProcessing_Project1(DataProcessing):
def general_filtering(self):
maxfilter_obj = MaxFilter()
minfilter_obj = MinFilter()
CombinedFilter_obj = CombinedFilter(maxfilter_obj, minfilter_obj)
self.my_list = CombinedFilter_obj.filter(self.my_list)
class DataProcessing_Project1_SubjectA(DataProcessing_Project1):
def subject_specific_filtering(self):
twentythreefilter_obj = TwentyThreeFilter()
self.my_list = twentythreefilter_obj.filter(self.my_list)
class DataProcessing_Project1_SubjectB(DataProcessing_Project1): pass
Furthermore, it's unnatural to have my_list
be part of the state of the DataProcessing
instance, and it's especially awkward to have to retrieve the result by calling .return_list()
.
Note that in
def remove_duplicate(self):
self.my_list = set(list(self.my_list))
… my_list
temporarily becomes a set
rather than a list
. You should have written self.my_list = list(set(self.my_list))
instead.
Suggested solution
This task is more naturally suited to functional programming. Each filter can be a function that accepts an iterable and returns an iterable. You can then easily combine filters through function composition.
As a bonus, you can take advantage of default parameter values in Python to supply generic processing steps. Then, just use None
to indicate that an absent processing step.
######################################################################
# Primitive filters
######################################################################
def deduplicator():
return lambda iterable: list(set(iterable))
def at_least(threshold=10):
return lambda iterable: [n for n in iterable if n >= threshold]
def at_most(threshold=100):
return lambda iterable: [n for n in iterable if n <= threshold]
def is_not(bad_value):
return lambda iterable: [n for n in iterable if n != bad_value]
######################################################################
# Higher-order filters
######################################################################
def compose(*filters):
def composed(iterable):
for f in filters:
if f is not None:
iterable = f(iterable)
return iterable
return composed
def data_processing(
deduplicate=deduplicator(),
general=compose(at_least(), at_most()),
specific=None,
):
return compose(deduplicate, general, specific)
######################################################################
# Demonstration
######################################################################
this_list = [1, 2, 23, 4, 34, 456, 234, 23, 3457, 5, 2]
ob = at_most()
this_list2 = ob(this_list)
print(this_list2) # [1, 2, 23, 4, 34, 23, 5, 2]
ob2 = at_least()
this_list3 = ob2(this_list2)
print(this_list3) # [23, 34, 23]
ob3 = compose(ob, ob2)
this_list4 = ob3(this_list)
print(this_list4) # [23, 34, 23]
ob4 = data_processing()
print(ob4(this_list)) # [34, 23]
ob5 = data_processing(specific=is_not(23))
print(ob5(this_list)) # [34]
ob6 = compose(ob, ob2, is_not(23))
print(ob6(this_list)) # [34]
$endgroup$
Your approach is suitable for a language like Java. But in Python? Stop writing classes! This is especially true for your task, where much of the code consists of do-nothing placeholders (in bold below) just to allow functionality to be implemented by subclasses.
from abc import ABC, abstractmethod
class DataProcessing(ABC):
def __init__(self, my_list):
self.my_list = my_list
def data_processing_steps(self):
self.remove_duplicate()
self.general_filtering()
self.subject_specific_filtering()
self.return_list()
def remove_duplicate(self):
self.my_list = set(list(self.my_list))
@abstractmethod
def general_filtering(self): pass
def subject_specific_filtering(self): pass
def return_list(self):
return self.my_list
class DataProcessing_Project1(DataProcessing):
def general_filtering(self):
maxfilter_obj = MaxFilter()
minfilter_obj = MinFilter()
CombinedFilter_obj = CombinedFilter(maxfilter_obj, minfilter_obj)
self.my_list = CombinedFilter_obj.filter(self.my_list)
class DataProcessing_Project1_SubjectA(DataProcessing_Project1):
def subject_specific_filtering(self):
twentythreefilter_obj = TwentyThreeFilter()
self.my_list = twentythreefilter_obj.filter(self.my_list)
class DataProcessing_Project1_SubjectB(DataProcessing_Project1): pass
Furthermore, it's unnatural to have my_list
be part of the state of the DataProcessing
instance, and it's especially awkward to have to retrieve the result by calling .return_list()
.
Note that in
def remove_duplicate(self):
self.my_list = set(list(self.my_list))
… my_list
temporarily becomes a set
rather than a list
. You should have written self.my_list = list(set(self.my_list))
instead.
Suggested solution
This task is more naturally suited to functional programming. Each filter can be a function that accepts an iterable and returns an iterable. You can then easily combine filters through function composition.
As a bonus, you can take advantage of default parameter values in Python to supply generic processing steps. Then, just use None
to indicate that an absent processing step.
######################################################################
# Primitive filters
######################################################################
def deduplicator():
return lambda iterable: list(set(iterable))
def at_least(threshold=10):
return lambda iterable: [n for n in iterable if n >= threshold]
def at_most(threshold=100):
return lambda iterable: [n for n in iterable if n <= threshold]
def is_not(bad_value):
return lambda iterable: [n for n in iterable if n != bad_value]
######################################################################
# Higher-order filters
######################################################################
def compose(*filters):
def composed(iterable):
for f in filters:
if f is not None:
iterable = f(iterable)
return iterable
return composed
def data_processing(
deduplicate=deduplicator(),
general=compose(at_least(), at_most()),
specific=None,
):
return compose(deduplicate, general, specific)
######################################################################
# Demonstration
######################################################################
this_list = [1, 2, 23, 4, 34, 456, 234, 23, 3457, 5, 2]
ob = at_most()
this_list2 = ob(this_list)
print(this_list2) # [1, 2, 23, 4, 34, 23, 5, 2]
ob2 = at_least()
this_list3 = ob2(this_list2)
print(this_list3) # [23, 34, 23]
ob3 = compose(ob, ob2)
this_list4 = ob3(this_list)
print(this_list4) # [23, 34, 23]
ob4 = data_processing()
print(ob4(this_list)) # [34, 23]
ob5 = data_processing(specific=is_not(23))
print(ob5(this_list)) # [34]
ob6 = compose(ob, ob2, is_not(23))
print(ob6(this_list)) # [34]
edited 1 hour ago
answered 1 hour ago
200_success200_success
132k20158423
132k20158423
add a comment |
add a comment |
$begingroup$
I think you would benefit from viewing your processing steps and criteria as filters that operate on iterables.
Suppose you have a sequence, like a set
or a list
or a tuple
. You could iterate over that sequence like so:
for item in sequence:
pass
Now suppose you use the iter()
built-in function to create an iterator, instead. Now you can pass around that iterator, and even extract values from it:
it = iter(sequence)
first_item = next(it)
print_remaining_items(it)
Finally, suppose you take advantage of generator functions and avoid collecting and returning entire lists. You can iterate over the elements of an iterable, inspect the individual values, and yield the ones you choose:
def generator(it):
for item in it:
if choose(item):
yield item
This allows you to process one iterable, and iterate over the results of your function, which makes it another iterable.
Thus, you can build a "stack" of iterables, with your initial sequence (or perhaps just an iterable) at the bottom, and some generator function at each higher level:
ibl = sequence
st1 = generator(ibl)
st2 = generator(st1)
st3 = generator(st2)
for item in st3:
print(item) # Will print chosen items from sequence
So how would this work in practice?
Let's start with a simple use case: you have an iterable, and you wish to filter it using one or more simple conditionals.
class FilteredData:
def __init__(self, ibl):
self.iterable = ibl
self.condition = self.yes
def __iter__(self):
for item in self.ibl:
if self.condition(item):
yield item
def yes(self, item):
return True
obj = FilteredData([1,2,3,4])
for item in obj:
print(item) # 1, 2, 3, 4
obj.condition = lambda item: item % 2 == 0
for item in obj:
print(item) # 2, 4
How can we combine multiple conditions? By "stacking" objects. Wrap one iterable item inside another, and you "compose" the filters:
obj = FilteredData([1,2,3,4])
obj.condition = lambda item: item % 2 == 0
obj2 = FilteredData(obj)
obj2.condition = lambda item: item < 3
for item in obj2:
print(item) # 2
Obviously, you can make things more complex. I'd suggest that you not do that until you establish a clear need.
For example, you could pass in the lambda as part of the constructor. Or subclass FilteredData.
Another example, you could "slurp" up the entire input as part of your __iter__
method in order to compute some aggregate value (like min, max, or average) then yield the values one at a time. It's painful since it consumes O(N) memory instead of just O(1), but sometimes it's necessary. That would require a subclass, or a more complex class.
$endgroup$
add a comment |
$begingroup$
I think you would benefit from viewing your processing steps and criteria as filters that operate on iterables.
Suppose you have a sequence, like a set
or a list
or a tuple
. You could iterate over that sequence like so:
for item in sequence:
pass
Now suppose you use the iter()
built-in function to create an iterator, instead. Now you can pass around that iterator, and even extract values from it:
it = iter(sequence)
first_item = next(it)
print_remaining_items(it)
Finally, suppose you take advantage of generator functions and avoid collecting and returning entire lists. You can iterate over the elements of an iterable, inspect the individual values, and yield the ones you choose:
def generator(it):
for item in it:
if choose(item):
yield item
This allows you to process one iterable, and iterate over the results of your function, which makes it another iterable.
Thus, you can build a "stack" of iterables, with your initial sequence (or perhaps just an iterable) at the bottom, and some generator function at each higher level:
ibl = sequence
st1 = generator(ibl)
st2 = generator(st1)
st3 = generator(st2)
for item in st3:
print(item) # Will print chosen items from sequence
So how would this work in practice?
Let's start with a simple use case: you have an iterable, and you wish to filter it using one or more simple conditionals.
class FilteredData:
def __init__(self, ibl):
self.iterable = ibl
self.condition = self.yes
def __iter__(self):
for item in self.ibl:
if self.condition(item):
yield item
def yes(self, item):
return True
obj = FilteredData([1,2,3,4])
for item in obj:
print(item) # 1, 2, 3, 4
obj.condition = lambda item: item % 2 == 0
for item in obj:
print(item) # 2, 4
How can we combine multiple conditions? By "stacking" objects. Wrap one iterable item inside another, and you "compose" the filters:
obj = FilteredData([1,2,3,4])
obj.condition = lambda item: item % 2 == 0
obj2 = FilteredData(obj)
obj2.condition = lambda item: item < 3
for item in obj2:
print(item) # 2
Obviously, you can make things more complex. I'd suggest that you not do that until you establish a clear need.
For example, you could pass in the lambda as part of the constructor. Or subclass FilteredData.
Another example, you could "slurp" up the entire input as part of your __iter__
method in order to compute some aggregate value (like min, max, or average) then yield the values one at a time. It's painful since it consumes O(N) memory instead of just O(1), but sometimes it's necessary. That would require a subclass, or a more complex class.
$endgroup$
add a comment |
$begingroup$
I think you would benefit from viewing your processing steps and criteria as filters that operate on iterables.
Suppose you have a sequence, like a set
or a list
or a tuple
. You could iterate over that sequence like so:
for item in sequence:
pass
Now suppose you use the iter()
built-in function to create an iterator, instead. Now you can pass around that iterator, and even extract values from it:
it = iter(sequence)
first_item = next(it)
print_remaining_items(it)
Finally, suppose you take advantage of generator functions and avoid collecting and returning entire lists. You can iterate over the elements of an iterable, inspect the individual values, and yield the ones you choose:
def generator(it):
for item in it:
if choose(item):
yield item
This allows you to process one iterable, and iterate over the results of your function, which makes it another iterable.
Thus, you can build a "stack" of iterables, with your initial sequence (or perhaps just an iterable) at the bottom, and some generator function at each higher level:
ibl = sequence
st1 = generator(ibl)
st2 = generator(st1)
st3 = generator(st2)
for item in st3:
print(item) # Will print chosen items from sequence
So how would this work in practice?
Let's start with a simple use case: you have an iterable, and you wish to filter it using one or more simple conditionals.
class FilteredData:
def __init__(self, ibl):
self.iterable = ibl
self.condition = self.yes
def __iter__(self):
for item in self.ibl:
if self.condition(item):
yield item
def yes(self, item):
return True
obj = FilteredData([1,2,3,4])
for item in obj:
print(item) # 1, 2, 3, 4
obj.condition = lambda item: item % 2 == 0
for item in obj:
print(item) # 2, 4
How can we combine multiple conditions? By "stacking" objects. Wrap one iterable item inside another, and you "compose" the filters:
obj = FilteredData([1,2,3,4])
obj.condition = lambda item: item % 2 == 0
obj2 = FilteredData(obj)
obj2.condition = lambda item: item < 3
for item in obj2:
print(item) # 2
Obviously, you can make things more complex. I'd suggest that you not do that until you establish a clear need.
For example, you could pass in the lambda as part of the constructor. Or subclass FilteredData.
Another example, you could "slurp" up the entire input as part of your __iter__
method in order to compute some aggregate value (like min, max, or average) then yield the values one at a time. It's painful since it consumes O(N) memory instead of just O(1), but sometimes it's necessary. That would require a subclass, or a more complex class.
$endgroup$
I think you would benefit from viewing your processing steps and criteria as filters that operate on iterables.
Suppose you have a sequence, like a set
or a list
or a tuple
. You could iterate over that sequence like so:
for item in sequence:
pass
Now suppose you use the iter()
built-in function to create an iterator, instead. Now you can pass around that iterator, and even extract values from it:
it = iter(sequence)
first_item = next(it)
print_remaining_items(it)
Finally, suppose you take advantage of generator functions and avoid collecting and returning entire lists. You can iterate over the elements of an iterable, inspect the individual values, and yield the ones you choose:
def generator(it):
for item in it:
if choose(item):
yield item
This allows you to process one iterable, and iterate over the results of your function, which makes it another iterable.
Thus, you can build a "stack" of iterables, with your initial sequence (or perhaps just an iterable) at the bottom, and some generator function at each higher level:
ibl = sequence
st1 = generator(ibl)
st2 = generator(st1)
st3 = generator(st2)
for item in st3:
print(item) # Will print chosen items from sequence
So how would this work in practice?
Let's start with a simple use case: you have an iterable, and you wish to filter it using one or more simple conditionals.
class FilteredData:
def __init__(self, ibl):
self.iterable = ibl
self.condition = self.yes
def __iter__(self):
for item in self.ibl:
if self.condition(item):
yield item
def yes(self, item):
return True
obj = FilteredData([1,2,3,4])
for item in obj:
print(item) # 1, 2, 3, 4
obj.condition = lambda item: item % 2 == 0
for item in obj:
print(item) # 2, 4
How can we combine multiple conditions? By "stacking" objects. Wrap one iterable item inside another, and you "compose" the filters:
obj = FilteredData([1,2,3,4])
obj.condition = lambda item: item % 2 == 0
obj2 = FilteredData(obj)
obj2.condition = lambda item: item < 3
for item in obj2:
print(item) # 2
Obviously, you can make things more complex. I'd suggest that you not do that until you establish a clear need.
For example, you could pass in the lambda as part of the constructor. Or subclass FilteredData.
Another example, you could "slurp" up the entire input as part of your __iter__
method in order to compute some aggregate value (like min, max, or average) then yield the values one at a time. It's painful since it consumes O(N) memory instead of just O(1), but sometimes it's necessary. That would require a subclass, or a more complex class.
answered 1 hour ago
Austin HastingsAustin Hastings
8,4721338
8,4721338
add a comment |
add a comment |
Thanks for contributing an answer to Code Review Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f219228%2fcombinable-filters%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown