Brightway2 - Importing ecospold1 processes exported from openLCA - brightway

I have an ecospold1 dataset extracted from openLCA and I would like to import it into Brightway2.
Using the SingleOutputEcospold1Importer should read the ecospold files but apparently something is wrong in the file schema.
Given the possible export formats of openLCA and the possible import formats of Brightway2, ecospold1 seems to be the only common format. If there is any other way to do it, I would be happy to try.
Code:
import brightway2 as bw
bw.projects.set_current('importing_ecospold1')
bw.bw2setup()
fp = "path/to/EcoSpold01"
importer = bw.SingleOutputEcospold1Importer(fp, 'database_name', use_mp=False)
Output
File ~\miniconda3\envs\playing_with_brightway\lib\site-packages\bw2io\extractors\ecospold1.py:132, in Ecospold1DataExtractor.process_dataset(cls, dataset, filename, db_name)
109 #classmethod
110 def process_dataset(cls, dataset, filename, db_name):
111 ref_func = dataset.metaInformation.processInformation.referenceFunction
112 comments = [
113 ref_func.get("generalComment"),
114 ref_func.get("includedProcesses"),
115 (
116 "Location: ",
117 dataset.metaInformation.processInformation.geography.get("text"),
118 ),
119 (
120 "Technology: ",
121 dataset.metaInformation.processInformation.technology.get("text"),
122 ),
123 (
124 "Time period: ",
125 getattr2(dataset.metaInformation.processInformation, "timePeriod").get(
126 "text"
127 ),
128 ),
129 (
130 "Production volume: ",
131 getattr2(
--> 132 dataset.metaInformation.modellingAndValidation, "representativeness"
133 ).get("productionVolume"),
134 ),
135 (
136 "Sampling: ",
137 getattr2(
138 dataset.metaInformation.modellingAndValidation, "representativeness"
139 ).get("samplingProcedure"),
140 ),
141 (
142 "Extrapolations: ",
143 getattr2(
144 dataset.metaInformation.modellingAndValidation, "representativeness"
145 ).get("extrapolations"),
146 ),
147 (
148 "Uncertainty: ",
149 getattr2(
150 dataset.metaInformation.modellingAndValidation, "representativeness"
151 ).get("uncertaintyAdjustments"),
152 ),
153 ]
155 def get_authors():
156 ai = dataset.metaInformation.administrativeInformation
File src/lxml/objectify.pyx:234, in lxml.objectify.ObjectifiedElement.__getattr__()
File src/lxml/objectify.pyx:453, in lxml.objectify._lookupChildOrRaise()
AttributeError: no such child: {http://www.EcoInvent.org/EcoSpold01}modellingAndValidation

This will be hard to fix in the SO format - if your data is not confidential, please file a new brightway2-io issue with some sample data. If your data is confidential, you will need to make up an example project and export that, we need to XML files to fix this.

Related

eli5 explain_weights_xgboost KeyError: 'bias'

I am new to xgboost, I trained a model, that works pretty well. Now I am trying to use eli5 to see the weights and I get: KeyError: 'bias'
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
in
3 clf6 = model6.named_steps['clf']
4 vec6 = model6.named_steps['transformer']
----> 5 explain_weights_xgboost(clf6, vec=vec6)
~/dev/envs/env3.7/lib/python3.7/site-packages/eli5/xgboost.py in explain_weights_xgboost(xgb, vec, top, target_names, targets, feature_names, feature_re, feature_filter, importance_type)
80 description=DESCRIPTION_XGBOOST,
81 is_regression=is_regression,
---> 82 num_features=coef.shape[-1],
83 )
84
~/dev/envs/env3.7/lib/python3.7/site-packages/eli5/_feature_importances.py in get_feature_importance_explanation(estimator, vec, coef, feature_names, feature_filter, feature_re, top, description, is_regression, estimator_feature_names, num_features, coef_std)
35 feature_filter=feature_filter,
36 feature_re=feature_re,
---> 37 num_features=num_features,
38 )
39 feature_importances = get_feature_importances_filtered(
~/dev/envs/env3.7/lib/python3.7/site-packages/eli5/sklearn/utils.py in get_feature_names_filtered(clf, vec, bias_name, feature_names, num_features, feature_filter, feature_re, estimator_feature_names)
124 feature_names=feature_names,
125 num_features=num_features,
--> 126 estimator_feature_names=estimator_feature_names,
127 )
128 return feature_names.handle_filter(feature_filter, feature_re)
~/dev/envs/env3.7/lib/python3.7/site-packages/eli5/sklearn/utils.py in get_feature_names(clf, vec, bias_name, feature_names, num_features, estimator_feature_names)
77 features are named x0, x1, x2, etc.
78 """
---> 79 if not has_intercept(clf):
80 bias_name = None
81
~/dev/envs/env3.7/lib/python3.7/site-packages/eli5/sklearn/utils.py in has_intercept(estimator)
60 if hasattr(estimator, 'fit_intercept'):
61 return estimator.fit_intercept
---> 62 if hasattr(estimator, 'intercept_'):
63 if estimator.intercept_ is None:
64 return False
~/dev/envs/env3.7/lib/python3.7/site-packages/xgboost/sklearn.py in intercept_(self)
743 .format(self.booster))
744 b = self.get_booster()
--> 745 return np.array(json.loads(b.get_dump(dump_format='json')[0])['bias'])
746
747
KeyError: 'bias'
Thank you!
I had the same issue and fixed it by specifying explicitly the argument booster when creating the estimator:
clf = XGBClassifier(booster='gbtree')

FeatureTools TypeError: unhashable type: 'set'

I'm trying this code for featuretools:
features, feature_names = ft.dfs(entityset = es, target_entity = 'demo',
agg_primitives = ['count', 'max', 'time_since_first', 'median', 'time_since_last', 'avg_time_between',
'sum', 'mean'],
trans_primitives = ['is_weekend', 'year', 'week', 'divide_by_feature', 'percentile'])
But I had this error
TypeError Traceback (most recent call last)
<ipython-input-17-89e925ff895d> in <module>
3 agg_primitives = ['count', 'max', 'time_since_first', 'median', 'time_since_last', 'avg_time_between',
4 'sum', 'mean'],
----> 5 trans_primitives = ['is_weekend', 'year', 'week', 'divide_by_feature', 'percentile'])
~/.local/lib/python3.6/site-packages/featuretools/utils/entry_point.py in function_wrapper(*args, **kwargs)
44 ep.on_error(error=e,
45 runtime=runtime)
---> 46 raise e
47
48 # send return value
~/.local/lib/python3.6/site-packages/featuretools/utils/entry_point.py in function_wrapper(*args, **kwargs)
36 # call function
37 start = time.time()
---> 38 return_value = func(*args, **kwargs)
39 runtime = time.time() - start
40 except Exception as e:
~/.local/lib/python3.6/site-packages/featuretools/synthesis/dfs.py in dfs(entities, relationships, entityset, target_entity, cutoff_time, instance_ids, agg_primitives, trans_primitives, groupby_trans_primitives, allowed_paths, max_depth, ignore_entities, ignore_variables, seed_features, drop_contains, drop_exact, where_primitives, max_features, cutoff_time_in_index, save_progress, features_only, training_window, approximate, chunk_size, n_jobs, dask_kwargs, verbose, return_variable_types)
226 n_jobs=n_jobs,
227 dask_kwargs=dask_kwargs,
--> 228 verbose=verbose)
229 return feature_matrix, features
~/.local/lib/python3.6/site-packages/featuretools/computational_backends/calculate_feature_matrix.py in calculate_feature_matrix(features, entityset, cutoff_time, instance_ids, entities, relationships, cutoff_time_in_index, training_window, approximate, save_progress, verbose, chunk_size, n_jobs, dask_kwargs)
265 cutoff_df_time_var=cutoff_df_time_var,
266 target_time=target_time,
--> 267 pass_columns=pass_columns)
268
269 feature_matrix = pd.concat(feature_matrix)
~/.local/lib/python3.6/site-packages/featuretools/computational_backends/calculate_feature_matrix.py in linear_calculate_chunks(chunks, feature_set, approximate, training_window, verbose, save_progress, entityset, no_unapproximated_aggs, cutoff_df_time_var, target_time, pass_columns)
496 no_unapproximated_aggs,
497 cutoff_df_time_var,
--> 498 target_time, pass_columns)
499 feature_matrix.append(_feature_matrix)
500 # Do a manual garbage collection in case objects from calculate_chunk
~/.local/lib/python3.6/site-packages/featuretools/computational_backends/calculate_feature_matrix.py in calculate_chunk(chunk, feature_set, entityset, approximate, training_window, verbose, save_progress, no_unapproximated_aggs, cutoff_df_time_var, target_time, pass_columns)
341 ids,
342 precalculated_features=precalculated_features_trie,
--> 343 training_window=window)
344
345 id_name = _feature_matrix.index.name
~/.local/lib/python3.6/site-packages/featuretools/computational_backends/utils.py in wrapped(*args, **kwargs)
35 def wrapped(*args, **kwargs):
36 if save_progress is None:
---> 37 r = method(*args, **kwargs)
38 else:
39 time = args[0].to_pydatetime()
~/.local/lib/python3.6/site-packages/featuretools/computational_backends/calculate_feature_matrix.py in calc_results(time_last, ids, precalculated_features, training_window)
316 ignored=all_approx_feature_set)
317
--> 318 matrix = calculator.run(ids)
319 return matrix
320
~/.local/lib/python3.6/site-packages/featuretools/computational_backends/feature_set_calculator.py in run(self, instance_ids)
100 precalculated_trie=self.precalculated_features,
101 filter_variable=target_entity.index,
--> 102 filter_values=instance_ids)
103
104 # The dataframe for the target entity should be stored at the root of
~/.local/lib/python3.6/site-packages/featuretools/computational_backends/feature_set_calculator.py in _calculate_features_for_entity(self, entity_id, feature_trie, df_trie, full_entity_df_trie, precalculated_trie, filter_variable, filter_values, parent_data)
187 columns=columns,
188 time_last=self.time_last,
--> 189 training_window=self.training_window)
190
191 # Step 2: Add variables to the dataframe linking it to all ancestors.
~/.local/lib/python3.6/site-packages/featuretools/entityset/entity.py in query_by_values(self, instance_vals, variable_id, columns, time_last, training_window)
271
272 if columns is not None:
--> 273 df = df[columns]
274
275 return df
~/.local/lib/python3.6/site-packages/pandas/core/frame.py in __getitem__(self, key)
2686 return self._getitem_multilevel(key)
2687 else:
-> 2688 return self._getitem_column(key)
2689
2690 def _getitem_column(self, key):
~/.local/lib/python3.6/site-packages/pandas/core/frame.py in _getitem_column(self, key)
2693 # get column
2694 if self.columns.is_unique:
-> 2695 return self._get_item_cache(key)
2696
2697 # duplicate columns & possible reduce dimensionality
~/.local/lib/python3.6/site-packages/pandas/core/generic.py in _get_item_cache(self, item)
2485 """Return the cached item, item represents a label indexer."""
2486 cache = self._item_cache
-> 2487 res = cache.get(item)
2488 if res is None:
2489 values = self._data.get(item)
TypeError: unhashable type: 'set'
I also tried the simplest code for deep feature synthesis (dfs) as shown below, but it still encountered the same error
features, feature_names = ft.dfs(entityset = es, target_entity = 'demo')
I'm not really sure why I encountered this error, any help or recommendations on how to go about from here is deeply appreciated.
Thanks in advance for your help!
I found a solution, my current version had bugs in it that was fixed by the FeatureTools team. Just run pip install directly from master,
pip install --upgrade https://github.com/featuretools/featuretools/zipball/master
This fixed and has been released in Featuretools 0.9.1. If you upgrade to the latest version of Featuretools, it will go away.

Create linear model to check correlation tokenize error

I have data like the sample below, which has 4 continuous columns [x0 to x3] and a binary column y. y has two values 1.0 and 0.0. I’m trying to check for correlation between the binary column y and one of the continuous columns x0, using the CatConCor function below, but I’m getting the error message below. The function creates a linear regression model and calcs the p value for the residuals with and without the categorical variable. If anyone can please point out the issue or how to fix it, it would be very much appreciated.
Data:
x_r x0 x1 x2 x3 y
0 0 0.466726 0.030126 0.998330 0.892770 0.0
1 1 0.173168 0.525810 -0.079341 -0.112151 0.0
2 2 -0.854467 0.770712 0.929614 -0.224779 0.0
3 3 -0.370574 0.568183 -0.928269 0.843253 0.0
4 4 -0.659431 -0.948491 -0.091534 0.706157 0.0
Code:
import numpy as np
import pandas as pd
from time import time
import scipy.stats as stats
from IPython.display import display # Allows the use of display() for DataFrames
# Pretty display for notebooks
%matplotlib inline
###########################################
# Suppress matplotlib user warnings
# Necessary for newer version of matplotlib
import warnings
warnings.filterwarnings("ignore", category = UserWarning, module = "matplotlib")
#
# Display inline matplotlib plots with IPython
from IPython import get_ipython
get_ipython().run_line_magic('matplotlib', 'inline')
###########################################
import matplotlib.pyplot as plt
import matplotlib.cm as cm
# correlation between categorical variable and continuous variable
def CatConCor(df,catVar,conVar):
import statsmodels.api as sm
from statsmodels.formula.api import ols
# subsetting data for one categorical column and one continuous column
data2=df.copy()[[catVar,conVar]]
data2[catVar]=data2[catVar].astype('category')
mod = ols(conVar+'~'+catVar,
data=data2).fit()
aov_table = sm.stats.anova_lm(mod, typ=2)
if aov_table['PR(>F)'][0] < 0.05:
print('Correlated p='+str(aov_table['PR(>F)'][0]))
else:
print('Uncorrelated p='+str(aov_table['PR(>F)'][0]))
# checking for correlation between categorical and continuous variables
CatConCor(df=train_df,catVar='y',conVar='x0')
Error:
---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
<ipython-input-6-80f83b8c8e14> in <module>()
1 # checking for correlation between categorical and continuous variables
2
----> 3 CatConCor(df=train_df,catVar='y',conVar='x0')
<ipython-input-2-35404ba1d697> in CatConCor(df, catVar, conVar)
103
104 mod = ols(conVar+'~'+catVar,
--> 105 data=data2).fit()
106
107 aov_table = sm.stats.anova_lm(mod, typ=2)
~/anaconda2/envs/py36/lib/python3.6/site-packages/statsmodels/base/model.py in from_formula(cls, formula, data, subset, drop_cols, *args, **kwargs)
153
154 tmp = handle_formula_data(data, None, formula, depth=eval_env,
--> 155 missing=missing)
156 ((endog, exog), missing_idx, design_info) = tmp
157
~/anaconda2/envs/py36/lib/python3.6/site-packages/statsmodels/formula/formulatools.py in handle_formula_data(Y, X, formula, depth, missing)
63 if data_util._is_using_pandas(Y, None):
64 result = dmatrices(formula, Y, depth, return_type='dataframe',
---> 65 NA_action=na_action)
66 else:
67 result = dmatrices(formula, Y, depth, return_type='dataframe',
~/anaconda2/envs/py36/lib/python3.6/site-packages/patsy/highlevel.py in dmatrices(formula_like, data, eval_env, NA_action, return_type)
308 eval_env = EvalEnvironment.capture(eval_env, reference=1)
309 (lhs, rhs) = _do_highlevel_design(formula_like, data, eval_env,
--> 310 NA_action, return_type)
311 if lhs.shape[1] == 0:
312 raise PatsyError("model is missing required outcome variables")
~/anaconda2/envs/py36/lib/python3.6/site-packages/patsy/highlevel.py in _do_highlevel_design(formula_like, data, eval_env, NA_action, return_type)
163 return iter([data])
164 design_infos = _try_incr_builders(formula_like, data_iter_maker, eval_env,
--> 165 NA_action)
166 if design_infos is not None:
167 return build_design_matrices(design_infos, data,
~/anaconda2/envs/py36/lib/python3.6/site-packages/patsy/highlevel.py in _try_incr_builders(formula_like, data_iter_maker, eval_env, NA_action)
60 "ascii-only, or else upgrade to Python 3.")
61 if isinstance(formula_like, str):
---> 62 formula_like = ModelDesc.from_formula(formula_like)
63 # fallthrough
64 if isinstance(formula_like, ModelDesc):
~/anaconda2/envs/py36/lib/python3.6/site-packages/patsy/desc.py in from_formula(cls, tree_or_string)
162 tree = tree_or_string
163 else:
--> 164 tree = parse_formula(tree_or_string)
165 value = Evaluator().eval(tree, require_evalexpr=False)
166 assert isinstance(value, cls)
~/anaconda2/envs/py36/lib/python3.6/site-packages/patsy/parse_formula.py in parse_formula(code, extra_operators)
146 tree = infix_parse(_tokenize_formula(code, operator_strings),
147 operators,
--> 148 _atomic_token_types)
149 if not isinstance(tree, ParseNode) or tree.type != "~":
150 tree = ParseNode("~", None, [tree], tree.origin)
~/anaconda2/envs/py36/lib/python3.6/site-packages/patsy/infix_parser.py in infix_parse(tokens, operators, atomic_types, trace)
208
209 want_noun = True
--> 210 for token in token_source:
211 if c.trace:
212 print("Reading next token (want_noun=%r)" % (want_noun,))
~/anaconda2/envs/py36/lib/python3.6/site-packages/patsy/parse_formula.py in _tokenize_formula(code, operator_strings)
92 else:
93 it.push_back((pytype, token_string, origin))
---> 94 yield _read_python_expr(it, end_tokens)
95
96 def test__tokenize_formula():
~/anaconda2/envs/py36/lib/python3.6/site-packages/patsy/parse_formula.py in _read_python_expr(it, end_tokens)
42 origins = []
43 bracket_level = 0
---> 44 for pytype, token_string, origin in it:
45 assert bracket_level >= 0
46 if bracket_level == 0 and token_string in end_tokens:
~/anaconda2/envs/py36/lib/python3.6/site-packages/patsy/util.py in next(self)
330 else:
331 # May raise StopIteration
--> 332 return six.advance_iterator(self._it)
333 __next__ = next
334
~/anaconda2/envs/py36/lib/python3.6/site-packages/patsy/tokens.py in python_tokenize(code)
33 break
34 origin = Origin(code, start, end)
---> 35 assert pytype not in (tokenize.NL, tokenize.NEWLINE)
36 if pytype == tokenize.ERRORTOKEN:
37 raise PatsyError("error tokenizing input "
AssertionError:
Upgrading patsy to 0.5.1 fixed the issue. I found the tip here:
https://github.com/statsmodels/statsmodels/issues/5343

variance_scaling_initializer() got an unexpected keyword argument 'distribution'

Here i want to predict the same values with time (regression neural network) using python. Here I have two outputs with three inputs. when I run the code it gives me an error "variance_scaling_initializer() got an unexpected keyword argument 'distribution'". Can you help me to solve the problem.?
Here I upload my code,
n_neurons_1 = 24
n_neurons_2 = 12
n_target = 2
softmax = 2
weight_initializer = tf.contrib.layers.variance_scaling_initializer(mode= "FAN_AVG", distribution ="uniform", scale = softmax)
bias_initializer = tf.zeros_initializer()
w_hidden_1 = tf.Variable(weight_initializer([n_time_dimensions,n_neurons_1]))
bias_hidden_1= tf.Variable(bias_initializer([n_neurons_1]))
w_hidden_2= tf.Variable(weight_initializer([n_neurons_1,n_neurons_2]))
bias_hidden_2 = tf.Variable(bias_initializer([n_neurons_2]))
w_out = tf.Variable(weight_initializer([n_neurons_2,2]))
bias_out = tf.Variable(bias_initializer([2]))
hidden_1 = tf.nn.relu(tf.add(tf.matmul(X, w_hidden_1),bias_hidden_1))
hidden_2 = tf.nn.relu(tf.add(tf.matmul(X, w_hidden_2),bias_hidden_2))
out = tf.transpose(tf.add(tf.matmul(hidden_2, w_out),bias_out))
My dataset is,
date time g p c apparentg
6/8/2018 0:06:15 141 131 136 141
6/8/2018 0:09:25 95 117 95 95
6/8/2018 0:11:00 149 109 139 149
6/8/2018 0:13:50 120 103 95 120
6/8/2018 0:16:20 135 97 105 135
6/8/2018 0:19:00 63 NaN 97 63
6/8/2018 0:20:00 111 NaN 100 111
6/8/2018 0:22:10 115 NaN 115 115
6/8/2018 0:23:40 287 NaN NaN 287
error is,
TypeError Traceback (most recent call last)
<ipython-input-26-9ceeb97429b1> in <module>()
31 n_target = 2
32 softmax = 2
---> 33 weight_initializer = tf.contrib.layers.variance_scaling_initializer(mode= "FAN_AVG", distribution ="uniform", scale = softmax)
34 bias_initializer = tf.zeros_initializer()
35 w_hidden_1 = tf.Variable(weight_initializer([n_time_dimensions,n_neurons_1]))
TypeError: variance_scaling_initializer() got an unexpected keyword argument 'distribution'
Looking into Documentation https://www.tensorflow.org/api_docs/python/tf/contrib/layers/variance_scaling_initializer
tf.contrib.layers.variance_scaling_initializer(
factor=2.0,
mode='FAN_IN',
uniform=False,
seed=None,
dtype=tf.float32
)
and
uniform: Whether to use uniform or normal distributed random initialization.
So try
uniform = True
instead of
distribution ="uniform"
in your function call
tf.contrib.layers.variance_scaling_initializer(mode= "FAN_AVG", distribution ="uniform", scale = softmax)
also there seems to be no scale= attribute in that function.

Featuretools dfs runtime error

Working through the featuretools "predict_next_purchase" demo against my own data. I've created the entity set, and have also created a new pandas.dataframe comprised of the labels and times. I'm to the point of using ft.dfs for deep feature synthesis, and am getting a RuntimeError: maximum recursion depth exceeded. Below is the stack trace:
feature_matrix, features = ft.dfs(target_entity='projects',
cutoff_time=labels.reset_index().loc[:,['jobnumber','time']],
training_window=inst_defn['training_window'],
entityset=es,
verbose=True)
Stack Trace:
Building features: 0it [00:00, ?it/s]
RuntimeError: maximum recursion depth exceeded
RuntimeErrorTraceback (most recent call last)
<ipython-input-743-f05fc567dd1b> in <module>()
3 training_window=inst_defn['training_window'],
4 entityset=es,
----> 5 verbose=True)
/Users/nbernini/OneDrive/PSC/venv/ml20/lib/python2.7/site-packages/featuretools/synthesis/dfs.pyc in dfs(entities, relationships, entityset, target_entity, cutoff_time, instance_ids, agg_primitives, trans_primitives, allowed_paths, max_depth, ignore_entities, ignore_variables, seed_features, drop_contains, drop_exact, where_primitives, max_features, cutoff_time_in_index, save_progress, features_only, training_window, approximate, verbose)
164 seed_features=seed_features)
165
--> 166 features = dfs_object.build_features(verbose=verbose)
167
168 if features_only:
/Users/nbernini/OneDrive/PSC/venv/ml20/lib/python2.7/site-packages/featuretools/synthesis/deep_feature_synthesis.pyc in build_features(self, variable_types, verbose)
227 self.where_clauses = defaultdict(set)
228 self._run_dfs(self.es[self.target_entity_id], [],
--> 229 all_features, max_depth=self.max_depth)
230
231 new_features = list(all_features[self.target_entity_id].values())
/Users/nbernini/OneDrive/PSC/venv/ml20/lib/python2.7/site-packages/featuretools/synthesis/deep_feature_synthesis.pyc in _run_dfs(self, entity, entity_path, all_features, max_depth)
353 entity_path=list(entity_path),
354 all_features=all_features,
--> 355 max_depth=new_max_depth)
356
357 """
/Users/nbernini/OneDrive/PSC/venv/ml20/lib/python2.7/site-packages/featuretools/synthesis/deep_feature_synthesis.pyc in _run_dfs(self, entity, entity_path, all_features, max_depth)
338 if self._apply_traversal_filters(entity, self.es[b_id],
339 entity_path,
--> 340 forward=False) and
341 b_id not in self.ignore_entities]
342 for b_entity_id in backward_entities:
/Users/nbernini/OneDrive/PSC/venv/ml20/lib/python2.7/site-packages/featuretools/synthesis/deep_feature_synthesis.pyc in _apply_traversal_filters(self, parent_entity, child_entity, entity_path, forward)
429 child_entity=child_entity,
430 target_entity_id=self.target_entity_id,
--> 431 entity_path=entity_path, forward=forward):
432 return False
433
/Users/nbernini/OneDrive/PSC/venv/ml20/lib/python2.7/site-packages/featuretools/synthesis/dfs_filters.pyc in is_valid(self, feature, entity, target_entity_id, child_feature, child_entity, entity_path, forward, where)
53
54 if type(feature) != list:
---> 55 return func(*args)
56
57 else:
/Users/nbernini/OneDrive/PSC/venv/ml20/lib/python2.7/site-packages/featuretools/synthesis/dfs_filters.pyc in apply_filter(self, parent_entity, child_entity, target_entity_id, entity_path, forward)
76 if (parent_entity.id == target_entity_id or
77 es.find_backward_path(parent_entity.id,
---> 78 target_entity_id) is None):
79 return True
80 path = es.find_backward_path(parent_entity.id, child_entity.id)
/Users/nbernini/OneDrive/PSC/venv/ml20/lib/python2.7/site-packages/featuretools/entityset/base_entityset.pyc in find_backward_path(self, start_entity_id, goal_entity_id)
308 is returned if no path exists.
309 """
--> 310 forward_path = self.find_forward_path(goal_entity_id, start_entity_id)
311 if forward_path is not None:
312 return forward_path[::-1]
/Users/nbernini/OneDrive/PSC/venv/ml20/lib/python2.7/site-packages/featuretools/entityset/base_entityset.pyc in find_forward_path(self, start_entity_id, goal_entity_id)
287
288 for r in self.get_forward_relationships(start_entity_id):
--> 289 new_path = self.find_forward_path(r.parent_entity.id, goal_entity_id)
290 if new_path is not None:
291 return [r] + new_path
... last 1 frames repeated, from the frame below ...
/Users/nbernini/OneDrive/PSC/venv/ml20/lib/python2.7/site-packages/featuretools/entityset/base_entityset.pyc in find_forward_path(self, start_entity_id, goal_entity_id)
287
288 for r in self.get_forward_relationships(start_entity_id):
--> 289 new_path = self.find_forward_path(r.parent_entity.id, goal_entity_id)
290 if new_path is not None:
291 return [r] + new_path
RuntimeError: maximum recursion depth exceeded
The issue here is cyclical relationships in your entity set. Currently, Deep Feature Synthesis can only create features when there is one unique path between two entities. If you have an entity with a relationship to itself, you would also get this error.
A future release of Featuretools will offer better support for this use case.

Resources