Kera's ImageDataGenerator randomly crashes - keras

I have the following structure, where I want to read jpg files from test.
./cats_dogs_small
├── test
│ ├── cats <- 1000 images
│ └── dogs <- 1000 images
To read the files, I use the following MWE:
import os
train_dir = os.path.join(os.environ['HOME'], 'Documents/cats_dogs_small')
train_dir = os.path.join(train_dir, 'train')
datagen = ImageDataGenerator(rescale=1./255)
batch_size = 20
def extract_features(directory):
generator = datagen.flow_from_directory(directory,
target_size=(150, 150),
batch_size=batch_size,
class_mode='binary')
i = 0
for inputs_batch, labels_batch in generator:
print(i, end=' ')
i += 1
return features, labels
train_features, train_labels = extract_features(train_dir)
Every time I run it, I get the same error message:
2020-11-19 16:08:56.973416: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory
2020-11-19 16:08:56.973436: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
Found 2000 images belonging to 2 classes.
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 Traceback (most recent call last):
File "/~/Documents/keras/untitled0.py", line 30, in <module>
train_features, train_labels = extract_features(train_dir)
File "/~/Documents/keras/untitled0.py", line 25, in extract_features
for inputs_batch, labels_batch in generator:
File "/.local/lib/python3.8/site-packages/keras_preprocessing/image/iterator.py", line 104, in __next__
return self.next(*args, **kwargs)
File "/.local/lib/python3.8/site-packages/keras_preprocessing/image/iterator.py", line 116, in next
return self._get_batches_of_transformed_samples(index_array)
File "/.local/lib/python3.8/site-packages/keras_preprocessing/image/iterator.py", line 227, in _get_batches_of_transformed_samples
img = load_img(filepaths[j],
File "/.local/lib/python3.8/site-packages/keras_preprocessing/image/utils.py", line 114, in load_img
img = pil_image.open(io.BytesIO(f.read()))
File "/anaconda3/envs/keras28/lib/python3.8/site-packages/PIL/Image.py", line 2943, in open
raise UnidentifiedImageError(
UnidentifiedImageError: cannot identify image file <_io.BytesIO object at 0x7f41286f3090>
The error randomly raises. Here I posted that the code crashed at 60, but sometimes crashes at 43, 69 or any other number. It seems the problem is not related to a specific image, but the way I'm using flow_from_directory / ImageDataGenerator.
Keras version: 2.4.3

Related

Huggingface tokenizer not able to load model after upgrading python to 3.10

I just updated Python to version 3.10.8. Note that I use JupyterLab.
I had to re-install a lot of packages, but now I get an error when I try to load the tokenizer of an HuggingFace model
This is my code:
# Import libraries
from transformers import pipeline, AutoTokenizer
# Define checkpoint
model_checkpoint = 'deepset/xlm-roberta-large-squad2'
# Tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_checkpoint)
Note that version of transformers is 4.24.0.
This is the error I get:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Cell In [3], line 2
1 # Tokenizer
----> 2 tokenizer = AutoTokenizer.from_pretrained(model_checkpoint)
File ~/.local/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py:637, in AutoTokenizer.from_pretrained(cls, pretrained_model_name_or_path, *inputs, **kwargs)
635 tokenizer_class_py, tokenizer_class_fast = TOKENIZER_MAPPING[type(config)]
636 if tokenizer_class_fast and (use_fast or tokenizer_class_py is None):
--> 637 return tokenizer_class_fast.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
638 else:
639 if tokenizer_class_py is not None:
File ~/.local/lib/python3.10/site-packages/transformers/tokenization_utils_base.py:1777, in PreTrainedTokenizerBase.from_pretrained(cls, pretrained_model_name_or_path, *init_inputs, **kwargs)
1774 else:
1775 logger.info(f"loading file {file_path} from cache at {resolved_vocab_files[file_id]}")
-> 1777 return cls._from_pretrained(
1778 resolved_vocab_files,
1779 pretrained_model_name_or_path,
1780 init_configuration,
1781 *init_inputs,
1782 use_auth_token=use_auth_token,
1783 cache_dir=cache_dir,
1784 local_files_only=local_files_only,
1785 _commit_hash=commit_hash,
1786 **kwargs,
1787 )
File ~/.local/lib/python3.10/site-packages/transformers/tokenization_utils_base.py:1932, in PreTrainedTokenizerBase._from_pretrained(cls, resolved_vocab_files, pretrained_model_name_or_path, init_configuration, use_auth_token, cache_dir, local_files_only, _commit_hash, *init_inputs, **kwargs)
1930 # Instantiate tokenizer.
1931 try:
-> 1932 tokenizer = cls(*init_inputs, **init_kwargs)
1933 except OSError:
1934 raise OSError(
1935 "Unable to load vocabulary from file. "
1936 "Please check that the provided vocabulary is accessible and not corrupted."
1937 )
File ~/.local/lib/python3.10/site-packages/transformers/models/xlm_roberta/tokenization_xlm_roberta_fast.py:155, in XLMRobertaTokenizerFast.__init__(self, vocab_file, tokenizer_file, bos_token, eos_token, sep_token, cls_token, unk_token, pad_token, mask_token, **kwargs)
139 def __init__(
140 self,
141 vocab_file=None,
(...)
151 ):
152 # Mask token behave like a normal word, i.e. include the space before it
153 mask_token = AddedToken(mask_token, lstrip=True, rstrip=False) if isinstance(mask_token, str) else mask_token
--> 155 super().__init__(
156 vocab_file,
157 tokenizer_file=tokenizer_file,
158 bos_token=bos_token,
159 eos_token=eos_token,
160 sep_token=sep_token,
161 cls_token=cls_token,
162 unk_token=unk_token,
163 pad_token=pad_token,
164 mask_token=mask_token,
165 **kwargs,
166 )
168 self.vocab_file = vocab_file
169 self.can_save_slow_tokenizer = False if not self.vocab_file else True
File ~/.local/lib/python3.10/site-packages/transformers/tokenization_utils_fast.py:114, in PreTrainedTokenizerFast.__init__(self, *args, **kwargs)
111 fast_tokenizer = TokenizerFast.from_file(fast_tokenizer_file)
112 elif slow_tokenizer is not None:
113 # We need to convert a slow tokenizer to build the backend
--> 114 fast_tokenizer = convert_slow_tokenizer(slow_tokenizer)
115 elif self.slow_tokenizer_class is not None:
116 # We need to create and convert a slow tokenizer to build the backend
117 slow_tokenizer = self.slow_tokenizer_class(*args, **kwargs)
File ~/.local/lib/python3.10/site-packages/transformers/convert_slow_tokenizer.py:1162, in convert_slow_tokenizer(transformer_tokenizer)
1154 raise ValueError(
1155 f"An instance of tokenizer class {tokenizer_class_name} cannot be converted in a Fast tokenizer instance."
1156 " No converter was found. Currently available slow->fast convertors:"
1157 f" {list(SLOW_TO_FAST_CONVERTERS.keys())}"
1158 )
1160 converter_class = SLOW_TO_FAST_CONVERTERS[tokenizer_class_name]
-> 1162 return converter_class(transformer_tokenizer).converted()
File ~/.local/lib/python3.10/site-packages/transformers/convert_slow_tokenizer.py:438, in SpmConverter.__init__(self, *args)
434 requires_backends(self, "protobuf")
436 super().__init__(*args)
--> 438 from .utils import sentencepiece_model_pb2 as model_pb2
440 m = model_pb2.ModelProto()
441 with open(self.original_tokenizer.vocab_file, "rb") as f:
File ~/.local/lib/python3.10/site-packages/transformers/utils/sentencepiece_model_pb2.py:20
18 from google.protobuf import descriptor as _descriptor
19 from google.protobuf import message as _message
---> 20 from google.protobuf import reflection as _reflection
21 from google.protobuf import symbol_database as _symbol_database
24 # ##protoc_insertion_point(imports)
File /usr/lib/python3/dist-packages/google/protobuf/reflection.py:58
56 from google.protobuf.pyext import cpp_message as message_impl
57 else:
---> 58 from google.protobuf.internal import python_message as message_impl
60 # The type of all Message classes.
61 # Part of the public interface, but normally only used by message factories.
62 GeneratedProtocolMessageType = message_impl.GeneratedProtocolMessageType
File /usr/lib/python3/dist-packages/google/protobuf/internal/python_message.py:69
66 import copyreg as copyreg
68 # We use "as" to avoid name collisions with variables.
---> 69 from google.protobuf.internal import containers
70 from google.protobuf.internal import decoder
71 from google.protobuf.internal import encoder
File /usr/lib/python3/dist-packages/google/protobuf/internal/containers.py:182
177 collections.MutableMapping.register(MutableMapping)
179 else:
180 # In Python 3 we can just use MutableMapping directly, because it defines
181 # __slots__.
--> 182 MutableMapping = collections.MutableMapping
185 class BaseContainer(object):
187 """Base container class."""
AttributeError: module 'collections' has no attribute 'MutableMapping'
I tried several solutions (for example, this and this), but none seem to work.
According to this link, I should change collections.Mapping into collections.abc.Mapping, but I wouldn't knwo where to do it.
Another possible solution is downgrading Python to 3.9, but I would like to keep it as last resort.
How can I fix this?
Turned out it was a problem related to protobuf module. I updated it to the latest version to date (which is 4.21.9).
This changed the error to:
TypeError: Descriptors cannot not be created directly.
If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0.
If you cannot immediately regenerate your protos, some other possible workarounds are:
1. Downgrade the protobuf package to 3.20.x or lower.
2. Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will use pure-Python parsing and will be much slower).
More information: https://developers.google.com/protocol-buffers/docs/news/2022-05-06#python-updates
So I downgraded protobuf to version 3.20.0 and that worked.
For further details, look here.

STAN on Databricks - AttributeError: 'ConsoleBuffer' object has no attribute 'closed'

Running STAN (pystan) on Databricks 8.2 ML throws the following Error
To reproduce, just run the simple example from https://pystan.readthedocs.io/en/latest/
Seems like the ConsoleBuffer Class doesn't have an implementation for closed? Have others run into this issue? Any workarounds recommended? I am currently using a single node Cluster and ideally don't want to run this on a local machine.
Stack Trace
AttributeError Traceback (most recent call last)
<command-261559943577864> in <module>
3 "sigma": [15, 10, 16, 11, 9, 11, 10, 18]}
4
----> 5 posterior = stan.build(schools_code, data=schools_data)
6 fit = posterior.sample(num_chains=4, num_samples=1000)
7 eta = fit["eta"] # array with shape (8, 4000)
/databricks/python/lib/python3.8/site-packages/stan/model.py in build(program_code, data, random_seed)
468
469 try:
--> 470 return asyncio.run(go())
471 except KeyboardInterrupt:
472 return # type: ignore
/databricks/python/lib/python3.8/asyncio/runners.py in run(main, debug)
41 events.set_event_loop(loop)
42 loop.set_debug(debug)
---> 43 return loop.run_until_complete(main)
44 finally:
45 try:
/databricks/python/lib/python3.8/asyncio/base_events.py in run_until_complete(self, future)
614 raise RuntimeError('Event loop stopped before Future completed.')
615
--> 616 return future.result()
617
618 def stop(self):
/databricks/python/lib/python3.8/site-packages/stan/model.py in go()
438 async def go():
439 io = ConsoleIO()
--> 440 io.error("<info>Building...</info>")
441 async with stan.common.HttpstanClient() as client:
442 # Check to see if model is in cache.
/databricks/python/lib/python3.8/site-packages/clikit/api/io/io.py in error(self, string, flags)
84 The string is formatted before it is written to the output.
85 """
---> 86 self._error_output.write(string, flags=flags)
87
88 def error_line(self, string, flags=None): # type: (str, Optional[int]) -> None
/databricks/python/lib/python3.8/site-packages/clikit/api/io/output.py in write(self, string, flags, new_line)
59 formatted += "\n"
60
---> 61 self._stream.write(to_str(formatted))
62
63 def write_line(self, string, flags=None): # type: (str, Optional[int]) -> None
/databricks/python/lib/python3.8/site-packages/clikit/io/output_stream/stream_output_stream.py in write(self, string)
19 Writes a string to the stream.
20 """
---> 21 if self.is_closed():
22 raise io.UnsupportedOperation("Cannot write to a closed input.")
23
/databricks/python/lib/python3.8/site-packages/clikit/io/output_stream/stream_output_stream.py in is_closed(self)
114 Returns whether the stream is closed.
115 """
--> 116 return self._stream.closed
AttributeError: 'ConsoleBuffer' object has no attribute 'closed'
After trying some old clusters, I realized that pystan 3 is a complete re-write. So one workaround is to go back to pystan==2.19.1.1

OpenCV Image Denoising gives: Error: -215:Assertation failed

Trying to denoise a really simple image, using the code below. When printing out the array of data I get the following structure, which is expected as the image is greyscale:
[[ 62 62 63 ... 29 16 6]
[ 75 90 103 ... 21 16 12]
[ 77 100 118 ... 29 29 30]
...
[ 84 68 56 ... 47 50 53]
[101 94 89 ... 40 44 48]
Here is the code and the associated error, at this point I'm a little stuck. Any suggestions?
import cv2
from matplotlib import pyplot as plt
img = cv2.imread(path,0)
dst = cv2.fastNlMeansDenoising(img,None,10,10,7,21)
plt.subplot(211),plt.imshow(dst)
plt.subplot(212),plt.imshow(img)
plt.show()
____________________________________________________________________
runfile(___, wdir='G:/James Alexander/Python Programs')
Traceback (most recent call last):
File "<ipython-input-127-ce832752c183>", line 1, in <module>
runfile('G:/James Alexander/Python Programs/Noiseremoval.py', wdir=___)
File "___", line 704, in runfile
execfile(filename, namespace)
File "___", line 108, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)
File "___", line 13, in <module>
dst = cv2.fastNlMeansDenoising(img,None,10,10,7,21)
error: OpenCV(4.1.0) C:\projects\opencv-python\opencv\modules\photo\src\denoising.cpp:120: error: (-215:Assertion failed) hn == 1 || hn == cn in function 'cv::fastNlMeansDenoising'
Read the documentation on the Denoising function that you're using. There are two ways to call the function and you seem to be doing a combination of the two.
dst = cv.fastNlMeansDenoising(src[, dst[, h[, templateWindowSize[, searchWindowSize]]]])
or
dst = cv.fastNlMeansDenoising(src, h[, dst[, templateWindowSize[, searchWindowSize[, normType]]]])
You are calling it with (src, dst, h, templateWindowSize, searchWindowSize, normType) which either has too many parameters or is in the wrong order, depending on which method you want to use.
change your parameters to
dst = cv2.fastNlMeansDenoising(img, None, 30, 7, 21)

theano index out of bound and shape mistaken in theano.scan

I'm writing my codes in dl4mt (an open source neural machine translation tool)
I encountered some weird problems in theano scan.
The following code was used to extract sub-tensor from a 3D tensor according to some indices.
the "indicesSub" in line 1098 is of shape (n_sample, window)
"sl" in 1099 is (n_sample, )
"cc_" in line 1100 is of shape (n_timestep, n_sample, dimctx),
the results of cc_result should be of shape (nsample, window, dimctx)
so in line 1116, where cc_ was already dimshuffled to shape (n_sample, n_timestep, dimctx), the three inputs are all sequences.
In the inner loop (_sub_step), indices is used as sequence parameter, the other two are non_sequences. The error occurs when training, which I will put on in the end. But when I use some trick to train a pseudo model and using it to test, the error doesn't show... Which is really weird because training and testing share the same codes.
The error is really weird, too. It says the input shape is (17, 256), (256, ), (), respectively. But according to my codes, the input shape should be (17, 256), (21, ), (), where 21 is window size.
I'm wondering if there's something that could change the input of scan? maybe a None input? I'm not sure. Please help me if you have any idea. Thank you
1098 indicesSub = indices_mask_[src_positions] # n_samples, window
1099 sl = sntlens.reshape([sntlens.shape[0], ]) # (n_sample, )
1100 ccshuffle = cc_.dimshuffle(1, 0, 2) # n_sample, n_timestep, dimctx
1101
1102 def _step_index(indices, cc_sub, sntlen_in):
1103 def _sub_step(indice_step, cc_step, len_step):
1104 # indice_step is a scalar
1105 # cc_step is (ntimestep * dimctx)
1106 # sntlen_in is a scalar
1107 r = tensor.switch(tensor.lt(indice_step, 0), 0, 1)
1108 l = tensor.switch(tensor.ge(indice_step, len_step), 0, 1)
1109 rt = ifelse(tensor.lt(r * l, 1), tensor.zeros([cc_step.shape[1], ]), cc_step[tensor.cast(indice_step, 'int64')])
1110 return rt
1111 ret, updt = theano.scan(_sub_step,
1112 sequences=indices,
1113 non_sequences=[cc_sub, sntlen_in])
1114 return ret
1115
1116 cc_result, upd = theano.scan(_step_index,
1117 sequences=[indicesSub, ccshuffle, sl])
1118
Error info:
24 File "theano/scan_module/scan_perform.pyx", line 397, in theano.scan_module.scan_perform.perform (/search/odin/chengshanbo/.theano/compiledir_Linux-2.6-el6.x86_64-x86_64-with-redhat-6.6-Santiago-x86_64-2.7. 11-64/scan_perform/mod.cpp:4193)
25 File "/search/speech/chengshanbo/tools/anaconda2/lib/python2.7/site-packages/Theano-0.8.0-py2.7.egg/theano/scan_module/scan_op.py", line 951, in rval
26 r = p(n, [x[0] for x in i], o)
27 File "/search/speech/chengshanbo/tools/anaconda2/lib/python2.7/site-packages/Theano-0.8.0-py2.7.egg/theano/scan_module/scan_op.py", line 940, in <lambda>
28 self, node)
29 File "theano/scan_module/scan_perform.pyx", line 405, in theano.scan_module.scan_perform.perform (/search/odin/chengshanbo/.theano/compiledir_Linux-2.6-el6.x86_64-x86_64-with-redhat-6.6-Santiago-x86_64-2.7. 11-64/scan_perform/mod.cpp:4316)
30 File "/search/speech/chengshanbo/tools/anaconda2/lib/python2.7/site-packages/Theano-0.8.0-py2.7.egg/theano/gof/link.py", line 314, in raise_with_op
31 reraise(exc_type, exc_value, exc_trace)
32 File "theano/scan_module/scan_perform.pyx", line 397, in theano.scan_module.scan_perform.perform (/search/odin/chengshanbo/.theano/compiledir_Linux-2.6-el6.x86_64-x86_64-with-redhat-6.6-Santiago-x86_64-2.7. 11-64/scan_perform/mod.cpp:4193)
33 File "/search/speech/chengshanbo/tools/anaconda2/lib/python2.7/site-packages/Theano-0.8.0-py2.7.egg/theano/scan_module/scan_op.py", line 951, in rval
34 r = p(n, [x[0] for x in i], o)
35 File "/search/speech/chengshanbo/tools/anaconda2/lib/python2.7/site-packages/Theano-0.8.0-py2.7.egg/theano/scan_module/scan_op.py", line 940, in <lambda>
36 self, node)
37 File "theano/scan_module/scan_perform.pyx", line 405, in theano.scan_module.scan_perform.perform (/search/odin/chengshanbo/.theano/compiledir_Linux-2.6-el6.x86_64-x86_64-with-redhat-6.6-Santiago-x86_64-2.7. 11-64/scan_perform/mod.cpp:4316)
38 File "/search/speech/chengshanbo/tools/anaconda2/lib/python2.7/site-packages/Theano-0.8.0-py2.7.egg/theano/gof/link.py", line 314, in raise_with_op
39 reraise(exc_type, exc_value, exc_trace)
40 File "theano/scan_module/scan_perform.pyx", line 397, in theano.scan_module.scan_perform.perform (/search/odin/chengshanbo/.theano/compiledir_Linux-2.6-el6.x86_64-x86_64-with-redhat-6.6-Santiago-x86_64-2.7. 11-64/scan_perform/mod.cpp:4193)
41 IndexError: index out of bounds
42 Apply node that caused the error: GpuIncSubtensor{Inc;int64}(GpuElemwise{add,no_inplace}.0, if{inplace,gpu}.0, ScalarFromTensor.0)
43 Toposort index: 5
44 Inputs types: [CudaNdarrayType(float32, matrix), CudaNdarrayType(float32, vector), Scalar(int64)]
45 Inputs shapes: [(17, 256), (256,), ()]
46 Inputs strides: [(256, 1), (1,), ()]
47 Inputs values: ['not shown', 'not shown', 17]
48 Outputs clients: [['output']]
49

how to run auto.arima using rpy2

I want to call R's auto.arima function from Python. I think i have not yet fully understood this interface. Can someone help me here - to send a time series obj to R, call forecast related functions and get back the results?
This is what I have done so far:
from rpy2.robjects import r
from rpy2.robjects import pandas2ri
#create a python time series
count = range(1, 51)
df['count'] = count
df['date'] = pd.date_range('2016-01-01', '2016-02-19')
df.set_index('date', inlace = True)
df.sort_index(inplace = True)
pandas2ri.activate()
r_timeseries = pandas2ri.py2ri(df)
r('fit <- auto.arima(r_timeseries)')
I think I have to import some R packages (like forecast). Not sure how to go about doing that in Python, properly pass the python time series object to R etc.
In [63]: r_ts = pandas2ri.py2ri(df)
In [64]: r_ts
Out[64]:
<DataFrame - Python:0x1126a93f8 / R:0x7ff7bfa51bc8>
[IntVector]
X0: <class 'rpy2.robjects.vectors.IntVector'>
<IntVector - Python:0x1126a96c8 / R:0x7ff7be1af1c0>
[ 1, 2, 3, ..., 48, 49, 50]
And, when I attempt to call forecast
In [83]: x = r('forecast(r_ts)')
/Library/Python/2.7/site-packages/rpy2/robjects/functions.py:106: UserWarning: Error in forecast(r_ts) : object 'r_ts' not found
res = super(Function, self).__call__(*new_args, **new_kwargs)
---------------------------------------------------------------------------
RRuntimeError Traceback (most recent call last)
<ipython-input-83-0765ffc30741> in <module>()
----> 1 x = r('forecast(r_ts)')
/Library/Python/2.7/site-packages/rpy2/robjects/__init__.pyc in __call__(self, string)
319 def __call__(self, string):
320 p = _rparse(text=StrSexpVector((string,)))
--> 321 res = self.eval(p)
322 return conversion.ri2py(res)
323
/Library/Python/2.7/site-packages/rpy2/robjects/functions.pyc in __call__(self, *args, **kwargs)
176 v = kwargs.pop(k)
177 kwargs[r_k] = v
--> 178 return super(SignatureTranslatedFunction, self).__call__(*args, **kwargs)
179
180 pattern_link = re.compile(r'\\link\{(.+?)\}')
/Library/Python/2.7/site-packages/rpy2/robjects/functions.pyc in __call__(self, *args, **kwargs)
104 for k, v in kwargs.items():
105 new_kwargs[k] = conversion.py2ri(v)
--> 106 res = super(Function, self).__call__(*new_args, **new_kwargs)
107 res = conversion.ri2ro(res)
108 return res
RRuntimeError: Error in forecast(r_ts) : object 'r_ts' not found
I tried the following as well:
In [99]: f = r('forecast.auto.arima(r_ts)')
---------------------------------------------------------------------------
RRuntimeError Traceback (most recent call last)
<ipython-input-99-1c4610d2740d> in <module>()
----> 1 f = r('forecast.auto.arima(r_ts)')
/Library/Python/2.7/site-packages/rpy2/robjects/__init__.pyc in __call__(self, string)
319 def __call__(self, string):
320 p = _rparse(text=StrSexpVector((string,)))
--> 321 res = self.eval(p)
322 return conversion.ri2py(res)
323
/Library/Python/2.7/site-packages/rpy2/robjects/functions.pyc in __call__(self, *args, **kwargs)
176 v = kwargs.pop(k)
177 kwargs[r_k] = v
--> 178 return super(SignatureTranslatedFunction, self).__call__(*args, **kwargs)
179
180 pattern_link = re.compile(r'\\link\{(.+?)\}')
/Library/Python/2.7/site-packages/rpy2/robjects/functions.pyc in __call__(self, *args, **kwargs)
104 for k, v in kwargs.items():
105 new_kwargs[k] = conversion.py2ri(v)
--> 106 res = super(Function, self).__call__(*new_args, **new_kwargs)
107 res = conversion.ri2ro(res)
108 return res
RRuntimeError: Error in eval(expr, envir, enclos) :
could not find function "forecast.auto.arima"
you could try what I do
import rpy2.robjects as ro
from rpy2.robjects import pandas2ri
pandas2ri.activate()
ro.r('library(forecast)')
rdf = pandas2ri.py2ri(df)
ro.globalenv['r_timeseries'] = rdf
pred = ro.r('as.data.frame(forecast(auto.arima(r_timeseries),h=5))')
this way, you can handle pred as a data frame like this
Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
51 51 51 51 51 51
52 52 52 52 52 52
53 53 53 53 53 53
54 54 54 54 54 54
55 55 55 55 55 55
In the first attempt you are telling R to use a variable r_ts that it does not now much about (the name r_ts is defined in your Python namespace), and in the second attempt you are added to this a function name R does not know anything about. Both error message are precisely reporting this as a problem.
Your first attempt could be rewritten as:
x = r('forecast')(r_ts)

Resources