M4A Tagging Issues - id3

I'm having an issue with inserting M4A atoms into a file. Since the original file does not have a udta structure, I added it using an existing M4A file I had lying around as a guide.
Here's what I did to add the atoms:
Build a udta atom in memory
Update the size of the moov atom to include to size of the udta atom
Copy the file up to the end of the first trak atom
Insert my udta atom
Copy the rest as usual.
The only real difference between the original file and tagged file is that the mdat atom has been moved down a little to accommodate the tags. This leads me to believe that there's some reference to that position in the other atoms, but I couldn't find one.
Here is the output from AtomicParsley:
Original File:
Atom ftyp # 0 of size: 36, ends # 36
Atom moov # 36 of size: 30156, ends # 30192
Atom mvhd # 44 of size: 108, ends # 152
Atom iods # 152 of size: 33, ends # 185
Atom trak # 185 of size: 30007, ends # 30192
Atom tkhd # 193 of size: 92, ends # 285
Atom mdia # 285 of size: 29907, ends # 30192
Atom mdhd # 293 of size: 32, ends # 325
Atom hdlr # 325 of size: 37, ends # 362
Atom minf # 362 of size: 29830, ends # 30192
Atom smhd # 370 of size: 16, ends # 386
Atom dinf # 386 of size: 36, ends # 422
Atom dref # 394 of size: 28, ends # 422
Atom stbl # 422 of size: 29770, ends # 30192
Atom stts # 430 of size: 24, ends # 454
Atom stsd # 454 of size: 106, ends # 560
Atom mp4a # 470 of size: 90, ends # 560
Atom esds # 506 of size: 54, ends # 560
Atom stsz # 560 of size: 26888, ends # 27448
Atom stsc # 27448 of size: 40, ends # 27488
Atom stco # 27488 of size: 2704, ends # 30192
Atom mdat # 30192 of size: 2495503, ends # 2525695
Modified file:
Atom ftyp # 0 of size: 36, ends # 36
Atom moov # 36 of size: 30323, ends # 30359
Atom mvhd # 44 of size: 108, ends # 152
Atom iods # 152 of size: 33, ends # 185
Atom trak # 185 of size: 30007, ends # 30192
Atom tkhd # 193 of size: 92, ends # 285
Atom mdia # 285 of size: 29907, ends # 30192
Atom mdhd # 293 of size: 32, ends # 325
Atom hdlr # 325 of size: 37, ends # 362
Atom minf # 362 of size: 29830, ends # 30192
Atom smhd # 370 of size: 16, ends # 386
Atom dinf # 386 of size: 36, ends # 422
Atom dref # 394 of size: 28, ends # 422
Atom stbl # 422 of size: 29770, ends # 30192
Atom stts # 430 of size: 24, ends # 454
Atom stsd # 454 of size: 106, ends # 560
Atom mp4a # 470 of size: 90, ends # 560
Atom esds # 506 of size: 54, ends # 560
Atom stsz # 560 of size: 26888, ends # 27448
Atom stsc # 27448 of size: 40, ends # 27488
Atom stco # 27488 of size: 2704, ends # 30192
Atom udta # 30192 of size: 167, ends # 30359
Atom meta # 30200 of size: 159, ends # 30359
Atom ilst # 30212 of size: 147, ends # 30359
Atom ©ART # 30220 of size: 35, ends # 30255
Atom data # 30228 of size: 27, ends # 30255
Atom ©nam # 30255 of size: 63, ends # 30318
Atom data # 30263 of size: 55, ends # 30318
Atom ©alb # 30318 of size: 41, ends # 30359
Atom data # 30326 of size: 33, ends # 30359
Atom mdat # 30359 of size: 2495503, ends # 2525862
Another thing of note is that the tagged file I'm using as reference has an hdlr atom under udta->meta, but adding a copy of that tag didn't help either. If I manually remove the udta atom and the size data of moov, the file works again.
When I try to play the tagged file, I get these errors in various programs:
mplayer:
[aac # 0x204d720] channel element 0.0 is not allocated
[aac # 0x204d720] channel element 0.0 is not allocated
[aac # 0x204d720] channel element 3.13 is not allocated
[aac # 0x204d720] channel element 2.14 is not allocated
[aac # 0x204d720] channel element 2.9 is not allocated
[aac # 0x204d720] Prediction is not allowed in AAC-LC.
[aac # 0x204d720] channel element 3.1 is not allocated
[aac # 0x204d720] channel element 0.3 is not allocated
....
totem:
** Message: Error: Could not decode stream.
gstfaad.c(1319): gst_faad_chain (): /GstPlayBin2:play/GstURIDecodeBin:uridecodebin0/GstDecodeBin2:decodebin20/GstFaad:faad0:
decoding error: Bitstream value not allowed by specification
banshee:
[Error 08:26:27.610] GStreamer stream error: Decode
[Error 08:26:27.960] GStreamer stream error: Decode
[Error 08:26:28.252] GStreamer resource error: NotFound
Oh, how I wish the other 99% of programs recognized ID3 tags on files other than MP3's....

Thanks to xdelta3, I was able to locate the difference between a file tagged with tagging software and then manually removed and the original file.
The issue appears to be the 'stco' atom, which is a list of chunks using absolute file offsets. Bingo! Since I added tags, those offsets are now invalid. More coding to do sigh

Related

Huggingface tokenizer not able to load model after upgrading python to 3.10

I just updated Python to version 3.10.8. Note that I use JupyterLab.
I had to re-install a lot of packages, but now I get an error when I try to load the tokenizer of an HuggingFace model
This is my code:
# Import libraries
from transformers import pipeline, AutoTokenizer
# Define checkpoint
model_checkpoint = 'deepset/xlm-roberta-large-squad2'
# Tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_checkpoint)
Note that version of transformers is 4.24.0.
This is the error I get:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Cell In [3], line 2
1 # Tokenizer
----> 2 tokenizer = AutoTokenizer.from_pretrained(model_checkpoint)
File ~/.local/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py:637, in AutoTokenizer.from_pretrained(cls, pretrained_model_name_or_path, *inputs, **kwargs)
635 tokenizer_class_py, tokenizer_class_fast = TOKENIZER_MAPPING[type(config)]
636 if tokenizer_class_fast and (use_fast or tokenizer_class_py is None):
--> 637 return tokenizer_class_fast.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
638 else:
639 if tokenizer_class_py is not None:
File ~/.local/lib/python3.10/site-packages/transformers/tokenization_utils_base.py:1777, in PreTrainedTokenizerBase.from_pretrained(cls, pretrained_model_name_or_path, *init_inputs, **kwargs)
1774 else:
1775 logger.info(f"loading file {file_path} from cache at {resolved_vocab_files[file_id]}")
-> 1777 return cls._from_pretrained(
1778 resolved_vocab_files,
1779 pretrained_model_name_or_path,
1780 init_configuration,
1781 *init_inputs,
1782 use_auth_token=use_auth_token,
1783 cache_dir=cache_dir,
1784 local_files_only=local_files_only,
1785 _commit_hash=commit_hash,
1786 **kwargs,
1787 )
File ~/.local/lib/python3.10/site-packages/transformers/tokenization_utils_base.py:1932, in PreTrainedTokenizerBase._from_pretrained(cls, resolved_vocab_files, pretrained_model_name_or_path, init_configuration, use_auth_token, cache_dir, local_files_only, _commit_hash, *init_inputs, **kwargs)
1930 # Instantiate tokenizer.
1931 try:
-> 1932 tokenizer = cls(*init_inputs, **init_kwargs)
1933 except OSError:
1934 raise OSError(
1935 "Unable to load vocabulary from file. "
1936 "Please check that the provided vocabulary is accessible and not corrupted."
1937 )
File ~/.local/lib/python3.10/site-packages/transformers/models/xlm_roberta/tokenization_xlm_roberta_fast.py:155, in XLMRobertaTokenizerFast.__init__(self, vocab_file, tokenizer_file, bos_token, eos_token, sep_token, cls_token, unk_token, pad_token, mask_token, **kwargs)
139 def __init__(
140 self,
141 vocab_file=None,
(...)
151 ):
152 # Mask token behave like a normal word, i.e. include the space before it
153 mask_token = AddedToken(mask_token, lstrip=True, rstrip=False) if isinstance(mask_token, str) else mask_token
--> 155 super().__init__(
156 vocab_file,
157 tokenizer_file=tokenizer_file,
158 bos_token=bos_token,
159 eos_token=eos_token,
160 sep_token=sep_token,
161 cls_token=cls_token,
162 unk_token=unk_token,
163 pad_token=pad_token,
164 mask_token=mask_token,
165 **kwargs,
166 )
168 self.vocab_file = vocab_file
169 self.can_save_slow_tokenizer = False if not self.vocab_file else True
File ~/.local/lib/python3.10/site-packages/transformers/tokenization_utils_fast.py:114, in PreTrainedTokenizerFast.__init__(self, *args, **kwargs)
111 fast_tokenizer = TokenizerFast.from_file(fast_tokenizer_file)
112 elif slow_tokenizer is not None:
113 # We need to convert a slow tokenizer to build the backend
--> 114 fast_tokenizer = convert_slow_tokenizer(slow_tokenizer)
115 elif self.slow_tokenizer_class is not None:
116 # We need to create and convert a slow tokenizer to build the backend
117 slow_tokenizer = self.slow_tokenizer_class(*args, **kwargs)
File ~/.local/lib/python3.10/site-packages/transformers/convert_slow_tokenizer.py:1162, in convert_slow_tokenizer(transformer_tokenizer)
1154 raise ValueError(
1155 f"An instance of tokenizer class {tokenizer_class_name} cannot be converted in a Fast tokenizer instance."
1156 " No converter was found. Currently available slow->fast convertors:"
1157 f" {list(SLOW_TO_FAST_CONVERTERS.keys())}"
1158 )
1160 converter_class = SLOW_TO_FAST_CONVERTERS[tokenizer_class_name]
-> 1162 return converter_class(transformer_tokenizer).converted()
File ~/.local/lib/python3.10/site-packages/transformers/convert_slow_tokenizer.py:438, in SpmConverter.__init__(self, *args)
434 requires_backends(self, "protobuf")
436 super().__init__(*args)
--> 438 from .utils import sentencepiece_model_pb2 as model_pb2
440 m = model_pb2.ModelProto()
441 with open(self.original_tokenizer.vocab_file, "rb") as f:
File ~/.local/lib/python3.10/site-packages/transformers/utils/sentencepiece_model_pb2.py:20
18 from google.protobuf import descriptor as _descriptor
19 from google.protobuf import message as _message
---> 20 from google.protobuf import reflection as _reflection
21 from google.protobuf import symbol_database as _symbol_database
24 # ##protoc_insertion_point(imports)
File /usr/lib/python3/dist-packages/google/protobuf/reflection.py:58
56 from google.protobuf.pyext import cpp_message as message_impl
57 else:
---> 58 from google.protobuf.internal import python_message as message_impl
60 # The type of all Message classes.
61 # Part of the public interface, but normally only used by message factories.
62 GeneratedProtocolMessageType = message_impl.GeneratedProtocolMessageType
File /usr/lib/python3/dist-packages/google/protobuf/internal/python_message.py:69
66 import copyreg as copyreg
68 # We use "as" to avoid name collisions with variables.
---> 69 from google.protobuf.internal import containers
70 from google.protobuf.internal import decoder
71 from google.protobuf.internal import encoder
File /usr/lib/python3/dist-packages/google/protobuf/internal/containers.py:182
177 collections.MutableMapping.register(MutableMapping)
179 else:
180 # In Python 3 we can just use MutableMapping directly, because it defines
181 # __slots__.
--> 182 MutableMapping = collections.MutableMapping
185 class BaseContainer(object):
187 """Base container class."""
AttributeError: module 'collections' has no attribute 'MutableMapping'
I tried several solutions (for example, this and this), but none seem to work.
According to this link, I should change collections.Mapping into collections.abc.Mapping, but I wouldn't knwo where to do it.
Another possible solution is downgrading Python to 3.9, but I would like to keep it as last resort.
How can I fix this?
Turned out it was a problem related to protobuf module. I updated it to the latest version to date (which is 4.21.9).
This changed the error to:
TypeError: Descriptors cannot not be created directly.
If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0.
If you cannot immediately regenerate your protos, some other possible workarounds are:
1. Downgrade the protobuf package to 3.20.x or lower.
2. Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will use pure-Python parsing and will be much slower).
More information: https://developers.google.com/protocol-buffers/docs/news/2022-05-06#python-updates
So I downgraded protobuf to version 3.20.0 and that worked.
For further details, look here.

STAN on Databricks - AttributeError: 'ConsoleBuffer' object has no attribute 'closed'

Running STAN (pystan) on Databricks 8.2 ML throws the following Error
To reproduce, just run the simple example from https://pystan.readthedocs.io/en/latest/
Seems like the ConsoleBuffer Class doesn't have an implementation for closed? Have others run into this issue? Any workarounds recommended? I am currently using a single node Cluster and ideally don't want to run this on a local machine.
Stack Trace
AttributeError Traceback (most recent call last)
<command-261559943577864> in <module>
3 "sigma": [15, 10, 16, 11, 9, 11, 10, 18]}
4
----> 5 posterior = stan.build(schools_code, data=schools_data)
6 fit = posterior.sample(num_chains=4, num_samples=1000)
7 eta = fit["eta"] # array with shape (8, 4000)
/databricks/python/lib/python3.8/site-packages/stan/model.py in build(program_code, data, random_seed)
468
469 try:
--> 470 return asyncio.run(go())
471 except KeyboardInterrupt:
472 return # type: ignore
/databricks/python/lib/python3.8/asyncio/runners.py in run(main, debug)
41 events.set_event_loop(loop)
42 loop.set_debug(debug)
---> 43 return loop.run_until_complete(main)
44 finally:
45 try:
/databricks/python/lib/python3.8/asyncio/base_events.py in run_until_complete(self, future)
614 raise RuntimeError('Event loop stopped before Future completed.')
615
--> 616 return future.result()
617
618 def stop(self):
/databricks/python/lib/python3.8/site-packages/stan/model.py in go()
438 async def go():
439 io = ConsoleIO()
--> 440 io.error("<info>Building...</info>")
441 async with stan.common.HttpstanClient() as client:
442 # Check to see if model is in cache.
/databricks/python/lib/python3.8/site-packages/clikit/api/io/io.py in error(self, string, flags)
84 The string is formatted before it is written to the output.
85 """
---> 86 self._error_output.write(string, flags=flags)
87
88 def error_line(self, string, flags=None): # type: (str, Optional[int]) -> None
/databricks/python/lib/python3.8/site-packages/clikit/api/io/output.py in write(self, string, flags, new_line)
59 formatted += "\n"
60
---> 61 self._stream.write(to_str(formatted))
62
63 def write_line(self, string, flags=None): # type: (str, Optional[int]) -> None
/databricks/python/lib/python3.8/site-packages/clikit/io/output_stream/stream_output_stream.py in write(self, string)
19 Writes a string to the stream.
20 """
---> 21 if self.is_closed():
22 raise io.UnsupportedOperation("Cannot write to a closed input.")
23
/databricks/python/lib/python3.8/site-packages/clikit/io/output_stream/stream_output_stream.py in is_closed(self)
114 Returns whether the stream is closed.
115 """
--> 116 return self._stream.closed
AttributeError: 'ConsoleBuffer' object has no attribute 'closed'
After trying some old clusters, I realized that pystan 3 is a complete re-write. So one workaround is to go back to pystan==2.19.1.1

Kera's ImageDataGenerator randomly crashes

I have the following structure, where I want to read jpg files from test.
./cats_dogs_small
├── test
│ ├── cats <- 1000 images
│ └── dogs <- 1000 images
To read the files, I use the following MWE:
import os
train_dir = os.path.join(os.environ['HOME'], 'Documents/cats_dogs_small')
train_dir = os.path.join(train_dir, 'train')
datagen = ImageDataGenerator(rescale=1./255)
batch_size = 20
def extract_features(directory):
generator = datagen.flow_from_directory(directory,
target_size=(150, 150),
batch_size=batch_size,
class_mode='binary')
i = 0
for inputs_batch, labels_batch in generator:
print(i, end=' ')
i += 1
return features, labels
train_features, train_labels = extract_features(train_dir)
Every time I run it, I get the same error message:
2020-11-19 16:08:56.973416: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory
2020-11-19 16:08:56.973436: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
Found 2000 images belonging to 2 classes.
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 Traceback (most recent call last):
File "/~/Documents/keras/untitled0.py", line 30, in <module>
train_features, train_labels = extract_features(train_dir)
File "/~/Documents/keras/untitled0.py", line 25, in extract_features
for inputs_batch, labels_batch in generator:
File "/.local/lib/python3.8/site-packages/keras_preprocessing/image/iterator.py", line 104, in __next__
return self.next(*args, **kwargs)
File "/.local/lib/python3.8/site-packages/keras_preprocessing/image/iterator.py", line 116, in next
return self._get_batches_of_transformed_samples(index_array)
File "/.local/lib/python3.8/site-packages/keras_preprocessing/image/iterator.py", line 227, in _get_batches_of_transformed_samples
img = load_img(filepaths[j],
File "/.local/lib/python3.8/site-packages/keras_preprocessing/image/utils.py", line 114, in load_img
img = pil_image.open(io.BytesIO(f.read()))
File "/anaconda3/envs/keras28/lib/python3.8/site-packages/PIL/Image.py", line 2943, in open
raise UnidentifiedImageError(
UnidentifiedImageError: cannot identify image file <_io.BytesIO object at 0x7f41286f3090>
The error randomly raises. Here I posted that the code crashed at 60, but sometimes crashes at 43, 69 or any other number. It seems the problem is not related to a specific image, but the way I'm using flow_from_directory / ImageDataGenerator.
Keras version: 2.4.3

Conversion of a CSV file to different form

i have a CSV file with contents
01"815732013.0"1brand1"[100 76 64 ... 153 139 94]"
01"815732025.0"1female1"[183 192 201 ... 18 10 0]"
01"815732027.0"1male1"[204 214 221 ... 214 221 255]"
in one column
I need the contents in four columns like this,
col1 col2 col3 col4
01 "815732013.0" 1brand1 "[100 76 64 ... 153 139 94]"
01 "815732025.0" 1female1 "[183 192 201 ... 18 10 0]"
01 "815732027.0" 1male1 "[204 214 221 ... 214 221 255]"
How can i change this?
using python/excel/any other tools.
If you don't need to have to have the double quotes in the output file, then you should be fine with splitting the lines on the double quotes:
import csv
import io
text = '''01"815732013.0"1brand1"[100 76 64 ... 153 139 94]"
01"815732025.0"1female1"[183 192 201 ... 18 10 0]"
01"815732027.0"1male1"[204 214 221 ... 214 221 255]"'''
with io.StringIO(text) as f, open('output.csv', 'w') as of:
writer = csv.writer(of, delimiter=',', quotechar='"')
for line in f:
line = [r for r in line.strip().split('"') if r]
writer.writerow(line)
This snippet of code is pretty straightforward. You're basically splitting on the double quotes and discarding empty strings.
If you wish your output file to contain the quotes, then you may have to use some regular expression to capture the fields:
import csv
import io
import re
text = '''01"815732013.0"1brand1"[100 76 64 ... 153 139 94]"
01"815732025.0"1female1"[183 192 201 ... 18 10 0]"
01"815732027.0"1male1"[204 214 221 ... 214 221 255]"'''
with io.StringIO(text) as f, open('output.csv', 'w') as of:
pat = re.compile(r'(\d+)(\b".+"\b)(\w+)(\b".+"\b)')
writer = csv.writer(of, delimiter=',', quotechar='"')
for line in f:
line = pat.sub(r'\1;\2;\3;\4', line.strip()).split(';')
writer.writerow(line)
This is very similar to the previous snippet, with the only difference being the regular expression. The expression groups the different fields according to your desired output. Those groups are used to generated a set of row values which are passed to the writer.writerow method to write the row in your destination file.
I hope this proves useful.

theano index out of bound and shape mistaken in theano.scan

I'm writing my codes in dl4mt (an open source neural machine translation tool)
I encountered some weird problems in theano scan.
The following code was used to extract sub-tensor from a 3D tensor according to some indices.
the "indicesSub" in line 1098 is of shape (n_sample, window)
"sl" in 1099 is (n_sample, )
"cc_" in line 1100 is of shape (n_timestep, n_sample, dimctx),
the results of cc_result should be of shape (nsample, window, dimctx)
so in line 1116, where cc_ was already dimshuffled to shape (n_sample, n_timestep, dimctx), the three inputs are all sequences.
In the inner loop (_sub_step), indices is used as sequence parameter, the other two are non_sequences. The error occurs when training, which I will put on in the end. But when I use some trick to train a pseudo model and using it to test, the error doesn't show... Which is really weird because training and testing share the same codes.
The error is really weird, too. It says the input shape is (17, 256), (256, ), (), respectively. But according to my codes, the input shape should be (17, 256), (21, ), (), where 21 is window size.
I'm wondering if there's something that could change the input of scan? maybe a None input? I'm not sure. Please help me if you have any idea. Thank you
1098 indicesSub = indices_mask_[src_positions] # n_samples, window
1099 sl = sntlens.reshape([sntlens.shape[0], ]) # (n_sample, )
1100 ccshuffle = cc_.dimshuffle(1, 0, 2) # n_sample, n_timestep, dimctx
1101
1102 def _step_index(indices, cc_sub, sntlen_in):
1103 def _sub_step(indice_step, cc_step, len_step):
1104 # indice_step is a scalar
1105 # cc_step is (ntimestep * dimctx)
1106 # sntlen_in is a scalar
1107 r = tensor.switch(tensor.lt(indice_step, 0), 0, 1)
1108 l = tensor.switch(tensor.ge(indice_step, len_step), 0, 1)
1109 rt = ifelse(tensor.lt(r * l, 1), tensor.zeros([cc_step.shape[1], ]), cc_step[tensor.cast(indice_step, 'int64')])
1110 return rt
1111 ret, updt = theano.scan(_sub_step,
1112 sequences=indices,
1113 non_sequences=[cc_sub, sntlen_in])
1114 return ret
1115
1116 cc_result, upd = theano.scan(_step_index,
1117 sequences=[indicesSub, ccshuffle, sl])
1118
Error info:
24 File "theano/scan_module/scan_perform.pyx", line 397, in theano.scan_module.scan_perform.perform (/search/odin/chengshanbo/.theano/compiledir_Linux-2.6-el6.x86_64-x86_64-with-redhat-6.6-Santiago-x86_64-2.7. 11-64/scan_perform/mod.cpp:4193)
25 File "/search/speech/chengshanbo/tools/anaconda2/lib/python2.7/site-packages/Theano-0.8.0-py2.7.egg/theano/scan_module/scan_op.py", line 951, in rval
26 r = p(n, [x[0] for x in i], o)
27 File "/search/speech/chengshanbo/tools/anaconda2/lib/python2.7/site-packages/Theano-0.8.0-py2.7.egg/theano/scan_module/scan_op.py", line 940, in <lambda>
28 self, node)
29 File "theano/scan_module/scan_perform.pyx", line 405, in theano.scan_module.scan_perform.perform (/search/odin/chengshanbo/.theano/compiledir_Linux-2.6-el6.x86_64-x86_64-with-redhat-6.6-Santiago-x86_64-2.7. 11-64/scan_perform/mod.cpp:4316)
30 File "/search/speech/chengshanbo/tools/anaconda2/lib/python2.7/site-packages/Theano-0.8.0-py2.7.egg/theano/gof/link.py", line 314, in raise_with_op
31 reraise(exc_type, exc_value, exc_trace)
32 File "theano/scan_module/scan_perform.pyx", line 397, in theano.scan_module.scan_perform.perform (/search/odin/chengshanbo/.theano/compiledir_Linux-2.6-el6.x86_64-x86_64-with-redhat-6.6-Santiago-x86_64-2.7. 11-64/scan_perform/mod.cpp:4193)
33 File "/search/speech/chengshanbo/tools/anaconda2/lib/python2.7/site-packages/Theano-0.8.0-py2.7.egg/theano/scan_module/scan_op.py", line 951, in rval
34 r = p(n, [x[0] for x in i], o)
35 File "/search/speech/chengshanbo/tools/anaconda2/lib/python2.7/site-packages/Theano-0.8.0-py2.7.egg/theano/scan_module/scan_op.py", line 940, in <lambda>
36 self, node)
37 File "theano/scan_module/scan_perform.pyx", line 405, in theano.scan_module.scan_perform.perform (/search/odin/chengshanbo/.theano/compiledir_Linux-2.6-el6.x86_64-x86_64-with-redhat-6.6-Santiago-x86_64-2.7. 11-64/scan_perform/mod.cpp:4316)
38 File "/search/speech/chengshanbo/tools/anaconda2/lib/python2.7/site-packages/Theano-0.8.0-py2.7.egg/theano/gof/link.py", line 314, in raise_with_op
39 reraise(exc_type, exc_value, exc_trace)
40 File "theano/scan_module/scan_perform.pyx", line 397, in theano.scan_module.scan_perform.perform (/search/odin/chengshanbo/.theano/compiledir_Linux-2.6-el6.x86_64-x86_64-with-redhat-6.6-Santiago-x86_64-2.7. 11-64/scan_perform/mod.cpp:4193)
41 IndexError: index out of bounds
42 Apply node that caused the error: GpuIncSubtensor{Inc;int64}(GpuElemwise{add,no_inplace}.0, if{inplace,gpu}.0, ScalarFromTensor.0)
43 Toposort index: 5
44 Inputs types: [CudaNdarrayType(float32, matrix), CudaNdarrayType(float32, vector), Scalar(int64)]
45 Inputs shapes: [(17, 256), (256,), ()]
46 Inputs strides: [(256, 1), (1,), ()]
47 Inputs values: ['not shown', 'not shown', 17]
48 Outputs clients: [['output']]
49

Resources