ete3 error : could not be translated into taxids! - Bioinformatics - python-3.5

I am using ete3(http://etetoolkit.org/) package in Python within a bioinformatics pipeline I wrote myself.
While running this script, I get the following error. I have used this script a lot for other datasets which don't have any issues and have not given any errors. I am using Python3.5 and miniconda. Any fixes/insights to resolve this error will be appreciated.
[Error]
Traceback (most recent call last):
File "/Users/d/miniconda2/envs/py35/bin/ete3", line 11, in <module>
load_entry_point('ete3==3.1.1', 'console_scripts', 'ete3')()
File "/Users/d/miniconda2/envs/py35/lib/python3.5/site-packages/ete3/tools/ete.py", line 95, in main
_main(sys.argv)
File "/Users/d/miniconda2/envs/py35/lib/python3.5/site-packages/ete3/tools/ete.py", line 268, in _main
args.func(args)
File "/Users/d/miniconda2/envs/py35/lib/python3.5/site-packages/ete3/tools/ete_ncbiquery.py", line 168, in run
collapse_subspecies=args.collapse_subspecies)
File "/Users/d/miniconda2/envs/py35/lib/python3.5/site-packages/ete3/ncbi_taxonomy/ncbiquery.py", line 434, in get_topology
lineage = id2lineage[sp]
KeyError: 3

Continuing from the comment section for better formatting.
Assuming that the sp contains 3 as suggested by the error message (do check this yourself). You can inspect the ete3 code (current version) following its definition, you can trace it to line:
def get_lineage_translator(self, taxids):
"""Given a valid taxid number, return its corresponding lineage track as a
hierarchically sorted list of parent taxids.
So I went to https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi
and checked if 3 is valid taxid and it appears that it is not.
# relevant section from ncbi taxonomy browser
No result found in the Taxonomy database for taxonomy id
3
It appears to me that your only option is to trace how the 3 gets computed. Because the root cause is simply that taxid 3 is not valid taxid number as required by the function.

Related

Failed to get followers count of 'b'metapodcode'' ~empty list

With the June 2022 update, there have been some changes in Instagram's APIs. There was a discussion here about changing or updating this code. You can find the discussion here then i did some research on this topic and i found the fix here but this code is written in another javascript language, if the code here is integrated into instapy, it seems that all the problem will be solved. Instapy is an application that I love, but I've been looking for a solution to this problem for days and I'm not good at programming languages. I'm trying to get help here as a last resort. I'm waiting for your help
INFO [2022-07-22 03:20:06] [metapodcod] Failed to get following count of 'b'metapodcod'' ~empty list
WARNING [2022-07-22 03:20:06] [metapodcod] Unable to save account progress, skipping data update
b"'NoneType' object has no attribute 'get'"
INFO [2022-07-22 03:20:07] [metapodcod] Sessional Live Report:
|> No any statistics to show
[Session lasted 2.5 minutes]
OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO
INFO [2022-07-22 03:20:07] [metapodcod] Session ended!
ooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
Traceback (most recent call last):
File "C:\Users\metapodcod\InstaPy\yeni.py", line 41, in <module>
with smart_run(session):
File "C:\Users\metapodcod\AppData\Local\Programs\Python\Python310\lib\contextlib.py", line 135, in __enter__
return next(self.gen)
File "C:\Users\metapodcod\InstaPy\instapy\util.py", line 1983, in smart_run
session.login()
File "C:\Users\metapodcod\InstaPy\instapy\instapy.py", line 475, in login
self.followed_by = log_follower_num(self.browser, self.username, self.logfolder)
File "C:\Users\metapodcod\InstaPy\instapy\print_log_writer.py", line 21, in log_follower_num
followed_by = getUserData("graphql.user.edge_followed_by.count", browser)
File "C:\Users\metapodcod\InstaPy\instapy\util.py", line 501, in getUserData
get_key = shared_data.get("entry_data").get("ProfilePage")
AttributeError: 'NoneType' object has no attribute 'get'

Error using MultiWorkerMirroredStrategy to train object detection research model ssd_mobilenet_v1_fpn_640x640_coco17_tpu-8

I'm trying to train research model ssd_mobilenet_v1_fpn_640x640_coco17_tpu-8 using the MultiWorkerMirroredStrategy (by setting --num_workers=2 in the invocation of model_main_tf2.py). I'm trying to train across two workers (0 and 1), each with a single GPU. However, when I attempt this I get the following error, always on worker 1:
Traceback (most recent call last):
File "C:\Users\JS\.conda\envs\tensor2\lib\site-packages\tensorflow\python\distribute\input_lib.py", line 553, in __next__
return self.get_next()
File "C:\Users\JS\.conda\envs\tensor2\lib\site-packages\tensorflow\python\distribute\input_lib.py", line 610, in get_next
return self._get_next_no_partial_batch_handling(name)
File "C:\Users\JS\.conda\envs\tensor2\lib\site-packages\tensorflow\python\distribute\input_lib.py", line 642, in _get_next_no_partial_batch_handling
replicas.extend(self._iterators[i].get_next_as_list(new_name))
File "C:\Users\JS\.conda\envs\tensor2\lib\site-packages\tensorflow\python\distribute\input_lib.py", line 1594, in get_next_as_list
return self._format_data_list_with_options(self._iterator.get_next())
File "C:\Users\JS\.conda\envs\tensor2\lib\site-packages\tensorflow\python\data\ops\multi_device_iterator_ops.py", line 580, in get_next
result.append(self._device_iterators[i].get_next())
File "C:\Users\JS\.conda\envs\tensor2\lib\site-packages\tensorflow\python\data\ops\iterator_ops.py", line 889, in get_next
return self._next_internal()
File "C:\Users\JS\.conda\envs\tensor2\lib\site-packages\tensorflow\python\data\ops\iterator_ops.py", line 819, in _next_internal
ret = gen_dataset_ops.iterator_get_next(
File "C:\Users\JS\.conda\envs\tensor2\lib\site-packages\tensorflow\python\ops\gen_dataset_ops.py", line 2922, in iterator_get_next
_ops.raise_from_not_ok_status(e, name)
File "C:\Users\JS\.conda\envs\tensor2\lib\site-packages\tensorflow\python\framework\ops.py", line 7186, in raise_from_not_ok_status
raise core._status_to_exception(e) from None # pylint: disable=protected-access
tensorflow.python.framework.errors_impl.OutOfRangeError: End of sequence [Op:IteratorGetNext]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\JS\Desktop\Tensorflow\models\research\object_detection\model_main_tf2.py", line 114, in <module>
tf.compat.v1.app.run()
File "C:\Users\JS\.conda\envs\tensor2\lib\site-packages\tensorflow\python\platform\app.py", line 36, in run
_run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
File "C:\Users\JS\.conda\envs\tensor2\lib\site-packages\absl\app.py", line 312, in run
_run_main(main, args)
File "C:\Users\JS\.conda\envs\tensor2\lib\site-packages\absl\app.py", line 258, in _run_main
sys.exit(main(argv))
File "C:\Users\JS\Desktop\Tensorflow\models\research\object_detection\model_main_tf2.py", line 105, in main
model_lib_v2.train_loop(
File "C:\Users\JS\.conda\envs\tensor2\lib\site-packages\object_detection\model_lib_v2.py", line 605, in train_loop
load_fine_tune_checkpoint(
File "C:\Users\JS\.conda\envs\tensor2\lib\site-packages\object_detection\model_lib_v2.py", line 401, in load_fine_tune_checkpoint
_ensure_model_is_built(model, input_dataset, unpad_groundtruth_tensors)
File "C:\Users\JS\.conda\envs\tensor2\lib\site-packages\object_detection\model_lib_v2.py", line 161, in _ensure_model_is_built
features, labels = iter(input_dataset).next()
File "C:\Users\JS\.conda\envs\tensor2\lib\site-packages\tensorflow\python\distribute\input_lib.py", line 549, in next
return self.__next__()
File "C:\Users\JS\.conda\envs\tensor2\lib\site-packages\tensorflow\python\distribute\input_lib.py", line 555, in __next__
raise StopIteration
StopIteration
Worker 0 eventually fails after detecting that worker 1 has gone down.
This error happens regardless of the physical machines on which the two workers run. In other words I see it if I'm running both workers on a single machine (using localhost) OR different machines on the same network.
Based on the trace in the error messages, the error appears to be occurring whenever the training loop attempts to iterate over the training data generated by strategy.experimental_distribute_datasets_from_function. Note that if I change the strategy to MirroredStrategy it runs fine on a single machine (no other changes made). I'm not sure if I'm doing something wrong or if there is a bug in the object detection API.
My setup on both machines is identical (I basically followed the setup instructions on the object detection web-site):
Windows 10
Tensorflow 2.8.0
Cuda Toolkit 11.2
cudnn 8.1
Has anyone ever seen this error before? If so, is there a way around it?
Ok, I think I understand the issue. In the object detection library there is a file called dataset_builder.py that builds the training dataset from the TFRecord stored in the file specified in the pipeline.config file (in the input_path item of the tf_record_input_reader). The function that actually reads the TFRecord file is _read_dataset_internal. This function treats the input_path of the pipeline config as a LIST OF FILES and then applies a sharding function (passed as an argument) to divide the files between the replicas doing the training (one replica per worker). Since my input_path only specified a single TFRecord file it was assigned to the first replica and the other replicas were given empty filenames!! Thus only the first replica actually had an input dataset to work with, hence the crash.
The solution was to split the training data across two files (two TFRecords) and then set the input_path in pipeline.config to be a list of paths rather than a single path. Once I did this it appears as though the model trained successfully (at least it didn't crash).
I'm not sure if this is a bug in the object detection code or not. I assumed that if I only had one training record (visible to both workers) that both workers would use it and just batch the data accordingly. I'm just not sure if the assumption itself is wrong or if the assumption is correct and the code is wrong.
Anyway, I this helps anyone who might be wrestling with the same issue.

Unable to download file (Web Scraping) - OSError [Errorno22] - invalid argument

I wrote a program in Python 3 which scrapes and download the pages of wikipedia category with certain depth and places them in a directory.
The problem which I am facing is, "suppose during the execution of code, if the algorithm encounters any page of wikipedia having special character like (*, #, $ etc.), then the algorithm fails with the below mentioned message in terms of error trace ".
An example of the special character wiki page is as follows:
https://en.wikipedia.org/wiki/Eden*
The error trace is as follows:
Traceback (most recent call last):
File "F:\Pen Drive 8 GB\PDF\Code\wiki.py", line 103, in <module>
d.search_and_store("Biomedical_engineering", subcategory_depth=2, path=PATH)
File "F:\Pen Drive 8 GB\PDF\Code\wiki.py", line 98, in search_and_store
self.search_and_store(subcat_result['title'], subcategory_depth-1, path)
File "F:\Pen Drive 8 GB\PDF\Code\wiki.py", line 98, in search_and_store
self.search_and_store(subcat_result['title'], subcategory_depth-1, path)
File "F:\Pen Drive 8 GB\PDF\Code\wiki.py", line 76, in search_and_store
if self.write_page_text(path, page_result):
File "F:\Pen Drive 8 GB\PDF\Code\wiki.py", line 44, in write_page_text
txt_file = open(file_path, 'w')
OSError: [Errno 22] Invalid argument: 'F:\\Code\\Wikipedia\\DATASETS\\Biomedical Engineering/Eden*.txt'
As you can see clearly, the algorithm scrapes the data of the pages without having any special character's, but why it raising the aforementioned error.
The MWE is very large. If anybody suggests, then I can share the same.
Please suggest something, as I am trying this since long and frustrated. I don't even have idea what I am doing wrong? Please help.
Any small help is deeply appreciated.
Thanks in Advance.

Error I do not understand [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed last month.
Improve this question
I generated this error in Python 3.5:
Traceback (most recent call last):
File "C:\Users\Owner\AppData\Local\Programs\Python\Python35\lib\shelve.py", line 111, in __getitem__
value = self.cache[key]
KeyError: 'P4_vegetables'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\Owner\Documents\Python\Allotment\allotment.py", line 217, in
main_program()
File "C:\Users\Owner\Documents\Python\Allotment\allotment.py", line 195, in main_program
main_program()
File "C:\Users\Owner\Documents\Python\Allotment\allotment.py", line 49, in main_program
print("Plot 4 - ", s["P4_vegetables"])
File "C:\Users\Owner\AppData\Local\Programs\Python\Python35\lib\shelve.py", line 113, in __getitem__
f = BytesIO(self.dict[key.encode(self.keyencoding)])
File "C:\Users\Owner\AppData\Local\Programs\Python\Python35\lib\dbm\dumb.py", line 141, in __getitem__
pos, siz = self._index[key] # may raise KeyError
KeyError: b'P4_vegetables'
It has been a while, but in case somebody comes across this: The following error
Traceback (most recent call last):
File "filepath", line 111, in __getitem__
value = self.cache[key]
KeyError: 'item1'
can occur if one attempts to retrieve the item outside of the with block. The shelf is closed once we start executing code outside of the with block. Therefore, any operation performed on the shelf outside of the with block in which the shelf is opened will be considered an invalid operation. For example,
import shelve
with shelve.open('ShelfTest') as item:
item['item1'] = 'item 1'
item['item2'] = 'item 2'
item['item3'] = 'item 3'
item['item4'] = 'item 4'
print(item['item1']) # no error, since shelf file is still open
# If we try print after file is closed
# an error will be thrown
# This is quite common
print(item['item1']) #error. It has been closed, so you can't retrieve it.
Hope this helps anyone who comes across a similar issue as the original poster.
It means that the dictionary (or of whatever type it is) cache does not contain the key named key which is of value 'P4_vegetables'. Make sure you added the key before using it.

Randomly hitting Joblib exception in Sklearn on parallel Grid Search

I am running gridSearchCV in parallel with n_jobs > 1, but randomly hit the following crash in joblib:
TypeError: Cannot create a consistent method resolution
order (MRO) for bases JoblibException, Exception
Here is the complete stack trace:
Traceback (most recent call last):
File "example_sklearn.py", line 92, in <module>
main()
File "example_sklearn.py", line 76, in main
).fit(X_train, y_train)
File "/usr/local/lib/python2.7/dist-packages/sklearn/grid_search.py",
line 372, in fit for clf_params in grid for train, test in cv)
File "/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/parallel.py",
line 516, in __call__self.retrieve()
File "/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/parallel.py",
line 448, in retrieve exception_type = _mk_exception(exception.etype)[0]
File "/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/my_exceptions.py",
line 61, in _mk_exception__str__=JoblibException.__str__),
TypeError: Cannot create a consistent method resolution
order (MRO) for bases JoblibException, Exception
Any pointers on what this really is, and how I can debug this. Is this a known issue with sklearn
I had the exact same exception, exactly while using the GridSearchCV.
If you look at the exception, it is complaining about not being able to understand how exactly it should choose between two parent classes JoblibException and Exception. This is a bug in the joblib package, that the inheritance is improper.
But other than than, there exist another problem, which is the source of the exception itself. It's getting an exception while retrieve()ing, and while passing the exception, you get the error.
The second problem (which is the source of the exception), seems to be fixed in later versions of joblib. But scikit-learn is still using an old version (I will submit a pull request with the changed file soon).
A temporary workaround would be to install your own version of joblib using
easy_install joblib
and then go to the sklearn/exterlan folder, remove/rename the joblib folder, and create a symbolic link to your own joblib using:
ln -s /path/to/joblib joblib
EDIT: Seems somebody has had already fixed the problem. My version was also old.

Resources