subprocess.CalledProcessError when trying to run Mallet with Gensim

subprocess.CalledProcessError when trying to run Mallet with Gensim - python-3.x

I'm trying to do topic modeling with Gensim and Mallet (link).
When I locate the mallet_path and then try to assign it to gensim, I get the error
subprocess.CalledProcessError : returned non-zero exit status 1
And I get prompted to update Java (which I have done).
Any hints on how to solve it?
mallet_path = '/Users/username/mallet-2.0.8/bin/mallet'
ldamallet = gensim.models.wrappers.LdaMallet(mallet_path, corpus=corpus, num_topics=20, id2word=id2word)
Traceback (most recent call last):
File "<pyshell#85>", line 1, in <module>
ldamallet = gensim.models.wrappers.LdaMallet(mallet_path, corpus=corpus, num_topics=20, id2word=id2word)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/gensim/models/wrappers/ldamallet.py", line 132, in __init__
self.train(corpus)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/gensim/models/wrappers/ldamallet.py", line 273, in train
self.convert_input(corpus, infer=False)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/gensim/models/wrappers/ldamallet.py", line 262, in convert_input
check_output(args=cmd, shell=True)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/gensim/utils.py", line 1918, in check_output
raise error
subprocess.CalledProcessError: Command '/Users/username/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /var/folders/76/hdlh6w8d3nbb4m424wx3010w0000gn/T/adc98e_corpus.txt --output /var/folders/76/hdlh6w8d3nbb4m424wx3010w0000gn/T/adc98e_corpus.mallet' returned non-zero exit status 1.

In bin directory, open the mallet file with a text editor then increase the MEMORY limit. It worked for me.

Related

why subprocess.check_output raise an error CalledProcessError

I tried lot of different way but i can't figure out this problem..i am not expert in python can any one explain how can i solve this problem...plzzz
command = "netsh wlan show profile"
ssid = subprocess.check_output(command, shell=True)
ssid = ssid.decode("utf-8")
ssid_list = re.findall('(?:Profile\s*:\s)(.*)', ssid)
for ssid_name in ssid_list:
subprocess.check_output(["netsh","wlan","show","profile",ssid_name,"key=clear"])
and it gives error:
Traceback (most recent call last):
File "C:/Users/Prakash/Desktop/Hacking tool/Execute_sys_cmd_payload.py", line 18, in <module>
subprocess.check_output(["netsh","wlan","show","profile",ssid_name,"key=clear"])
File "C:\Users\Prakash\AppData\Local\Programs\Python\Python38\lib\subprocess.py", line 411, in check_output
return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
File "C:\Users\Prakash\AppData\Local\Programs\Python\Python38\lib\subprocess.py", line 512, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['netsh', 'wlan', 'show', 'profile', 'Prakash_WiFi\r', 'key=clear']' returned non-zero exit status 1.
Process finished with exit code 1

With check_output this exception will be raised whenever a command extis abnormally, as indicated by a non-zero return code.
A possible solution is to use subprocess.run instead. This will run the process and return a CompletedProcess instance, which has a returncode attribute that you can check;
import subprocess as sp
command = ["netsh","wlan","show",ssid_name,"key=clear"]
netsh = subprocess.run(command, shell=True, stdout=sp.PIPE)
if netsh.returncode != 0:
print(f'netsh error: {netsh.returncode}')
else:
# process netsh.stdout.
From reading this page, it seems that the command you should use is netsh wlan show (without profile).

I am getting error "Restoring from checkpoint failed." while training tensorflow estimator api on AI-platform(ml-engine)

I am trying to do hyperparameter tuning on ai-engine for DNN regressor using tensorflow estimator api. But after submitting the job, it shows that job is failed and I get this error in job details.
Can someone help?
Hyperparameter Tuning Trial #1 Failed before any other successful trials were completed. The failed trial had parameters: learning_rate=0.0019937718716419557, num-layers=2, first-layer-size=148, scale-factor=0.7910729020312588, . The trial's error message was: The replica master 0 exited with a non-zero status of 1.
Traceback (most recent call last):
[...]
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 507, in _build_internal
restore_sequentially, reshape)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 385, in _AddShardedRestoreOps
name="restore_shard"))
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 332, in _AddRestoreOps
restore_sequentially)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 580, in bulk_restore
return io_ops.restore_v2(filename_tensor, names, slices, dtypes)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/gen_io_ops.py", line 1572, in restore_v2
name=name)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper
op_def=op_def)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/util/deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 3300, in create_op
op_def=op_def)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 1801, in __init__
self._traceback = tf_stack.extract_stack()
InvalidArgumentError (see above for traceback): Restoring from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:
tensor_name = dnn/hiddenlayer_0/bias; shape in shape_and_slice spec [148] does not match the shape stored in checkpoint: [117]
[[node save/RestoreV2_1 (defined at /usr/local/lib/python3.5/dist-packages/tensorflow_estimator/python/estimator/estimator.py:1403) ]]

Looks like you are using the same output directory for all the trials, and so trial#1 is trying to read trial#2 checkpoint (perhaps because it is the latest one in the directory) and failing because the architecture is different
Make sure to use a different output directory for each hyperparam training run. There are two ways you do this:
Use the --job-dir as the output directory.
Append a hyperparam trial number to the output directory you are using now:
output_dir = os.path.join( output_dir, json.loads( os.environ.get('TF_CONFIG', '{}') ).get('task', {}).get('trial', '') )

Issues tokenizing text

Started text analysing, and eventually ran into a need for downloading Corpora in using PyCharm2019 as IDE. Not really sure what traceback message wants me to do, since I used PyCharm's own lib import interface to enable Corpora already. Why does an error stating that Corpora is not available to the code keep reappearing?
Imported TextBlob, tried to do a line like: from textblob import TextBlob...view code below
from textblob import TextBlob
TextBlob(train['tweet'][1]).words
print("\nPRINT TOKENIZATION") # own instruction to allow for knowing what code result delivers
print(TextBlob(train['tweet'][1]).words)
….
Tried to install via nltk, no luck...error when downloading 'brown.tei'
showing info https://raw.githubusercontent.com/nltk/nltk_data/gh-pages/index.xml
Exception in Tkinter callback
Traceback (most recent call last):
File "C:\Users\jcst\AppData\Local\Programs\Python\Python37-32\lib\tkinter__init__.py", line 1705, in call
return self.func(*args)
File "C:\Users\jcst\PycharmProjects\TextMining\venv\lib\site-packages\nltk\downloader.py", line 1796, in _download
return self._download_threaded(*e)
File "C:\Users\jcst\PycharmProjects\TextMining\venv\lib\site-packages\nltk\downloader.py", line 2082, in _download_threaded
assert self._download_msg_queue == []
AssertionError
Traceback (most recent call last):
File "C:\Users\jcst\PycharmProjects\TextMining\venv\lib\site-packages\textblob\decorators.py", line 35, in decorated
return func(*args, **kwargs)
File "C:\Users\jcst\PycharmProjects\TextMining\venv\lib\site-packages\textblob\tokenizers.py", line 57, in tokenize
return nltk.tokenize.sent_tokenize(text)
File "C:\Users\jcst\PycharmProjects\TextMining\venv\lib\site-packages\nltk\tokenize__init__.py", line 104, in sent_tokenize
tokenizer = load('tokenizers/punkt/{0}.pickle'.format(language))
File "C:\Users\jcst\PycharmProjects\TextMining\venv\lib\site-packages\nltk\data.py", line 870, in load
opened_resource = _open(resource_url)
Resource File "C:\Users\jcst\PycharmProjects\TextMining\venv\lib\site-packages\nltk\data.py", line 995, in open
punkt not found.
Please use the NLTK Downloader to obtain the resource:
return find(path, path + ['']).open()
File "C:\Users\jcst\PycharmProjects\TextMining\venv\lib\site-packages\nltk\data.py", line 701, in find
import nltk
nltk.download('punkt')
For more information see: https://www.nltk.org/data.html
Attempted to load tokenizers/punkt/english.pickle
Searched in:
- 'C:\Users\jcst/nltk_data'
- 'C:\Users\jcst\PycharmProjects\TextMining\venv\nltk_data'
- 'C:\Users\jcst\PycharmProjects\TextMining\venv\share\nltk_data'
- 'C:\Users\jcst\PycharmProjects\TextMining\venv\lib\nltk_data'
- 'C:\Users\jcst\AppData\Roaming\nltk_data'
- 'C:\nltk_data'
- 'D:\nltk_data'
- 'E:\nltk_data'
- ''
raise LookupError(resource_not_found)
LookupError:
Resource punkt not found.
Please use the NLTK Downloader to obtain the resource:
import nltk
nltk.download('punkt')
For more information see: https://www.nltk.org/data.html
Attempted to load tokenizers/punkt/english.pickle
Searched in:
- 'C:\Users\jcst/nltk_data'
- 'C:\Users\jcst\PycharmProjects\TextMining\venv\nltk_data'
- 'C:\Users\jcst\PycharmProjects\TextMining\venv\share\nltk_data'
- 'C:\Users\jcst\PycharmProjects\TextMining\venv\lib\nltk_data'
- 'C:\Users\jcst\AppData\Roaming\nltk_data'
- 'C:\nltk_data'
- 'D:\nltk_data'
- 'E:\nltk_data'
- ''
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:/Users/jcst/PycharmProjects/TextMining/ModuleImportAndTrainFileIntro.py", line 151, in
TextBlob(train['tweet'][1]).words
File "C:\Users\jcst\PycharmProjects\TextMining\venv\lib\site-packages\textblob\decorators.py", line 24, in get
value = obj.dict[self.func.name] = self.func(obj)
File "C:\Users\jcst\PycharmProjects\TextMining\venv\lib\site-packages\textblob\blob.py", line 649, in words
return WordList(word_tokenize(self.raw, include_punc=False))
File "C:\Users\jcst\PycharmProjects\TextMining\venv\lib\site-packages\textblob\tokenizers.py", line 73, in word_tokenize
for sentence in sent_tokenize(text))
File "C:\Users\jcst\PycharmProjects\TextMining\venv\lib\site-packages\textblob\base.py", line 64, in itokenize
return (t for t in self.tokenize(text, *args, **kwargs))
File "C:\Users\jcst\PycharmProjects\TextMining\venv\lib\site-packages\textblob\decorators.py", line 38, in decorated
raise MissingCorpusError()
textblob.exceptions.MissingCorpusError:
Looks like you are missing some required data for this feature.
To download the necessary data, simply run
python -m textblob.download_corpora
or use the NLTK downloader to download the missing data: http://nltk.org/data.html
If this doesn't fix the problem, file an issue at https://github.com/sloria/TextBlob/issues.

KeyError: 'ifname' in convert an OpenWRT tar.gz to NetJSON

I was trying to use the netjsonconfig command line utility and tried the
convert an OpenWRT tar.gz to NetJSON and print to standard output (with 4 space indentation) utility
netjsonconfig --native network --backend openwrt --method json -a indent=" "
But following error shows,
ubuntu#ip-172-31-21-48:~/netjsontest$ netjsonconfig --native backup.tar.gz --backend openwrt --method json -a indent=" "
Traceback (most recent call last):
File "/usr/local/bin/netjsonconfig", line 180, in <module>
instance = backend_class(**options)
File "/usr/local/lib/python2.7/dist-packages/netjsonconfig/backends/base/backend.py", line 47, in __init__
self.parse(native)
File "/usr/local/lib/python2.7/dist-packages/netjsonconfig/backends/base/backend.py", line 280, in parse
self.to_netjson()
File "/usr/local/lib/python2.7/dist-packages/netjsonconfig/backends/base/backend.py", line 293, in to_netjson
value = converter.to_netjson()
File "/usr/local/lib/python2.7/dist-packages/netjsonconfig/backends/base/converter.py", line 108, in to_netjson
result = self.to_netjson_loop(block, result, index + 1)
File "/usr/local/lib/python2.7/dist-packages/netjsonconfig/backends/openwrt/converters/wireless.py", line 118, in to_netjson_loop
interface = self.__get_netjson_interface(block)
File "/usr/local/lib/python2.7/dist-packages/netjsonconfig/backends/openwrt/converters/wireless.py", line 246, in __get_netjson_interface
if interface['name'] == wifi['ifname']:
KeyError: 'ifname'
`
Python version: Python 2.7.6
OS: ubuntu: 14.04
Can anyone help me to get this fixed?
Edit :
http://netjsonconfig.openwisp.org/en/stable/general/commandline_utility.html

What does network contain?
The exception you're getting looks like a bug though, you shouldn't get an exception but a failure.
Maybe is better to open an issue in https://github.com/openwisp/netjsonconfig

cannot parse sentence using stanford parser even i have set environment variables

I am using python 3.5.
I downloaded the Stanford parser and extracted it. I have also set the environment variable properly and it got set properly. But when I ran a sentence and tried to parse I am getting an error.
This is the error:
Traceback (most recent call last):
File "<pyshell#15>", line 1, in <module>
sp.parse("this is a sentence".split())
File "C:\Users\MAHESH\AppData\Local\Programs\Python\Python35\lib\site-packages\nltk\parse\api.py", line 45, in parse
return next(self.parse_sents([sent], *args, **kwargs))
File "C:\Users\MAHESH\AppData\Local\Programs\Python\Python35\lib\site-packages\nltk\parse\stanford.py", line 120, in parse_sents
cmd, '\n'.join(' '.join(sentence) for sentence in sentences), verbose))
File "C:\Users\MAHESH\AppData\Local\Programs\Python\Python35\lib\site-packages\nltk\parse\stanford.py", line 216, in _execute
stdout=PIPE, stderr=PIPE)
File "C:\Users\MAHESH\AppData\Local\Programs\Python\Python35\lib\site-packages\nltk\internals.py", line 134, in java
raise OSError('Java command failed : ' + str(cmd))
OSError: Java command failed : ['java.exe', '-mx1000m', '-cp', 'C:/Users/MAHESH/stanfordparser/stanford-parser-full-2015-04-20\\stanford-parser-3.5.2-models.jar;C:/Users/MAHESH/stanfordparser/stanford-parser-full-2015-04-20\\ejml-0.23.jar;C:/Users/MAHESH/stanfordparser/stanford-parser-full-2015-04-20\\stanford-parser-3.5.2-javadoc.jar;C:/Users/MAHESH/stanfordparser/stanford-parser-full-2015-04-20\\stanford-parser-3.5.2-models.jar;C:/Users/MAHESH/stanfordparser/stanford-parser-full-2015-04-20\\stanford-parser-3.5.2-sources.jar;C:/Users/MAHESH/stanfordparser/stanford-parser-full-2015-04-20\\stanford-parser.jar', 'edu.stanford.nlp.parser.lexparser.LexicalizedParser', '-model', 'edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz', '-sentences', 'newline', '-outputFormat', 'penn', '-tokenized', '-escaper', 'edu.stanford.nlp.process.PTBEscapingProcessor', '-encoding', 'utf8', 'C:\\Users\\MAHESH\\AppData\\Local\\Temp\\tmp1jcjvrl1']

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

subprocess.CalledProcessError when trying to run Mallet with Gensim - python-3.x

In bin directory, open the mallet file with a text editor then increase the MEMORY limit. It worked for me.

Related

why subprocess.check_output raise an error CalledProcessError

I am getting error "Restoring from checkpoint failed." while training tensorflow estimator api on AI-platform(ml-engine)

Issues tokenizing text

KeyError: 'ifname' in convert an OpenWRT tar.gz to NetJSON

cannot parse sentence using stanford parser even i have set environment variables

Categories

Resources