python how to detect if type is 'datetime.time' - python-3.x

I am trying to do test to see if a data type is 'datetime.time' and if so convert it to 'datetime.datetime'. My code snippet is below. x_values is a series and each element of the series is a 'datetime.time'.
...
x_values = x.loc[:, "processed_time"]
print(x_values.dtypes)
print(type(x_values.iloc[0]))
print(x_values)
if isinstance(x_values.iloc[0], datetime.time):
x_values = pd.to_datetime(x_values, format='%H:%M:%S')
...
But the program errors out at the test with:
Traceback (most recent call last):
File "/Users/.../risk_calculations.py", line 282, in plot_risk
if isinstance(x_values.iloc[0], datetime.time):
TypeError: isinstance() arg 2 must be a type or tuple of types
object
<class 'datetime.time'>
1387 00:55:14
1388 10:02:01
1389 10:02:02
1390 10:02:02
1391 10:02:08
...
6417 14:36:49
6418 14:36:51
6419 15:24:52
6420 15:36:59
6422 16:21:03
Name: processed_time, Length: 3621, dtype: object
This Stack answer seemed closest to addressing my challenge but I think I have implemented the suggestion correctly. Note that the print statements show that the type is in fact a 'class datetime.time' as required (I think) by 'is instance' so I don't understand why the errors. I know I can make it work if I replace the 'if' statement with:
if 'datetime.time' in str(type(x_values.iloc[0])):
...
But that seems kludgy. Is there a more correct test for an instance of 'datetime.time'?

Related

Encountered an internal AutoML error- ClientException: Message: No objects to concatenate

I am trying to implement Hierarchical time series forecasting on azureautoml pipelines.
I followed this notebook for implementation
https://github.com/Azure/azureml-examples/blob/main/v1/python-sdk/tutorials/automl-with-azureml/forecasting-hierarchical-timeseries/auto-ml-forecasting-hierarchical-timeseries.ipynb
While I ran training pipeline on compute instance it worked, but when I am running the same on compute cluster it breaks at hts-proportion-calculation part.
This is the error I am getting,
system error:
Encountered an internal AutoML error. Error Message/Code: ClientException. Additional Info: ClientException:
      Message: No objects to concatenate
      InnerException: None
      ErrorResponse
{
"error": {
"message": "No objects to concatenate"
}
}
logs :
Loading arguments for scenario proportions-calculation
adding argument --input-medatadata
adding argument --hts-graph
adding argument --enable-event-logger
Input arguments dict is {'--input-medatadata': '/mnt/azureml/cr/j/85509be625484b6caa3c1d97b7ab2e33/cap/data-capability/wd/INPUT_automl_training_workspaceblobstore/azureml/17ca5ae7-7269-4246-888f-e781071e3f5c/automl_training', '--hts-graph': '/mnt/azureml/cr/j/85509be625484b6caa3c1d97b7ab2e33/cap/data-capability/wd/INPUT_hts_graph_workspaceblobstore/azureml/a2c1b15a-c895-41e8-b6a6-1ca37ebe9e77/hts_graph', '--enable-event-logger': None}
Unknown file to proceed outputs.txt
processing: outputs.txt with type None.
Cleaning up all outstanding Run operations, waiting 300.0 seconds
3 items cleaning up...
Cleanup took 0.001676321029663086 seconds
Traceback (most recent call last):
File "proportions_calculation_wrapper.py", line 47, in <module>
runtime_wrapper.run()
File "/azureml-envs/azureml_e34d0633ffc4cb2fa25d91e3da5f59be/lib/python3.7/site-packages/azureml/train/automl/runtime/_many_models/automl_pipeline_step_wrapper.py", line 63, in run
self._run()
File "/azureml-envs/azureml_e34d0633ffc4cb2fa25d91e3da5f59be/lib/python3.7/site-packages/azureml/train/automl/runtime/_hts/proportions_calculation.py", line 44, in _run
proportions_calculation(self.arguments_dict, self.event_logger, script_run=self.step_run)
File "/azureml-envs/azureml_e34d0633ffc4cb2fa25d91e3da5f59be/lib/python3.7/site-packages/azureml/train/automl/runtime/_hts/proportions_calculation.py", line 173, in proportions_calculation
proportion_files_list, forecasting_parameters.time_column_name, graph.label_column_name
File "/azureml-envs/azureml_e34d0633ffc4cb2fa25d91e3da5f59be/lib/python3.7/site-packages/azureml/train/automl/runtime/_hts/proportions_calculation.py", line 92, in calculate_time_agg_sum_for_all_files
df = pd.concat(pool.map(concat_func, files_batches), ignore_index=True)
File "/azureml-envs/azureml_e34d0633ffc4cb2fa25d91e3da5f59be/lib/python3.7/site-packages/pandas/util/_decorators.py", line 311, in wrapper
return func(*args, **kwargs)
File "/azureml-envs/azureml_e34d0633ffc4cb2fa25d91e3da5f59be/lib/python3.7/site-packages/pandas/core/reshape/concat.py", line 304, in concat
sort=sort,
File "/azureml-envs/azureml_e34d0633ffc4cb2fa25d91e3da5f59be/lib/python3.7/site-packages/pandas/core/reshape/concat.py", line 351, in __init__
raise ValueError("No objects to concatenate")
ValueError: No objects to concatenate
Please let me know how can I resolve this issue ?
This error was incurred as Iteration timeout was not less than experiment timeout , but the system error & logs are a kind of misleading.
df = pd.concat(pool.map(concat_func, files_batches), ignore_index=True)
logs was pointing to pandas "No objects to concatenate"
This error can be overcome by setting iterationtimeout value less than experimenttime out value.
I had set iteration_timeout_minutes=60 which caused the error.
automl_settings = AutoMLConfig(
task="forecasting",
primary_metric="normalized_root_mean_squared_error",
experiment_timeout_hours=1,
label_column_name=label_column_name,
track_child_runs=False,
forecasting_parameters=forecasting_parameters,
pipeline_fetch_max_batch_size=15,
model_explainability=model_explainability,
n_cross_validations="auto", # Feel free to set to a small integer (>=2) if runtime is an issue.
cv_step_size="auto",
# The following settings are specific to this sample and should be adjusted according to your own needs.
iteration_timeout_minutes=10,
iterations=15,
)
We are able to run the sample successfully using the compute cluster as given below.
from azureml.core.compute import ComputeTarget, AmlCompute
# Name your cluster
compute_name = "hts-compute"
if compute_name in ws.compute_targets:
compute_target = ws.compute_targets[compute_name]
if compute_target and type(compute_target) is AmlCompute:
print("Found compute target: " + compute_name)
else:
print("Creating a new compute target...")
provisioning_config = AmlCompute.provisioning_configuration(
vm_size="STANDARD_D16S_V3", max_nodes=20
)
# Create the compute target
compute_target = ComputeTarget.create(ws, compute_name, provisioning_config)
# Can poll for a minimum number of nodes and for a specific timeout.
# If no min node count is provided it will use the scale settings for the cluster
compute_target.wait_for_completion(
show_output=True, min_node_count=None, timeout_in_minutes=20
)
# For a more detailed view of current cluster status, use the 'status' property
print(compute_target.status.serialize())

How to use strategy.scope() in RoBERTa?

I made a model using BERT, for a NLI problem, the algorithm ran without problems, however, when I wanted to adapt it to RoBERTa, and I use strategy.scope (), it generates an error that I don't know how to solve, I appreciate any indication.
´´´
max_len1 = 515 # 128*4 de premisa mas 128*4 de hipotesis
def build_model1():
input_word_ids = tf.keras.Input(shape=(max_len1,), dtype=tf.int32,name="input_word_ids")
input_mask = tf.keras.Input(shape = (max_len1,),dtype=tf.int32,name = "input_mask")
input_type_ids = tf.keras.Input(shape = (max_len1,),dtype=tf.int32,name="input_type_ids")
embedding = model([input_word_ids,input_mask,input_type_ids])[0]
output = tf.keras.layers.Dense(3,activation='softmax')(embedding[:,0,:])
model3 = tf.keras.Model(inputs=[input_word_ids, input_mask, input_type_ids], outputs=output)
model3.compile(tf.keras.optimizers.Adam(lr=1e-5),
loss = 'sparse_categorical_crossentropy', metrics= ['accuracy'])
return model3
with strategy.scope():
model3 = build_model1()
model3.summary()
WARNING:tensorflow:The parameters `output_attentions`, `output_hidden_states` and `use_cache` cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.: `config=XConfig.from_pretrained('name', output_attentions=True)`).
WARNING:tensorflow:AutoGraph could not transform <bound method Socket.send of <zmq.sugar.socket.Socket object at 0x7f2425631d00>> and will run it as-is.
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: module, class, method, function, traceback, frame, or code object was expected, got cython_function_or_method
To silence this warning, decorate the function with #tf.autograph.experimental.do_not_convert
WARNING:tensorflow:AutoGraph could not transform <bound method Socket.send of <zmq.sugar.socket.Socket object at 0x7f2425631d00>> and will run it as-is.
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: module, class, method, function, traceback, frame, or code object was expected, got cython_function_or_method
To silence this warning, decorate the function with #tf.autograph.experimental.do_not_convert
WARNING: AutoGraph could not transform <bound method Socket.send of <zmq.sugar.socket.Socket object at 0x7f2425631d00>> and will run it as-is.
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: module, class, method, function, traceback, frame, or code object was expected, got cython_function_or_method
To silence this warning, decorate the function with #tf.autograph.experimental.do_not_convert
WARNING:tensorflow:The parameters `output_attentions`, `output_hidden_states` and `use_cache` cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.: `config=XConfig.from_pretrained('name', output_attentions=True)`).
WARNING:tensorflow:AutoGraph could not transform <function wrap at 0x7f243c214d40> and will run it as-is.
Cause: while/else statement not yet supported
To silence this warning, decorate the function with #tf.autograph.experimental.do_not_convert
WARNING:tensorflow:AutoGraph could not transform <function wrap at 0x7f243c214d40> and will run it as-is.
Cause: while/else statement not yet supported
To silence this warning, decorate the function with #tf.autograph.experimental.do_not_convert
WARNING:tensorflow:The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.
WARNING: AutoGraph could not transform <function wrap at 0x7f243c214d40> and will run it as-is.
Cause: while/else statement not yet supported
To silence this warning, decorate the function with #tf.autograph.experimental.do_not_convert
WARNING:tensorflow:The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-24-e91a2e7e4b41> in <module>()
1 with strategy.scope():
----> 2 model3 = build_model1()
3 model3.summary()
2 frames
/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/training.py in _validate_compile(self, optimizer, metrics, **kwargs)
2533 'with strategy.scope():\n'
2534 ' model=_create_model()\n'
-> 2535 ' model.compile(...)' % (v, strategy))
2536
2537 # Model metrics must be created in the same distribution strategy scope
ValueError: Variable (<tf.Variable 'tfxlm_roberta_model/roberta/encoder/layer_._0/attention/self/query/kernel:0' shape=(1024, 1024) dtype=float32, numpy=
array([[-0.00294119, -0.00129846, 0.00517603, ..., 0.03835522,
0.0218797 , 0.02100084],
[-0.00933813, -0.05062149, 0.01634834, ..., -0.02387142,
0.0113477 , -0.02262339],
[-0.02023344, -0.04181184, -0.00581416, ..., -0.00609464,
0.00801133, 0.00512151],
...,
[-0.02129102, -0.03157991, -0.04071935, ..., 0.04682101,
0.01948426, 0.00312433],
[-0.04902648, -0.01055507, 0.01377375, ..., 0.00845209,
0.01616496, -0.01041171],
[ 0.00759454, -0.00162496, -0.00215843, ..., -0.03199947,
-0.03871808, 0.04949447]], dtype=float32)>) was not created in the distribution strategy scope
of (<tensorflow.python.distribute.tpu_strategy.TPUStrategy object at 0x7f21fcbbb210>). It is most
likely due to not all layers or the model or optimizer being created outside the distribution
strategy scope. Try to make sure your code looks similar to the following.
with strategy.scope():
model=_create_model()
model.compile(...)
´´´
The same code, as I said above, works perfectly for BERT, obviously, for RoBERTa I made the changes in the tokenizer and the loading of the model
I managed to solve it, investigating, I reached that the implementation of roberta went beyond just calling the model

Pyomo: sending options="threads" to cbc solver causes an error

It is possible to activate multithreading in a command line:
$cbc -threads=6
Welcome to the CBC MILP Solver
Version: 2.9.9
Build Date: Aug 21 2017
$command line - cbc -threads=6 (default strategy 1)
threads was changed from 0 to 6
But when I try to activate this option in pyomo code
opt = SolverFactory('cbc')
result = opt.solve(instance, options="threads=4")
I get an error:
File "/usr/local/lib/python3.9/dist-packages/pyomo/opt/base/solvers.py", line 561, in solve
self.options.update(kwds.pop('options', {}))
File "/usr/local/lib/python3.9/dist-packages/pyutilib/misc/misc.py", line 360, in update
if type(d[k]) is dict:
TypeError: string indices must be integers
Any ideas?
The options keyword argument expects a dictionary. If you want to use the same syntax as the command line, you're after options_string
opt.solve(instance, options_string="threads=4")
opt.solve(instance, options={"threads": 4})

for loop over list KeyError: 664

I am trying to iterate this list with words as
CTCCTC TCCTCT CCTCTC CTCTCC TCTCCC CTCCCA TCCCAA CCCAAA CCAAAC CAAACT
CTGGGC TGGGCC GGGCCA GGCCAA GCCAAT CCAATG CAATGC AATGCC ATGCCT TGCCTG GCCTGC
TGCCAG GCCAGG CCAGGA CAGGAG AGGAGG GGAGGG GAGGGG AGGGGC GGGGCT GGGCTG GGCTGG GCTGGT CTGGTC
TGGTCT GGTCTG GTCTGG TCTGGA CTGGAC TGGACA GGACAC GACACT ACACTA CACTAT
ATTCAG TTCAGC TCAGCC CAGCCA AGCCAG GCCAGT CCAGTC CAGTCA AGTCAA GTCAAC TCAACA CAACAC AACACA
ACACAA CACAAG ACAAGG AGGTGG GGTGGC GTGGCC TGGCCT GGCCTG GCCTGC CCTGCA CTGCAC
TGCACT GCACTC CACTCG ACTCGA CTCGAG TCGAGG CGAGGT GAGGTT AGGTTC GGTTCC
TATATA ATATAC TATACC ATACCT TACCTG ACCTGG CCTGGT CTGGTA TGGTAA GGTAAT GTAATG TAATGG AATGGA
I am trying for loop to read each item in the list and parse it through mk_model.vector
the code used is as follows
for x in all_seq_sentences[:]:
mk_model.vector(x)
print(x)
Usually, mk_model.vector("AGT") will give an array corresponding to defines dna2vec model, But here rather than actually performing the model run it throws error as
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-144-77c47b13e98a> in <module>
1 for x in all_seq_sentences[:]:
----> 2 mk_model.vector(x)
3 print(x)
4
~/Desktop/DNA2vec/dna2vec/dna2vec/multi_k_model.py in vector(self, vocab)
35
36 def vector(self, vocab):
---> 37 return self.data[len(vocab)].model[vocab]
38
39 def unitvec(self, vec):
KeyError: 664
Looking forward to some help here
The above problem was having issues because the for loop took all items in first line as one item, which is why .split() was best solution of it. To read follow https://python-reference.readthedocs.io/en/latest/docs/str/split.html
working code:
for i in all_seq_sentences:
word = i.split()
print(word[0])
and then later implement another loop to access the model.vector function
vec_of_all_seq = []
for sentence in all_seq_sentences:
sentence = sentence.split()
for word in sentence:
vec_of_all_seq.append(mk_model.vector(word))
vector representation derived from model.vector will be saved in numpy array named vec_of_all_seq.

Pandas: Data Frame has 2 columns as 1

q.head()
Outputs
Weekly_Sales
Date
2010-02-28 131963.08
2010-03-31 91237.14
2010-04-30 150516.76
2010-05-31 66694.15
2010-06-30 66740.70
Now the problem i'm facing is that i want to plot 'Date' Column vs 'Weekly_Sales' Column. I've already used the command
q=y.resample('M',on='Date').sum()
to convert weekly data to monthly which results in the upper Dataframe.
type(q)
outputs "class 'pandas.core.frame.DataFrame'" showing that q is a data frame. Now since q doesn't have two different columns as shown here,
q.Weekly_Sales
outputs
Date
2010-02-28 131963.08
2010-03-31 91237.14
2010-04-30 150516.76
2010-05-31 66694.15
2010-06-30 66740.70
2010-07-31 81915.01
2010-08-31 64578.81
2010-09-30 71913.27
2010-10-31 134644.53
2010-11-30 92161.40
2010-12-31 173983.88
2011-01-31 69146.59
2011-02-28 125762.63
2011-03-31 82823.34
2011-04-30 165056.95
2011-05-31 68251.72
2011-06-30 62978.57
2011-07-31 78856.23
2011-08-31 59061.95
2011-09-30 87756.41
2011-10-31 98806.83
2011-11-30 98537.51
2011-12-31 174512.07
2012-01-31 70205.35
2012-02-29 134683.30
2012-03-31 114680.54
2012-04-30 125600.12
2012-05-31 70792.98
2012-06-30 83646.54
2012-07-31 66468.79
2012-08-31 83045.57
2012-09-30 76137.90
2012-10-31 96244.56
Freq: M, Name: Weekly_Sales, dtype: float64
whereas
q.Date
outputs
Traceback (most recent call last):
File "<pyshell#8>", line 1, in <module>
q.Date
File "C:\Program Files (x86)\Python36-32\lib\site-packages\pandas\core\generic.py", line 3614, in __getattr__
return object.__getattribute__(self, name)
AttributeError: 'DataFrame' object has no attribute 'Date'
since both the columns come under q.Weekly_Sales , how do i seperate them to get 2 columns and finally plot them?
double [[]] will query the single columns as dataframe rather than Series, then we using reset_index
new_s=q[['Weekly_Sales']].reset_index()

Resources