How to debug Ecto.Migration constraints? - ecto

Enforcing the format of phone numbers (such as "5551234567") in the schema works fine using Ecto.Schema.validate_format/4 with the ~r(/^\d{10}$) regex, and decided to add a constraint to the database as well, but it throws an exception.
The migration:
defmodule ANV.Repo.Migrations.CreatePhoneNumbers do
use Ecto.Migration
def change do
create table(:phone_numbers, primary_key: false) do
add :id, :uuid, primary_key: true
add :phone_number, :string
timestamps()
end
create constraint(
:phone_numbers,
:phone_number_must_be_a_ten_digit_string,
check: "phone_number ~ '^\d{10}$'"
# This variant doesn't work either
# check: "phone_number ~ $$^\d{10}\Z$$"
)
end
end
I can't find a hint about what I'm doing wrong in the error below, and adding Ecto.Changeset.check_constraint/3, as suggested below, won't show any changeset errors either, but the update will still fail.
iex(2)> Repo.preload(admin, :phone_numbers) \
...(2)> |> Ecto.Changeset.change() \
...(2)> |> Ecto.Changeset.put_assoc(:phone_numbers,
[%{phone_number: "5551234567"}]) \
...(2)> |> Repo.update()
[debug] QUERY OK source="phone_numbers" db=1.0ms queue=1.5ms
SELECT p0."id", p0."phone_number", p0."user_id", p0."inserted_at", p0."updated_at", p0."user_id" FROM "phone_number
s" AS p0 WHERE (p0."user_id" = $1) ORDER BY p0."user_id" [<<151, 69, 143, 150, 60, 216, 64, 125, 152, 42, 217, 7, 2
15, 117, 58, 30>>]
[debug] QUERY OK db=0.2ms
begin []
[debug] QUERY ERROR db=4.7ms
INSERT INTO "phone_numbers" ("phone_number","user_id","inserted_at","updated_at","id") VALUES ($1,$2,$3,$4,$5) ["55
51234567", <<151, 69, 143, 150, 60, 216, 64, 125, 152, 42, 217, 7, 215, 117, 58, 30>>, ~N[2019-09-05 21:39:43], ~N[
2019-09-05 21:39:43], <<217, 77, 187, 38, 201, 82, 78, 117, 156, 144, 64, 248, 249, 55, 227, 70>>]
[debug] QUERY OK db=0.3ms
rollback []
** (Ecto.ConstraintError) constraint error when attempting to insert struct:
* phone_number_must_be_a_ten_digit_string (check_constraint)
If you would like to stop this constraint violation from raising an
exception and instead add it as an error to your changeset, please
call `check_constraint/3` on your changeset with the constraint
`:name` as an option.
The changeset has not defined any constraint.
(ecto) lib/ecto/repo/schema.ex:687: anonymous fn/4 in Ecto.Repo.Schema.constraints_to_errors/3
(elixir) lib/enum.ex:1336: Enum."-map/2-lists^map/1-0-"/2
(ecto) lib/ecto/repo/schema.ex:672: Ecto.Repo.Schema.constraints_to_errors/3
(ecto) lib/ecto/repo/schema.ex:274: anonymous fn/15 in Ecto.Repo.Schema.do_insert/4
(ecto) lib/ecto/association.ex:662: Ecto.Association.Has.on_repo_change/5
(ecto) lib/ecto/association.ex:432: anonymous fn/8 in Ecto.Association.on_repo_change/7
(elixir) lib/enum.ex:1948: Enum."-reduce/3-lists^foldl/2-0-"/3
(ecto) lib/ecto/association.ex:428: Ecto.Association.on_repo_change/7
(elixir) lib/enum.ex:1948: Enum."-reduce/3-lists^foldl/2-0-"/3
(ecto) lib/ecto/association.ex:392: Ecto.Association.on_repo_change/4
(ecto) lib/ecto/repo/schema.ex:837: Ecto.Repo.Schema.process_children/5
(ecto) lib/ecto/repo/schema.ex:914: anonymous fn/3 in Ecto.Repo.Schema.wrap_in_transaction/6
(ecto_sql) lib/ecto/adapters/sql.ex:890: anonymous fn/3 in Ecto.Adapters.SQL.checkout_or_transaction/4
(db_connection) lib/db_connection.ex:1415: DBConnection.run_transaction/4
Also reconfigured PostgreSQL logging (according to 19.8. Error Reporting and Logging) to log everything, but the only related lines are also unhelpful (or I'm not seeing something that is obvious):
2019-09-05 22:26:33.060 UTC [15340] ERROR: new row for relation "phone_numbers" violates check constraint "phone_number_must_be_a_ten_digit_string"
2019-09-05 22:26:33.060 UTC [15340] DETAIL: Failing row contains (e4ce295a-9a7b-4a38-8e24-6f7191f711ce, 5551234567, 98b5b906-16f3-4ea0-875b-f8d4539efe06, 2019-09-05 22:26:33, 2019-09-05 22:26:33).
2019-09-05 22:26:33.060 UTC [15340] STATEMENT: INSERT INTO "phone_numbers" ("phone_number","user_id","inserted_at","updated_at","id") VALUES ($1,$2,$3,$4,$5)

The backslash needs to be escaped. It should have been
check: "phone_number ~ '^\\d{10}$'".
From the initial migration:
create constraint(
:phone_numbers,
:phone_number_must_be_a_ten_digit_string,
check: "phone_number ~ '^\d{10}$'"
The same regex but input directly in psql:
ALTER TABLE phone_numbers
ADD CONSTRAINT phone_number_must_be_a_ten_digit_string2
CHECK (phone_number ~ '^\d{10}$');
Comparison side-by-side (using \d+ phone_numbers in psql):
Check constraints:
"phone_number_must_be_a_ten_digit_string" CHECK (phone_number::text ~ '^^?{10}$'::text)
"phone_number_must_be_a_ten_digit_string2" CHECK (phone_number::text ~ '^\d{10}$'::text)

Related

Encountered an internal AutoML error- ClientException: Message: No objects to concatenate

I am trying to implement Hierarchical time series forecasting on azureautoml pipelines.
I followed this notebook for implementation
https://github.com/Azure/azureml-examples/blob/main/v1/python-sdk/tutorials/automl-with-azureml/forecasting-hierarchical-timeseries/auto-ml-forecasting-hierarchical-timeseries.ipynb
While I ran training pipeline on compute instance it worked, but when I am running the same on compute cluster it breaks at hts-proportion-calculation part.
This is the error I am getting,
system error:
Encountered an internal AutoML error. Error Message/Code: ClientException. Additional Info: ClientException:
      Message: No objects to concatenate
      InnerException: None
      ErrorResponse
{
"error": {
"message": "No objects to concatenate"
}
}
logs :
Loading arguments for scenario proportions-calculation
adding argument --input-medatadata
adding argument --hts-graph
adding argument --enable-event-logger
Input arguments dict is {'--input-medatadata': '/mnt/azureml/cr/j/85509be625484b6caa3c1d97b7ab2e33/cap/data-capability/wd/INPUT_automl_training_workspaceblobstore/azureml/17ca5ae7-7269-4246-888f-e781071e3f5c/automl_training', '--hts-graph': '/mnt/azureml/cr/j/85509be625484b6caa3c1d97b7ab2e33/cap/data-capability/wd/INPUT_hts_graph_workspaceblobstore/azureml/a2c1b15a-c895-41e8-b6a6-1ca37ebe9e77/hts_graph', '--enable-event-logger': None}
Unknown file to proceed outputs.txt
processing: outputs.txt with type None.
Cleaning up all outstanding Run operations, waiting 300.0 seconds
3 items cleaning up...
Cleanup took 0.001676321029663086 seconds
Traceback (most recent call last):
File "proportions_calculation_wrapper.py", line 47, in <module>
runtime_wrapper.run()
File "/azureml-envs/azureml_e34d0633ffc4cb2fa25d91e3da5f59be/lib/python3.7/site-packages/azureml/train/automl/runtime/_many_models/automl_pipeline_step_wrapper.py", line 63, in run
self._run()
File "/azureml-envs/azureml_e34d0633ffc4cb2fa25d91e3da5f59be/lib/python3.7/site-packages/azureml/train/automl/runtime/_hts/proportions_calculation.py", line 44, in _run
proportions_calculation(self.arguments_dict, self.event_logger, script_run=self.step_run)
File "/azureml-envs/azureml_e34d0633ffc4cb2fa25d91e3da5f59be/lib/python3.7/site-packages/azureml/train/automl/runtime/_hts/proportions_calculation.py", line 173, in proportions_calculation
proportion_files_list, forecasting_parameters.time_column_name, graph.label_column_name
File "/azureml-envs/azureml_e34d0633ffc4cb2fa25d91e3da5f59be/lib/python3.7/site-packages/azureml/train/automl/runtime/_hts/proportions_calculation.py", line 92, in calculate_time_agg_sum_for_all_files
df = pd.concat(pool.map(concat_func, files_batches), ignore_index=True)
File "/azureml-envs/azureml_e34d0633ffc4cb2fa25d91e3da5f59be/lib/python3.7/site-packages/pandas/util/_decorators.py", line 311, in wrapper
return func(*args, **kwargs)
File "/azureml-envs/azureml_e34d0633ffc4cb2fa25d91e3da5f59be/lib/python3.7/site-packages/pandas/core/reshape/concat.py", line 304, in concat
sort=sort,
File "/azureml-envs/azureml_e34d0633ffc4cb2fa25d91e3da5f59be/lib/python3.7/site-packages/pandas/core/reshape/concat.py", line 351, in __init__
raise ValueError("No objects to concatenate")
ValueError: No objects to concatenate
Please let me know how can I resolve this issue ?
This error was incurred as Iteration timeout was not less than experiment timeout , but the system error & logs are a kind of misleading.
df = pd.concat(pool.map(concat_func, files_batches), ignore_index=True)
logs was pointing to pandas "No objects to concatenate"
This error can be overcome by setting iterationtimeout value less than experimenttime out value.
I had set iteration_timeout_minutes=60 which caused the error.
automl_settings = AutoMLConfig(
task="forecasting",
primary_metric="normalized_root_mean_squared_error",
experiment_timeout_hours=1,
label_column_name=label_column_name,
track_child_runs=False,
forecasting_parameters=forecasting_parameters,
pipeline_fetch_max_batch_size=15,
model_explainability=model_explainability,
n_cross_validations="auto", # Feel free to set to a small integer (>=2) if runtime is an issue.
cv_step_size="auto",
# The following settings are specific to this sample and should be adjusted according to your own needs.
iteration_timeout_minutes=10,
iterations=15,
)
We are able to run the sample successfully using the compute cluster as given below.
from azureml.core.compute import ComputeTarget, AmlCompute
# Name your cluster
compute_name = "hts-compute"
if compute_name in ws.compute_targets:
compute_target = ws.compute_targets[compute_name]
if compute_target and type(compute_target) is AmlCompute:
print("Found compute target: " + compute_name)
else:
print("Creating a new compute target...")
provisioning_config = AmlCompute.provisioning_configuration(
vm_size="STANDARD_D16S_V3", max_nodes=20
)
# Create the compute target
compute_target = ComputeTarget.create(ws, compute_name, provisioning_config)
# Can poll for a minimum number of nodes and for a specific timeout.
# If no min node count is provided it will use the scale settings for the cluster
compute_target.wait_for_completion(
show_output=True, min_node_count=None, timeout_in_minutes=20
)
# For a more detailed view of current cluster status, use the 'status' property
print(compute_target.status.serialize())

PyAlgoTrade: How to use resampleBarFeed with multiple instruments?

I am resampling a few instruments with [pyalogtrade][1].
I have a base barfeed for 1-minute data, which is working fine
I have added a resampler to resample for 2 minutes, as follows:
class Strategy(strategy.BaseStrategy):
def __init__(self, instruments,feed, brk):
strategy.BaseStrategy.__init__(self, feed, brk)
self.__position = None
self.__instrument = instruments
self._resampledBF = self.resampleBarFeed(2 * bar.Frequency.MINUTE, self.resampledOnBar_2minute)
self.info ("initialised strategy")
I got this error:
2022-09-08 12:36:00,396 strategy [INFO] 1-MIN: INSTRUMENT1: Date: 2022-09-08 12:35:00+05:30 Open: 17765.55 High: 17774.5 Low: 17765.35 Close: 1777 myStrategy.run()
File "pyalgotrade\pyalgotrade\strategy\__init__.py", line 514, in run
self.__dispatcher.run()
File "pyalgotrade\pyalgotrade\dispatcher.py", line 109, in run
eof, eventsDispatched = self.__dispatch()
File "pyalgotrade\pyalgotrade\dispatcher.py", line 97, in __dispatch
if self.__dispatchSubject(subject, smallestDateTime):
File "pyalgotrade\pyalgotrade\dispatcher.py", line 75, in __dispatchSubject ret = subject.dispatch() is True
File "pyalgotrade\pyalgotrade\feed\__init__.py", line 106, in dispatch
dateTime, values = self.getNextValuesAndUpdateDS()
File "pyalgotrade\pyalgotrade\feed\__init__.py", line 81, in getNextValuesAndUpdateDS
dateTime, values = self.getNextValues()
File "pyalgotrade\pyalgotrade\barfeed\__init__.py", line 101, in getNextValues
raise Exception(
Exception: Bar date times are not in order. Previous datetime was 2022-09-08 12:34:00+05:30 and current datetime is 2022-09-08 12:34:00+05:30
However, the error does not occur if the self._resampledBF = self.resampleBarFeed is commented out.
Also, on searching online, I found a similar report/ possible fix reported earlier on Google groups: https://groups.google.com/g/pyalgotrade/c/v9ht1Bfz5Ds/m/ojF8uH8sFwAJ
The solution recommended was:
Sorry never mind, I fixed it. Using current timestamp instead of the one from IB and that fixed it.
Not sure if this is has been resolved.
Would like to know how to resolve the error while resampling.

I am getting this error: MermaidExtension.extendMarkdown() missing 1 required positional argument: 'md_globals'

This was working not too long ago (probably inadvertently upgraded a library somewhere.) All of my libraries are up to date.
Here is the stack trace:
File "C:\Users\jorda\Documents\projects\python\poolBoy\flaskApp.py", line 422, in about
html += markdown.markdown(text, extensions=['md_mermaid', 'fenced_code', 'tables'])
File "C:\Users\jorda\Documents\projects\python\poolBoy\venv\lib\site-packages\markdown\core.py", line 386, in markdown
md = Markdown(**kwargs)
File "C:\Users\jorda\Documents\projects\python\poolBoy\venv\lib\site-packages\markdown\core.py", line 96, in __init__
self.registerExtensions(extensions=kwargs.get('extensions', []),
File "C:\Users\jorda\Documents\projects\python\poolBoy\venv\lib\site-packages\markdown\core.py", line 125, in registerExtensions
ext.extendMarkdown(self)
TypeError: MermaidExtension.extendMarkdown() missing 1 required positional argument: 'md_globals'
Any suggestions would be most appreciated!
Must have upgraded Markdown inadvertently. I added the following to my requirements and it is working fine now:
Markdown<3.2

Dataflow job fails with HttpError, NotImplementedError

I'm running a Dataflow job which I think should work, and is failing after 1.5 hrs with what looks like network errors. It works fine when run against a subset of the data.
The first trouble sign is a whole string of warnings like this:
Refusing to split <dataflow_worker.shuffle.GroupedShuffleRangeTracker object at 0x7f2bcb629950> at b'\xa4r\xa6\x85\x00\x01': proposed split position is out of range [b'\xa4^E\xd2\x00\x01', b'\xa4r\xa6\x85\x00\x01'). Position of last group processed was b'\xa4r\xa6\x84\x00\x01'.
Then there are four errors which seem to be about writing CSV files to GCS:
Error in _start_upload while inserting file gs://(redacted).csv: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/apache_beam/io/gcp/gcsio.py", line 565, in _start_upload self._client.objects.Insert(self._insert_request, upload=self._upload) File "/usr/local/lib/python3.7/site-packages/apache_beam/io/gcp/internal/clients/storage/storage_v1_client.py", line 1156, in Insert upload=upload, upload_config=upload_config) File "/usr/local/lib/python3.7/site-packages/apitools/base/py/base_api.py", line 731, in _RunMethod return self.ProcessHttpResponse(method_config, http_response, request) File "/usr/local/lib/python3.7/site-packages/apitools/base/py/base_api.py", line 737, in ProcessHttpResponse self.__ProcessHttpResponse(method_config, http_response, request)) File "/usr/local/lib/python3.7/site-packages/apitools/base/py/base_api.py", line 604, in __ProcessHttpResponse http_response, method_config=method_config, request=request) apitools.base.py.exceptions.HttpError: HttpError accessing <https://www.googleapis.com/resumable/upload/storage/v1/b/(redacted).csv&uploadType=resumable&upload_id=(redacted)>: response: <{'content-type': 'text/plain; charset=utf-8', 'x-guploader-uploadid': '(redacted)', 'content-length': '0', 'date': 'Wed, 08 Jul 2020 22:17:28 GMT', 'server': 'UploadServer', 'status': '503'}>, content <>
Error in _start_upload while inserting file gs://(redacted).csv: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/apache_beam/io/gcp/gcsio.py", line 565, in _start_upload self._client.objects.Insert(self._insert_request, upload=self._upload) File "/usr/local/lib/python3.7/site-packages/apache_beam/io/gcp/internal/clients/storage/storage_v1_client.py", line 1156, in Insert upload=upload, upload_config=upload_config) File "/usr/local/lib/python3.7/site-packages/apitools/base/py/base_api.py", line 715, in _RunMethod http_request, client=self.client) File "/usr/local/lib/python3.7/site-packages/apitools/base/py/transfer.py", line 908, in InitializeUpload return self.StreamInChunks() File "/usr/local/lib/python3.7/site-packages/apitools/base/py/transfer.py", line 1020, in StreamInChunks additional_headers=additional_headers) File "/usr/local/lib/python3.7/site-packages/apitools/base/py/transfer.py", line 971, in __StreamMedia self.RefreshResumableUploadState() File "/usr/local/lib/python3.7/site-packages/apitools/base/py/transfer.py", line 873, in RefreshResumableUploadState self.stream.seek(self.progress) File "/usr/local/lib/python3.7/site-packages/apache_beam/io/filesystemio.py", line 301, in seek offset, whence, self.position, self.last_block_position)) NotImplementedError: offset: 0, whence: 0, position: 411, last: 411
The Dataflow job ID is 2020-07-07_13_08_31-7649894576933400587 -- if anyone from Google Cloud Support is able to look at this I'd be very grateful. Thanks very much.
P.S I asked a similar question last year (Dataflow job fails at BigQuery write with backend errors), the resolution was to use --experiments=use_beam_bq_sink -- I am already doing this.
You can safely ignore "Refusing to split" errors. This just means that the split position Dataflow service provided probably was received by the worker after that position was already read by the worker. Hence the worker has to ignore the split request.
Error "Error in _start_upload while inserting" seems more problematic and seems to be similar to https://issues.apache.org/jira/browse/BEAM-7014. I suspect this to be a rare flake though so I'm not sure if this was the reason for your job failure (the job only fails of the same workitem failed four times).
Can you contact Google Cloud support so that they can look into your job ?
I will mention this in the JIRA.

Vim errorformat for jshint (with example)

I am trying to write efm for this:
file: app/assets/javascripts/topbar.js
line 17, col 3, Missing semicolon.
line 19, col 19, 'is_mobile' was used before it was defined.
line 21, col 1965, Expected '{' and instead saw 'check'.
line 25, col 18, 'onScroll' was used before it was defined.
file: app/assets/javascripts/trends.js
line 2, col 55, Missing semicolon.
line 6, col 27, 'trendTypeSelected' was used before it was defined.
line 7, col 32, Expected '===' and instead saw '=='.
JSHint check failed
but I think I am missing the concept of file on separate line. Can someone help me out with
this?
This is what I have so far and it does not work:
%-Pfile:\ %f,line\ %l\,\ col\ %c\,\ %m
Thanks!
Solution:
set efm=%-Pfile:\ %f,%*[\ ]line\ %l\\,\ col\ %c\\,\ %m,%-Q,%-GJSHint\ check\ failed
Vim provides %P, which pushes the parsed file (%f) onto the stack, see :help errorformat-separate-filename.
So for your error output, it would be something like %+Pfile: %f,...

Resources