Spark server logs are trimmed - apache-spark

I am trying to fetch the spark server logs for the particular event. But it is getting trimmed in the middle itself
For Eg: If i have a process instance Id of 862939, i will be having a list of events such as for Start, Exception and Completed.
Start/Completed Event Logs: { timestamp:dd/mm/yyy,"log
level"="INFO","source"="/path", "exception"="","message"=""}
Exception logs:{ timestamp:dd/mm/yyy,"log
level"="ERROR","source"="/path",
"exception"="java.lang.NumberFormatException: for input string: {
As you could see the exception logs get truncated in the middle itself without even having all the other fields followed by exception. Could anyone help me to get the full logs for the exception.

Related

sometime Elasticsearch throwing exception while saving logs to elasticsearch

While saving logs in elasticsearch. some time logs save in ES but sometime its throwing exception
mapper_parsing_exception] object mapping for [meta.user_details.permissions] tried to parse field [null] as object, but found a concrete value
I find a solution online that I need to delete index then reindex but I cannot do that.
any other solution.

Why does it shows same logs twice (one as info and one as error with same message) on stackdriver?

I am preparing some logging using python, but whenever I run my code it generates the logs but shows twice on stackdriver console (one as info and one as error). Anyone have idea on how to deal with this problem.
my code:
import logging
from google.cloud import logging as gcp_logging
log_client = gcp_logging.Client()
log_client.setup_logging()
# here executing some bigquery operations
logging.info("Query result loaded into temporary table: {}".format(temporary_table))
# here executing some bigquery operations
logging.error("Query executed with empty result set.")
When I run above code it show above logs twice on stackdriver.
Info:2019-10-17T11:54:02.504Z cf-mycloudfunctionname Query result loaded into temporary table: mytable
Error:2019-10-17T11:54:02.505Z cf-mycloudfunctionname Query result loaded into temporary table: mytable
What I can see is that both (error & info) has been recognized as a plane text, so it sent the same message as info for stderr and stdout, this is why you are getting two same messages.
What you need to do is to correct phrase the this two logs into an structured JSON, this to be recognized by stackdriver as one entity with the correct payload to be displayed.
Additional, you can configure the stackdriver agent to send the logs as you need to be sent, take a look to this document.
Also It will depend from where you are trying to retrieve this logs GCE, GKE, BQ. In some cases its preferible to change the structure of fluentd directly instead of the stackdriver agent.

Logstash 6.2 - full persistent queue (wrong mapping?)

My queue is almost full and I see this errors in my log file:
[2018-05-16T00:01:33,334][WARN ][logstash.outputs.elasticsearch] Could not index event to Elasticsearch. {:status=>400, :action=>["index", {:_id=>nil, :_index=>"2018.05.15-el-mg_papi-prod", :_type=>"doc", :_routing=>nil}, #<LogStash::Event:0x608d85c1>], :response=>{"index"=>{"_index"=>"2018.05.15-el-mg_papi-prod", "_type"=>"doc", "_id"=>"mHvSZWMB8oeeM9BTo0V2", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"failed to parse [papi_request_json.query.disableFacets]", "caused_by"=>{"type"=>"i_o_exception", "reason"=>"Current token (VALUE_TRUE) not numeric, can not use numeric value accessors\n at [Source: org.elasticsearch.common.bytes.BytesReference$MarkSupportingStreamInputWrapper#56b8442f; line: 1, column: 555]"}}}}}
[2018-05-16T00:01:37,145][INFO ][org.logstash.beats.BeatsHandler] [local: 0:0:0:0:0:0:0:1:5000, remote: 0:0:0:0:0:0:0:1:50222] Handling exception: org.logstash.beats.BeatsParser$InvalidFrameProtocolException: Invalid Frame Type, received: 69
[2018-05-16T00:01:37,147][INFO ][org.logstash.beats.BeatsHandler] [local: 0:0:0:0:0:0:0:1:5000, remote: 0:0:0:0:0:0:0:1:50222] Handling exception: org.logstash.beats.BeatsParser$InvalidFrameProtocolException: Invalid Frame Type, received: 84
...
[2018-05-16T15:28:09,981][INFO ][logstash.outputs.elasticsearch] retrying failed action with response code: 403 ({"type"=>"cluster_block_exception", "reason"=>"blocked by: [FORBIDDEN/12/index read-only / allow delete (api)];"})
[2018-05-16T15:28:09,982][INFO ][logstash.outputs.elasticsearch] retrying failed action with response code: 403 ({"type"=>"cluster_block_exception", "reason"=>"blocked by: [FORBIDDEN/12/index read-only / allow delete (api)];"})
[2018-05-16T15:28:09,982][INFO ][logstash.outputs.elasticsearch] retrying failed action with response code: 403 ({"type"=>"cluster_block_exception", "reason"=>"blocked by: [FORBIDDEN/12/index read-only / allow delete (api)];"})
If I understand first warning, problem is with mapping. I have a lot of files in my queue Logstash folder. My questions is:
How to empty my queue, can i just delete all files from logstash queue folder(all logs will be lost)? And then resend all the data to logstash to proper index?
How can I determine where exactly is problem in mapping, or which servers sending data of wrong type?
I have pipeline on port 5000 named testing-pipeline just for checking if Logstash is active from nagios. What is that [INFO ][org.logstash.beats.BeatsHandler] logs?
If I understand correctly, [INFO ][logstash.outputs.elasticsearch] is just logs about retrying to process logstash queue?
On all servers is FIlebeat 6.2.2. Thank you for your help.
All pages in queue could be deleted but it is not the proper solution. In my case, the queue was full because there was events with different mapping of index. In Elasticsearch 6, you cannot send documents with different mapping to the same index so the logs stacked in queue because of this logs (even if there is only one wrong event, all others will not be processed). So how to process all data you can process an skip the wrong one? Solution is to configure DLQ (dead letter queue). Every event with response code 400 or 404 is moved to DLQ so others could be process. The data from DLQ can be processed later with pipeline.
Wrong mapping can be determined by error log "error"=>{"type"=>"mapper_parsing_exception", ..... }. To specify exact place with wrong mapping, you have to compare mapping of events and indices.
[INFO ][org.logstash.beats.BeatsHandler] was caused by Nagios server. The check did not consist of valid request, that's why the Handling exception. The check should test if Logstash service is active. Now I check Logstas service on localhost:9600, for more info here.
[INFO ][logstash.outputs.elasticsearch] means that Logstash trying to process the queue but index is locked ([FORBIDDEN/12/index read-only / allow delete (api)]) because the indices was set to read-only state. Elasticsearch, when there is not enough space on server, automatically configure indices to read-only. This can be change by cluster.routing.allocation.disk.watermark.low, for more info here.

UserErrorSqlBulkCopyInvalidColumnLength - Azure SQL Database

I am configuring a Salesforce to Azure SQL Database data copy using Azure Data Factory. There appears to be an issue with the column length, but I am unable to identify which column is actually causing an issue.
How can I gain more insight into exactly what is causing my problem? or what column is really invalid?
{
"dataRead":18560714,
"dataWritten":0,
"rowsRead":15514,
"rowsCopied":0,
"copyDuration":34,
"throughput":533.109,
"errors":[
{
"Code":9123,
"Message":"ErrorCode=UserErrorSqlBulkCopyInvalidColumnLength,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=SQL Bulk Copy failed due to received an invalid column length from the bcp client.,Source=Microsoft.DataTransfer.ClientLibrary,''Type=System.Data.SqlClient.SqlException,Message=The service has encountered an error processing your request. Please try again. Error code 4815.\r\nA severe error occurred on the current command. The results, if any, should be discarded.,Source=.Net SqlClient Data Provider,SqlErrorNumber=40197,Class=20,ErrorCode=-2146232060,State=1,Errors=[{Class=20,Number=40197,State=1,Message=The service has encountered an error processing your request. Please try again. Error code 4815.,},{Class=20,Number=0,State=0,Message=A severe error occurred on the current command. The results, if any, should be discarded.,},],'",
"EventType":0,
"Category":5,
"Data":{
},
"MsgId":null,
"ExceptionType":null,
"Source":null,
"StackTrace":null,
"InnerEventInfos":[
]
}
],
"effectiveIntegrationRuntime":"DefaultIntegrationRuntime (East US 2)",
"usedCloudDataMovementUnits":4,
"usedParallelCopies":1,
"executionDetails":[
{
"source":{
"type":"Salesforce"
},
"sink":{
"type":"AzureSqlDatabase"
},
"status":"Failed",
"start":"2018-03-01T18:07:37.5732769Z",
"duration":34,
"usedCloudDataMovementUnits":4,
"usedParallelCopies":1,
"detailedDurations":{
"queuingDuration":5,
"timeToFirstByte":24,
"transferDuration":4
}
}
]
}
"Message":"ErrorCode=UserErrorSqlBulkCopyInvalidColumnLength,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=SQL
Bulk Copy failed due to received an invalid column length from the bcp
client.,Source=Microsoft.DataTransfer.ClientLibrary,''Type=System.Data.SqlClient.SqlException,Message=The
service has encountered an error processing your request. Please try
again. Error code 4815.\r\nA severe error occurred on the current
command. The results, if any, should be
discarded.,Source=.Net SqlClient Data
Provider,SqlErrorNumber=40197,Class=20,ErrorCode=-2146232060,State=1,Errors=[{Class=20,Number=40197,State=1,Message=The service has encountered an error processing your request. Please try
again. Error code 4815.,},{Class=20,Number=0,State=0,Message=A severe
error occurred on the current command. The results, if any,
should be discarded.
I have seen that error when trying copy a string from the source to the destination but the column receiving the string on the destination is shorter than the source.
My suggestion is to make sure you selected data types big enough on the destination so they can receive the data from the source. Make sure also the sink has the columns shown in the preview.
To identify which column may be involved, exclude big columns one by one from the source.

Loopback 3 discards error information on multiple validation errors, turning 422 to 500, how can I solve that?

I'm migrating from Loopback 2 tot 3.
I currently have an issue with validation errors and strong-error-handler
When I post a bulk create which results in multiple validation errors, those get returned as an array of ValidationErrors.
Those errors get grouped by strong-error handler in a 500 internal server error, which is how it was before, but the details of the errors get discarded, when debug is set to false.
In my example I upload an array of tags, but for each tag, a uniqueness validation is executed. When 2 or more tags are already in the database, I have an array of errors, instead of a single validation error
I need a way to determine why the validation failed on the client side, but the details of the errors are discarded now.
Am I doing something wrong here, or should this be considered as a bug?
From the strongloop error handler documentation in loopback,
In production mode, strong-error-handler omits details from error responses to prevent leaking sensitive information:
More information
For 5xx errors, the output contains only the status code and the status name from the HTTP specification.
For 4xx errors, the output contains the full error message (error.message) and the contents of the details property (error.details) that ValidationError typically uses to provide machine-readable details about validation problems. It also includes error.code to allow a machine-readable error code to be passed through which could be used, for example, for translation.
Am I doing something wrong here, or should this be considered as a bug?
No this is the intended behaviour
Safe error fields
You can set the stack trace as "safe-error-field" so that it will be displayed in production.
For example, the stack field is not displayed by default if you run the loopback in production mode.
If you still want to display the stack field, then change the config json in the server/middleware.json
"final:after": {
"strong-error-handler": {
"params": {
"safeFields": ["stack"]
}
}
}

Resources