BIServerComponent - Could not get details about the specified host - components

I have an OBIEE 11g installation in a Red Hat machine, but I'm finding problems to make it running. I can start WebLogic and its services, so I’m able to enter the WebLogic console and Enterprise Manager, but problems come when I try to start OBIEE components with opmnctl command.
Set up OBIEE Components
cd /home/Oracle/Middleware/instances/instance1/bin/
./opmnctl startall
Check the status of components
cd /home/Oracle/Middleware/instances/instance1/bin/
./opmnctl status
Processes in Instance: instance1
---------------------------------+--------------------+---------+---------
ias-component | process-type | pid | status
---------------------------------+--------------------+---------+---------
coreapplication_obiccs1 | OracleBIClusterCo~ | 8221 | Alive
coreapplication_obisch1 | OracleBIScheduler~ | N/A | Down
coreapplication_obijh1 | OracleBIJavaHostC~ | 8726 | Alive
coreapplication_obips1 | OracleBIPresentat~ | 6921 | Alive
coreapplication_obis1 | OracleBIServerCom~ | N/A | Down
OracleBIServerComponent is in "Down" status and i have checked the problem in Enterprise Manager. The content of log is:
[2014-03-20T10:23:47.000+00:00] [OracleBIServerComponent] [NOTIFICATION:1] [] [] [ecid: 004xGPJYrnnFw00Fzzx0g00006Gb000000] [tid: fb3c8c60] [36007] Loading repository /home/Oracle/Middleware/instances/instance1/bifoundation/OracleBIServerComponent/coreapplication_obis1/repository/Justizia_BI0011.rpd.
[2014-03-20T10:23:48.000+00:00] [OracleBIServerComponent] [NOTIFICATION:1] [] [] [ecid: 004xGPJYrnnFw00Fzzx0g00006Gb000000] [tid: 43cf5940] [14055] Loading subject area: JUSTICIA_F1 ...
[2014-03-20T10:23:48.000+00:00] [OracleBIServerComponent] [NOTIFICATION:1] [] [] [ecid: 004xGPJYrnnFw00Fzzx0g00006Gb000000] [tid: 43cf5940] [14056] Finished loading subject area: JUSTICIA_F1.
[2014-03-20T10:23:48.000+00:00] [OracleBIServerComponent] [NOTIFICATION:1] [] [] [ecid: 004xGPJYrnnFw00Fzzx0g00006Gb000000] [tid: fb3c8c60] [85003] MDX Member Name Cache subsystem started successfully.
[2014-03-20T10:23:48.000+00:00] [OracleBIServerComponent] [NOTIFICATION:1] [] [] [ecid: 004xGPJYrnnFw00Fzzx0g00006Gb000000] [tid: fb3c8c60] [85004] MDX Member Name Cache subsystem recovered entries: 0, size: 0 bytes.
[2014-03-20T10:23:52.000+00:00] [OracleBIServerComponent] [NOTIFICATION:1] [] [] [ecid: 004xGPJYrnnFw00Fzzx0g00006Gb000000] [tid: fb3c8c60] [85008] MDX Member Name Cache subsystem statistics - Entries: 0, Sizes: 0 bytes, Overall Queries: 0, Hits: 0(0%), Misses: 0(0%).
[2014-03-20T10:23:52.000+00:00] [OracleBIServerComponent] [NOTIFICATION:1] [] [] [ecid: 004xGPJYrnnFw00Fzzx0g00006Gb000000] [tid: fb3c8c60] [85007] MDX Member Name Cache subsystem stopped.
[2014-03-20T10:23:53.000+00:00] [OracleBIServerComponent] [NOTIFICATION:1] [] [] [ecid: 004xGPJYrnnFw00Fzzx0g00006Gb000000] [tid: fb3c8c60] [14058] Unloaded all subject areas.
[2014-03-20T10:23:53.000+00:00] [OracleBIServerComponent] [ERROR:1] [] [] [ecid: ] [tid: fb3c8c60] [nQSError: 46131] Unresolved host name: JustiziaInf. Could not get details about the specified host
The part of log important is:
[nQSError: 46131] Unresolved host name: JustiziaInf. Could not get details about the specified host
Any idea?

Related

How can I fix out of memory issue in Kusto query?

I have an issue with memory usage in Azure DashBoard. The below query is running well and producing the expected result but from Time to time It is throwing memory usage exceeds error.
let Requests = requests
| extend MyCustomer= trim_end('/',tostring(split(customDimensions.source,'//')[1])), OrderID= tostring(customDimensions.subject),
Environment=trim_start(#'[\d.][\d.][\d.]',tostring(split(cloud_RoleName,'-')[4])), Data= tostring(customDimensions.data), state= tostring(customDimensions.type),
OperationName = tostring(customDimensions.OperationName),Category = tostring(customDimensions.Category)
| where OrderID != "" and MyCustomer !contains "eurotracs"
| project success, resultCode, state,ProcessingDate = timestamp, MyCustomer, OrderID, Environment, operation_Id,OperationName,Category
| join kind=fullouter (exceptions
| extend Method = innermostMethod,ReasonMessage=innermostMessage
| distinct ReasonMessage,Method, operation_Id,method )
on operation_Id
| sort by OrderID desc, ProcessingDate desc;
Requests
|extend
Test1_OK =iif(split(method, ".")[-1] contains "Test1", "NO","YES")
,Test2_OK =iif(split(method, ".")[-1] contains "Test2", "NO","YES")
,Test3_OK =iif(split(method, ".")[-1] !contains "Test1" and split(method, ".")[-1] !contains "Test2", "NO","YES")
| project ProcessingDate,OrderID,success,Test1_OK,Test2_OK,Test3_OK, ProcessingStateDetail =split(method, ".")[-1],ReasonMessage, state
| order by ProcessingDate

Use RabbitMQ as procedure and Celery as consumer

I'm trying to use RabbitMQ, Celery, and Flask app to simply update the database. ProcedureAPI.py is an API that gets the data, inserts records in the database, and pushes data to the Radbbitmq server. Celery gets the data from Rabbit Queue and updates the database.
I'm new to this, please point out what I'm doing wrong.
consumer.py
from celery import Celery
import sqlite3
import time
#app = Celery('Task_Queue')
#default_config = 'celeryconfig'
#app.config_from_object(default_config)
app = Celery('tasks', backend='rpc://', broker='pyamqp://guest:guest#localhost')
#app.task(serializer='json')
def updateDB(x):
x=x["item"]
with sqlite3.connect("test.db") as conn:
time.sleep(5)
conn.execute('''updateQuery''', [x])
# app.log(f"{x['item']} status is updated as completed!")
return x
ProcedureAPI.py
from flask import Flask,request,jsonify
import pandas as pd
import sqlite3
import json
import pika
import configparser
parser = configparser.RawConfigParser()
configFilePath = 'appconfig.conf'
parser.read(configFilePath)
# RabbitMQ Config
rmq_username = parser.get('general', 'rmq_USERNAME')
rmq_password = parser.get('general', 'rmq_PASSWORD')
host= parser.get('general', 'rmq_IP')
port= parser.get('general', 'rmq_PORT')
# Database
DATABASE= parser.get('general', 'DATABASE_FILE')
app = Flask(__name__)
conn_credentials = pika.PlainCredentials(rmq_username, rmq_password)
connection = pika.BlockingConnection(pika.ConnectionParameters(
host=host,
port=port,
credentials=conn_credentials))
channel = connection.channel()
#app.route('/create', methods=['POST'])
def create_main():
if request.method=="POST":
print(DATABASE)
with sqlite3.connect(DATABASE) as conn:
conn.execute('''CREATE TABLE table1
(feild1 INTEGER PRIMARY KEY, ##AUTOINCREMENT
feild2 varchar(20) NOT NULL,
feild3 varchar(20) DEFAULT 'pending');''')
return "Table created",202
#app.route('/getData', methods=['GET'])
def display_main():
if request.method=="GET":
with sqlite3.connect(DATABASE) as conn:
df = pd.read_sql_query("SELECT * from table1", conn)
df_list = df.values.tolist()
JSONP_data = jsonify(df_list)
return JSONP_data,200
#app.route('/', methods=['POST'])
def update_main():
if request.method=="POST":
updatedata=request.get_json()
with sqlite3.connect(DATABASE) as conn:
conn.execute("INSERT_Query")
print("Records Inserted successfully")
channel.queue_declare(queue='celery', durable=True)
channel.basic_publish(exchange = 'celery',routing_key ='celery' ,body = json.dumps(updatedata),properties=pika.BasicProperties(delivery_mode = 2))
return updatedata,202
# main driver function
if __name__ == '__main__':
app.run()
configfile
[general]
# RabbitMQ server (broker) IP address
rmq_IP=127.0.0.1
# RabbitMQ server (broker) TCP port number (5672 or 5671 for SSL)
rmq_PORT=5672
# queue name (storage node hostname)
rmq_QUEUENAME=Task_Queue
# RabbitMQ authentication
rmq_USERNAME=guest
rmq_PASSWORD=guest
DATABASE_FILE=test.db
# log file
receiver_LOG_FILE=cmdmq_receiver.log
sender_LOG_FILE=cmdmq_sender.log
run celery
celery -A consumer worker --pool=solo -l info
The error I got:
(env1) PS C:\Users\USER\Desktop\Desktop\Jobs Search\nodepython\flaskapp> celery -A consumer worker --pool=solo -l info
-------------- celery#DESKTOP-FRBNH77 v5.2.0 (dawn-chorus)
--- ***** -----
-- ******* ---- Windows-10-10.0.19041-SP0 2021-11-12 17:35:04
- *** --- * ---
- ** ---------- [config]
- ** ---------- .> app: tasks:0x1ec10c9c5c8
- ** ---------- .> transport: amqp://guest:**#localhost:5672//
- ** ---------- .> results: rpc://
- *** --- * --- .> concurrency: 12 (solo)
-- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)
--- ***** -----
-------------- [queues]
.> celery exchange=celery(direct) key=celery
[tasks]
. consumer.updateDB
[2021-11-12 17:35:04,546: INFO/MainProcess] Connected to amqp://guest:**#127.0.0.1:5672//
[2021-11-12 17:35:04,571: INFO/MainProcess] mingle: searching for neighbors
[2021-11-12 17:35:05,594: INFO/MainProcess] mingle: all alone
[2021-11-12 17:35:05,605: INFO/MainProcess] celery#DESKTOP-FRBNH77 ready.
[2021-11-12 17:35:14,952: WARNING/MainProcess] Received and deleted unknown message. Wrong destination?!?
The full contents of the message body was: body: '{"item": "1BOOK"}' (17b)
{content_type:None content_encoding:None
delivery_info:{'consumer_tag': 'None4', 'delivery_tag': 1, 'redelivered': False, 'exchange':
'celery', 'routing_key': 'celery'} headers={}}
Any reference code or suggestion will be a great help.
Looks like you haven’t declare the exchange and bind into queue that you want to route
channel.exchange_declare(exchange='exchange_name', exchange_type="type_of_exchange")
channel.queue_bind(exchange='exchange_name, queue='your queue_name')
Producer : P
Exchange : E
Queue : Q
Bind : B
Producer(your pika script) does not able to send message directly Producer needs some intermediate to send to Queue, therefore message route from
P >> E >> B >> Q
Exchange route the request to one or multiple Queue depending upon exchange type
Bind (As name explain) it use to bind the exchanges to Queue depending upon exchange type
for more detail please refer ::
https://hevodata.com/learn/rabbitmq-exchange-type/

Find logs of POD in AKS using Log Analytics Query

There is a AKS running that is connected to Log Analytics in Azure.
I'm trying to view logs of named PODs using the following query snippet:
let KubePodLogs = (clustername:string, podnameprefix:string) {
let ContainerIdList = KubePodInventory
| where ClusterName =~ clustername
| where Name startswith strcat(podnameprefix, "-")
| where strlen(ContainerID)>0
| distinct ContainerID;
ContainerLog
| where ContainerID in (ContainerIdList)
| join (KubePodInventory | project ContainerID, Name, PodLabel, Namespace, Computer) on ContainerID
| project TimeGenerated, Node=Computer, Namespace, PodName=Name1, PodLabel, ContainerID, LogEntry
};
KubePodLogs('aks-my-cluster', 'my-service') | order by TimeGenerated desc
The above query does return rows of the matching PODs but not all that are actually available.
Trying to get results of the partial queries by inspecting POD details:
KubePodInventory
| where ClusterName =~ 'aks-my-cluster'
| where Name startswith 'my-service-'
| where strlen(ContainerID)>0
| distinct ContainerID;
gives me a container-id. Now feeding this container-id into another query shows more
results then the combined query from above. Why ?
ContainerLog
| where ContainerID == "aec001...fc31"
| order by TimeGenerated desc
| project TimeGenerated, ContainerID, LogEntry
One thing I noticed is that the later simple query result contain log results that have a LogEntry field parsed from JSON formatted output of the POD. In the results I can expand LogEntryto more fields corresponding to the original JSON data of that POD log output.
I.e. it seems like the combined query ( with a join ) skips those JSON LogEntry ContainerLog entries, but why ?
As far as I can see the combined query doesn't filter in any way on the LogEntry field.
A changed query seems to produce the results I would expect:
I exchanged the join with a lookup and used more columns to distinct the KubePodInventory results.
let KubePodLogs = (clustername:string, podnameprefix:string) {
let ContainerIdList = KubePodInventory
| where ClusterName =~ clustername
| where Name startswith strcat(podnameprefix, "-")
| where strlen(ContainerID)>0
| distinct ContainerID, PodLabel, Namespace, PodIp, Name;
ContainerLog
| where ContainerID in (ContainerIdList)
| lookup kind=leftouter (ContainerIdList) on ContainerID
| project-away Image, ImageTag, Repository, Name, TimeOfCommand
| project-rename PodName=Name1
};
KubePodLogs('aks-my-cluster', 'my-service') | order by TimeGenerated desc

Tensorflow checkpoints are not correctly saved when using gcloud compute unit instead of local

When I train locally using google cloud buckets as data source and destination with:
gcloud ml-engine local train --module-name trainer.task_v2s --package-path trainer/
I get normal results and checkpoints are getting saved properly in 20 seps since my dataset is 400 examples and I use 20 as batchsize: 400/20 = 20 steps = 1 Epoch. These files get saved in my model dir in the bucket
model.ckpt-0.data-00000-of-00001
model.ckpt-0.index
model.ckpt-0.meta
model.ckpt-20.data-00000-of-00001
model.ckpt-20.index
model.ckpt-20.meta
Furthermore my local GPU is properly engaged:
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1018 G /usr/lib/xorg/Xorg 212MiB |
| 0 1889 G compiz 69MiB |
| 0 5484 C ...rtualenvs/my_project/bin/python 2577MiB |
+-----------------------------------------------------------------------------+
When I now try to use a gcloud compute unit:
gcloud ml-engine jobs submit training my_job_name \
--module-name trainer.task_v2s --package-path trainer/ \
--staging-bucket gs://my-bucket --region europe-west1 \
--scale-tier BASIC_GPU --runtime-version 1.8 --python-version 3.5
It takes around the same time to save a checkpoint, but it is getting saved in 1 step increment, though the data sources have not changed. The loss is also decreasing way slower, as it would when only one example would be trained. This is how the files look:
model.ckpt-0.data-00000-of-00001
model.ckpt-0.index
model.ckpt-0.meta
model.ckpt-1.data-00000-of-00001
model.ckpt-1.index
model.ckpt-1.meta
The GPU is also not getting engaged at all:
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
I'm using a custom estimator with no configured clusterspec, as I assume you only need that for distributed camputing and my run_config looks like this:
Using config: {'_master': '', '_num_ps_replicas': 0, '_session_config': None, '_task_id': 0, '_model_dir': 'gs://my_bucket/model_dir', '_save_checkpoints_steps': None, '_tf_random_seed': None, '_task_type': 'master', '_keep_checkpoint_max': 5, '_evaluation_master': '', '_device_fn': None, '_save_checkpoints_secs': 600, '_save_summary_steps': 100, '_cluster_spec': , '_log_step_count_steps': 100, '_is_chief': True, '_global_id_in_cluster': 0, '_num_worker_replicas': 1, '_service': None, '_keep_checkpoint_every_n_hours': 10000, '_train_distribute': None}
From the logs I can also see the TF_CONFIG environment variable:
{'environment': 'cloud', 'cluster': {'master': ['127.0.0.1:2222']}, 'job': {'python_version': '3.5', 'run_on_raw_vm': True, 'package_uris': ['gs://my-bucket/my-project10/27cb2041a4ae5a14c18d6e7f8622d9c20789e3294079ad58ab5211d8e09a2669/MyProject-0.9.tar.gz'], 'runtime_version': '1.8', 'python_module': 'trainer.task_v2s', 'scale_tier': 'BASIC_GPU', 'region': 'europe-west1'}, 'task': {'cloud': 'qc6f9ce45ab3ea3e9-ml', 'type': 'master', 'index': 0}}
My guess is that I need to configure something I haven't but I have no idea what. I also do get some warnings at the beginning, but I don't think they have something to do with this:
google-cloud-vision 0.29.0 has requirement requests<3.0dev,>=2.18.4, but you'll have requests 2.13.0 which is incompatible.
I just found my error: I needed to put tensorflow-gpu instead of tensorflow in my setup.py. Even better is, as rhaertel80 stated, to omit tensorflow all together.

How to zip two column in pyspark? [duplicate]

This question already has answers here:
How to zip two array columns in Spark SQL
(4 answers)
Closed 2 years ago.
I use: Python 3.6 and PySpark 2.3.0. In the following exaple I have only tow items in item but also I can have more information like first_name, last_name, city.
I have a data frame with the following schema:
|-- email: string (nullable = true)
| -- item: struct(nullable=true)
| | -- item: array(nullable=true)
| | | -- element: struct(containsNull=true)
| | | | -- data: string(nullable=true)
| | | | -- fieldid: string(nullable=true)
| | | | -- fieldname: string(nullable=true)
| | | | -- fieldtype: string(nullable=true)
This is my output:
+-----+-----------------------------------------------------------------------------------------+
|email|item |
+-----+-----------------------------------------------------------------------------------------+
|x |[[[Gmail, 32, Email Client, dropdown], [Device uses Proxy Server, 33, Device, dropdown]]]|
|y |[[[IE, 32, Email Client, dropdown], [Personal computer, 33, Device, dropdown]]] |
+-----+-----------------------------------------------------------------------------------------+
I want to transform this data frame to:
+-----+-------------------------------------+
|email|Email Client|Device |
+-----+-------------------------------------+
|x |Gmail |Device uses Proxy Server|
|y |IE |Personal computer |
+-----+-------------------------------------+
I do some transformations:
df = df.withColumn('item', df.item.item)
df = df.withColumn('column_names', df.item.fieldname)
df = df.withColumn('column_values', df.item.data)
And now my output is:
+-----+----------------------+---------------------------------+
|email|column_names |column_values |
+-----+----------------------+---------------------------------+
|x |[Email Client, Device]|[Gmail, Device uses Proxy Server]|
|y |[Email Client, Device]|[IE, Personal computer] |
+-----+----------------------+---------------------------------+
From here I want a method how to zip these columns.
You asked how to zip the arrays, but you can actually get to your desired output without the intermediate steps of creating the column_names and column_values columns.
Use the getItem() function to grab the desired values by index:
import pyspark.sql.functions as f
df = df.select(
'email',
f.col('item.data').getItem(0).alias('Email Client'),
f.col('item.data').getItem(1).alias('Device')
)
df.show(truncate=False)
#+-----+------------+------------------------+
#|email|Email Client|Device |
#+-----+------------+------------------------+
#|x |Gmail |Device uses Proxy Server|
#|y |IE |Personal computer |
#+-----+------------+------------------------+
This assumes that the Email Client field is always at index 0 and Device is at index 1.
If you can't assume that the fields are always in the same order in each row, another option is to create a map from the values in the column_names and column_values using pyspark.sql.functions.create_map().
This function takes takes a:
list of column names (string) or list of Column expressions that [are] grouped as key-value pairs, e.g. (key1, value1, key2, value2, ...).
We iterate over the items in column_names and column_values to create a list of the pairs, and then use list(chain.from_iterable(...)) to flatten the list.
After the list is made, you can select the field by name.
from itertools import chain
# first create a map type column called 'map'
df.select(
'email',
f.create_map(
list(
chain.from_iterable(
[[f.col('column_names').getItem(i), f.col('column_values').getItem(i)]
for i in range(2)]
)
)
).alias('map')
)
df.show(truncte=False)
#+-----+--------------------------------------------------------------+
#|email|map |
#+-----+--------------------------------------------------------------+
#|x |Map(Email Client -> Gmail, Device -> Device uses Proxy Server)|
#|y |Map(Email Client -> IE, Device -> Personal computer) |
#+-----+--------------------------------------------------------------+
# now select the fields by key
df = df.select(
'email',
f.col('map').getField("Email Client").alias("Email Client"),
f.col('map').getField("Device").alias("Device")
)
This assumes that there will always be at least 2 elements in each array.
If you wanted to zip lists of arbitrary length, you would have to use a udf.
# define the udf
zip_lists = f.udf(lambda x, y: [list(z) for z in zip(x, y)], ArrayType(StringType()))
# use the udf to zip the lists
df.select(
'email',
zip_lists(f.col('column_names'), f.col('column_values')).alias('zipped')
).show(truncate=False)
#+-----+-----------------------------------------------------------+
#|email|zipped |
#+-----+-----------------------------------------------------------+
#|x |[[Email Client, Gmail], [Device, Device uses Proxy Server]]|
#|y |[[Email Client, IE], [Device, Personal computer]] |
#+-----+-----------------------------------------------------------+
Or you could use a udf to create the map:
make_map = f.udf(lambda x, y: dict(zip(x, y)), MapType(StringType(), StringType()))
df.select(
'email',
make_map(f.col('column_names'), f.col('column_values')).alias('map')
).show(truncate=False)
#+-----+--------------------------------------------------------------+
#|email|map |
#+-----+--------------------------------------------------------------+
#|x |Map(Device -> Device uses Proxy Server, Email Client -> Gmail)|
#|y |Map(Device -> Personal computer, Email Client -> IE) |
#+-----+--------------------------------------------------------------+

Resources