Spark stream data from IBM MQ - apache-spark

I want to stream data from IBM MQ. I have tried out this code I found on Github.
I am able to stream data from the Queue but each time it streams, it takes all the data from it. I just want to take the current data that is pushed into the queue. I looked up on many sites but didn't find the correct solution.
In Kafka we had something like KafkaStreamUtils for streaming the near-real-time data. Is there anything similar to that in IBM MQ so that it streams only the latest data?

The sample in the link you provided shows that it calls the following method to recieve from the the IBM MQ:
CustomMQReciever(String host , int port, String qm, String channel, String qn)
If you review CustomMQReciever here you can see that it is only Browsing the messages from the queue. This means the message will still be on the queue and the next time you connect you will receive the same messages:
MQQueueBrowser browser = (MQQueueBrowser) qSession.createBrowser(queue);
If you wanted to remove the messages from the queue you would need to call a method that does consume them from the queue instead of browsing them from the queue. Below is an example of changes to CustomMQReciever.java that should accomplish what you want:
Under the initConnection() change the above code to the following to cause it to remove the messages from the queue:
MQMessageConsumer consumer = (MQMessageConsumer) qSession.createConsumer(queue);
Get rid of:
enumeration= browser.getEnumeration();
Under receive() change the following:
while (!isStopped() && enumeration.hasMoreElements() )
{
receivedMessage= (JMSMessage) enumeration.nextElement();
String userInput = convertStreamToString(receivedMessage);
//System.out.println("Received data :'" + userInput + "'");
store(userInput);
}
To something like this:
while (!isStopped() && (receivedMessage = consumer.receiveNoWait()) != null))
{
String userInput = convertStreamToString(receivedMessage);
//System.out.println("Received data :'" + userInput + "'");
store(userInput);
}

Related

Convert http payload to string for debugging

I am using this C++ code on an ESP8266 to check on several http accessible data. Once in a while this device acts strangelay and I would like to be able to look at the data it gets. So I collect all payloads to be able to check it via rest api. The following code ALMOST works
String payloadAll = "start ###";
http.begin(client, getLink2);
httpCode = http.GET();
if ( httpCode > 0 ) { payload = http.getString(); payloadAll = payloadAll + payload.c_str() + "### ENDE-PAYLOAD1 ###"; }
the output looks like this:
start ###Internals: DEF 1 60 192.168.40.1:1502 TCP DeviceName 192.168.40.1:1502 EXPECT idle FD 6 FUUID 5xxxxx-fxxx-9xxx-fxxx-exxxxxxxxxxx IODev SolarEdge Interval 60 LASTOPEN 1647580317.55657 MODBUSID 1 MODE master MODEL SE8K-XXXXXXXXXX### ENDE-PAYLOAD1 ###
That is the first line of the payload only. I don't know much about the formast the requested data is in but once it is in the String, I can search ind find keyword and such from the lines I don't see in the output, so I know the data is there. If I leave out the ".c_str()" part, I will only get "start ###### ENDE-PAYLOAD1 ###"
How can I get all the lines - formatting is not important - it can all be one line, I just need to be able to see it as plain text.
Thanks!

Disconnect tweepy stream

I have created a global tweepy stream and I have one function where I want to disconnect it if it's active and after to start to track the word received as argument (with async=True).
When I call the function for the first time it's working properly but after I receive errors.
Here is the sequence where I m creating the stream:
myStreamListener = MyStreamListener()
myStream = tweepy.Stream(auth = api.auth, listener = myStreamListener)
And here I use it:
def cauta_dupa_cuvant(word):
myStream.disconnect()
myStream.filter(languages=["en"], track = [word],is_async=True)
Any idea?

Akka: How to ensure that message has been received?

I have an actor Dispenser. What it does is it
dispenses some objects by request
listens to arriving new ones
Code follows
class Dispenser extends Actor {
override def receive: Receive = {
case Get =>
context.sender ! getObj()
case x: SomeType =>
addObj(x)
}
}
In real processing it doesn't matter whether 1 ms or even few seconds passed since new object was sent until the dispenser starts to dispense it, so there's no code tracking it.
But now I'm writing test for the dispenser and I want to be sure that firstly it receives new object and only then it receives a Get request.
Here's the test code I came up with:
val dispenser = system.actorOf(Props.create(classOf[Dispenser]))
dispenser ! obj
Thread.sleep(100)
val task = dispenser ? Get()
val result = Await.result(task, timeout)
check(result)
It satisfies one important requirement - it doesn't change original code. But it is
At least 100ms seconds slow even on very high performance boxes
Unstable and fails sometimes because 100 ms or any other constant doesn't provide any guaranties.
And the question is how to make a test that satisfies requirement and doesn't have cons above (neither any other obvious cons)
You can take out the Thread.sleep(..) and your test will be fine. Akka guarantees the ordering you need.
With the code
dispenser ! obj
val task = dispenser ? Get()
dispenser will process obj before Get deterministically because
The same thread puts obj then Get in the actor's mailbox, so they're in the correct order in the actor's mailbox
Actors process messages sequentially and one-at-a-time, so the two messages will be received by the actor and processed in the order they're queued in the mailbox.
(..if there's nothing else going on that's not in your sample code - routers, async processing in getObj or addObj, stashing, ..)
Akka FSM module is really handy for testing underlying state and behavior of the actor and does not require to change its implementation specifically for tests.
By using TestFSMRef one can get actors current state and and data by:
val testActor = TestFSMRef(<actors constructor or Props>)
testActor.stateName shouldBe <state name>
testActor.stateData shouldBe <state data>
http://doc.akka.io/docs/akka/2.4.1/scala/fsm.html

Why a simple publish subscribe is not working with zeromq?

I want to establish publish subscribe communication between to machines.
The two machines, that I have, are ryu-primary and ryu-secondary
The steps I follow in each of the machines are as follows.
In the initializer for ryu-primary (IP address is 192.168.241.131)
self.context = zmq.Context()
self.sub_socket = self.context.socket(zmq.SUB)
self.pub_socket = self.context.socket(zmq.PUB)
self.pub_port = 5566
self.sub_port = 5566
def establish_zmq_connection(self): # Socket to talk to server
print( "Connection to ryu-secondary..." )
self.sub_socket.connect( "tcp://192.168.241.132:%s" % self.sub_port )
def listen_zmq_connection(self):
print( 'Listen to zmq connection' )
self.pub_socket.bind( "tcp://*:%s" % self.pub_port )
def recieve_messages(self):
while True:
try:
string = self.sub_socket.recv( flags=zmq.NOBLOCK )
print( 'flow mod messages recieved {}'.format(string) )
return string
except zmq.ZMQError:
break
def push_messages(self,msg):
self.pub_socket.send( "%s" % (msg) )
From ryu-secondary (IP address - 192.168.241.132)
In the initializer
self.context = zmq.Context()
self.sub_socket = self.context.socket(zmq.SUB)
self.pub_socket = self.context.socket(zmq.PUB)
self.pub_port = 5566
self.sub_port = 5566
def establish_zmq_connection(self): # Socket to talk to server
print( "Connection to ryu-secondary..." )
self.sub_socket.connect( "tcp://192.168.241.131:%s" % self.sub_port )
def listen_zmq_connection(self):
print( 'Listen to zmq connection' )
self.pub_socket.bind( "tcp://*:%s" % self.pub_port )
def recieve_messages(self):
while True:
try:
string = self.sub_socket.recv( flags=zmq.NOBLOCK )
print( 'flow mod messages recieved {}'.format(string) )
return string
except zmq.ZMQError:
break
def push_messages(self,msg):
print( 'pushing message to publish socket' )
self.pub_socket.send( "%s" % (msg) )
These are the functions that I have.
I am calling on ryu-secondary:
establish_zmq_connections()
push_messages()
But I am not recieving those messages on ryu-primary, when I call
listen_zmq_connection()
recieve_messages()
Can someone point out to me what I am doing wrong?
Repair the PUB/SUB messaging pattern setup
There are several important steps in making the PUB/SUB pattern work.
All this is well described in the ZeroMQ documentation.
You need not repeat both pub & sub parts of code on both sides, the more that it masks, as A side-effect thereof, the case if you mix the pub and sub socket addresses/ports/calls/etc in an "opposite" node code and you do not see such a principal collision.
your code defines the initial form of PUB-archetype, that is expected to .push_messages()
your code defines the initial form of SUB-archetype, that is expected to .receive_messages()
your code does not show, how do you control who goes first on a connection setup -- whether .bind() or .connect() appears at random or before/after the other
your code does not show any subscription setup, after the SUB-archetype was instantiated. A default value upon a socket instantiation does need to be modified via a .setsockopt( zmq.SUBSCRIBE = '') method, otherwise there is a prohibitive filter that does not allow any ( yet unsubscribed ) message to pass through and got-output ( "received" ) on the SUB-side
Must modify a default SUB-side subscription filter, it is prohibitive
You may have noticed from the ZeroMQ documentation, that until setup otherwise, the sub-side does filter-out all incoming messages.
http://api.zeromq.org/2-1:zmq-setsockopt
"The ZMQ_SUBSCRIBE option shall establish a new message filter on a ZMQ_SUB socket. Newly created ZMQ_SUB sockets shall filter out all incoming messages, therefore you should call this option to establish an initial message filter.
An empty option_value of length zero shall subscribe to all incoming messages. A non-empty option_value shall subscribe to all messages beginning with the specified prefix. Multiple filters may be attached to a single ZMQ_SUB socket, in which case a message shall be accepted if it matches at least one filter."
Class-method pre-configuration of a Context instance possible
There is another possibility for a python code using pyzmq 13.0+. There you may also setup this via a Context class-method .setsockopt( zmq.SUBSCRIBE, "" ) et al, but such call has to precede the new socket instantiation from a Context-instance pre-configured this way.

Firebase... Add/Update Firebase Using node.js Script

I have arbitrary JSON that is sensibly laid out like this:
[
{
"id":100,
"name":"Buckeye, AZ",
"status":"OPEN",
"address":{
"street":"416 S Watson RD",
"city":"Buckeye"
...
}
}
]
I've written a node.js script like this for proof of concept (why I'm using node is that the JS API seems better supported than REST or Ruby for this. I could be wrong):
http = require('http')
Firebase = require('firebase')
all_sites_url = "http://supercharge.info/service/supercharge/allSites"
firebase_url = "https://tesla-supercharger.firebaseio.com/"
http.get(all_sites_url, (res) ->
body = ""
res.on "data", (chunk) ->
body += chunk
return
res.on "end", ->
response = JSON.parse(body)
all_sites = response
send_to_firebase(response)
return
return
).on "error", (e) ->
console.log "Got error: ", e
return
send_to_firebase = (response) ->
firebase_ref = new Firebase(firebase_url)
for charger in response
console.log charger
new_child = firebase_ref.push()
new_child.set {id: charger.id, data: charger}, (error) ->
if error
console.log "Data cound not be saved #{error}"
else
console.log "Data saved successfully"
The result is a unique id generated by Firebase, which has as a child a data and an id child. The data child has the expected information like name, status, etc.
What I'd prefer is to generate a key-value pair. E.g., for an id of 100:
- 100
- name
- address
street
city
etc. So my first question is how to accomplish this or if it is even sensible.
After the first time around, this data (call it the data from an external server) will be there and a mobile app will have added some fields. These are not present in the data already there. Next time I fetch data from the external server, I want to update things that have changed that the server would know about, like status. I don't want to tamper with things that only the mobile devices would know about like remote_observations.
I know I'm seeming a bit dense here, but I'm trying to put together a sensible data model that will be updatable from that server using a CRON job and incrementally updatable from a bunch of mobile devices.
Any help is much appreciated.
UPDATE: I have found that this works for getting the structure I want:
send_to_firebase = (response) ->
firebase_ref = new Firebase(firebase_url)
for charger in response
firebase_ref.child(charger.id).update charger, (error) ->
if error
console.log "Data could not be saved #{error}"
else
responses_pending += 1
console.log "Data saved successfully : #{responses_pending} pending"
firebase_ref.on 'value', ->
console.log "value received rp=#{responses_pending}"
process.exit() if (responses_pending -= 1) < 1
So the code I settled on is this:
http = require('http')
Firebase = require('firebase')
firebase_url = '/path/to/your/firebase'
# code to get JSON of the form:
{
"id":100,
"name":"Buckeye, AZ",
"status":"OPEN",
"address":{"street":"416 S Watson RD",
"city":"Buckeye",
"state":"AZ",
"zip":"85326",
"country":"USA"},
... etc.
}
# Asynchronous get of JSON hash from some server or other.
get_my_fine_JSON().on 'complete', (response) ->
send_to_firebase(response)
send_to_firebase = (response) ->
firebase_ref = new Firebase(firebase_url)
length = response.length
for charger in response
firebase_ref.child(charger.id).update charger, (error) ->
if error
console.log "Data could not be saved #{error}"
else
console.log "Data saved successfully"
process.exit() if length -= 1 is 0
Discussion:
The idea was to have a Firebase structure like this:
- 100
- address
street: "123 Main Street"
etc.
That's reason 1 why id is pulled up to be the primary key. Reason 2 is so that I can uniquely identify an object pulled off the external server as the "same" one in my Firebase and apply any updates necessary.
Epiphany 1: Update is more like upsert. If the key is there, whatever hash you supply replaces matching values. If it's not there, then Firebase happily adds it. Which is way cool because it covers both the push and patch cases.
Epiphany 2: This process will hang waiting for events if nothing tells it to stop. That's why the countdown index, length is decremented until the code has upserted (for lack of a better term) each item.
Observation 1: Doing this in node.js is super fast compared with REST using Python or Ruby. And this upsert stuff is wicked cool if I'm understanding it right.
Observation 2: There isn't a ton of wisdom out there as of this writing regarding writing node shell scripts to do this kind of stuff. Maybe it's a good idea, maybe a bad one. I don't know.
Observation 3: Because of the asynchronous nature of node and the Firebase Javascript API (both GOOD THINGs), terminating a process before the last bit is done can be tricky because your process has to hang on just long enough to complete its last request/response with Firebase. This is, as mentioned before, done in the completion handler of the update. Otherwise we wouldn't necessarily be complete when the process exited.
Caveat 1: Related to observation 2, this could be a bad idea, but I haven't been able to find resources that speak to the problem.
Caveat 2: This could be a horrid abuse or misunderstanding of the Firebase update API. I am reporting observed behavior in the limited case of my specific data. YMMV.
Caveat 3: I'm hoping the process lifetime is as I suggest it is in observation 3.
A note to the decaffeinated: The Javascript for this is so trivially different that it shouldn't be too tough to translate. Or go to js2coffee and paste the Coffeescript into the right pane to get real Javascript in the left pane that you can tune.

Resources