Timeout for Lambda: Export Cloudfront logs from S3 to ElasticSearch

Timeout for Lambda: Export Cloudfront logs from S3 to ElasticSearch - node.js

I wrote a Lambda function to send Cloudfront logs into Elasticsearch.
The worklow is as follow:
1. Cloudfront send logs (compressed into .gz format) into S3
2. The Bucket send a notification which is caught by the Lambda function
3. The Lambda is triggered. Decompress the logs and send them into Elasticsearch.
I use for this s3-to-logstore combined with winston-parser.
The Lambda is indeed triggered, but only one part of the logs is sent to Elasticsearch, because the Lambda function times out (I set the timeout to the max: 5min).
I suspect decompressing the .gz logs take some time, but it's at most 30 KB which is not much and should not take long.
I was inspired by this example, and here is my function:
var s3ToLogstore = require('s3-to-logstore');
var winston = require('winston');
require('winston-elasticsearch');
var elasticsearch = require('elasticsearch');
var client = new elasticsearch.Client({
host: process.env.ES_HOST,
log: 'trace'
});
var transport = new winston.transports.Elasticsearch({
indexPrefix: process.env.ES_INDEXPREFIX,
client: client
});
var options = {
format: process.env.FORMAT,
transport: transport,
reformatter: function(data){
data.environment = process.env.STAGE;
data.origin = process.env.FORMAT;
return data;
}
};
exports.handler = s3ToLogstore(options);
The cloudwatch logs are totally fine, and there is no error in them. The Lambda just times out, and I can't figure out why.
Any help would be greatly appreciated.

Most likely the elasticsearch client keeps a connection open, so the lambda never stops. Try setting keepAlive property to false.
var client = new elasticsearch.Client({
host: process.env.ES_HOST,
log: 'trace',
keepAlive: false
});
Refer to keepAlive and related properties.
https://www.elastic.co/guide/en/elasticsearch/client/javascript-api/current/configuration.html

Related

AWS Greengrass V2 Node Publishing problem with aws-iot-sdk-v2 JS

For the past few days I've been trying to solve the problem of publishing a message from Lambda to the AWS cloud, using Greengrass v2.
The code in python was even provided in the documentation, only had to be slightly reworked.
When it comes to SDK v2 JS in documentation there is only minimal mention about publish function in AWS-CRT library.
I tried to create code using components from this library, but it looks like the library also requires a script with parameters.
This is my code that requires installation of aws-iot-sdk-v2 js.
const iotsdk = require("aws-iot-device-sdk-v2");
const mqtt = iotsdk.mqtt;
const os = require("os");
const util = require("util");
const GROUP_ID = process.env.GROUP_ID;
const THING_NAME = process.env.AWS_IOT_THING_NAME;
const THING_ARN = process.env.AWS_IOT_THING_ARN;
(topic = "gg/message"),
(payload = JSON.stringify({ message: util.format("ping") }));
function greengrassHelloWorldRun() {
mqtt.MqttClientConnection.prototype.publish(topic, payload);
}
console.log(topic);
console.log(payload);
setInterval(greengrassHelloWorldRun, 5000);
exports.handler = function (event, context) {
console.log("event: " + JSON.stringify(event));
console.log("context: " + JSON.stringify(context));
};
I get errors about arguments and NAPI.
The same errors also appear when using this function as lambda component in greengrass logs
Maybe someone has some example how to publish some message on topic using Node lambda with sdk v2.

After contacting AWS Support I know it is impossible.
AWS doesn't support mqttProxy IPC for SDK V2 JS yet.

ChristopherTal
I'm also using the Greengrass SDKs for JS and indeed they're still a work in progress. But I was able to send messages to the IoTCore from Greengrass using the JS SDKs.
A few things to mention:
You seem to use the aws-iot-device-sdk-v2 SDK which is for things
The aws-greengrass-core-sdk npm package is made for components
It is important to differ between things and components and decide who's doing what.
To send data to IoTCore from a thing, you need indeed to use MQTT. On the deployment page on the Greengrass console, you need to revise the deployment and add following components:
MQTT Broker
MQTT Bridge
Client device auth
This way your thing connects to the local MQTT Broker through the client device auth component and the MQTT Bridge decides how the traffic is routed. You can read all info on the links above.
I even realised this using the standard mqtt npm package. You need to create a certificate and a thing using lambda or the console and use those certificates to access the broker.
const mqtt = require('mqtt')
const fs = require('fs')
const ca = fs.readFileSync(locationOfTheCA)
const key = fs.readFileSync(locationOfThePrivateKey)
const cert = fs.readFileSync(locationOfTheCertificate)
console.log('Welcome to MQTT Connector')
const client = mqtt.connect('mqtts://localhost:8883', {
clientId: 'yourThingNameHere',
ca: ca,
key: key,
cert: cert
})
client.on('connect', function () {
console.log('Connected to MQTT')
/* client.subscribe('$aws/*', function (err) {
if (!err) {
//client.publish('presence', 'Hello mqtt')
}
})*/
})
client.on('message', function (topic, message) {
// message is Buffer
console.log(message.toString())
client.end()
})
Hopefully this helps you out!
Warm regards
Hacor

Heroku Node.js RedisCloud Redis::CannotConnectError on localhost instead of REDISCLOUD_URL

When i try to connect my Nodsjs application to RedisCloud on Heroku I am getting the following error
Redis::CannotConnectError: Error connecting to Redis on 127.0.0.1:6379 (ECONNREFUSED)
I have even tried to directly set the redis URL and port in the code to test it out as well. But still, it tried to connect to the localhost on Heroku instead of the RedisCloud URL.
const {Queue} = require('bullmq');
const Redis = require('ioredis');
const conn = new Redis(
'redis://rediscloud:mueSEJFadzE9eVcjFei44444RIkNO#redis-15725.c9.us-east-1-4.ec2.cloud.redislabs.com:15725'
// Redis Server Connection Configuration
console.log('\n==================================================\n');
console.log(conn.options, process.env.REDISCLOUD_URL);
const defaultQueue = () => {
// Initialize queue instance, by passing the queue-name & redis connection
const queue = new Queue('default', {conn});
return queue;
};
module.exports = defaultQueue;
Complete Dump of the Logs https://pastebin.com/N9awJYL9

set REDISCLOUD_URL on .env file as follows
REDISCLOUD_URL =redis://rediscloud:password#hostname:port
import * as Redis from 'ioredis';
export const redis = new Redis(process.env.REDISCLOUD_URL);

I just had a hard time trying to find out how to connect the solution below worked for me.
Edit----
Although I had been passed the parameters to connect to the Redis cloud, it connected to the local Redis installed in my machine. Sorry for that!
I will leave my answer here, just in case anyone need to connect to local Redis.
let express = require('express');
var redis = require('ioredis');
pwd = 'your_pwd'
url = 'rediss://host'
port = '1234'
redisConfig = `${url}${pwd}${port}`
client = redis.createClient({ url: redisConfig })
client.on('connect', function() {
console.log('-->> CONNECTED');
});
client.on("error", function(error) {
console.error('ERRO DO REDIS', error);
});

Just wanted to post my case in case someone has the same problem like me.
In my situation I was trying to use Redis with Bull, so i need it the url/port,host data to make this happened.
Here is the info:
https://devcenter.heroku.com/articles/node-redis-workers
but basically you can start your worker like this:
let REDIS_URL = process.env.REDISCLOUD_URL || 'redis://127.0.0.1:6379';
//Once you got Redis info ready, create your task queue
const queue = new Queue('new-queue', REDIS_URL);
In the case you are using local, meaning 'redis://127.0.0.1:6379' remember to run redis-server:
https://redis.io/docs/getting-started/

How to connect to AWS ElasticSearch using npm elasticsearch and http-aws-es?

I am using the npm elasticsearch package to search my AWS ES domain. Everything seems to work fine when I use Postman to make POST requests with my AWS IAM credentials.
I wanted to do the same in my code (node.js). I referred to this answer here:
How to make calls to elasticsearch apis through NodeJS?
Here is code:
const elasticsearch = require('elasticsearch');
const awsHttpClient = require('http-aws-es');
const AWS = require('aws-sdk');
const client = new elasticsearch.Client({
host: 'my-aws-es-endpoint',
connectionClass: awsHttpClient,
amazonES: {
region: 'us-east-1',
credentials: new AWS.Credentials('my-access-key','my-secret-key')
}
});
But when I run client.search(), it fails with an error:
Elasticsearch ERROR: 2018-10-31T15:12:22Z
Error: Request error, retrying
POST https://my-endpoint.us-east-1.es.amazonaws.com/my-index/student/_search => Data must be a string or a buffer
It also gives me a warning
Elasticsearch WARNING: 2018-10-31T15:12:22Z
Unable to revive connection: https://my-endpoint.us-east-1.es.amazonaws.com/
When I use just the aws-sdk, it works fine (probably because I sign the request there?)
Can someone suggest what I am I doing wrong here?

I was able to solve this by specifying the region. There is a problem with the elasticsearch client where it's not able to the pick the region which we specify in
amazonES: {
region: 'us-east-1',
credentials: new AWS.Credentials('my-access-key','my-secret-key')
}
}
I solved it by specifying the region using AWS.config.region before the above code
AWS.config.region = 'us-east-1';

How do I intercept outgoing tcp messages in node?

How can I write a simple stream which intercepts messages?
For example, say I want to log (or eventually transform) a message being sent over the wire by a user's socket.write(...) call.
Following is a minimal program which attempts to do this:
const net = require('net');
const stream = require('stream');
const socket = new net.Socket();
const transformer = new stream.Transform({
transform(chunk,e,cb){
console.log("OUT:"+chunk.toString());
cb();
}});
//const client = socket.pipe(transformer); // <= prints "OUT:" on client, but nothing on server
const client = transformer.pipe(socket); // <= prints nothing on client, but "hello world" on server
socket.on('data', (data)=>{ console.log("IN:"+data.toString()); });
socket.connect(1234, 'localhost', ()=>{ client.write("hello world"); });
When I do socket.pipe(transformer), the client prints "OUT:" (like I want), but doesn't actually send anything to the server. When I swap the pipe locations, transformer.pipe(socket), nothing gets printed to the client but the message gets sent to the server.
Although not listed here, I also tried to use the Writable stream, which does print the message on the client, but it is never sent to the server (if I do a this.push(...) inside the Writable stream, it still doesn't seem to send to the server)
What am I missing here?
EDIT: Reformatted the code for clarity and updated the text

It looks like I needed to change the following line
socket.connect(1234, 'localhost', ()=>{ client.write("hello world"); });
to this
socket.connect(1234, 'localhost', ()=>{ transformer.write("hello world"); });
This is based on #Mr.Phoenix's comment. I expected .pipe() to return a new stream which I could use. I believe that is how Java's netty framework does it and I kept expecting node streams to work the same way.

You're not writing any data out of the stream.
You need to either this.push(chunk) or change the call to cb to cb(null, chunk).
See the docs about implementing transform streams for more info.

Can't receive redis data from socket io

I'm building a realtime visualization using redis as pubsub messenger between python and node. There's a python script always running which sets a redis hash with hmset. That side of the app is working fine, if I enter the following example command: "HGETALL 'sellers-80183917'" in a redis client I end up getting the proper data.
The problem is in the js side. I'm using socketio and redis nodejs libraries to listen to the redis instance and publish the results online through a d3js viz.
I run the following code with node:
var express = require('express');
var app = express();
var redis = require('redis');
app.use(express.static(__dirname + '/public'));
var http = require('http').Server(app);
var io = require('socket.io')(http);
var sredis = require('socket.io-redis');
io.adapter(sredis({ host: 'localhost', port: 6379 }));
redisSubscriber = redis.createClient(6379, 'localhost', {});
redisSubscriber.on('message', function(channel, message) {
io.emit(channel, message);
});
app.get('/sellers/:seller_id', function(req, res){
var seller_id = req.params.seller_id;
redisSubscriber.subscribe('sellers-'.concat(seller_id));
res.render( 'seller.ejs', { seller:seller_id } );
});
http.listen(3000, '127.0.0.1', function(){
console.log('listening on *:3000');
});
And this is the relevant part of the seller.ejs file that's receiving the user requests and outputting the viz:
var socket = io('http://localhost:3000');
var stats;
var seller_key = 'sellers-'.concat(<%= seller %>);
socket.on(seller_key, function(msg){
stats = [];
console.log('Im in');
var seller = $.parseJSON(msg);
var items = seller['items'];
for(item in items) {
var item_data = items[item];
stats.push({'title': item_data['title'], 'today_visits': item_data['today_visits'], 'sold_today': item_data['sold_today'], 'conversion_rate': item_data['conversion_rate']});
}
setupData(stats);
});
The problem is that the socket_on() method never receives anything and I don't see where the problem is as everything seems to be working fine besides this.

I think that you might be confused as to what Pub/Sub in Redis actually is. It's not a way to listen to changes on hashes; you can have a Pub/Sub channel called sellers-1, and you can have a hash with the key sellers-1, but those are unrelated to each other.
As documented here:
Pub/Sub has no relation to the key space.
There is a thing called keyspace notifications that can be used to listen to changes in the key space (through Pub/Sub channels); however, this feature isn't enabled by default because it'll take up more resources.
Perhaps an easier method would be to publish a message after the HMSET, so any subscribers would know that the hash got changed (they would then retrieve the hash contents themselves, or the published message would contain the relevant data).
This brings us to the next possible issue: you only have one subscriber connection, redisSubscriber.
From what I understand from the Node.js Redis driver, calling .subscribe() on such a connection would remove any previous subscriptions in favor of the new one. So if you were previously subscribed to the sellers-1 channel and subscribe to sellers-2, you wouldn't be receiving messages from the sellers-1 channel anymore.
You can listen on multiple channels by either passing an array of channels, or by passing them as a arguments:
redisSubscriber.subscribe([ 'sellers-1', 'sellers-2', ... ])
// Or:
redisSubscriber.subscribe('sellers-1', 'sellers-2', ... )
You would obviously have to track each "active" seller subscription. Either that, or create a new connection for each subscription, which also isn't ideal.
It's probably a better idea to have a single Pub/Sub channel on which all changes would get published, instead of a separate channel for each seller.
Finally: if your seller id's aren't hard to guess (for instance, if it's based on an incremental integer value), it would be trivial for someone to write a client that would make it possible to listen in on any seller channel they'd like. It might not be a problem, but it is something to be aware of.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string