How to make sure ftp stream is fully download? - node.js

This is my code
const execute = async() => {
const dir = `c://users/xxx/desktop/test/`
const ftp = new PromiseFtp()
try {
await ftp.connect(config)
for (let word of word_name) {
let lists = await ftp.list(`/loc/${word}/`)
let file = lists.slice(-2)[0] //get recent file
let file_path = `${dir}${file.name}`
const stream = await ftp.get(`/loc/${word}/${file.name}`)
await new Promise((resolve, reject) => {
stream.once('end', resolve)
stream.once('error', reject)
stream.pipe(fs.createWriteStream(file_path))
})
let byte = fs.readFileSync(file_path )
byte = iconv.decode(byte, 'big5')
console.log(byte)
}
} catch(e) {
console.error(e)
} finally {
return ftp.end()
}
}
I use promise-ftp module, sometimes the byte in the loop all return correct value,but sometimes one or more than one value just return undefined,I think the reason is asynchronous stream, I tried use stream.once('finish')and other way,but it not working too,anyone can explain why? thanks。

Related

how to use await instead of then in promise?

How to correctly resolve a Promise.all(...), I'm trying that after resolving the promise which generates a set of asynchronous requests (which are simple database queries in supabase-pg SQL) I'm iterating the result in a forEach , to make a new request with each of the results of the iterations.
But, try to save the result that it brings me in a new array, which prints fine in the console, but in the response that doesn't work. It comes empty, I understand that it is sending me the response before the promise is finished resolving, but I don't understand why.
In an answer to a previous question I was told to use await before the then, but I didn't quite understand how to do it.
What am I doing wrong?
export const getReportMonthly = async(req: Request & any, res: Response, next: NextFunction) => {
try {
let usersxData: UsersxModalidadxRolxJob[] = [];
let data_monthly: HoursActivityWeeklySummary[] = [];
let attendance_schedule: AttendanceSchedule[] = [];
let time_off_request: TimeOffRequestRpc[] = [];
let configs: IndicatorConfigs[] = [];
const supabaseService = new SupabaseService();
const promises = [
supabaseService.getSummaryWeekRpcWihoutFreelancers(req.query.fecha_inicio, req.query.fecha_final).then(dataFromDB => {
data_monthly = dataFromDB as any;
}),
supabaseService.getUsersEntity(res).then(dataFromDB => {
usersxData = dataFromDB as any;
}),
supabaseService.getAttendaceScheduleRpc(req.query.fecha_inicio, req.query.fecha_final).then(dataFromDB => {
attendance_schedule = dataFromDB as any;
}),
supabaseService.getTimeOffRequestRpc(req.query.fecha_inicio, req.query.fecha_final).then(dataFromDB => {
time_off_request = dataFromDB as any;
}),
supabaseService.getConfigs(res).then(dataFromDB => {
configs = dataFromDB;
}),
];
let attendanceInMonthly = new Array();
await Promise.all(promises).then(() => {
attendance_schedule.forEach(element => {
let start_date = element.date_start.toString();
let end_date = element.date_end.toString();
supabaseService.getTrackedByDateAndIDArray(start_date, end_date).then(item => {
console.log(item);
attendanceInMonthly.push(item);
});
});
})
res.json(attendanceInMonthly)
} catch (error) {
console.log(error);
res.status(500).json({
title: 'API-CIT Error',
message: 'Internal server error'
});
}
If you await a promise you could write the return of this in a variable and work with this normaly.
So instead of your current code you could use the following changed code:
export const getReportMonthly = async(req: Request & any, res: Response, next: NextFunction) => {
try {
let usersxData: UsersxModalidadxRolxJob[] = [];
let data_monthly: HoursActivityWeeklySummary[] = [];
let attendance_schedule: AttendanceSchedule[] = [];
let time_off_request: TimeOffRequestRpc[] = [];
let configs: IndicatorConfigs[] = [];
const supabaseService = new SupabaseService();
const promises = [
supabaseService.getSummaryWeekRpcWihoutFreelancers(req.query.fecha_inicio, req.query.fecha_final).then(dataFromDB => {
data_monthly = dataFromDB as any;
}),
supabaseService.getUsersEntity(res).then(dataFromDB => {
usersxData = dataFromDB as any;
}),
supabaseService.getAttendaceScheduleRpc(req.query.fecha_inicio, req.query.fecha_final).then(dataFromDB => {
attendance_schedule = dataFromDB as any;
}),
supabaseService.getTimeOffRequestRpc(req.query.fecha_inicio, req.query.fecha_final).then(dataFromDB => {
time_off_request = dataFromDB as any;
}),
supabaseService.getConfigs(res).then(dataFromDB => {
configs = dataFromDB;
}),
];
const resolvedPromises = await Promise.all(promises)
const attendanceInMonthly = await Promise.all(
resolvedPromises.map(
async (element) => {
let start_date = element.date_start.toString();
let end_date = element.date_end.toString();
return supabaseService.getTrackedByDateAndIDArray(start_date, end_date)
}
)
)
console.log(attendanceInMonthly) // this should be your finaly resolved promise
res.json(attendanceInMonthly)
} catch (error) {
console.log(error);
res.status(500).json({
title: 'API-CIT Error',
message: 'Internal server error'
});
}
Something like this should your code looks like. I am not sure if this solves exactly your code because your code has some syntax errors wich you have to solve for you.
If I understand correctly, you launch a few requests, among which one (getAttendaceScheduleRpc, which assigns attendance_schedule) is used to launch some extra requests again, and you need to wait for all of these (including the extra requests) before returning?
In that case, the immediate issue is that you perform your extra requests in "subqueries", but you do not wait for them.
A very simple solution would be to properly separate those 2 steps, somehow like in DerHerrGammler's answer, but using attendance_schedule instead of resolvedPromises as input for the 2nd step:
let attendanceInMonthly = new Array();
await Promise.all(promises);
await Promise.all(attendance_schedule.map(async (element) => {
let start_date = element.date_start.toString();
let end_date = element.date_end.toString();
const item = await supabaseService.getTrackedByDateAndIDArray(start_date, end_date);
console.log(item);
attendanceInMonthly.push(item);
});
res.json(attendanceInMonthly);
If you are really looking to fine tune your performance, you could take advantage of the fact that your extra requests depend only on the result of one of your initial requests (getAttendaceScheduleRpc), so you could launch them as soon as the latter is fullfilled, instead of waiting for all the promises of the 1st step:
let attendance_schedule: AttendanceSchedule[] = [];
let attendanceInMonthly = new Array();
const promises = [
supabaseService.getAttendaceScheduleRpc(req.query.fecha_inicio, req.query.fecha_final).then(dataFromDB => {
attendance_schedule = dataFromDB as any;
// Immediately launch your extra (2nd step) requests, without waiting for other 1st step requests
// Make sure to return when all new extra requests are done, or a Promise
// that fullfills when so.
return Promise.all(attendance_schedule.map(async (element) => {
let start_date = element.date_start.toString();
let end_date = element.date_end.toString();
const item = await supabaseService.getTrackedByDateAndIDArray(start_date, end_date);
console.log(item);
attendanceInMonthly.push(item);
});
}),
// etc. for the rest of 1st step requests
];
await Promise.all(promises);
res.json(attendanceInMonthly);

fs.watchFile() a json file until a specific value appear

So I have a json file that changes continously and I need to read it AFTER a value called auth-token is written to the file, here what I get now:
const json = fs.readFileSync("some-json.json")
const headers = JSON.parse(json);
return headers
But it reads the file before anything can be written to it, is there anyway that I can use fs.watchFile() and watch the file UNTIL the value is written?
Thanks
You can use fs.watch although its behavior is a bit unreliable with multiple events triggered upon file change (but I don't think it would be a problem here).
Here is a small sample:
const { watch } = require('fs');
const { readFile } = require('fs/promises');
(async () => {
const result = await new Promise((resolve) => {
const watcher = watch('some-json.json', async (eventType, filename) => {
try {
const fileContent = await readFile(filename);
const headers = JSON.parse(fileContent.toString());
if (headers['auth-token']) { // or whatever test you need here
watcher.close();
resolve(headers);
}
} catch (e) {}
});
});
console.log(result);
})();
Note that if your file gets modified many times before it contains the desired header, it might be preferable to replace the usage of fs.watch by a setInterval to read the file at regular intervals until it contains the value you expect.
Here is what it would look like:
const { readFile } = require('fs/promises');
(async () => {
const waitingTime = 1000;
const result = await new Promise((resolve) => {
const interval = setInterval(async (eventType, filename) => {
const fileContent = await readFile('some-json.json');
try {
const headers = JSON.parse(fileContent.toString());
if (headers['auth-token']) { // or whatever test you need here
clearInterval(interval);
resolve(headers);
}
} catch (e) {}
}, waitingTime);
});
console.log(result);
})();

Stream (Geo)JSON file and get startByte and endByte of each JSON record in the file

For very large JSON/GeoJSON files, I'd like to create a primitive key/value store that keeps track of the starting positions and lengths of each JSON record in the file. This way, I could look up individual records at a later stage without reading the whole file into memory (Using the fd.read API). Somewhat similar to a super simple database, but read-only and without the extra overhead.
The issue I'm facing is that I don't know how I could determine the starting position and byte length of each JSON record / GeoJSON feature in the original file.
Here's some pseudo-code showcasing what I'm trying to achieve, note that the geojsonStream.parse callback doesn't receive the startByte and length arguments in reality though.
Thanks for your help, also happy about any feedback outlining why this might be a bad idea :)
import geojsonStream from 'geojson-stream'
import { open } from 'fs/promises'
import { Buffer } from 'buffer'
function getFeaturePositionsInFile(fd) {
return new Promise((resolve,reject) => {
const featurePositionsInFile = []
const stream = fd
.createReadStream()
.pipe(geojsonStream.parse((building, index, startByte, length) => {
// The startByte and length callback arguments are not real unfortunately :(
featurePositionsInFile.push({
index,
startPosition,
length
})
}))
stream.on('end', () => resolve(featurePositionsInFile))
stream.on('error', () => reject)
})
}
function readSingleFeatureFromFile(fd, startPosition, length) {
return new Promise((resolve, reject) => {
try {
const buff = Buffer.alloc(length)
const offset = 0
const { buffer } = await fd.read(buff, offset, length, startPosition)
const singleFeature = JSON.parse(buffer.toString())
resolve(singleFeature)
} catch (e) {
reject(e)
}
})
}
const fd = await open('buildings.geojson')
const featurePositionsInFile = await getFeaturePositionsInFile(fd)
const featureIndexToRead = 0
const { startPosition, length } = featurePositionsInFile[featureIndexToRead]
const singleFeature = await readSingleFeatureFromFile(fd, startPosition, length)
Alright, since I couldn't find a suitable package for my needs, I created a simple (naïve) solution using RegExp to extract single GeoJSON features.
It works given:
The GeoJSON has a properties object, and the object is the last key in the parent GeoJSON object
the GeoJSON (properties) solely consists ASCII characters
For GeoJSON files containing non-ASCII characters, the byte counting is off. I tried but couldn't really find out what exactly I'm doing wrong, so any help is appreciated!
For a more general solution, I guess one would need to implement the byte counting logic in an existing library such as stream-json
import { open } from 'fs/promises'
import { Buffer } from 'buffer'
const HIGHWATERMARK = 64 * 1024 / 8
function getFeaturePositionsInFile(fd) {
return new Promise((resolve,reject) => {
const featurePositionsInFile = []
const stream = fd.createReadStream({highWaterMark: HIGHWATERMARK, autoClose: false});
// this RegEx will solely work with standard GeoJSON without any foreign members:
// https://datatracker.ietf.org/doc/html/rfc7946#section-6.1
// The properties object has to be present, and has to be that last key in the GeoJSON object
const jsonExtractor = /\{[\n\r\s]*?"type":[\n\r\s]*?"Feature"[\S\s]*?\}(?:[\n\r\s]*\})+/g
let string = ''
let endPos = 0
stream.on('data', (d) => {
const section = d.toString()
const sectionLength = (new TextEncoder().encode(section)).length
string += section
endPos+= sectionLength
let match
let latestEndPositionInString = 0
while ((match = jsonExtractor.exec(string)) != null) {
const startPositionInString = match.index
const featureString = match[0]
const endPositionInString = startPositionInString + featureString.length
const curStringLength = (new TextEncoder().encode(string)).length
// calculate starting position in file
const startPosition = endPos - curStringLength + startPositionInString
// calculate number of bytes in feature
const byteLength = (new TextEncoder().encode(featureString)).length
// store info for later in our lookup array
featurePositionsInFile.push({
startPosition,
byteLength
})
if (endPositionInString > latestEndPositionInString) {
latestEndPositionInString = endPositionInString
}
}
// remove features from string to free memory
string = string.substring(latestEndPositionInString)
})
stream.on('end', () => resolve(featurePositionsInFile))
stream.on('error', () => reject)
})
}
function readSingleFeatureFromFile(fd, startPosition, length) {
return new Promise(async (resolve, reject) => {
try {
const buff = Buffer.alloc(length)
const offset = 0
const { buffer } = await fd.read(buff, offset, length, startPosition)
const featureString = buffer.toString()
const singleFeature = JSON.parse(featureString)
resolve(singleFeature)
} catch (e) {
reject(e)
}
})
}
async function getFeature(featureIndexToRead, featurePositionsInFile) {
const { startPosition, byteLength } = featurePositionsInFile[featureIndexToRead]
const singleFeature = await readSingleFeatureFromFile(fd, startPosition, byteLength)
return singleFeature
}
// source: https://raw.githubusercontent.com/node-geojson/geojson-stream/master/test/data/featurecollection.geojson
const path = 'featurecollection.geojson'
// -> has 3 features
const fd = await open(path, 'r');
const featurePositionsInFile = await getFeaturePositionsInFile(fd)
// get nth (e.g 3rd) feature in file
const firstFeature = await getFeature(2, featurePositionsInFile)
console.log(firstFeature)
// done! make sure to close the filehandle
fd.close()
https://gist.github.com/chrispahm/c226cca151b25147869288600151a5f8

upload to s3 with fileS buffer

I am trying to upload to s3 with bulk files.
Somehow if I am uploading with callback, it'll work properly but I want to push all the return data into an array then do something after. But it doesn't work.
I looked online, I was saw answers such as using async await or recurssive would work but still it's not working though. I even tried using reduce but no luck too
example of my reduce
return files.reduce((accumulator, current) => {
const {path, buffer} = current;
const s3 = new AWS.S3();
// https://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3.html#putObject-property
s3.putObject(awsS3sdkParams(path, buffer), function ( err, data ) {
const { protocol, host } = this.request.httpRequest.endpoint;
data.params = this.request.params;
data.params.url = `${protocol}//${host}/${data.params.Key}`;
return [...accumulator, data];
});
}, []);
example using recurrsive
const result = [];
const helper = (files) => {
const {path, buffer} = files[0];
const s3 = new AWS.S3();
// https://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3.html#putObject-property
s3.putObject(UploadService.awsS3sdkParams(path, buffer), function (err, data){
const { protocol, host } = this.request.httpRequest.endpoint;
data.params = this.request.params;
data.params.url = `${protocol}//${host}/${data.params.Key}`;
UtilsService.clDebug(data, 'data');
result.push(data);
files.shift();
if(files.length > 0) return helper(files);
});
};
helper(files);
return results;
example using promise
const result = [];
for(let {path, buffer} of files){
const s3 = new AWS.S3();
s3.putObject(awsS3sdkParams(path, buffer)).promise()
.then(file => {
result.push(file);
})
.catch(err => {
console.log(err, 'errs');
});
}
I can pretty much understand why result is always [] but how can I make it work though?
Reason why I cannot use async await is because I tried but then somehow after files are either uploaded with bad data that I cannot even open the file, or keys would be the same...
Does anyone has any other suggestions or advice?
Thanks in advance for any

How to download much files in a loop in node js?

i take a first steps with node.js and i don't understand async code.
I would to make a code who downloads 1k files from links.
In my code i used package "node-downloader-helper".
and this package work to one or 2 files.
when loop is bigger downloads max 70 files and all don't working correctly
all are downloaded incorrectly.
Its my code:
let uniq = [array with 1000 links];
async.forEachOf(uniq, function (value, key, callback) {
const dl = new DownloaderHelper(value, 'E:/XAMPP/htdocs/mydownloads/pdf/',{fileName: key+".pdf"});
dl.on('end', () => console.log('Download Completed'))
dl.start();
}, function (err) {
if (err) console.error(err.message);
});
enter image description here
solution
const save = async (link, index) => {
const dl = new DownloaderHelper(link, linkToDirectory)
await dl.start();
}
const forLoop = async _ => {
console.log('Start')
for (let index = 0; index < uniq.length; index++) {
const link = uniq[index]
const response = await save(link, index)
console.log(response)
}
console.log('End')
}
forLoop();

Resources