IPFS streams not buffers - node.js

I'm trying to create an API get URL that can be called with <img src='...'/> that will actually load the image from IPFS.
I'm getting the file from IPFS and I can send it as a buffer via fastify but can't send it as a stream.
here's the working buffer using ipfs.cat
import { concat as uint8ArrayConcat } from "uint8arrays/concat";
import all from "it-all";
fastify.get(
"/v1/files/:username/:cid",
async function (request: any, reply: any) {
const { cid }: { cid: string } = request.params;
const ipfs = create();
const data = uint8ArrayConcat(await all(ipfs.cat(cid)));
reply.type("image/png").send(data);
}
);
Docs for ipfs cat
Docs for fastify reply buffers
I also tried sending it as a stream to try and not load the file into the server's memory...
import { concat as uint8ArrayConcat } from "uint8arrays/concat";
import all from "it-all";
import { Readable } from "stream";
...
fastify.get(
"/v1/files/:username/:cid",
async function (request: any, reply: any) {
const { cid }: { cid: string } = request.params;
const ipfs = create();
const bufferToStream = async (buffer: any) => {
const readable = new Readable({
read() {
this.push(buffer);
this.push(null);
},
});
return readable;
};
const data = uint8ArrayConcat(await all(ipfs.cat(cid)));
const str = await bufferToStream(data);
reply.send(str);
}
);
With a new error
Error [ERR_STREAM_WRITE_AFTER_END]: write after end
Here I'm trying to push into the stream
import { concat as uint8ArrayConcat } from "uint8arrays/concat";
import all from "it-all";
import { Readable } from "stream";
fastify.get(
"/v1/files/:username/:cid",
async function (request: any, reply: any) {
const { cid }: { cid: string } = request.params;
const ipfs = create();
const myStream = new Readable();
myStream._read = () => {};
const pushChunks = async () => {
for await (const chunk of ipfs.cat(cid)) {
myStream.push(chunk);
}
};
pushChunks();
reply.send(myStream);
}
);
the error now is
INFO (9617): stream closed prematurely
and trying to dump it all in the stream
import { concat as uint8ArrayConcat } from "uint8arrays/concat";
import all from "it-all";
import { Readable } from "stream";
fastify.get(
"/v1/files/:username/:cid",
async function (request: any, reply: any) {
const { cid }: { cid: string } = request.params;
const ipfs = create();
var myStream = new Readable();
myStream._read = () => {};
myStream.push(uint8ArrayConcat(await all(ipfs.cat(cid))));
myStream.push(null);
reply.send(myStream);
}
);
with error
WARN (14295): response terminated with an error with headers already sent
Is there any benefit to converting it to a stream? Hasn't IPFs already loaded it into memory??

Is there any benefit to converting it to a stream? Hasn't IPFs already loaded it into memory??
The ipfs module returns many chunks as a byte array.
So a file is the sum of these chunks.
Now, if you push all these chunks into an Array, and then uint8ArrayConcat is called, all the chunks are actually in your memory server.
So, if the file is 10 GB, your server has an array of bytes equal to 10 GB in memory.
Since this is unwanted for sure, you should push every chunk from the ipfs file to the response. By doing this, the chunk buffer array is transitive in the server's memory, but it is not persisted. So, in this case, you will not have the 10 GB file in memory, but only a tiny slice of it.
Since ipfs.cap returns an Async iterator, you could handle manually or use something like async-iterator-to-stream to write:
const ipfsStream = asyncIteratorToStream(ipfs.cat(cid))
return reply.send(ipfsStream)
As follow up, I share this awesome resource about node.js stream and buffers

Related

Upload file with apollo-upload-client and graphql-yoga 3.x

In the 3.x versions of graphql-yoga fileuploads use the scalar type File for queries, but apollo-upload-client uses Upload, so how can I make it work with those frameworks?
The easy answer is, that it just works by using Upload instead of File in the query.
This is off topic, but you can make a simpler solution by just sending a File. You need to remove apollo-upload-client from the list. Also on the backend. Pure file upload example.
shema.graphql
scalar File
extend type Mutation {
profileImageUpload(file: File!): String!
}
resolver.ts
profileImageUpload: async (_, { file }: { file: File }) => {
// get readableStream from blob
const readableStream = file.stream()
const stream = readableStream.getReader()
console.log('file', file.type)
let _file: Buffer | undefined
while (true) {
// for each iteration: value is the next blob fragment
const { done, value } = await stream.read()
if (done) {
// no more data in the stream
console.log('all blob processed.')
break
}
if (value)
_file = Buffer.concat([_file || Buffer.alloc(0), Buffer.from(value)])
}
if (_file) {
const image = sharp(_file)
const metadata = await image.metadata()
console.log(metadata, 'metadata')
try {
const image = await sharp(_file).resize(600, 600).webp().toBuffer()
fs.writeFileSync('test.webp', image)
console.log(image, 'image')
}
catch (error) {
console.error(error)
}
}
return 'a'
},

stream array of objects nested in objects in nodejs

I fetch data from a API using nodejs.
I get a response with such a structure (The response is saved by a stream into JSON file)
{"data":{"total":40,"data":[{"date":"20220914","country":"PL","data1":1,"data2":2,"data3":3,"data4":"4"},{"date":"20220914","country":"DE","data1":21,"data2":22,"data3":23,"data4":"24"},{"date":"20220914","country":"DE","data1":21,"data2":22,"data3":23,"data4":"24"},{"date":"20220914","country":"PL","data1":1,"data2":2,"data3":3,"data4":"4"}], "total_page":1,"page":1,"page_size":100},"success":true,"code":"0","request_id":"123"}
Now I would like read the file in a stream, do some transforms on the each object, however I am not able to retrieve it object by object.
The problem is the array with data which I'm interested in is nested in .data.data object keys and I don't know how to get each element of the array one by one and modify it.
import { pipeline, Transform } from 'stream';
import { promisify } from 'util';
import fs from 'fs';
public async processData() {
await this.api.getReport();
const reader = fs.createReadStream('./response.json');
const writer = fs.createWriteStream('properFormat.txt');
const asyncPipeline = promisify(pipeline);
const newFormatedData = (object: Record<string, string>) => {
//Here I would like to take into consideration only values for example with the key: date, country and data1
console.log(object.toString());
};
const formatData = new Transform({
objectMode: true,
transform(chunk, encoding, done) {
this.push(newFormatedData(chunk));
done();
},
});
asyncPipeline(reader, formatData, writer);
}
Thank you for any hints on this!

remix passing args to async server side functions

I am actually a really beginner with this stuff so I beg your pardon for my (silly) questions.
I want to use async functions inside tsx pages, specifically those functions are fetching calls from shopify to get data and ioredis calls to write and read some data.
I know that remix uses action loader functions, so to manage shopify calls I figured out this
export const loader: LoaderFunction = async ({ params }) => {
return json(await GetProductById(params.id as string));
};
async function GetProductById(id: string) {
const ops = ...;
const endpoint = ...;
const response = await fetch(endpoint, ops);
const json = await response.json();
return json;
};
export function FetchGetProductById(id: number) {
const fetcher = useFetcher();
useEffect(() => {
fetcher.load(`/query/getproductid/${id}`);
}, []);
return fetcher.data;
}
with this solution I can get the data whenever I want just calling FetchGetProductById, but my problem is that I need to send more complex data to the loader (like objects)
How may I do that?
In Remix, the loader only handles GET requests, so data must be in the URL, either via params or searchParams (query string).
If you want to pass data in the body of the request, then you'll need to use POST and create an action.
NOTE: Remix uses FormData and not JSON to send data. You will need to convert your JSON into a string.
export const action = async ({ request }: ActionArgs) => {
const formData = await request.formData();
const object = JSON.parse(formData.get("json") as string);
return json(object);
};
export default function Route() {
const fetcher = useFetcher();
useEffect(() => {
if (fetcher.state !== 'idle' || fetcher.data) return;
fetcher.submit(
{
json: JSON.stringify({ a: 1, message: "hello", b: true }),
},
{ method: "post" }
);
}, [fetcher]);
return <pre>{JSON.stringify(fetcher.data, null, 2)}</pre>
}

Upload file using Fetch and calc the progress using readable streams not work as expected (Browser)

I am trying to upload a file using fetch streams, I need can calculate the upload progress.
But on the server I receive an corrupt file of 0 bytes, I really don't know how to use the ReadableStreams feature, that is a really new feature.
Client (Browser)
const fileReadble = (reader) => new ReadableStream({
start(controller) {
return pump()
function pump() {
return reader.read().then(({ done, value }) => {
if (done) {
controller.close()
return
}
controller.enqueue(value)
return pump()
})
}
}
const uploadVideo = async (file) => {
const fileReader = file.stream().getReader()
const response = await fetch('http://localhost:3005', {
method: 'POST',
body: fileReadble(fileReader),
})
if (!response.ok) {
console.error(`unexpected response ${response.statusText}`)
return
}
}
})
Server (Node)
import http from 'node:http'
import { createWriteStream } from 'node:fs'
http
.createServer((req) => {
if (req.method !== 'POST') return
req.pipe(createWriteStream('./video.mp4'))
})
.listen(3005)
Note: I am using tus-js-client on the server to upload videos to vimeo, this is for security reasons, I need to send the file using Fetch.

Using csv-parse with highlandjs

I would like to do a bit of parsing on csv files to convert them to JSON and extract data out of them. I'm using highland as a stream processing library. I am creating an array of csv parsing streams using
import { readdir as readdirCb, createReadStream } from 'fs';
import { promisify } from 'util';
import _ from 'highland';
import parse from 'csv-parse';
const readdir = promisify(readdirCb);
const LOGS_DIR = './logs';
const options = '-maxdepth 1';
async function main() {
const files = await readdir(LOGS_DIR)
const stream = _(files)
.map(filename => createReadStream(`${LOGS_DIR}/${filename}`))
.map(parse)
}
main();
I have tried to use stream like:
const stream = _(files)
.map(filename => createReadStream(`${LOGS_DIR}/${filename}`))
.map(parse)
.each(stream => {
stream.on('parseable', () => {
let record
while (record = stream.read()) { console.log(record) }
})
})
This does not produce any records. I am not sure as how to proceed and receive the JSON for each row for each CSV file.
EDIT:
Writing a function like this works for an individual file:
import parse from 'csv-parse';
import transform from 'stream-transform';
import { createReadStream } from 'fs';
export default function retrieveApplicationIds(filename) {
console.log('Parsing file', filename);
return createReadStream(filename).pipe(parser).pipe(getApplicationId).pipe(recordUniqueId);
}
Edit 2:
I have tried using the concat streams approach:
const LOGS_DIR = './logs';
function concatStreams(streamArray, streamCounter = streamArray.length) {
streamArray.reduce((mergedStream, stream) => {
// pipe each stream of the array into the merged stream
// prevent the automated 'end' event from firing
mergedStream = stream.pipe(mergedStream, { end: false });
// rewrite the 'end' event handler
// Every time one of the stream ends, the counter is decremented.
// Once the counter reaches 0, the mergedstream can emit its 'end' event.
stream.once('end', () => --streamCounter === 0 && mergedStream.emit('end'));
return mergedStream;
}, new PassThrough());
}
async function main() {
const files = await readdir(LOGS_DIR)
const streams = files.map(parseFile);
const combinedStream = concatStreams(streams);
combinedStream.pipe(process.stdout);
}
main();
When I use this, I get the error:
(node:1050) MaxListenersExceededWarning: Possible EventEmitter memory leak detected. 11 unpipe listeners added to [Transformer]. Use emitter.setMaxListeners() to increase limit

Resources