I'm currently looking for a way to track all requests made from a Website in zombie.js. The idea is to get all information about loaded content (eg. tracking pixel for ads, analytics tags, images, css ...). Basically the Network Monitor from the dev Tools in a headless browser.
I'm currently stuck at this point:
var Browser = require("zombie");
var url = "http://stackoverflow.com/";
var browser = new Browser();
browser.visit(url, function(err) {
for (var i = browser.resources.length - 1; i >= 0; i--) {
console.log(browser.resources[i].request.url)
}
})
This is probably the most basic Set Up and will not track anything except of some .js request. Also I can't track loaded files which are loaded by some external Script. Best example is the Google Tagmanager which will "hide" all files which are loaded by the Tag Manager.
Would be great if somebody would have a idea how to solve this issue.
Thanks in advance
Daniel
What you want to find out is called Resources, and you can access them via browser.resources, like
browser.visit(url).then(function(){
console.log(browser.resources); // array with downloaded resources
});
You can also creates pipes to monitor the resources being downloaded real-time:
browser.pipeline.addHandler(function(browser, request, response){
console.log(request, response);
return response;
});
browser.visit(url).then(function(){
console.log('successful visit');
});
Related
I'm building a personal portfolio website using Vue.js, and I'm attempting to build a form to allow me to add to my portfolio later. I'm storing the text data in firebase, but I also want to be able to upload and access pictures. I'm attempting to upload through a form and save with node:fs with the following
import { writeFile } from 'node:fs'
export function saveImages (data:FileList, toDoc: string) {
const reader = new FileReader()
const imageNames = []
console.log(toDoc)
for (let i = 0; i < data.length; i++) {
imageNames.push(toDoc + '/' + data[i].name)
reader.readAsBinaryString(data[i])
reader.onloadend = function (e) {
if (e.target?.readyState === FileReader.DONE) {
const imageFile = e.target.result as string
if (imageFile) {
writeFile('./assets/' + data[i].name, imageFile, 'binary', (err) =>
console.log('was unable to save file ' + data[i].name + ' => ' + err)
)
}
}
}
}
return imageNames
}
When I attempt to call saveImages, I get the error
ERROR in node:fs
Module build failed: UnhandledSchemeError: Reading from "node:fs" is not handled by plugins (Unhandled scheme).
Webpack supports "data:" and "file:" URIs by default.
You may need an additional plugin to handle "node:" URIs.
As pointed out by the comments on your answer, the Node.js-fs-module is cannot be used in the frontend. Here is why:
While developing your vue.js-app, you should remember that you always have to run a development server in order for the app to compile to a browser-readable format. The resulting page will not be delivered as some .vue-files and .js-files but everything will be bundled into an html-file and some additional .js-files.
While running the development server, the source directory of your app is 'lying' on a server, but this is not even the directory that is delivered to the browser.
In a production server, there will be static assets built out for your vue.js-app, which does also only contain .html- and .js-files.
When a client (browser) accesses a route, some static files will be delivered to the browser, and all the code you are writing in your .vue-files will be run on the client's machine, not on a server. So you cannot interact with server-side directories.
Therefore, you should look into a backend server framework, which you can connect to your frontend to allow users to upload files to a server, and those files would be saved on the server. You will then set up your vue app to communicate with the backend. Here are some additional resources:
Node modules with vue: StackOverflow
Express.js: popular backend framework for Node.js
Express.js: Deliver files
Blog article on Express.js file upload (Attacomsian)
You might also want to take a look on how to send static files with express, because once the files are uploaded and the server receives them, it could store them into a static-directory, where you could access them without having to use separate API-routes.
I have a challenge I'm running into and cannot seem to find an answer for it anywhere on the web. I'm working on a personal project; it's a Node.js application that uses the request and cheerio packages to hit an end-point and scrape some data... However, the endpoint is a Facebook page... and the display of its content is dependent upon whether the user is logged in or not.
In short, the app seeks to scrape the user's saved links, you know, all that stuff you add to your "save for later" but never actually go back to (at least in my case). The end-point, then, is htpps://www.facebook.com/saved. If, in your browser, you are logged into Facebook, clicking that link will take you where the application needs to go. However, since the application isn't technically going through the browser that has your credentials and your session saved, I'm running into a bit of an issue...
Yes, using the request module I'm able to successfully reach "a" part of Facebook, but not the one I need... My question really is: how should I begin to handle this challenge?
This is all the code I have for the app so far:
var express = require('express');
var fs = require('fs');
var request = require('request');
var cheerio = require('cheerio');
var app = express();
app.get('/scrape', (req, res) => {
// Workspace
var url = 'https://www.facebook.com/saved';
request(url, (err, response, html) => {
if (err) console.log(err);
res.send(JSON.stringify(html));
})
})
app.listen('8081', () => {
console.log('App listening on port 8081');
})
Any input will be greatly appreciated... Currently, I'm on hold...! How could I possibly hit this end-point with credentials (safely) provided by the user so that the application could get legitimately get past authentication and reach the desired end-point?
I don't think you can accomplish that using request-cheerio module since you need to make a post request with your login information.
A headless browser is more appropriate for this kind of project if you want it to be a scraper. Try using casperJs or PhantomJs. It will give you more flexibility but it's not a node.js module so you need to make a step further if you want to incorporate it with express.
One nodeJs module I know that can let you post is Osmosis. If you can make .login(user, pw) to work then that'll be great but I don't think it can successfully login to facebook though.
API if possible would be a much nicer solution but I'm assuming you already looked it up and find nothing in there for what you are looking for.
My personal choice would be to use an RobotProcessAutomation. WinAutomation, for example, is a great tool for manipulating web and scraping. It's a whole new different approach but it can do the job well and can be implemented faster compared to programmatically coding it.
I dont understand why cant javascript make ftp calls?. Why do we have to make such a request using server?
Even Browsers have ability to authenticate and browse a ftp server. Maybe use browser api's to do it?
Ok, answering my own question here.
I went through Mozilla docs on XMLHTTPRequest. It specifically says -
Despite its name, XMLHttpRequest can be used to retrieve any type of data, not just XML, and it supports protocols other than HTTP (including file and ftp).
So, I am satisfied with this. JavaScript can make calls to ftp using this.
The title of this question suggests the requester is keen on understanding if an FTP transfer could be implemented using JavaScript. However looking at the the answer by the same requester it appears that the question was just to know if URLs with FTP protocols can be used with JS and possibly HTML tags. The answer is yes. Even a simple <a> tag supports FTP URLs in the href attribute. Hope this helps the new readers. And yes, the XMLHttpRequest AJAX object does enable calling a URL with an FTP protocol.
Cheers.
There is a JavaScript library at http://ftp.apixml.net/ that allows FTP file uploads via JavaScript.
In this case, technically, the ftpjs server is making the FTP connection to the FTP server, but the instructions are being passed via JavaScript. So this particular library is designed primarily to allow developers add a basic file upload mechanism without writing sever-side code.
Under the hood, it uses the HTML5 FileReader to read the file to a base64 string, and then posts this using CORS AJAX back to the server.
// Script from http://ftp.apixml.net/
// Copyright 2017 http://ftp.apixml.net/, DO NOT REMOVE THIS COPYRIGHT NOTICE
var Ftp = {
createCORSRequest: function (method, url) {
var xhr = new XMLHttpRequest();
if ("withCredentials" in xhr) {
// Check if the XMLHttpRequest object has a "withCredentials" property.
// "withCredentials" only exists on XMLHTTPRequest2 objects.
xhr.open(method, url, true);
} else if (typeof XDomainRequest != "undefined") {
// Otherwise, check if XDomainRequest.
// XDomainRequest only exists in IE, and is IE's way of making CORS requests.
xhr = new XDomainRequest();
xhr.open(method, url);
} else {
// Otherwise, CORS is not supported by the browser.
xhr = null;
}
return xhr;
},
upload: function(token, files) {
var file = files[0];
var reader = new FileReader();
reader.readAsDataURL(file);
reader.addEventListener("load",
function() {
var base64 = this.result;
var xhr = Ftp.createCORSRequest('POST', "http://ftp.apixml.net/upload.aspx");
if (!xhr) {
throw new Error('CORS not supported');
}
xhr.onreadystatechange = function() {
if (xhr.readyState == 4 && xhr.status == 200) {
Ftp.callback(file);
}
};
xhr.setRequestHeader("Content-type", "application/x-www-form-urlencoded");
xhr.send("token=" + token + "&data=" + encodeURIComponent(base64) + "&file=" + file.name);
},
false);
},
callback: function(){}
};
its very difficult to FTP data(BIGfile) to backup server without using HTTP protocol in a web application.
Lets say, S1-( Client Browser ),
S2-(code container server),
S3-(Files Backup server) and we want to upload 2gb file from s1 using FTP.
use case diagram
This can be done by "JavaApplet" . we can embed uploader applet in webapplication. This applet will run inside browser sand box.
go through link
sample code for ftp using applet
provided you have to enable java on your browser.
I'm working on a NodeJS project that requires me to obtain driving directions on the server.
It seems like an obvious choice to use the Google Javascript API Version 3. But it seems like it was made only to be used on HTML pages, and not on server-only scripts. Even loading the API requires a script-tag or document.write.
Then I turned to node-googlemaps which is based on the Google Maps API. Sadly, this also does not work for two reasons:
the returned travel steps do not contain the path field
the license does not allow to use that API without displaying a map to a user
What can I do? Are there any workarounds or other APIs I could use?
Best, Boris
Actually you can do it on the backend or frontend, and the approach is basically the same.
All you gotta do is a request to the endpoint passing the right parameters, then the API will return you everything you need.
So, roughly it would be something like this:
var http = require('http');
var options = {
host: 'maps.googleapis.com',
path: '/maps/api/directions/json?origin=Toronto&destination=Montreal&avoid=highways&mode=bicycling'
}
callback = function(response) {
// variable that will save the result
var result = '';
// every time you have a new piece of the result
response.on('data', function(chunk) {
result += chunk;
});
// when you get everything back
response.on('end', function() {
res.send(result);
});
}
http.request(options, callback).end();
And here's the documentation's link if you want to dig deeper on this: https://developers.google.com/maps/documentation/directions/?hl=nl
Cheers,
The following code words if run in the console itself:
var $inputbox = $('input#inputfield');
var SPACE_KEYCODE = 32;
var space_down = $.Event( 'keyup', { which: SPACE_KEYCODE } );
$inputbox.trigger(space_down)
I can see the event being triggered and the page responding.
However when running the same code in a content script via a Chrome extension, it fails silently. Logging the results of '$inputbox.trigger(space_down)' shows it correctly returning the element.
The intention here is to have the existing page JS respond to the keyboard event from the extension. Is this possible?
Although I haven't been able to find documentation about whether events are distinct between the content script JS 'world' and the origin site's world, I made the following in the content script to load some JS into window.location, making it run in the content of the origin site.
// In order to send keyboard events, we'll need to send them from the page's JS
var load_into_page_context = function(file) {
var file_url = chrome.extension.getURL(file);
$.get(file_url, function(script_contents) {
window.location = 'javascript:'+script_contents
})
}
load_into_page_context("injectme.js");
This will load injectme.js (bundled with the extension) into the window.location, and make the generated keyboard events activate the origin site's event handlers.