Make a chrome extension download all PDFs with `declarativeNetRequest` - google-chrome-extension

I have to migrate a chrome extension from MV2 to MV3, and that means replacing usages of the blocking chrome.webRequest API with declarativeNetRequest. One usage is this:
function enableDownloadPDFListener() {
chrome.webRequest.onHeadersReceived.addListener(downloadPDFListener);
}
function downloadPDFListener(details) {
const header = details.responseHeaders.find(e => e.name.toLowerCase() === 'content-type');
if (header.value && header.value === 'application/pdf') {
const headerDisposition = details.responseHeaders.find(
e => e.name.toLowerCase() === 'content-disposition'
);
if (headerDisposition) {
headerDisposition.value = headerDisposition.value.replace('inline', 'attachment');
} else {
details.responseHeaders.push({ name: 'Content-Disposition', value: 'attachment' });
}
}
return { responseHeaders: details.responseHeaders };
}
Explanation: This function intercepts requests, checks if their Content-Type header is application/pdf, and if that's the case, sets Content-Disposition: attachment to force downloading the file. We have this functionality to save our employees time when downloading lots of PDF files from various websites.
The problem I'm facing is that this API is deprecated and can't be used in Manifest V3, and I wasn't able to migrate it to the declarativeNetRequest API. I tried the following:
[
{
"id": 1,
"priority": 1,
"action": {
"type": "modifyHeaders",
"responseHeaders": [
{
"header": "content-disposition",
"operation": "set",
"value": "attachment"
}
]
},
"condition": {
// what should I put here?
}
}
]
But I don't know how to filter files with a certain Content-Type header. From what I understand, this is currently not possible. Is there any other way to get this functionality in Chrome's MV3?
I tried { "urlFilter": "*.pdf" } as a condition, which isn't correct, but might be good enough. However, although the badge indicates that the rule was executed, the Content-Disposition header isn't set in the network tab, and the file isn't downloaded. What went wrong here?

At a glance, I don't think there's a condition that would work here. It seems like a reasonable use case and something that may make a good issue at https://bugs.chromium.org/p/chromium though.
In the meantime - could you have an extension which injects a content script that listens to the click event on links? Or alternatively you could perhaps wait for the PDF to open and then close the tab and perform a download.

Related

How can I replace my webRequest blocking with declarativeNetRequest?

I'm try to migrate my chrome extension to Manifest V3, and having some troubles with webRequest. I don't know how to replace the below source code to declarativeNetRequest. Also, what should I add to my manifest.json to make this work. Because when I read on the Chrome Developers, it mentions some thing about rules, and I don't know what those rules are. Thank you so much for your help
chrome.webRequest.onHeadersReceived.addListener(info => {
const headers = info.responseHeaders; // original headers
for (let i=headers.length-1; i>=0; --i) {
let header = headers[i].name.toLowerCase();
if (header === "x-frame-options" || header === "frame-options") {
headers.splice(i, 1); // Remove the header
}
if (header === "content-security-policy") { // csp header is found
// modifying frame-ancestors; this implies that the directive is already present
headers[i].value = headers[i].value.replace("frame-ancestors", "frame-ancestors " + window.location.href);
}
}
// Something is messed up still, trying to bypass CORS when getting the largest GIF on some pages
headers.push({
name: 'Access-Control-Allow-Origin',
value: window.location.href
})
// return modified headers
return {responseHeaders: headers};
}, {
urls: [ "<all_urls>" ], // match all pages
types: [ "sub_frame" ] // for framing only
}, ["blocking", "responseHeaders"]);
IMAGE OF THE SOURCE CODE

how to access data from json converted from xml nodejs api

I have developed nodejs api which accepts xml as input, I was able to access it in the nodejs and convert the xml to below mentioned json.
var data = {
"ns0:service1":{
"$":{
"xmlns:ns0":"http://www.google.com"
},
"ns0:messageheader":{
"$":{
"version":"1.0",
"xmlns:ns1":"http://www.google.com/logo"
},
"ns1:sourcesystemcode":"MUST",
"ns1:operation":"Process",
"ns1:targetsystemlist":{
"ns1:targetsystemcode":"TEST1",
"ns1:targetsystemname":"TEST1"
}
},
"ns0:messagedata":{
"ns3:messagedata":{
"$":{
"xmlns:ns3":"http://www.google.com/logo2"
},
"ns3:somessagerequestdata":{
"ns3:sorequestorderheader":{
"ns3:sourcecode":"TEST1",
"ns3:msgdate":"2014-05-28T11:48:31",
"ns3:deliveryaddress":{
"ns3:name":"John",
"ns3:streetname":"Latin",
"ns3:housenumber":"53"
},
"ns3:customeraddress":"",
"ns3:sorequestline":{
"ns3:orderid":"ord_001",
"ns3:linetype":"testing",
"ns3:itemnumber":"001",
"ns3:itemdescription":"iphonex",
"ns3:quantity":"1",
}
}
}
}
}
}
}
how can i access the values like "ord_001" "testing" "001" "iphonex" "1" in node js.
One way is to use bracket notation:
data["ns0:service1"]["ns0:messagedata"]["ns3:messagedata"]["ns3:somessagerequestdata"]["ns3:sorequestorderheader"]["ns3:sorequestline"]["ns3:orderid"] will get you to ord_001.
data["ns0:service1"]["ns0:messagedata"]["ns3:messagedata"]["ns3:somessagerequestdata"]["ns3:sorequestorderheader"]["ns3:sorequestline"] will get you to the whole object you're looking for.
The way those objects are set up is pretty rough though. Something is blocking JSON.parse(data) from working.

Training Microsoft Custom Vision model via rest api

I am working on a simple nodejs console utility that will upload images for the training of a Custom Vision model. I do this mainly because the customvision web app won't let you tag multiple images at once.
tl;dr: How to post images into the CreateImagesFromFiles API endpoint?
I cannot figure out how to pass images that I want to upload. The documentation just defines a string as a type for one of the properties (content I guess). I tried passing path to local file, url to online file and even base64 encoded image as a string. Nothing passed.
They got a testing console (blue button "Open API testing console" at the linked docs page) but once again... it's vague and won't tell you what kind of data it actually expects.
The code here isn't that relevant, but maybe it helps...
const options = {
host: 'southcentralus.api.cognitive.microsoft.com',
path: `/customvision/v2.0/Training/projects/${projectId}/images/files`,
method: 'POST',
headers: {
'Training-Key': trainingKey,
'Content-Type': 'application/json'
}
};
const data = {
images: [
{
name: 'xxx',
contents: 'iVBORw0KGgoAAAANSUhEUgAAAAUAAAAFCAYAAACNbyblAAAAEklEQVR42mP8z8AARKiAkQaCAFxlCfyG/gCwAAAAAElFTkSuQmCC',
tagIds: [],
regions: []
}
],
tagIds: []
}
const req = http.request(options, res => {
...
})
req.write(JSON.stringify(data));
req.end();
Response:
BODY: { "statusCode": 404, "message": "Resource not found" }
No more data in response.
I got it working using the "API testing console" feature, so I can help you to identify your issue (but sorry, I'm not expert in node.js so I will guide you with C# code)
Format of content for API
You are right, the documentation is not clear about the content the API is waiting for. I made some search and found a project in a Microsoft's Github repository called Cognitive-CustomVision-Windows, here.
What is saw is that they use a class called ImageFileCreateEntry whose signature is visible here:
public ImageFileCreateEntry(string name = default(string), byte[] contents = default(byte[]), IList<System.Guid> tagIds = default(IList<System.Guid>))
So I guessed it's using a byte[].
You can also see in their sample how they did for this "batch" mode:
// Or uploaded in a single batch
var imageFiles = japaneseCherryImages.Select(img => new ImageFileCreateEntry(Path.GetFileName(img), File.ReadAllBytes(img))).ToList();
trainingApi.CreateImagesFromFiles(project.Id, new ImageFileCreateBatch(imageFiles, new List<Guid>() { japaneseCherryTag.Id }));
Then this byte array is serialized with Newtonsoft.Json: if you look at their documentation (here) it says that byte[] are converted to String (base 64 encoded). That's our target.
Implementation
As you mentioned that you tried with base64 encoded image, I gave it a try to check. I took my StackOverflow profile picture that I downloaded locally. Then using the following, I got the base64 encoded string:
Image img = Image.FromFile(#"\\Mac\Home\Downloads\Picto.jpg");
byte[] arr;
using (MemoryStream ms = new MemoryStream())
{
img.Save(ms, System.Drawing.Imaging.ImageFormat.Jpeg);
arr = ms.ToArray();
}
var content = Convert.ToBase64String(arr);
Later on, I called the API with no tags to ensure that the image is posted and visible:
POST https://southcentralus.api.cognitive.microsoft.com/customvision/v2.2/Training/projects/MY_PROJECT_ID/images/files HTTP/1.1
Host: southcentralus.api.cognitive.microsoft.com
Training-Key: MY_OWN_TRAINING_KEY
Content-Type: application/json
{
"images": [
{
"name": "imageSentByApi",
"contents": "/9j/4AAQSkZJRgA...TOO LONG FOR STACK OVERFLOW...",
"tagIds": [],
"regions": []
}
],
"tagIds": []
}
Response received: 200 OK
{
"isBatchSuccessful": true,
"images": [{
"sourceUrl": "imageSentByApi",
"status": "OK",
"image": {
"id": "GENERATED_ID_OF_IMAGE",
"created": "2018-11-05T22:33:31.6513607",
"width": 328,
"height": 328,
"resizedImageUri": "https://irisscuprodstore.blob.core.windows.net/...",
"thumbnailUri": "https://irisscuprodstore.blob.core.windows.net/...",
"originalImageUri": "https://irisscuprodstore.blob.core.windows.net/..."
}
}]
}
And my image is here in Custom Vision portal!
Debugging your code
In order to debug, you should 1st try to submit your content again with tagIds and regions arrays empty like in my test, then provide the content of the API reply

How can I open a tab without loading it in a Google Chrome extension?

The only thing I could think of was using chrome.tabs.discard, and below is my current code:
var test_button= document.getElementById('test_button');
test_button.onclick = function(element) {
var to_load = {"url": "https://stackoverflow.com", "active": false, "selected": false};
chrome.tabs.create(to_load, function(tab) {
chrome.tabs.discard(tab.id);
});
};
However, rather than preventing this page from loading, calling chrome.tabs.discard before it's loaded results in Chrome replacing it with about:blank.
The only "solution" I found was to wait for the tab to load, but waiting for it to load before unloading it defeats the purpose, especially if I'm opening a large amount of tabs at once.
Any help would be appreciated.
The solution is to only call chrome.tabs.discard on the tab after its URL value has updated, such as:
var tabs_to_unload = {}
chrome.tabs.onUpdated.addListener(function(tabId, changeInfo, changedTab) {
if (tabs_to_unload[tabId] == true) {
// We can only discard the tab once its URL is updated, otherwise it's replaced with about:empty
if(changeInfo.url) {
chrome.tabs.discard(tabId);
delete tabs_to_unload[tabId];
}
}
});
var test_button= document.getElementById('test_button');
test_button.onclick = function(element) {
var to_load = {"url": "https://stackoverflow.com", "active": false, "selected": false};
chrome.tabs.create(to_load, function(tab) {
tabs_to_unload[tab.id] = true;
});
};
In my case, the exact code was a bit different, as I was performing these actions from within a popup, and the variables and listeners registered by its script only lived as long as the popup, but the principle behind it was the same.

Cache all images with onHeadersReceived

I'm trying to modify the response headers of the images to save bandwith and improve the response time.These are my files:
manifest.json
{
"name": "Cache all images",
"version": "1.0",
"description": "",
"background": {"scripts": ["cacheImgs.js"]},
"permissions": [ "<all_urls>", "webRequest", "webRequestBlocking" ],
"icons": {"48": "48.png"},
"manifest_version": 2
}
cacheImgs.js
var expDate = new Date(Date.now()+1000*3600*24*365).toUTCString();
var newHeaders =
[{name : "Access-Control-Allow-Origin", value : "*"},
{name : "Cache-Control", value : "public, max-age=31536000"},
{name : "Expires", value : expDate},
{name : "Pragma", value : "cache"}];
function handler(details) {
var headers = details.responseHeaders;
for(var i in headers){
if(headers[i].name.toLowerCase()=='content-type' && headers[i].value.toLowerCase().match(/^image\//)){
for(var i in newHeaders) {
var didSet = false;
for(var j in headers) {
if(headers[j].name.toLowerCase() == newHeaders[i].name.toLowerCase() ) {
headers[j].value = newHeaders[i].value;
did_set = true; break;
}
}
if(!didSet) { headers.push( newHeaders[i] ); }
}
break;
}
}
console.log(headers);
return {responseHeaders: headers}
};
var requestFilter = {urls:['<all_urls>'], types: ['image'] };
var extraInfoSpec = ['blocking', 'responseHeaders'];
chrome.webRequest.onHeadersReceived.addListener(handler, requestFilter, extraInfoSpec);
the console.log fires many times and i can see the new headers. The problem is that when I open the chrome developer tools of the page, in the network tab, i see the same original headers of the images. Also note the blocking value in the extraInfoSpec, so that's supposed to be synchronous. Does someone happen the same?
UPDATE
Now I see the modified response headers in the network panel.
But now I only see from images whose initiator is the webpage itself. The images whose initiator are jquery.min.js doesn't change the response headers
There are two relevant issues here.
First, the headers displayed in the developer tools are those that are received from the server. Modifications by extensions do not show up (http://crbug.com/258064).
Second (this is actually more important!), modifying the cache headers (such as Cache-control) has no influence on the caching behavior of Chromium, because the caching directives have already been processed when the webRequest API is notified of the headers.
See http://crbug.com/355232 - "Setting caching headers in webRequest.onHeadersReceived does not influence caching"
After doing some research, it turns out my previous answer was wrong. This is actually a Chrome bug - Chrome's DevTools Network panel will only show the actual headers received from the server. However, the headers you've injected will still have the desired effect.
Another extension developer identified the issue here and provided a link to the Chrome defect report

Resources