How can I invalidate Google Cloud CDN cache from my express server? - node.js

Is there a way to invalidate / clear cached content on Cloud CDN from my express server?
For example, if I'm generating server rendered content to make it readily available and I update a specific route from my website, like editing a blogPost, for example. I need to do the following:
export const editBlogPostHandler = (req,res,next) => {
// 1. UPDATE BLOGPOST WITH SLUG some-blogpost-slug ON DB
// 2. INVALIDATE /some-blogpost-slug ROUTE ON CLOUD CDN CACHE
// THIS IS NECESSARY FOR NEW REQUESTS TO GET FRESH DATA RATHER THAN A STALE DATA RESPONSE
};
How can I do that from my express server?
From Cloud CDN - Invalidating Cached Content:
You can invalidate cached content from Cloud CDN through these methods:
Using the console:
Using gcloud SDK:

There is an API endpoint for that :
https://cloud.google.com/compute/docs/reference/rest/v1/urlMaps/invalidateCache
POST https://compute.googleapis.com/compute/v1/projects/{project}/global/urlMaps/{resourceId}/invalidateCache

As a complement to Alexandre accepted answer, here are more details on how to use this endpoint:
POST https://compute.googleapis.com/compute/v1/projects/{project}/global/urlMaps/{resourceId}/invalidateCache
In order to get the resourceId, you can call the endpoint mentioned here in order to get a list of urlMaps resources and their associated ids.
GET https://compute.googleapis.com/compute/v1/projects/{project}/global/urlMaps
Once you've got the resourceId, you also need to specify the path of the file/folder that you wish to invalidate in the request body (wildcard paths also work):
{ "path": "/folder/file.mp4" }
In the response body, you will find the id of the compute operation - If you want to check this operation progress, you can query it using the Compute Operation Global Get method.
In addition and in order to avoid running the same request several times, it is advised to give a unique requestId parameter under the form of a UUID (as specified in RFC 4122)

Related

How can I configure an Azure Function to work with dynamic configuration using Azure App Configuration?

I'm trying to write an Azure Function that works as a multi-tenant. When a clientId comes from the clients as a parameter, I need to fetch its configuration from Azure App Configuration using labels.
Let me explain in a pseudo code;
I'm using .Net 6 Azure Function Isolated.
I made configuration in Program.cs like in the below code;
new HostBuilder()
.ConfigureAppConfiguration(builder =>
{
string cs = "connection-string"; //which is actually comes from environment veriable
builder.AddAzureAppConfiguration(options =>
{
options.Connect(cs);
//options.Select(KeyFilter.Any, Environment.GetEnvironmentVariable("ClientId")); //It works that way, but I don't want to do it. I need to read the clientId while I receive the request
});
})
Let's assume that we made a middleware and a request came out
public RequestMiddleware(){
//Here I need to inject some service
}
//Invoke method of middleware
public async Task Invoke(FunctionContext context, FunctionExecutionDelegate next){
HttpRequestData requestData = await context.GetHttpRequestDataAsync();
requestData.Headers.TryGetValues("client-id", out var clientId);
var id = clientId.First();
//And here, I need to configure App Configuration to use the id as a label. As I did in the Program.cs
//like this ; options.Select(KeyFilter.Any, id);
}
Does it possible to load configuration by the label in the request lifecycle? Please keep in mind that Azure App Configuration in the standard tier has a limitation on requests, 3000 per hour. That's why it won't be a good idea to load AppConfiguration each time.
What you can do is work with a static Dictionary to 'cache' and whenever a request with some new client id arrives, you fetch the configuration add it to the dictionary for the future requests.
This document discussed strategies you should think about when using App Configuration in multi-tenant applications.
https://learn.microsoft.com/en-us/azure/architecture/guide/multitenant/service/app-configuration
From your description, it looks like you want to load tenant-specific config on-demand instead of all at once at the app startup, so you shouldn't need the code in program.cs. I would recommend using prefixes instead of labels to differentiate the config date for each tenant.
Check out the application-side caching section to learn more what options you have. I imagine, in the middleware, upon every request, you will first check if you have a cached IConfiguration instance for the tenant; if not, you will load the config for this tenant and cache it; if yes, you serve the request based on the config in your cache.

Setting up Swagger Ui in Firebase Functions Server

I've developed an API on Firebase Cloud Functions and I want to include a docs path to it.
I'm using swagger and I could successfully test it locally (localhost:PORT/docs) but when I deploy the function to Firebase it's not working, it redirects me to an authorization page.
I think I figured out why this is:
Let's say the name of my Cloud function is cfunc. Then the base url for it is something like https://region-name-project-name.cloudfunctions.net/cfunc. Based on how I included the swagger documentation:
const swaggerDoc = require('./docs/swagger.config.json')
app.use(
'/docs',
allowCors,
swaggerUi.serve,
swaggerUi.setup(swaggerDoc, {
customCssUrl: '/assets/swagger.css',
customSiteTitle: 'My Function Title',
customfavIcon: '/assets/logo.ico',
swaggerOptions: {
supportedSubmitMethods: [] //to disable the "Try it out" button
}
})
)
the docs should be located at https://region-name-project-name.cloudfunctions.net/cfunc/docs. When I try to access that URL, watching "Network" in my browser DevTools, it attempts a GET at that URL with response 304 and then redirects to https://region-name-project-name.cloudfunctions.net/docs and that's what brings up the Google Authentication page, since there's no Cloud Function named "docs" so Google thinks I'm trying to access something else in Firebase Cloud Functions (the same thing happens if I do something like https://region-name-project-name.cloudfunctions.net/tomato)
But I still don't know how to fix this redirect or why it's happening. I tried adding the Cloud Function URL to the host parameter of the swagger.config.json file, and some modifications to CORS, like allowing more Request Methods, adding json as content type, allowing authentication on headers, but nothing seems to be working.
Hope I was clear enought, if not tell me any other info you need (it's one of my first posts here :B)
Found the SOLUTION
After testing a BUNCH of different things, I found out that the redirection was in fact happening always removing one slice of the path after, for example I changed the docs endpoint to '/something/docs' and when accessing the URL that would be https://region-name-project-name.cloudfunctions.net/cfunc/something/docs it redirected to https://region-name-project-name.cloudfunctions.net/cfunc/docs which did not bring up the Google Authentication thing but now wasn't a valid path for my docs so it returned a 'Cannot GET /cfunc/docs'.
For some reason this redirection DOES NOT happen if you add an extra forward slash ('/') at the end of the documentation URL. So, in the first case, where the endpoint for the documentation is only '/docs', accessing the URL https://region-name-project-name.cloudfunctions.net/cfunc/docs/ does it. I do not know why that is, I'm probably posting an Issue on the swagger repo, but if someone has some extra data on why or how to make it work otherwise it would be awesome to hear.
Hope this helps someone else!
EDIT:
Oh and another thing I forgot, it's apparently better if you setup swagger-ui as if you were using express Router, even if you are not (maybe Firebase loads the Cloud Function with something like a router), so instead of app.use('/docs', swagger-ui.serve, swagger-ui.setup(swagger-file)) do app.use('/docs', swagger-ui.serve) and then app.get('/docs', swagger-ui.setup(swagger-file))

How to make a private call when using SSR Nuxt?

I am writing a headless solution for a WordPress website and noticed that for one particular endpoint, I need to authenticate to pull some data that will be used publicly. But, I'm concerned that where I'm using it will expose it to the web.
In my store/index.js I use the nuxtServerInit action method to execute some actions and I pass them some objects they need to fulfill their tasks:
async nuxtServerInit ({ dispatch }, { $axios, app }) {
await dispatch('initialize', { $axios, app })
},
$axios is passed because it will be used to query the API, and app is passed to help build the options to authenticate the request.
Is this a security vulnerability in Nuxt SSR? I think it is. If so, where are the only valid areas you can use secrets? asyncData ()?
If you're using SSR, you can use the privateRuntimeConfig runtime object and pass your secret in the nuxt.config.js file
export default {
privateRuntimeConfig: {
apiSecret: process.env.API_SECRET
}
}
If you read the documentation of nuxtServerInit, you can see that
Vuex action that is called only on server-side to pre-populate the store
Since this method is server-side only, you can use apiSecret (in my example) and it should be totally fine security-wise.
PS: Keep in mind that everything beyond what is generated on the server (hence, with NodeJS or nuxtServerInit) is "public". So your VueJS's client code lifecycle hooks are public: mounted(), fetch(), asyncData() because they will be visible on your browser's devtools.
Also, should your endpoint be that critical? If so, nuxtServerInit is the good way to go. If you need to fetch some more data in a "private way", you'll need to proxy it through some backend to hide the sensitive info and retrieve only the useful public data.

Downsides of an API which neglects http method and path

I'm wondering what the downsides would be for a production server whose api is totally ignorant of the HTTP request path. For example, an api which is fully determined by query parameters, or even fully determined by the http body.
let server = require('http').createServer(async (req, res) => {
let { headers, method, path, query, body } = await parseRequest(res);
// `headers` is an Object representing headers
// `method` is 'get', 'post', etc.
// `path` could look like /api/v2/people
// `query` could look like { filter: 'age>17&age<35', page: 7 }
// `body` could be some (potentially large) http body
// MOST apis would use all these values to determine a response...
// let response = determineResponse(headers, method, path, query, body);
// But THIS api completely ignores everything except for `query` and `body`
let response = determineResponse(query, body);
doSendResponse(res, response); // Sets response headers, etc, sends response
});
The above server's API is quite strange. It will completely ignore the path, method, headers, and body. While most APIs primarily consider method and path, and look like this...
method path description
GET /api - Metadata about api
GET /api/v1 - Old version of api
GET /api/v2 - Current api
GET /api/v2/people - Make "people" db queries
POST /api/v2/people - Insert a new person into db
GET /api/v2/vehicles - Make "vehicle" db queries
POST /api/v2/vehicles - Insert a new vehicle into db
.
.
.
This API only considers url query, and looks very different:
url query description
<empty> - Metadata about api
apiVersion=1 - Old version of api
apiVersion=2 - Current api
apiVersion=2&table=people&action=query - Make "people" db queries
apiVersion=2&table=people&action=insert - Add new people to db
.
.
.
Implementing this kind of api, and ensuring clients use the correct api schema is not necessarily an issue. I am instead wondering about what other issues could arise for my app, due to writing an api with this kind of schema.
Would this be detrimental for SEO?
Would this be detrimental to performance? (caching?)
Are there additional issues that occur when an api is ignorant of method and url path?
That's indeed very unusual but it's basically how a RPC web api would work.
There would not be any SEO issue as far as I know.
Performance/caching should be the same, as the full "path" is composed of the same parameters in the end.
It however would be complicated to use with anything that doesn't expect it (express router, fancy http clients, etc.).
The only fundamental difference I see is how browsers treat POST requests as special (e.g. won't ever be created just with a link), and your API would expose deletion/creation of data just with a link. That's more or less dangerous depending on your scenario.
My advice would be: don't do that, stick to standards unless you have a very good reason not to.

Cloudfront cache with GraphQL?

At my company we're using graphql for production apps, but only for private ressources.
For now our public APIs are REST APIs with a Cloudfront service for cache. We want to transform them as GraphQL APIs, but the question is : how to handle cache properly with GraphQL ?
We thought using a GET graphql endpoint, and cache on querystring but we are a bit affraid of the size of the URL requested (as we support IE9+ and sell to schools with sometime really dummy proxy and firewalls)
So we would like to use POST graphQL endpoint but...cloudfront cannot cache a request based on its body
Anyone has an idea / best practice to share ?
Thanks
The two best options today are:
Use a specialized caching solution, like FastQL.io
Use persisted queries with GET, where some queries are saved on your server and accessed by name via GET
*Full disclosure: I started FastQL after running into these issues without a good solution.
I am not sure if it has a specific name, but I've seen a pattern in the wild where the graphQL queries themselves are hosted on the backend with a specific id.
It's much less flexible as it required pre-defined queries baked in.
The client would just send arguments/params and ID of said pre-defined query to use and that would be your cache key. Similar to how HTTP caching would work with an authenticated request to /my-profile with CloudFront serving different responses based on auth token in headers.
How the client sends it depends on your backends implementation of graphQL.
You could either pass it as a white listed header or query string.
So if the backend has defined a query that looks like
(Using pseudo code)
const MyQuery = gql`
query HeroNameAndFriends($episode: int) {
hero(episode: $episode) {
name
friends {
name
}
}
}
`
Then your request would be to something like api.app.com/graphQL/MyQuery?episode=3.
That being said, have you actually measured that your queries wouldn't fit in a GET request? I'd say go with GET requests if CDN Caching is what you need and use the approach mentioned above for the requests that don't fit the limits.
Edit: Seems it has a name: Automatic Persisted Queries. https://www.apollographql.com/docs/apollo-server/performance/apq/
Another alternative to remain with POST requests is to use Lambda#Edge on your CloudFront and by using DynamoDB tables to store your caches similar to how CloudFlare workers do it.
async function handleRequest(event) {
let cache = caches.default
let response = await cache.match(event.request)
if (!response){
response = await fetch(event.request)
if (response.ok) {
event.waitUntil(cache.put(event.request, response.clone()))
}
}
return response
}
Some reading material on that
https://aws.amazon.com/blogs/networking-and-content-delivery/lambdaedge-design-best-practices/
https://aws.amazon.com/blogs/networking-and-content-delivery/leveraging-external-data-in-lambdaedge/
An option I've explored on paper but not yet implemented is to use Lambda#Edge in request trigger mode to transform a client POST to a GET, which can then result in a cache hit.
This way clients can still use POST to send GQL requests, and you're working with a small number of controlled services within AWS when trying to work out the max URL length for the converted GET request (and these limits are generally quite high).
There will still be a length limit, but once you have 16kB+ GQL requests, it's probably time to take the other suggestion of using predefined queries on server and just reference them by name.
It does have the disadvantage that request trigger Lambdas run on every request, even a cache hit, so will generate some cost, although the lambda itself should be very fast/simple.

Resources