Varnish ESI for lots of small bits of information

Varnish ESI for lots of small bits of information - varnish

I've got a standard blog-type application with posts and users that can add those posts to their favorites.
Goals
When a user looks at a list of posts, they should see an indication (an image) of whether
each post is a favorite. Anonymous users don't have any favorites.
The list of posts needs to be cached in Varnish (for both anonymous and logged-in users) because it's expensive to calculate.
Ideas
Cache the list page in Varnish and use ESI to fetch the favorites information...
... for each post for the user making the current request. Downside: 50 ESI requests per page (basically the N+1 problem).
... as a JSON object which is then stored on the page. On the client, this object is read and the DOM is manipulated to indicate favorites information. Downside: doesn't work for users without Javascript.
... as a CSS snippet which is stored in the page. The CSS determines what to display for each post. Downside: only works for stylable content (ie, images). Not possible to display text information.
Am I missing any possibilities to accomplish what I want? Idea 3 seems to be the cleverest answer, but it wouldn't work if I also wanted to display the date the user favorited the post.

Answer 2 makes a lot of sense. It makes pages nicely cacheable, and only sacrifices the 'favorite functionality' for people without javascript in their browser.
Who are those people anyway? Still surfing with lynx? ;). And would they accept cookies to make your login mechanism (required for personal favorites) work in the first place, or even login at all?

Related

Chrome Extension - Detecting New Posts on a 4chan thread

I'm wiring a Chrome extension which needs to be able to detect whenever a thread on 4chan has been updated (threads autoupdate). I've tried using a MutationObserver, but it is being set off too many times (4chan thread pages change often for a variety of reasons outside of new posts, including hovering over/expanding images, viewing post replies, opening the reply dialog, etc.)

I'm at work, so I can't visit 4chan to help you more specifically :^)
But, you should find out the distinguishing class or attribute of each post, e.g. by using Chrome's Inspect, store the last post ID (i.e. the dubs, trips, quads-meter), and on each mutation event, check if the last post ID matches the previous post ID, to determine whether there has been new posts.
There is probably a more efficient way to detect only posts, but this should do the job just fine, if you're already familiar with MutationObserver.

How to provide command line arguments using HTTP under Node

Is it possible to pass command line arguments using HTTP under Node.js? This seems like a simple thing to do but I can not seem to find out how or if it is even possible. I am struggling a little with the async nature of node so may be missing something fundamental here!
Thanks,
Will

You have a few choices of how to pass state info from one script to another. One of the simplest and most portable has been around since the beginning - when you get the user data posted from page1, send it along in hidden form elements of page2. Then a post of page2 will have the user input on the new form elements and automatically include the hidden form element values as well. Of course you can use the data in the page1 post to otherwise determine what goes on page2. And so on to page3, etc.
The other common choice is cookies. You leave a cookie on the user's browser when they view page1 and then query the browser for it in your code for page2. This is totally portable in modern browsers, but the user can turn off cookies and then it won't work.
Another option is session variables in your node.js scripts. These are pretty easy to work with, but some servers use cookies behind the scenes and they could be off. You might want to read up on that one.
None of those 3 require use of JavaScript on the browser which is required for the Ajax option. In this single page mode you can keep all the state info you want in the JavaScript code because the page never gets reloaded. That gets a little tougher for a beginner and there's also the possibility that Javascript is off. If you are developing a rich, interactive app, you can expect your users to have JS enabled. But for a website with a few pages to sequence to casual visitors it may not always be on.
So, I'd suggest you try the hidden form elements to get started. Something like:
<input type="hidden" name="whatever" value="data-from-page1-post" />
If you put that onto a form in page2, it will come back in the post.
Have fun...

Friendly URLs when using a Record ID for dynamic content

I've read a bit on the matter of friendly urls and I'm a little unsure as to what is better.
I currently have my website using a structure of http://www.domain.com/page.php?id=2
I am using the record id to determine the content of the page. My record id's are numeric and increment for new pages added. The content of existing pages can change completely over time. But, still use the same record id (this is a cms so the client may do this).
The way I understand it I have two options for friendly urls:
http://www.domain.com/page/2
http://www.domain.com/some-text-describing-the-page
Now because I identify the content by the record id, I would assume the first option would make more sense.
My client seems to want option two.
After some reading I found two conflicting points.
As per Tim Berners-Lee (the architect of the WWW) he states that you want a URI which will have the potential to remain the same 2 months, 2 years, 200 years from now. So you DO NOT want to use a page title or something similar for your pages. If you change your pages content you are either forced to change the content and leave the URI alone, or change the URI and are stuck with dangling links. You can read his article here (http://www.w3.org/Provider/Style/URI)
However, a number of other people on the internet (with no know authority to me) clearly state that you need to have a descriptive yet short URI for the best SEO value. From what I read, mostly for the purpose of backlinks and having keywords in the anchor text since people just use the link itself for the anchor text. So having keywords in the link itself helps search engines know what the link is about without a custom title.
It seems to me the difference has to do with long term VS short term.
Am I grasping this correctly?
If I am to use a slug style URI as defined by the user, do I have to just allow my user to type in whatever they want to a field and check against the current database to see if it exist? If so, am I supposed to anticipate static links by running a query for the know record id and then use the result to generate the url which would just be rewritten back to the format: http://www.domain.com/page.php?id=2?
It seems to me that would be a lot of extra overhead.

I would suggest something in the middle of those two:
http://www.domain.com/page/2/some-text-describing-the-page
or without page:
http://www.domain.com/2/some-text-describing-the-page
You can still get page Id from the Url, and there is a title as well! And what even more important, you're still able to get correct content, even when page title change later.
So think about situation like that: User creates a page, it receives Id=4 and it's title is My great title. From that information Url is generated, and is e.g. http://www.domain.com/page/3/my-great-title. After 2 months user changes the title to This title is better then the last one!. Url changes as well to http://www.domain.com/page/3/this-title-is-better-then-the-last-one. However, there is still 3 within the Url, so you're able to show right content! You can also check, if the rest of Url is actual, and redirect (301 would be the best one) to new one to let search engines know, that Url changed.

TYPO3: How to count page impressions on every page with an extension

I need to count the page impressions of every page on a TYPO3 site into the db.
So I think I need an extension which is called on every page impression and increase a column 'impressions' in the db of the specific page.
I'm new to typo3 and new to extension development as well. Is there a way to include an extbase-extension on every page so some php-script get called?
(Update)
I want to add more information:
I don't need a counter which counts all PIs. The counter needs to be page-related. So it make sense to extend the pages-table from Typo3. Another need is that the extension should be done with extbase.

I'm new to typo3 and new to extension development as well. Is there a way to include an extbase-extension on every page so some php-script get called?
Once your plugin is configured you can include it with page.1234 < plugin.tx_yourextension_pi1 on any page. 1234 determines the position on your page.
The script should be USER_INT, so it's not being cached (mind you, this will cost loads of performance as previously stated by #norwebian)
As you don't want to output anything, make sure the controller stays empty as well.

Did you do a quick search in the extension repository? Trying a search for "page counter" reveals four relevant extensions.
"Sys_stat" is the closest thing to an "official" solution, it is really just enabling a few settings already existent. It has been reported to fill up the database with too much data, though.
"Generic Visitor Counter" would be my favourite, I believe (if I was going for a page counter at all), it is recently updated and seems simple enough.
You should really consider a proper stats extension, though. Both ics_awstats and ke_stats have been in my toolset.
YMMV. Be aware that if your site is popular, stats gathering quickly gets out of hand. On the other hand, if you go for a simple counter, including uncached extensions will cost performance.

I am not sure if I really understood what you want and need. After all, page impressions are not the same as page views. I wouldn't know the difference "onpage" right now though. So am I right in assuming that you mean page views?
If yes: I would take the following approach:
A separate, autonomous extension with a JavaScript for asynchronous calling of an API and a table for storing page views / page impressions.
Each page globally binds a JavaScript that initializes itself.
Once the DOM is ready, it sends a call to an AJAX API endpoint with the URL of the page as a parameter.
The endpoint takes only the URL.
For each unique URL, a record including counter is created or updated.
Extending the table for the pages doesn't make sense to me. What are you doing with a website that consists of news overviews, news details, press and blog sections, a dealer search and a store with product pages?
I would keep the statistics table standalone.
If you expand the table a bit and add date and time - no simple increment of hits - you can even identify the hottest pages of the week, the month, etc.
--
My approach won't increase/delay page load time much, if at all, and will have little noticeable impact even on heavily requested websites.
With the AJAX endpoint, it's then up to you how you deploy it and how much of the CMS framework you want to load.

How do I hide Drupal nodes that shouldn't be directly accessed from users and search engines?

I have seen many somewhat similar questions, but nothing quite what I'm looking for. So at the risk of being told this is a duplicate... here it goes.
I've found that there are times I have a node that simply contains content that will be displayed somewhere else, but shouldn't be viewed directly. That is, no one should ever go to node/1234, but the content in node 1234 should be displayed somewhere else.
For example, I create an about page with tabbed content using views. So there are "About Me", "About Us" and "About Them" pages. All of these are displayed in a single page with tabs using Views. So I don't want people to get directly to the "About Us" node because then they wouldn't see the tabs for the other pages. At the same time, I don't want Google giving people a direct link to this node, I want to limit access so users can only get to it through the View (i.e., the tab).
So I need to restrict access to the node, remove it from the Drupal search results, and make sure Google doesn't pick up on it. Any suggestions?
---- Note ----
I've accepted the answer from mingos (thanks btw) because even though it's not a full answer / solution, it gave me some good things to think about. Additional answers are still welcome.

In Drupal 7 you can use: http://drupal.org/project/internal_nodes
Description: Some content/nodes should never be viewed directly; only visible be through something else such as Views or Panels. This module denies access to node/[nid] URLs while allowing the content to stay published and otherwise viewable.
Full disclosure: I am the creator and co-maintainer of Internal Nodes. I found this question while searching to see how the module could be found on Google.

Tough one.
If you want to have many nodes like this and do the "displaying elsewhere" dynamically, I can't think of anything right now (at 2:20 AM I rarely can).
If there is onne such page (or very few), I'd restrict access to it by any available means (Permissions, Nodeaccess, Content Access, TAC, whatever) and then create special themes for the pages where the restricted content should be displayed. The themes would contain database queries, fetching content from the restricted nodes.
Other possibility might include creating a special theme for the hidden nodes in question (perhaps all belonging to the same content type?). Make full node display nothing (or a message saying the access is restricted) and add a ROBOTS meta tag asking Google not to index the page. Make the teaser view available though - you can display it freely inside a view, but since /node/1234 is the FULL view, the actual content will be unavailable here.
Dunno if this solves your problem, hope it helps at least a bit.

I found this page after running into this same problem.
What I found worked for me might be part of the answer you need:
Take a look at the Page Manager Redirect Module http://drupal.org/project/page_manager_redirect . I just started playing with it.
It uses the Page Manager module of CTools to redirect one page to another. What makes this most powerful is that Page Manager uses Contexts. So, if you want to redirect all pages of a particular content type, you can do so.
I just started to use it (instead of Taxonomy Redirect and Path Redirect) to redirect (301 response code) my taxonomy terms for a particular vocabulary to particular nodes.
In your instance, you should be able to use contexts to filter for specific pages.
Of course this doesn't solve the problem of these nodes coming up in search results.

There is also another module Rabbit Hole which has a similar functionality like Internal Nodes but works for all entities, not only nodes.

I am having the same problem, and are currently thinking of the following solution where all the content of a node is to be displayed to certain users (permission based):
- unpublish node
- create a new published checkbox
- create a view with fields that shows alle the content
Haven't tested it thoroughly yet, but it seems to work.
The node is to be displayed to the creator (only one in permission 1), some of it to permission 2 and all of it to permission 3.
Any comments on this solution.
I assume this will also exclude it from search, but permission 2 and 3 needs to be able to search it. Still haven't figured that one out.

I used Rules module with an "entity is of bundle" and the built-in "Page redirect" action.

There is a really easy way to do this if you only want to show a content type through a view.
create a content type as and make it unpublished.
create a view and on the filter option set the filter to "Content: Published (No)"
the view will give anon users access to the content through the view but they won't have access to the unpublished content at the direct link to the content.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string