How to filter user input that edits the html/css of a website (like in Tumblr)? [closed]

How to filter user input that edits the html/css of a website (like in Tumblr)? [closed] - security

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
Tumblr allows users to edit the HTML and CSS of their blogs through a Templating system. In fact, it even allows users to include their own external scripts into their pages. Doing this obviously opens a lot of security holes for Tumblr; however, it's obviously doing fine. What's even more interesting is how Tumblr's team managed to do all these through a LAMP Stack before.
One thing that I found out through pinging different Tumblr sites is that the blog sites are dispersed through multiple servers.
Nonetheless, hacking the root of just one of those servers could compromise a ton of data. Furthermore, while some could argue that Tumblr may just be doing manual checks on each of its blogging sites, it still seems pretty risky and unpractical for Tumblr's team to do so because of the amount of data that Tumblr has. Because of this, I think there are still some aspects that checking manually hasn't covered yet, especially in terms of how Tumblr's team filters their user input before it enters their database.
My main question: How does Tumblr (or any other similar site) filter its user input and thereby, prevent hacking and exploits from happening?

What is Tumblr.
Tumblr is a microbloggin service, which lets its users to post multimedia and short text blogs on their website.
Formating and styling blog
Every blog service lets its user to edit and share the content. At the same time they also let their users to style their blog depending on what type of service they are providing.
For instance, A company blog can never have a garden image as its background and at the same time a shopkeeper can never show a beach image; unless they are present at that place or include such objects in their work.
What Tumblr. does
Well, they just keep checking the files for any error!
As a general bloggin platform. It is necessary to allow the users to upload and style them blogs. And at the same time it is a job for the company to keep the control of how their service is used!
So Tumblr. keeps a great note on these things. They also donot allow to upload files that infect the system, and are well-known to delete such accounts if anything fishy is caught!
Tumblr. allows the users to upload files and multimedia that is used to style the blog. They used a seperate platform where to save all such files! So when you upload it, it does not get executed on their system. They access it from the server or from the hard drive which these files are saved on and then provide you with the blog that includes those files.
What would I do
I would do the same, I would first upload and save the files on a seperate place, where if executed they donot harm my system if are infected by a virus. Not all the users upload virus. But once they do, you should use an antivirus system to detect and remove the virus and at the same time block that account.
I would have let the users to use my service, now its user's job to upload content and its my job to prevent hacking.

All this stuff (HTML/CSS/External scripts) does not run on Tumblr machines. So to them it does not matter. One is responsible for the stuff that runs on your own PC. As to Javascript it lives in a sandpit

Related

Are there methods to trace text files that were web scrapped, finding the user involved and after the text has been machine translated?

Information
I’m a bit new and want to create a site with these capabilities but don’t know where to start, please point out if I violated any rules or should write differently.
I’ll a bit specific here, so there is a web novel site where content is hidden behind a subscription.
So you would need to be logged into an account.
The novels are viewed through the site’s viewer which can not be selected/highlighted then copied.
Question
If you web scrape and download the chapters as .txt files then machine translated using like Google Translate, is there a way to track the uploader of the MTL or when the MTL file is shared?
Of similar nature there are aggregator sites that have non-MTL’d novels on them, but the translation teams have hidden lines which tell readers to go to their official site. The lines aren’t on the official site though, only when it has been copied. How is that possible?
I’ve slightly read about JWT, and I’m assuming they can find the user when they’re on the website but what about in the text?
Additional Question (if above is possible, don’t have to answer this just curious)
If it is possible to embed like some identifying token, is there a way to break it by perhaps converting it into an encrypted epub?

Htaccess and url for multilingual site [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
I'm ready to use the subdirectory format for my multilingual website.
My first question is:
For SEO, have I to translate the page name in the url or it is useless?
Example:
- Same filename
site.com/fr/login
site.com/en/login
OR
- Different filename
site.com/fr/connexion
site.com/en/login
Then, when user is on site.com: Should I redirect him to site.com/en and site.com/fr depending user's IP? Or have I to set a default local, and have my url like site.com/page and site.com/fr/page
Finally, what is the best way to get the local from user's current URL?
Parsing url to get /fr or /en, or add a GET parameter in url with lang=fr (hidden with htaccess)
Thanks :)

As a precondition, I assume that you are not using frameworks / libraries. Furthermore, I never have solved similar problems only using .htaccess (as the title of your question requests) and thus don't know if it is possible to do so. Nevertheless, the following guidelines may help you.
Your first question
In general, a web page's file name and path have influence on its ranking. Furthermore, having page names and paths in native languages might help your users memorize the most important of your URLs even without bookmarking them.
Nevertheless, I never would translate the page names or directories for pages which are part of a web application (as opposed to a informational or promotional pages).
The login page you mentioned is a good example. I am nearly sure that you do not want your site to be found because of its contents on its login page. Actually, there are many websites which exclude login pages and other application pages from being indexed at all.
Instead, in SEO terms, put your effort into your promotional and informational pages. Provide valuable content, explain what is special about you or your site, and do everything you could that those pages get properly indexed. IMHO, static HTML pages are the best choice for doing so.
Furthermore, if you translate the names of pages which belong to your actual application, you will run into massive trouble. For example, after successful login, your application probably will transfer the user to his personal dashboard, which probably will be based on another HTML template / page. If you have translated that page name into different languages, then your application will have to take care to take the user to the right version. Basically, that means that you need as many versions of your application as languages you want to support. Of course, there are tricks to make life easier, but this will be a constant pain and definitely in no way worth the effort.
To summarize: Create static pages which show your USP (unique seller position) and provide valuable content to users (for example sophisticated tutorials and so on). Translate those pages, including names and paths, and SEO them in every way you could. But regarding the actual application, optimizing its pages is kind of pointless and even counterproductive.
Your second question
I would never use IP based redirecting for several reasons.
First, there are many customers in countries which are not their home country. For example, do you really want to redirect all native English speakers to your Hungarian pages because they are currently in Hungary for a business trip?
Second, more and more users today are using VPNs for different reasons, thereby often hiding the country where they currently are.
Third, which IP address belongs to which provider or country is highly volatile; you would have to constantly update your databases to keep up.
There are more reasons, but I think you already have got the idea.
Fortunately, there is a solution to your problem (but see "Final remark" below): Every browser, when fetching a page from a server, tells the server the preferred and accepted languages. For example, Apache can directly use that information in RewriteRule statements and redirect the user to the correct page.
If you can't alter your Server's configuration, then you can evaluate the respective header in your CGI program.
When doing your research, look for the Accept-Language HTTP 1.1 header. A good starting point probably is here.
Your third question
You eventually are mixing up two different things in your third question. A locale is not the same as a language. On one hand, you are asking "...to get the local from...", and on the other hand, you say "...lang=fr...", thus making the impression you want to get the language.
If you want to get the language: See my answer to your second question, or parse the language from the current path (as you already have suggested).
If you want to get the locale, things are more complicated. The only reasonable automatic method is to derive the locale from the language, but this will often fail. For example, I generally prefer the English language when doing research, but on the other hand, I am located in Germany and thus would like dates and times in the format I am used to, so deriving my desired locale from my preferred language will fail.
Unfortunately, there is no HTTP header which could tell the server which locale the user prefers. As a starting point, this article may help you.
See the final remark (next section) on how to solve this problem.
Final remark
As the article linked above already states: The only reliable way to satisfy the user is to let him choose his language, his locale and his time zone within your application. You could store the user's choices either in cookies or in your back-end database; each has its own advantages and disadvantages.
I usually use a combination of all methods (HTTP headers, cookies, database) in my projects.

Think about humans at the first. Is the URL translation important for users in France? Some people may think what it’s fine to get translated words in the URL. Users from other locales may think otherwise. Search engines take into account user behavioral factors. SEO factors will higher if you solution is more convinient for users.
It whould be nice if users get an expected language version. A site could help them if it suggests a language version by IP, HTTP headers, cookies and so on. Some people may prefer another language, some people may be on a trip. So it's still important let them to choice a language version manually.
Please read manuals and analyze competitors sites in case of doubt.

i usally show in mostly website they give url like site.com/en and site.com/fr as you mention but it upon you how you want to show website to user. i prefer make default site.com/en and give user option to select his language.
if you still confuse then refer below link it will usefull.
See Refferal Link Here

Should you translate paths?
If possible, by all means - as this will help users of that language to feel like "first class citizens". Login routes probably won't have much impact on SEO, but translating URLs on content pages may well help them to be more discoverable.
You can read Google's recommendations on multi-regional and multilingual sites, which state that it's "fine to translate words in the URL".
Should you redirect based on IP?
This can help first time users, but there are a few things to bear in mind:
How accurate will this be? E.g. if I speak English but I visit France and then view your site - I will get French. If your target market is mobile-toting globe-trotters, then it may not be the best choice. Is looking at Accept-Language, for example, any better?
Will geolocating the IP address on every request introduce any performance problems for your servers? Or make too many calls to an external geocoding service? Make sure to carry out capacity planning, and reduce calls where you already know the locale (e.g. from a cookie, or user explicitly stating their preference.
Even if you guess a preferred locale, always allow an explicit user preference to override that. There's nothing more frustrating than moving between pages, and having the site decide that it knows better what language you understand :-)
As long as you make it easy to switch between sites, you shouldn't need a specific landing page. It doesn't hurt to pop up a banner if you're unsure whether you should redirect a user (for example, amazon.com will show a banner to a UK user, giving them the option of switching sites - but not deciding for them)
How should you structure your URLs?
Having the language somewhere in the URL (either as a subdomain or a folder) is probably best for SEO. Don't forget to update your sitemap as well, to indicate to crawlers that there are alternate content pages in different languages.

FTP file upload feature on Wordpress page [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
I have this Wordpress page that a colleague of mine has been working on. The user has to be able to login in and upload some files. We are thinking about using this plugin for logging in
https://wordpress.org/plugins/profile-builder/
When logged in, the user has access to a page with a feature where he/she can select a file from his/her harddisk. The selected file should then be uploaded to an FTP server. When the upload is complete, the page should display a list of the files the user has already uploaded and have an option to delete each individual file.
I realise that this requires sending commands to the the FTP server (LIST, DELE, MKD etc).
I have considered making my on Node.js server and letting the server handle the FTP upload, but I am also thinking that there should be a Wordpress plugin for this. I have tried searching, but all I can find is instructions on how to use Wordpress' own FTP upload function to deploy a site, which is not particularly helpfull for me. Also, I don't have any experience with Wordpress, but have some web experience, so making sense of it shouldn't be difficult.
So have any of you guys done some similar before and maybe know a plugin? Or made your own server application for something like this?
Thank you.

Mains problems about upload by users in wp-upload is security, and the files type accepted by the wp uploader. It's better to use our own code, that sanitize the file following your parameters.
You can try:
https://wordpress.org/plugins/frontend-uploader/
or
https://wordpress.org/plugins/wp-file-upload/

Well this is fairly simple if you have some php skills. You dont even need any kind of ftp access. So here is a basic breakdown of how it should be done the "wordpress way".
1) Separate dashboard/user areas for different users (the page they see
after login)
Well its not a good idea to give users access to the backend/admin area and is not very user friendly as well, so you can look into theme-my-login plugin which does a great job of creating login pages and separate pages where users are redirected after login.
2) Then you need to have a upload area where users can upload the movies from their computer
For this you can look us wordpress attachment functions. You have functions for uploading, deleting etc. Just create a simple form and grab the uploaded file and pass them through these functions (you should look into sanitizing data properly / validation), you might also need to increase the size of uploaded files through php.ini etc.
wp_insert_attachment, wp_delete_attachment, wp_get_attachment_url
3) A repository of all movies uploaded by specific user
This is a piece of cake with user meta's. The above described attachment functions comes with action hooks, Hooks are like triggers that gets triggered when someone does some action. So once the attachment is uploaded, you can hook into that action and grab the id of the uploaded attachment, name of uploaded attachment etc and them save them to the logged in user's meta. Its better if you use an array of values and then encode them into json. This way a list of all user movies can be store in a single database entry, which would be very efficient.
For creating and updating user meta's you should this function update_user_meta, delete_user_meta, etc.
Now to show all movies by a user, you can use something like get_user_meta('movie_list')

Is it possible to prevent man in the browser attack at the server with hardware device [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
Recently I found a hardware device that can prevent bot attacks by changing html DOM elements on the fly The details are mentioned here
The html input element id and name and also form element action will be replaced with some random string before page is sent to client. After client submit, the hardware device replace its values with originals. So the server code will remain on change and bots can not work on fixed input name, id.
That was the total idea, BUT they also have claimed that this product can solve the man in the browser attack.
http://techxplore.com/news/2014-01-world-botwall.html :
Shape Security claims that the added code to a web site won't cause
any noticeable delays to the user interface (or how it appears) and
that it works against other types of attacks as well, such as account
takeover, and man-in-the-browser. They note that their approach works
because it deflects attacks in real time whereas code for botnets is
changed only when it installs (to change its signature).
Theoretically is it possible that some one can prevent the man in the browser attack at the server?!

Theoretically is it possible that some one can prevent the man in the browser attack at the server?!
Nope. Clearly the compromised client can do anything a real user can.
Making your pages more resistant to automation is potentially an arms race of updates and countermeasures. Obfuscation like this can at best make it annoying enough to automate your site that it's not worth it to the attacker—that is, you try to make yourself no longer the ‘low-hanging fruit’.
They note that their approach works because it deflects attacks in real time whereas code for botnets is changed only when it installs (to change its signature).
This seems pretty meaningless. Bots naturally can update their own code. Indeed banking trojans commonly update themselves to work around changes to account login pages. Unless the service includes live updates pushed out to the filter boxes to work around these updates, you still don't win.
(Such an Automation Arms Race As A Service would be an interesting proposition. However I would be worried about new obfuscation features breaking your applications. For example imagine what would happen for the noddy form-field-renaming example on the linked site if you have your own client-side scripts were relying on those names. Or indeed if your whole site was a client-side Single Page App, this would have no effect.)

Are there best practices for testing security in an Agile development shop? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
Regarding Agile development, what are the best practices for testing security per release?
If it is a monthly release, are there shops doing pen-tests every month?

What's your application domain? It depends.
Since you used the word "Agile", I'm guessing it's a web app. I have a nice easy answer for you.
Go buy a copy of Burp Suite (it's the #1 Google result for "burp" --- a sure endorsement!); it'll cost you 99EU, or ~$180USD, or $98 Obama Dollars if you wait until November.
Burp works as a web proxy. You browse through your web app using Firefox or IE or whatever, and it collects all the hits you generate. These hits get fed to a feature called "Intruder", which is a web fuzzer. Intruder will figure out all the parameters you provide to each one of your query handlers. It will then try crazy values for each parameter, including SQL, filesystem, and HTML metacharacters. On a typical complex form post, this is going to generate about 1500 hits, which you'll look through to identify scary --- or, more importantly in an Agile context, new --- error responses.
Fuzzing every query handler in your web app at each release iteration is the #1 thing you can do to improve application security without instituting a formal "SDLC" and adding headcount. Beyond that, review your code for the major web app security hot spots:
Use only parameterized prepared SQL statements; don't ever simply concatenate strings and feed them to your database handle.
Filter all inputs to a white list of known good characters (alnum, basic punctuation), and, more importantly, output filter data from your query results to "neutralize" HTML metacharacters to HTML entities (quot, lt, gt, etc).
Use long random hard-to-guess identifiers anywhere you're currently using simple integer row IDs in query parameters, and make sure user X can't see user Y's data just by guessing those identifiers.
Test every query handler in your application to ensure that they function only when a valid, logged-on session cookie is presented.
Turn on the XSRF protection in your web stack, which will generate hidden form token parameters on all your rendered forms, to prevent attackers from creating malicious links that will submit forms for unsuspecting users.
Use bcrypt --- and nothing else --- to store hashed passwords.

I'm no expert on Agile development, but I would imagine that integrating some basic automated pen-test software into your build cycle would be a good start. I have seen several software packages out there that will do basic testing and are well suited for automation.

I'm not a security expert, but I think the most important fact you should be aware of, before testing security, is what you are trying to protect. Only if you know what you are trying to protect, you can do a proper analysis of your security measures and only then you can start testing those implemented measures.
Very abstract, I know. However, I think it should be the first step of every security audit.

Unit testing, Defense Programming and lots of logs
Unit testing
Make sure you unit test as early as possible (e.g. the password should be encrypted before sending, the SSL tunnel is working, etc). This would prevent your programmers from accidentally making the program insecure.
Defense Programming
I personally call this the Paranoid Programming but Wikipedia is never wrong (sarcasm). Basically, you add tests to your functions that checks all the inputs:
is the user's cookies valid?
is he still currently logged in?
are the function's parameters protected against SQL injection? (even though you know that the input are generated by your own functions, you will test anyway)
Logging
Log everything like crazy. Its easier to remove logs then to add them. A user have logged in? Log it. A user found a 404? Log it. The admin edited/deleted a post? Log it. Someone was able to access a restricted page? Log it.
Don't be surprised if your log file reaches 15+ Mb during your development phase. During beta, you can decide which logs to remove. If you want, you can add a flag to decide when a certain event is logged.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string