Adding/calling non-AQL server side functions in ArangoDB? - arangodb

I know that ArangoDB allows for custom AQL functions to be defined server side, but is it possible to define server-side Javascript functions that can be called during transactions?
I'm using the Java driver to connect to ArangoDB, which performs transactions by sending a Javascript function as a string.
To avoid sending a large and complex string every time (one transaction I'm using is ~500 lines long), I'd prefer to have it stored server-side, and called more simply from Java.
E.g. instead of running something like:
String action = "function (params) {"
+ "const db = require('#arangodb').db;"
+ "return db._query('FOR i IN test RETURN i._key').toArray();"
+ "}";
String[] keys = arango.db().transaction(action, String[].class, new TransactionOptions());
I'd like to call something like:
String action = "my_function";
String[] keys = arango.db().transaction(action, String[].class, new TransactionOptions());
Or:
String action = "function(params) {"
+ "const my_function = require("somefunc");
+ "return my_function(params);
+ "}";
Is this possible to achieve?

You can write custom endpoints with Foxx.
It is a microservice framework written in JavaScript that lets you execute complex JS code on the server-side, with access to ArangoDB's internal JS API (db object etc.). You can encapsulate whatever business logic in it, e.g. your 500 lines of transaction code and add some parameters to control it from outside without having to send all the code each time.

Related

Convert AutoQuery query string to SqlExpression

I am trying to re-create AutoQuery queries outside of a service request. I am doing this because I give user option to save a request and then use that data elsewhere. I save the query string data so I am trying to create a query from the saved query string.
I need 2 things.
1) query that returns the complete data not limited by default autoquery page size
2) query that returns the count
I tried making the query like this:
IAutoQueryDb _autoQuery = HostContext.TryResolve<IAutoQueryDb>();
var dto = new MyQueryDbClass();
Dictionary<string, string> pars = GetParameters();
var query = _autoQuery.CreateQuery(dto, pars);
The problem with this is that the query generated has the table name of the response object and not the actual table so it doesn't work. Also I am unable to call ToCountSatement() on it. It is also limited by my default page size.
Is there a way to convert the AutoQuery query string to a SqlExpression so I can execute it and also get count statement?
The CreateQuery() API returns a populated SqlExpression<Table> similar to what would have been created if manually constructing the query yourself, e.g:
SqlExpression<Table> query = _autoQuery.CreateQuery(dto, pars);
To clear the paging info you can call .Limit() without arguments which will clear any populated Offset/Rows values:
query.Limit();
The Custom AutoQuery Implementations docs shows an example of AutoQuery executes the query behind the scenes, e.g. you can get the total with:
var total = Db.Count(query);

Thread pool with Apps Script on Spreadsheet

I have a Google Spreadsheet with internal AppsScript code which process each row of the sheet and perform an urlfetch with the row data. The url will provide a value which will be added to the values returned by each row processing..
For now the code is processing 1 row at a time with a simple for:
var spreadsheet = SpreadsheetApp.getActiveSpreadsheet();
var sheet = spreadsheet.getActiveSheet();
var range = sheet.getDataRange();
for(var i=1 ; i<range.getValues().length ; i++) {
var payload = {
// retrieve data from the row and make payload object
};
var options = {
"method":"POST",
"payload" : payload
};
var result = UrlFetchApp.fetch("http://.......", options);
var text = result.getContentText();
// Save result for final processing
// (with multi-thread function this value will be the return of the function)
}
Please note that this is only a simple example, in the real case the working function will be more complex (like 5-6 http calls, where the output of some of them are used as input to the next one, ...).
For the example let's say that there is a generic "function" which executes some sort of processing and provides a result as output.
In order to speed up the process, I'd like to try to implement some sort of "multi-thread" processing, so I can process multiple rows in the same time.
I already know that javascript does not offer a multi-thread handling, but I read about WebWorker which seems to create an async processing of a function.
My goal is to obtain some sort of ThreadPool (like 5 threads at a time) and send every row that need to be processed to the pool, obtaining as output the result of each function.
When all the rows finished the processing, a final action will be performed gathering all the results of each function.
So the capabilities I'm looking for are:
managed "ThreadPool" where I can submit an N amount of tasks to be performed
possibility to obtain a resulting value from each task processed by the pool
possibility to determine that all the tasks has been processed, so a final "event" can be executed
I already see that there are some ready-to-use libraries like:
https://www.hamsters.io/wiki#thread-pool
http://threadsjs.readthedocs.io/en/latest/
https://github.com/andywer/threadpool-js
but they work with NodeJS. Due to AppsScript nature, I need a more simplier approach, which is provided by native JS. Also, it seems that minified JS are not accepted by AppsScript editor, so I also need the "expanded" version.
Do you know a simple ThreadPool in JS where I can submit a function to be execute and I get back a Promise for the result?

Couchbase java sdk 1.4.7 numeric key in view Query not returning results

The view definition emits a string field from the document as a key. The field value can be all numeric or alphanumeric. Query using key with all numeric value does not return any row but alphanumeric key returns data.
On server web console and rest api, I could see the row so view is getting updated properly and hence leaning to believe that issue is with java sdk client.
Below is the code I use to query.
CouchbaseClient couchBaseDAO; // = initialize client.
String corelationId = "12345678";
Query query = new Query();
query.setKey(corelationId);
ViewResponse result = couchBaseDAO.query(queryConfig, query);
JSONArray jsonArray = new JSONArray();
if(result != null){
for(ViewRow row: result){
jsonArray.put(row.getValue());
}
}
return jsonArray.toString();
Map:
function(doc,meta) {
if(doc!=null && doc.requestData!=null) {
emit(doc.requestData.corelationId, [doc.request.id, doc.status]);
}
}
If I changed key to alphanumeric, it works.
String corelationId = "ab-12-09-a-123";
Java HotSpot 7.
Couchbase java sdk 1.4.7
Couchbase Server 3.0.3
Solution
Based on the information given in answer below, below are two options you have
Option 1 Server side map change
If you are building a new map than go for it. Harmonize your key to become always string emit("" + doc.requestData.corelationId, ...);
If your view already exists then all your existing documents will not change right away.
Option 2 Client side change
If you are like me where option 1 is not possible, go for harmonizing your key in your code. It overcomes's skd's logic to treat it as numeric.
corelationId = StringUtils.isNumeric(corelationId)?"\""+corelationId+"\"":corelationId;
Your view emits the corelationId as it is, in its original type. You said that in the documents it was alternating between a numerical value and a string.
If you pass the key to the SDK as a Long it will work.
(I suspect that in the web ui you naturally typed in 12345678 in the key field and not "12345678", so you did the correct equivalent of using a Long in the web UI)
If you cannot know the correct type to use for each key you search, harmonize the key type in the map function so that you know always to use strings:
emit("" + doc.requestData.corelationId, ...);

how do we add url parameters? (EJS + Node + Express)

I understood how we parse the url parameters in express routes, as in the example
How to get GET (query string) variables in Express.js on Node.js?
But where do the url parameters come from in the first place?
EDIT:
Apparently, I can build such a query with jquery (i.e $.get). I can append params to this query object. It s cool, but still i m trying to understand how we achieve this in the query that renders the page as a whole.
An example : when i choose the oldest tab below, how does SO add ?answertab=oldest to the url so it becomes :
https://stackoverflow.com/questions/30516497/how-do-we-add-url-parameters-ejs-node-express?answertab=oldest#tab-top
The string you're looking at is a serialization of the values of a form, or some other such method of inputing data. To get a sense of this, have a look at jQuery's built in .serialize() method.
You can construct that string manually as well, and that's pretty straight forward as well. The format is just ?var1=data1&var2=data2 etc. If you have a JSON object {"name": "Tim", "age": 22} then you could write a very simple function to serialize this object:
function serializeObject(obj) {
var str = "?";
for(var i = 0; i < Object.keys(obj).length; i++) {
key = Object.keys(obj)[i];
if (i === Object.keys(obj).length - 1)
str += encodeURIComponent(key) + "=" + encodeURIComponent(obj[key]);
else
str += encodeURIComponent(key) + "=" + encodeURIComponent(obj[key]) + "&";
}
return str;
}
Running seralizeObject({"name": "Tim", "age": 22}) will output '?name=Tim&age=22'. This could be used to generate a link or whatnot.
The page author writes them so. This is how they "come in the first place". The authors of an HTML page decide (or are told by website designers) where to take the user when he clicks on a particular anchor element on it. If they want users to GET a page with some query parameters (which their server handles), they simply add query string of their choice to the link's href attribute.
Take a look at the href attribute of the oldest tab you clicked:
<a
class="youarehere"
href="/questions/30516497/how-do-we-add-url-parameters-ejs-node-express?answertab=oldest#tab-top"
title="Answers in the order they were provided"
>
oldest
</a>
When you clicked it, the browser simply took you to path indicated in href attribute /questions/30516497/how-do-we-add-url-parameters-ejs-node-express?answertab=oldest#tab-top relative to the base URL http://stackoverflow.com. So the address bar changed.
stackoverflow.com may have its own system of generating dynamic HTML pages. Their administrators and page authors have configured their server to handle particular query parameters and have put in place their own methods to make sure that links on their pages point to the URL(including query string) they wish.
You need to provide URIs with query strings of your choice (you can build them using url.format and querystring.stringify) to your template system to render. Then make your express routes process them and generate pages depending on their value.

How does stackoverflow display codes without compromising their security? [duplicate]

Is there a catchall function somewhere that works well for sanitizing user input for SQL injection and XSS attacks, while still allowing certain types of HTML tags?
It's a common misconception that user input can be filtered. PHP even has a (now deprecated) "feature", called magic-quotes, that builds on this idea. It's nonsense. Forget about filtering (or cleaning, or whatever people call it).
What you should do, to avoid problems, is quite simple: whenever you embed a a piece of data within a foreign code, you must treat it according to the formatting rules of that code. But you must understand that such rules could be too complicated to try to follow them all manually. For example, in SQL, rules for strings, numbers and identifiers are all different. For your convenience, in most cases there is a dedicated tool for such an embedding. For example, when you need to use a PHP variable in the SQL query, you have to use a prepared statement, that will take care of all the proper formatting/treatment.
Another example is HTML: If you embed strings within HTML markup, you must escape it with htmlspecialchars. This means that every single echo or print statement should use htmlspecialchars.
A third example could be shell commands: If you are going to embed strings (such as arguments) to external commands, and call them with exec, then you must use escapeshellcmd and escapeshellarg.
Also, a very compelling example is JSON. The rules are so numerous and complicated that you would never be able to follow them all manually. That's why you should never ever create a JSON string manually, but always use a dedicated function, json_encode() that will correctly format every bit of data.
And so on and so forth ...
The only case where you need to actively filter data, is if you're accepting preformatted input. For example, if you let your users post HTML markup, that you plan to display on the site. However, you should be wise to avoid this at all cost, since no matter how well you filter it, it will always be a potential security hole.
Do not try to prevent SQL injection by sanitizing input data.
Instead, do not allow data to be used in creating your SQL code. Use Prepared Statements (i.e. using parameters in a template query) that uses bound variables. It is the only way to be guaranteed against SQL injection.
Please see my website http://bobby-tables.com/ for more about preventing SQL injection.
No. You can't generically filter data without any context of what it's for. Sometimes you'd want to take a SQL query as input and sometimes you'd want to take HTML as input.
You need to filter input on a whitelist -- ensure that the data matches some specification of what you expect. Then you need to escape it before you use it, depending on the context in which you are using it.
The process of escaping data for SQL - to prevent SQL injection - is very different from the process of escaping data for (X)HTML, to prevent XSS.
PHP has the new nice filter_input functions now, that for instance liberate you from finding 'the ultimate e-mail regex' now that there is a built-in FILTER_VALIDATE_EMAIL type
My own filter class (uses JavaScript to highlight faulty fields) can be initiated by either an ajax request or normal form post. (see the example below)
<?
/**
* Pork Formvalidator. validates fields by regexes and can sanitize them. Uses PHP filter_var built-in functions and extra regexes
* #package pork
*/
/**
* Pork.FormValidator
* Validates arrays or properties by setting up simple arrays.
* Note that some of the regexes are for dutch input!
* Example:
*
* $validations = array('name' => 'anything','email' => 'email','alias' => 'anything','pwd'=>'anything','gsm' => 'phone','birthdate' => 'date');
* $required = array('name', 'email', 'alias', 'pwd');
* $sanitize = array('alias');
*
* $validator = new FormValidator($validations, $required, $sanitize);
*
* if($validator->validate($_POST))
* {
* $_POST = $validator->sanitize($_POST);
* // now do your saving, $_POST has been sanitized.
* die($validator->getScript()."<script type='text/javascript'>alert('saved changes');</script>");
* }
* else
* {
* die($validator->getScript());
* }
*
* To validate just one element:
* $validated = new FormValidator()->validate('blah#bla.', 'email');
*
* To sanitize just one element:
* $sanitized = new FormValidator()->sanitize('<b>blah</b>', 'string');
*
* #package pork
* #author SchizoDuckie
* #copyright SchizoDuckie 2008
* #version 1.0
* #access public
*/
class FormValidator
{
public static $regexes = Array(
'date' => "^[0-9]{1,2}[-/][0-9]{1,2}[-/][0-9]{4}\$",
'amount' => "^[-]?[0-9]+\$",
'number' => "^[-]?[0-9,]+\$",
'alfanum' => "^[0-9a-zA-Z ,.-_\\s\?\!]+\$",
'not_empty' => "[a-z0-9A-Z]+",
'words' => "^[A-Za-z]+[A-Za-z \\s]*\$",
'phone' => "^[0-9]{10,11}\$",
'zipcode' => "^[1-9][0-9]{3}[a-zA-Z]{2}\$",
'plate' => "^([0-9a-zA-Z]{2}[-]){2}[0-9a-zA-Z]{2}\$",
'price' => "^[0-9.,]*(([.,][-])|([.,][0-9]{2}))?\$",
'2digitopt' => "^\d+(\,\d{2})?\$",
'2digitforce' => "^\d+\,\d\d\$",
'anything' => "^[\d\D]{1,}\$"
);
private $validations, $sanatations, $mandatories, $errors, $corrects, $fields;
public function __construct($validations=array(), $mandatories = array(), $sanatations = array())
{
$this->validations = $validations;
$this->sanitations = $sanitations;
$this->mandatories = $mandatories;
$this->errors = array();
$this->corrects = array();
}
/**
* Validates an array of items (if needed) and returns true or false
*
*/
public function validate($items)
{
$this->fields = $items;
$havefailures = false;
foreach($items as $key=>$val)
{
if((strlen($val) == 0 || array_search($key, $this->validations) === false) && array_search($key, $this->mandatories) === false)
{
$this->corrects[] = $key;
continue;
}
$result = self::validateItem($val, $this->validations[$key]);
if($result === false) {
$havefailures = true;
$this->addError($key, $this->validations[$key]);
}
else
{
$this->corrects[] = $key;
}
}
return(!$havefailures);
}
/**
*
* Adds unvalidated class to thos elements that are not validated. Removes them from classes that are.
*/
public function getScript() {
if(!empty($this->errors))
{
$errors = array();
foreach($this->errors as $key=>$val) { $errors[] = "'INPUT[name={$key}]'"; }
$output = '$$('.implode(',', $errors).').addClass("unvalidated");';
$output .= "new FormValidator().showMessage();";
}
if(!empty($this->corrects))
{
$corrects = array();
foreach($this->corrects as $key) { $corrects[] = "'INPUT[name={$key}]'"; }
$output .= '$$('.implode(',', $corrects).').removeClass("unvalidated");';
}
$output = "<script type='text/javascript'>{$output} </script>";
return($output);
}
/**
*
* Sanitizes an array of items according to the $this->sanitations
* sanitations will be standard of type string, but can also be specified.
* For ease of use, this syntax is accepted:
* $sanitations = array('fieldname', 'otherfieldname'=>'float');
*/
public function sanitize($items)
{
foreach($items as $key=>$val)
{
if(array_search($key, $this->sanitations) === false && !array_key_exists($key, $this->sanitations)) continue;
$items[$key] = self::sanitizeItem($val, $this->validations[$key]);
}
return($items);
}
/**
*
* Adds an error to the errors array.
*/
private function addError($field, $type='string')
{
$this->errors[$field] = $type;
}
/**
*
* Sanitize a single var according to $type.
* Allows for static calling to allow simple sanitization
*/
public static function sanitizeItem($var, $type)
{
$flags = NULL;
switch($type)
{
case 'url':
$filter = FILTER_SANITIZE_URL;
break;
case 'int':
$filter = FILTER_SANITIZE_NUMBER_INT;
break;
case 'float':
$filter = FILTER_SANITIZE_NUMBER_FLOAT;
$flags = FILTER_FLAG_ALLOW_FRACTION | FILTER_FLAG_ALLOW_THOUSAND;
break;
case 'email':
$var = substr($var, 0, 254);
$filter = FILTER_SANITIZE_EMAIL;
break;
case 'string':
default:
$filter = FILTER_SANITIZE_STRING;
$flags = FILTER_FLAG_NO_ENCODE_QUOTES;
break;
}
$output = filter_var($var, $filter, $flags);
return($output);
}
/**
*
* Validates a single var according to $type.
* Allows for static calling to allow simple validation.
*
*/
public static function validateItem($var, $type)
{
if(array_key_exists($type, self::$regexes))
{
$returnval = filter_var($var, FILTER_VALIDATE_REGEXP, array("options"=> array("regexp"=>'!'.self::$regexes[$type].'!i'))) !== false;
return($returnval);
}
$filter = false;
switch($type)
{
case 'email':
$var = substr($var, 0, 254);
$filter = FILTER_VALIDATE_EMAIL;
break;
case 'int':
$filter = FILTER_VALIDATE_INT;
break;
case 'boolean':
$filter = FILTER_VALIDATE_BOOLEAN;
break;
case 'ip':
$filter = FILTER_VALIDATE_IP;
break;
case 'url':
$filter = FILTER_VALIDATE_URL;
break;
}
return ($filter === false) ? false : filter_var($var, $filter) !== false ? true : false;
}
}
Of course, keep in mind that you need to do your sql query escaping too depending on what type of db your are using (mysql_real_escape_string() is useless for an sql server for instance). You probably want to handle this automatically at your appropriate application layer like an ORM. Also, as mentioned above: for outputting to html use the other php dedicated functions like htmlspecialchars ;)
For really allowing HTML input with like stripped classes and/or tags depend on one of the dedicated xss validation packages. DO NOT WRITE YOUR OWN REGEXES TO PARSE HTML!
No, there is not.
First of all, SQL injection is an input filtering problem, and XSS is an output escaping one - so you wouldn't even execute these two operations at the same time in the code lifecycle.
Basic rules of thumb
For SQL query, bind parameters
Use strip_tags() to filter out unwanted HTML
Escape all other output with htmlspecialchars() and be mindful of the 2nd and 3rd parameters here.
To address the XSS issue, take a look at HTML Purifier. It is fairly configurable and has a decent track record.
As for the SQL injection attacks, the solution is to use prepared statements. The PDO library and mysqli extension support these.
PHP 5.2 introduced the filter_var function.
It supports a great deal of SANITIZE, VALIDATE filters.
Methods for sanitizing user input with PHP:
Use Modern Versions of MySQL and PHP.
Set charset explicitly:
$mysqli->set_charset("utf8");manual
$pdo = new PDO('mysql:host=localhost;dbname=testdb;charset=UTF8', $user, $password);manual
$pdo->exec("set names utf8");manual
$pdo = new PDO(
"mysql:host=$host;dbname=$db", $user, $pass,
array(
PDO::ATTR_ERRMODE => PDO::ERRMODE_EXCEPTION,
PDO::MYSQL_ATTR_INIT_COMMAND => "SET NAMES utf8"
)
);manual
mysql_set_charset('utf8') [deprecated in PHP 5.5.0, removed in PHP 7.0.0].
Use secure charsets:
Select utf8, latin1, ascii.., dont use vulnerable charsets big5, cp932, gb2312, gbk, sjis.
Use spatialized function:
MySQLi prepared statements:
$stmt = $mysqli->prepare('SELECT * FROM test WHERE name = ? LIMIT 1'); $param = "' OR 1=1 /*";$stmt->bind_param('s', $param);$stmt->execute();
PDO::quote() - places quotes around the input string (if required) and escapes special characters within the input string, using a quoting style appropriate to the underlying driver:$pdo = new PDO('mysql:host=localhost;dbname=testdb;charset=UTF8', $user, $password);explicit set the character set$pdo->setAttribute(PDO::ATTR_EMULATE_PREPARES, false);disable emulating prepared statements to prevent fallback to emulating statements that MySQL can't prepare natively (to prevent injection)$var = $pdo->quote("' OR 1=1 /*");not only escapes the literal, but also quotes it (in single-quote ' characters)
$stmt = $pdo->query("SELECT * FROM test WHERE name = $var LIMIT 1");
PDO Prepared Statements: vs MySQLi prepared statements supports more database drivers and named parameters: $pdo = new PDO('mysql:host=localhost;dbname=testdb;charset=UTF8', $user, $password);explicit set the character set$pdo->setAttribute(PDO::ATTR_EMULATE_PREPARES, false);disable emulating prepared statements to prevent fallback to emulating statements that MySQL can't prepare natively (to prevent injection)
$stmt = $pdo->prepare('SELECT * FROM test WHERE name = ? LIMIT 1');
$stmt->execute(["' OR 1=1 /*"]);
mysql_real_escape_string [deprecated in PHP 5.5.0, removed in PHP 7.0.0].
mysqli_real_escape_string Escapes special characters in a string for use in an SQL statement, taking into account the current charset of the connection. But recommended to use Prepared Statements because they are not simply escaped strings, a statement comes up with a complete query execution plan, including which tables and indexes it would use, it is a optimized way.
Use single quotes (' ') around your variables inside your query.
Check the variable contains what you are expecting for:
If you are expecting an integer, use:
ctype_digit — Check for numeric character(s);$value = (int) $value;$value = intval($value);$var = filter_var('0755', FILTER_VALIDATE_INT, $options);
For Strings use:
is_string() — Find whether the type of a variable is stringUse Filter Function filter_var() — filters a variable with a specified filter:$email = filter_var($email, FILTER_SANITIZE_EMAIL);$newstr = filter_var($str, FILTER_SANITIZE_STRING);more predefined filters
filter_input() — Gets a specific external variable by name and optionally filters it:$search_html = filter_input(INPUT_GET, 'search', FILTER_SANITIZE_SPECIAL_CHARS);
preg_match() — Perform a regular expression match;
Write Your own validation function.
One trick that can help in the specific circumstance where you have a page like /mypage?id=53 and you use the id in a WHERE clause is to ensure that id definitely is an integer, like so:
if (isset($_GET['id'])) {
$id = $_GET['id'];
settype($id, 'integer');
$result = mysql_query("SELECT * FROM mytable WHERE id = '$id'");
# now use the result
}
But of course that only cuts out one specific attack, so read all the other answers. (And yes I know that the code above isn't great, but it shows the specific defence.)
There's no catchall function, because there are multiple concerns to be addressed.
SQL Injection - Today, generally, every PHP project should be using prepared statements via PHP Data Objects (PDO) as a best practice, preventing an error from a stray quote as well as a full-featured solution against injection. It's also the most flexible & secure way to access your database.
Check out (The only proper) PDO tutorial for pretty much everything you need to know about PDO. (Sincere thanks to top SO contributor, #YourCommonSense, for this great resource on the subject.)
XSS - Sanitize data on the way in...
HTML Purifier has been around a long time and is still actively updated. You can use it to sanitize malicious input, while still allowing a generous & configurable whitelist of tags. Works great with many WYSIWYG editors, but it might be heavy for some use cases.
In other instances, where we don't want to accept HTML/Javascript at all, I've found this simple function useful (and has passed multiple audits against XSS):
/* Prevent XSS input */
function sanitizeXSS () {
$_GET = filter_input_array(INPUT_GET, FILTER_SANITIZE_STRING);
$_POST = filter_input_array(INPUT_POST, FILTER_SANITIZE_STRING);
$_REQUEST = (array)$_POST + (array)$_GET + (array)$_REQUEST;
}
XSS - Sanitize data on the way out... unless you guarantee the data was properly sanitized before you add it to your database, you'll need to sanitize it before displaying it to your user, we can leverage these useful PHP functions:
When you call echo or print to display user-supplied values, use htmlspecialchars unless the data was properly sanitized safe and is allowed to display HTML.
json_encode is a safe way to provide user-supplied values from PHP to Javascript
Do you call external shell commands using exec() or system() functions, or to the backtick operator? If so, in addition to SQL Injection & XSS you might have an additional concern to address, users running malicious commands on your server. You need to use escapeshellcmd if you'd like to escape the entire command OR escapeshellarg to escape individual arguments.
What you are describing here is two separate issues:
Sanitizing / filtering of user input data.
Escaping output.
1) User input should always be assumed to be bad.
Using prepared statements, or/and filtering with mysql_real_escape_string is definitely a must.
PHP also has filter_input built in which is a good place to start.
2) This is a large topic, and it depends on the context of the data being output. For HTML there are solutions such as htmlpurifier out there.
as a rule of thumb, always escape anything you output.
Both issues are far too big to go into in a single post, but there are lots of posts which go into more detail:
Methods PHP output
Safer PHP output
If you're using PostgreSQL, the input from PHP can be escaped with pg_escape_literal()
$username = pg_escape_literal($_POST['username']);
From the documentation:
pg_escape_literal() escapes a literal for querying the PostgreSQL database. It returns an escaped literal in the PostgreSQL format.
You never sanitize input.
You always sanitize output.
The transforms you apply to data to make it safe for inclusion in an SQL statement are completely different from those you apply for inclusion in HTML are completely different from those you apply for inclusion in Javascript are completely different from those you apply for inclusion in LDIF are completely different from those you apply to inclusion in CSS are completely different from those you apply to inclusion in an Email....
By all means validate input - decide whether you should accept it for further processing or tell the user it is unacceptable. But don't apply any change to representation of the data until it is about to leave PHP land.
A long time ago someone tried to invent a one-size fits all mechanism for escaping data and we ended up with "magic_quotes" which didn't properly escape data for all output targets and resulted in different installation requiring different code to work.
Easiest way to avoid mistakes in sanitizing input and escaping data is using PHP framework like Symfony, Nette etc. or part of that framework (templating engine, database layer, ORM).
Templating engine like Twig or Latte has output escaping on by default - you don't have to solve manually if you have properly escaped your output depending on context (HTML or Javascript part of web page).
Framework is automatically sanitizing input and you should't use $_POST, $_GET or $_SESSION variables directly, but through mechanism like routing, session handling etc.
And for database (model) layer there are ORM frameworks like Doctrine or wrappers around PDO like Nette Database.
You can read more about it here - What is a software framework?
Just wanted to add that on the subject of output escaping, if you use php DOMDocument to make your html output it will automatically escape in the right context. An attribute (value="") and the inner text of a <span> are not equal.
To be safe against XSS read this:
OWASP XSS Prevention Cheat Sheet
PHP filter extension has many of the functions needed for checking the externaluser input & it is designed for making data sanitization easier and quicker.
PHP filters can comfortably sanitize & validate the external input.

Resources