CQRS Thin Read Layer: Where does full text search fit in? - domain-driven-design

I Use a CQRS thin read layer to provide denormalized lists/reporting data for the UI.
In some parts of my application I want to provide a search box so the user can filter through the data.
Lucene.NET is my full text search engine of choice at the moment, as I've implemented it before and am very happy with it.
But where does the searching side of things fit in with CQRS?
I see two options but there are probably more.
1] My Controller can pass the search string to a search layer (Lucene.NET) which returns a list of ID that I can then pass to the CQRS read layer. The read layer will take these IDs and assemble them into a WHERE ID IN (1,2,3) clause, ultimately returning a DataTable or IEnumerable back to the controller.
List<int> ids = searchLayer.SearchCustomers("searchString");
result = readLayer.GetCustomers(ids);
2] My thin read layer can have searching coded directly into it, so I just call
readLayer.GetListOfCustomers("search string", page, page1);

Remember that using CQRS doesn't mean that you use it in every part of your application. Slicing an application into smaller components allows using various architectural principles and patterns as they make sense. A full text search API might be one of these components.

Without knowing all the details of your application, I would lean towards a single entry point from your controller into your search layer (your option #2 above). I think your controller shouldn't know that it needs to call layer #1 for full-text enabled searching and layer #2 for just regular WHERE clause type searching.
I would assume you would have two different 'contexts' (e.g. SQLContext and LuceneContext), and these would be dependencies injected into your ReadLayer. Your read layer logic should then make the decision on when to use LuceneContext and when to use SQLContext; your controller won't know and shouldn't know.
This also allows you to swap out LuceneContext for MongoContext if there's a compelling reason in the future without your controller knowing or having to change.
And if you use interfaces for everything (e.g. ISearchContext is implemented by LuceneContext and MongoContext), then really the only change for swapping out contexts is in your IoC container's initialization/rules.
So, go with option #2, inject in your dependencies, have your controller just work through your read layer, and you should be good to go. Hopefully this helps!

Related

REST API-server dilemma

I plan to build a server that will use a REST API on NODE.JS with Meteor.
What are the differences between these two methods of writing an API:
1.http://meteorpedia.com/read/REST_API
example: someSrver.com/post/:_id
someSrver.com/post?id=_id
Thanks
I think there's no really difference at least you are handling request correctly.
I think the second style suit better if you have to pass server a lot of parameters for example a service that provide high res image, where you can specify tile size, coordinates and other things like that.
If you use api as an interface to database usually first style is used.
I've develop a rest interface using this library: https://atmospherejs.com/nimble/restivus , it's very easy to use and it use first style.
So, you should read up on REST principles and API design if you want to really understand the why's.
But generally, the rule of thumb is that a URL should represent a resource. The paths generally represent a given "thing" and the querystring represents some kind of filter on that "thing".
So if you "post" object is logically its own entity (e.g.a blog post), then it'd have its own unique url such that a GET to www.example.com/posts/:id would return the one specific blog entry you're talking about.
GET /posts would map to a list of all posts, for example, and GET /posts?tagged=cheetahs would get you a list of all posts filtered to return just those with the tag of 'cheetahs' assigned to them.
This is all rules of thumb and standards. The implementation really doesn't matter and most servers don't care; but there is value in following the standards as they tend to be more maintainable, elegant, and help you not have to make a million design decisions. If you ever want others to integrate with you, it makes it easier for them to know what to expect as well.
According to the URI standard, the query is for non-hierarchical filters while, the path is for hierarchical ones.
I would use query if we are talking about a filtered collection, so the result will be a representation of a collection, for example json []. On the other hand if we are talking about an item, then I would use the path and the json would be an object/item {}. But this is only my own style, you can use which one you prefer. (The URI structure has only routing purposes if you use REST with HATEOAS. I assume you don't.)
In REST the URL/URI is the address of an item or a collection of items. So, to get all addresses for customer 2 you could do this:
/api/customer/2/addresses
If however you wanted just those addresses with a postcode you could go:
/api/customer/2/addresses?withPostcode=1
In this case, the first URL/URI represents a thing/things whereas the second has a modifier, restriction or filter applied to it.
Therefore someSrver.com/post/:_id means get me the post which is known by that ID (though ideally it would be someSrver.com/posts/:_id - note the plural). Whereas the second one (someSrver.com/post?id=_id) implies that everything to the left of the question mark has already narrowed down your thing/things and they now need filtering by an ID property (in this case) on the thing.
It's a subtle distinction in many cases, but I'd sum it up as the first applying a selector/location and the second applying a selector/location with a filter.
Although I didn't implement REST API server in node yet, I want to share with you few important points when you design your server:
Try to use flat paths for the controllers, nested paths are causing confusion.
Avoid custom HTTP methods such as PUT, HEAD and PATCH, not all firewalls like them.
Use applicative error codes with HTTP 200 OK and use the HTTP protocol error codes only for HTTP errors.
See more: http://restafar.com/create-new-rest-server/

DDD: Using Value Objects inside controllers?

When you receive arguments in string format from the UI inside you controller, do you pass strings to application service (or to command) directly ?
Or, do you create value objects from the strings inside the controller ?
new Command(new SomeId("id"), Weight.create("80 kg"), new Date())
or
new Command("id", "80 kg", new Date())
new Command("id", "80", "kg", new Date())
Maybe it is not important, but it bothers me.
The question is, should we couple value objects from the domain to (inside) the controller ?
Imagine you don't have the web between you application layer and the presentation layer (like android activity or swing), would you push the use of value objects in the UI ?
Another thing, do you serialize/unserialize value objects into/from string like this ?
Weight weight = Weight.create("80 kg");
weight.getValue().equals(80.0);
weight.getUnit().equals(Unit.KILOGRAMS);
weight.toString().equals("80 kg");
In the case of passing strings into commands, I would rather pass "80 kg" instead of "80" and "kg".
Sorry if the question is not relevant or funny.
Thank you.
UPDATE
I came across that post while I was searching information about a totally different topic : Value Objects in CQRS - where to use
They seem to prefer primitives or DTOs, and keep VOs inside the domain.
I've also taken a look at the book of V. Vernon (Implementing DDD), and it talks about (exactly -_-) that in chapter 14 (p. 522)
I've noticed he's using commands without any DTOs.
someCommand.setId("id");
someCommand.setWeightValue("80");
someCommand.setWeightUnit("kg");
someCommand.setOtherWeight("80 kg");
someCommand.setDate("17/03/2015 17:28:35");
someCommand.setUserName("...");
someCommand.setUserAttribute("...");
someCommand.setUserOtherAttributePartA("...");
someCommand.setUserOtherAttributePartB("...");
It is the command object that would be mapped by the controller. Value objects initialization would appeare in the command handler method, and they would throw something in case of bad value (self validation in initialization).
I think I'm starting to be less bothered, but some other opinions would be welcomed.
As an introduction, this is highly opinionated and I'm sure everyone has different ideas on how it should work. But my endeavor here is to outline a strategy with some good reasons behind it so you can make your own evaluation.
Pass Strings or Parse?
My personal preference here is to parse everything in the Controller and send down the results to the Service. There are two main phases to this approach, each of which can spit back error conditions:
1. Attempt to Parse
When a bunch of strings come in from the UI, I think it makes sense to attempt to interpret them immediately. For easy targets like ints and bools, these conversions are trivial and model binders for many web frameworks handle them automatically.
For more complex objects like custom classes, it still makes sense to handle it in this location so that all parsing occurs in the same location. If you're in a framework which provides model binding, much of this parsing is probably done automatically; if not - or you're assembling a more complex object to be sent to a service - you can do it manually in the Controller.
Failure Condition
When parsing fails ("hello" is entered in an int field or 7 is entered for a bool) it's pretty easy to send feedback to the user before you even have to call the service.
2. Validate and Commit
Even though parsing has succeeded, there's still the necessity to validate that the entry is legitimate and then commit it. I prefer to handle validation in the service level immediately prior to committing. This leaves the Controller responsible for parsing and makes it very clear in the code that validation is occurring for every piece of data that gets committed.
In doing this, we can eliminate an ancillary responsibility from the Service layer. There's no need to make it parse objects - its single purpose is to commit information.
Failure Condition
When validation fails (someone enters an address on the moon, or enters a date of birth 300 years in the past), the failure should be reported back up to the caller (Controller, in this case). While the user probably makes no distinction between failure to parse and failure to validate, it's an important difference for the software.
Push Value Objects to UI?
I would accept parsed objects as far up the stack as possible, every time. If you can have someone else's framework handle that bit of transformation, why not do it? Additionally, the closer to the UI that the objects can live, the easier it is to give good, quick feedback to the user about what they're doing.
A Note on Coupling
Overall, pushing objects up the stack does result in greater coupling. However, writing software for a particular domain does involve being tightly coupled to that domain, whatever it is. If a few more components are tightly coupled to some concepts that are ubiquitous throughout the domain - or at least to the API touchpoints of the service being called - I don't see any real reduction in architectural integrity or flexibility occurring.
Parse One Big String or Components?
In general, it tends to be easiest to just pass the entire string into the Parse() method to get sorted through. Take your example of "80 kg":
"80 kg" and "120 lbs" may both be valid weight inputs
If you're passing in strings to a Parse() method, it's probably doing some fairly heavy lifting anyway. Expecting it to split a string based on a space is not overbearing.
It's far easier to call Weight.create(inputString) than it is to split inputString by " ", then call Weight.create(split[0], split[1]).
It's easier to maintain a single-string-input Parse() function as well. If some new requirement comes in that the Weight class has to support pounds and ounces, a new valid input may be "120 lbs 6 oz". If you're splitting up the input, you now need four arguments. Whereas if it's entirely encapsulated within the Parse() logic, there's no burden to outside consumers. This makes the code more extensible and flexible.
The difference between a DTO and a VO is that a DTO has no behavior, it's a simple container designed to pass data around from component to component. Besides, you rarely need to compare two DTO's and they are generally transient.
A Value Object can have behavior. Two VO's are compared by value rather than reference, which means for instance two Address value objects with the same data but that are different object instances will be equal. This is useful because they are generally persisted in one form or another and there are more occasions to compare them.
It turns out that in a DDD application, VO's will be declared and used in your Domain layer more often than not since they belong to the domain's Ubiquitous Language and because of separation of concerns. They can sometimes be manipulated in the Application layer but typically won't be sent between the UI layer and Application. We use DTO's for that instead.
Of course, this is debatable and depends a lot on the layers you choose to build your application out of. There might be cases when crunching your layered architecture down to 2 layers will be beneficial, and when using business objects directly in the UI won't be that bad.

Meaning of Leaky Abstraction?

What does the term "Leaky Abstraction" mean? (Please explain with examples. I often have a hard time grokking a mere theory.)
Here's a meatspace example:
Automobiles have abstractions for drivers. In its purest form, there's a steering wheel, accelerator and brake. This abstraction hides a lot of detail about what's under the hood: engine, cams, timing belt, spark plugs, radiator, etc.
The neat thing about this abstraction is that we can replace parts of the implementation with improved parts without retraining the user. Let's say we replace the distributor cap with electronic ignition, and we replace the fixed cam with a variable cam. These changes improve performance but the user still steers with the wheel and uses the pedals to start and stop.
It's actually quite remarkable... a 16 year old or an 80 year old can operate this complicated piece of machinery without really knowing much about how it works inside!
But there are leaks. The transmission is a small leak. In an automatic transmission you can feel the car lose power for a moment as it switches gears, whereas in CVT you feel smooth torque all the way up.
There are bigger leaks, too. If you rev the engine too fast, you may do damage to it. If the engine block is too cold, the car may not start or it may have poor performance. And if you crank the radio, headlights, and AC all at the same time, you'll see your gas mileage go down.
It simply means that your abstraction exposes some of the implementation details, or that you need to be aware of the implementation details when using the abstraction. The term is attributed to Joel Spolsky, circa 2002. See the wikipedia article for more information.
A classic example are network libraries that allow you to treat remote files as local. The developer using this abstraction must be aware that network problems may cause this to fail in ways that local files do not. You then need to develop code to handle specifically errors outside the abstraction that the network library provides.
Wikipedia has a pretty good definition for this
A leaky abstraction refers to any implemented abstraction, intended to reduce (or hide) complexity, where the underlying details are not completely hidden
Or in other words for software it's when you can observe implementation details of a feature via limitations or side effects in the program.
A quick example would be C# / VB.Net closures and their inability to capture ref / out parameters. The reason they cannot be captured is due to an implementation detail of how the lifting process occurs. This is not to say though that there is a better way of doing this.
Here's an example familiar to .NET developers: ASP.NET's Page class attempts to hide the details of HTTP operations, particularly the management of form data, so that developers don't have to deal with posted values (because it automatically maps form values to server controls).
But if you wander beyond the most basic usage scenarios the Page abstraction begins to leak and it becomes hard to work with pages unless you understand the class' implementation details.
One common example is dynamically adding controls to a page - the value of dynamically-added controls won't be mapped for you unless you add them at just the right time: before the underlying engine maps the incoming form values to the appropriate controls. When you have to learn that, the abstraction has leaked.
Well, in a way it is a purely theoretical thing, though not unimportant.
We use abstractions to make things easier to comprehend. I may operate on a string class in some language to hide the fact that I'm dealing with an ordered set of characters that are individual items. I deal with an ordered set of characters to hide the fact that I'm dealing with numbers. I deal with numbers to hide the fact that I'm dealing with 1s and 0s.
A leaky abstraction is one that doesn't hide the details its meant to hide. If call string.Length on a 5-character string in Java or .NET I could get any answer from 5 to 10, because of implementation details where what those languages call characters are really UTF-16 data-points which can represent either 1 or .5 of a character. The abstraction has leaked. Not leaking it though means that finding the length would either require more storage space (to store the real length) or change from being O(1) to O(n) (to work out what the real length is). If I care about the real answer (often you don't really) you need to work on the knowledge of what is really going on.
More debatable cases happen with cases like where a method or property lets you get in at the inner workings, whether they are abstraction leaks, or well-defined ways to move to a lower level of abstraction, can sometimes be a matter people disagree on.
I'll continue in the vein of giving examples by using RPC.
In the ideal world of RPC, a remote procedure call should look like a local procedure call (or so the story goes). It should be completely transparent to the programmer such that when they call SomeObject.someFunction() they have no idea if SomeObject (or just someFunction for that matter) are locally stored and executed or remotely stored and executed. The theory goes that this makes programming simpler.
The reality is different because there's a HUGE difference between making a local function call (even if you're using the world's slowest interpreted language) and:
calling through a proxy object
serializing your parameters
making a network connection (if not already established)
transmitting the data to the remote proxy
having the remote proxy restore the data and call the remote function on your behalf
serializing the return value(s)
transmitting the return values to the local proxy
reassembling the serialized data
returning the response from the remote function
In time alone that's about three orders (or more!) of magnitude difference. Those three+ orders of magnitude are going to make a huge difference in performance that will make your abstraction of a procedure call leak rather obviously the first time you mistakenly treat an RPC as a real function call. Further a real function call, barring serious problems in your code, will have very few failure points outside of implementation bugs. An RPC call has all of the following possible problems that will get slathered on as failure cases over and above what you'd expect from a regular local call:
you might not be able to instantiate your local proxy
you might not be able to instantiate your remote proxy
the proxies may not be able to connect
the parameters you send may not make it intact or at all
the return value the remote sends may not make it intact or at all
So now your RPC call which is "just like a local function call" has a whole buttload of extra failure conditions you don't have to contend with when doing local function calls. The abstraction has leaked again, even harder.
In the end RPC is a bad abstraction because it leaks like a sieve at every level -- when successful and when failing both.
What is abstraction?
Abstraction is a way of simplifying the world.
It means you don't have to worry about what is actually happening under the hood.
Example: Flying a 737/747 is "abstracted" away
Planes are complicated systems: it involves: jet engines, oxygen systems, electrical systems, landing gear systems etc.
...but the pilot doesn't have to worry about it... all of that is "abstracted away". The only thing a pilot needs to focus on is yoke (i.e. steering wheel of the plane).
He pushes the yoke left to go left, and right to go right etc.
....that is in an ideal world. In reality, flying a plane is much more complicated. Because many details ARE NOT "abstracted away".
Leaky Abstractions in 737 Example
Pilots in reality have to worry about a LOT of things: wind speed, thrust, angles of attack, fuel, altitude, weather problems, angles of descent. Computers can help the pilot in these tasks, but not everything is automated / simplified......not everything is "abstracted away".
e.g. If the pilot pulls up too hard on the column - the plane will obey, but then the plane might stall, and that's really bad.
In other words, it is not enough for the pilot to simply control the steering wheel without knowing anything else.........nooooo.......the pilot must know about the underlying risks and limitations of the plane before the pilot flies one.......the pilot must know how the plane works, and how the plane flies; the pilot must know implementation details..... that pulling up too hard will lead to a stall, or that landing too steeply will destroy the plane etc.
Those things are not abstracted away. A lot of things are abstracted, but not everything. The abstraction is "leaky".
Leaky Abstractions in Code
......it's the same thing in your code. If you don't know the underlying implementation details, then you're gonna have problems.
ORMs abstract a lot of the hassle in dealing with database queries, but if you've ever done something like:
User.all.each do |user|
puts user.name # let's print each user's name
end
Then you will realise that's a nice way to kill your app. You need to know that calling User.allwith 25 million users is going to spike your memory usage, and is going to cause problems. You need to know some underlying details. The abstraction is leaky.
An example in the django ORM many-to-many example:
Notice in the Sample API Usage that you need to .save() the base Article object a1 before you can add Publication objects to the many-to-many attribute. And notice that updating the many-to-many attribute saves to the underlying database immediately, whereas updating a singular attribute is not reflected in the db until the .save() is called.
The abstraction is that we are working with an object graph, where single-value attributes and mult-value attributes are just attributes. But the implementation as a relational database backed data store leaks... as the integrity system of the RDBS appears through the thin veneer of an object interface.
The fact that at some point, which will guided by your scale and execution, you will be needed to get familiar with the implementation details of your abstraction framework in order to understand why it behave that way it behave.
For example, consider this SQL query:
SELECT id, first_name, last_name, age, subject FROM student_details;
And its alternative:
SELECT * FROM student_details;
Now, they do look like a logically equivalent solutions, but the performance of the first one is better due the individual column names specification.
It's a trivial example but eventually it comes back to Joel Spolsky quote:
All non-trivial abstractions, to some degree, are leaky.
At some point, when you will reach a certain scale in your operation, you will want to optimize the way your DB (SQL) works. To do it, you will need to know the way relational databases works. It was abstracted to you in the beginning, but it's leaky. You need to learn it at some point.
Assume, we have the following code in a library:
Object[] fetchDeviceColorAndModel(String serialNumberOfDevice)
{
//fetch Device Color and Device Model from DB.
//create new Object[] and set 0th field with color and 1st field with model value.
}
When the consumer calls the API, they get an Object[]. The consumer has to understand that the first field of the object array has color value and second field is the model value. Here the abstraction has leaked from library to the consumer code.
One of the solutions is to return an object which encapsulates Model and Color of the Device. The consumer can call that object to get the model and color value.
DeviceColorAndModel fetchDeviceColorAndModel(String serialNumberOfTheDevice)
{
//fetch Device Color and Device Model from DB.
return new DeviceColorAndModel(color, model);
}
Leaky abstraction is all about encapsulating state. very simple example of leaky abstraction:
$currentTime = new DateTime();
$bankAccount1->setLastRefresh($currentTime);
$bankAccount2->setLastRefresh($currentTime);
$currentTime->setTimestamp($aTimestamp);
class BankAccount {
// ...
public function setLastRefresh(DateTimeImmutable $lastRefresh)
{
$this->lastRefresh = $lastRefresh;
} }
and the right way(not leaky abstraction):
class BankAccount
{
// ...
public function setLastRefresh(DateTime $lastRefresh)
{
$this->lastRefresh = clone $lastRefresh;
}
}
more description here.

Freebase: Format search result to list all properties of object of unknown type(s)

I'm trying to write a MQL query to format a search result in freebase (the "output" parameter in the search API). I essentially want to find the (simple) values of all the properties of a given search result (without knowing anything about the types of the result a priori). By "simple", I mean only the default properties if the values are complex objects.
E.g., if I search for "Yo La Tengo" and this takes me to the result for "/en/yo_la_tengo", I want to be able to get the group's members (I just need names, not instruments or dates started), albums (again, just names), films contributed to (again, just names), etc.
Is there a simple way to do this with a search output query, given that I know nothing about the types? I imagine there's some sort of reflection magic I can use, and I've tried mucking about with "/type/reflect", but I'm not getting anywhere. I'm brand-new to MQL (though I have extensive SQL experience), so this is a little daunting. Any ideas?
Edit: So to clarify, I think the problem I'm seeing is due to mediator types like "performance" (an actor in a film) or "marriage". E.g., with a query about Yo La Tengo, I can see most (all?) information that I'm interested in, but a similar query about [The Muppet Movie]( freebase.com/api/service/search?limit=1&mql_output=%5B%7B%22%2Ftype%2Freflect%2Fany_reverse%22%3A%5B%7B%7D%5D%2C%22%2Ftype%2Freflect%2Fany_master%22%3A%5B%7B%7D%5D%2C%22%2Ftype%2Freflect%2Fany_value%22%3A%5B%7B%7D%5D%7D%5D&query=The%20Muppet%20Movie -- sorry, SO thinks I'm a spammer so I can't make this a link), I don't see Frank Oz reference at all (probably because his performance is referenced instead). Is there a generic way for me to "follow" mediator types to get all their properties? E.g., is there a single output MQL that would allow me to get the actor in a performance (when linked form a film search result) and give the the spouse in a marriage (when linked from a person)?
Querying not only every property, but then following those properties another ply deep in the graph for all search results is going to be an incredibly expensive operation. What is the use case for this? Do you really have a UI where the user can see and effectively absorb all this information? To answer the question directly though, it's not possible to unpack mediator types automatically using mql_output on the search API.
I'd suggest combining a basic set of information on the search query with a deeper set of information on a topic that the user has expressed interest (e.g. by hovering over). This UI experience would be similar to that of Freebase Suggest.
In the years since the question was originally asked there have been some additional useful things added such as the "notable" pseudo-property which lets you see what the topic is notable for.
Of course everyone also needs to be moving to the new API, so the queries would be:
https://www.googleapis.com/freebase/v1/search?query=%22the%20muppet%20movie%22&limit=1&indent=true
https://www.googleapis.com/freebase/v1/topic/en/the_muppet_movie
AFAIK there is no way to do this in outright MQL, but you can:
Get all the properties of an object or type of object, then
Programmatically construct another MQL query to get those objects you want to know more about.
Look at this example:
[{
"type|=": [
"/film/actor",
"/tv/tv_actor",
"/celebrities/celebrity"
],
"*": [{}]
}]​
It grabs all the properties of all objects that have the type actor, tv_actor, or celebrity. When you run it, you'll see all the possible "follow" points you can explore.
This is not exactly what you want, but it should get you closer.

standard way of encoding pagination info on a restful url get?

I think my question would be better explained with a couple of examples...
GET http://myservice/myresource/?name=xxx&country=xxxx&_page=3&_page_len=10&_order=name asc
that is, on the one hand I have conditions ( name=xxx&country=xxxx ) and on the other hand I have parameters affecting the query ( _page=3&_page_len=10&_order=name asc )
now, I though about using some special prefix ( "_" in thes case ) to avoid collisions between conditions and parameters ( what if my resourse has an "order" property? )
is there some standard way to handle these situations?
--
I found this example (just to pick one)
http://www.peej.co.uk/articles/restfully-delicious.html
GET http://del.icio.us/api/peej/bookmarks/?tag=mytag&dt=2009-05-30&start=1&end=2
but in this case condition fields are already defined (there is no start nor end property)
I'm looking for some general solution...
--
edit, a more detailed example to clarify
Each item is completely indepent from one another... let's say that my resources are customers, and that (luckily) I have a couple millions of them in my db.
so the url could be something like
http://myservice/customers/?country=argentina,last_operation=2009-01-01..2010-01-01
It should give me all the customers from argentina that bought anything in the last year
Now I'd like to use this service to build a browse page, or to fill a combo with ajax, for example, so the idea was to add some metada to control what info should I get
to build the browse page I would add
http://...,_page=1,_page_len=10,_order=state,name
and to fill an autosuggest combo with ajax
http://...,_page=1,_page_len=100,_order=state,name,name=what_ever_type_the_user*
to fill the combo with the first 100 customers matching what the user typed...
my question was if there was some standard (written or not) way of encoding this kind of stuff in a restfull url manner...
While there is no standard, Web API Design (by Apigee) is a great book of advice when creating Web APIs. I treat it as a sort of standard, and follow its recommendations whenever I can.
Under "Pagination and partial response" they suggest (page 17):
Use limit and offset
We recommend limit and offset. It is more common, well understood in leading databases, and easy for developers.
/dogs?limit=25&offset=50
There's no standard or convention which defines a way to do this, but using underscores (one or two) to denote meta-info isn't a bad idea. This is what's used to specify member variables by convention in some languages.
Note:
I started writing this as a comment to my previous answer. Then I was going to add it as an edit, but I think that it belongs as a separate answer instead. This is a completely different approach and a separate answer in its own right since it is a different approach.
The more that I have been thinking about this, I think that you really have two different resources that you have to deal with:
A page of resources
Each resource that is collected into the page
I may have missed something (could be... I've been guilty of misinterpretation). Since a page is a resource in its own right, the paging meta-information is really an attribute of the resource so placing it in the URL isn't necessarily the wrong approach. If you consider what can be cached downstream for a page and/or referred to as a resource in the future, the resource is defined by the paging attributes and the query parameters so they should both be in the URL. To continue with my entirely too lengthy response, the page resource would be something like:
http://.../myresource/page-10/3?name=xxx&country=yyy&order=name&orderby=asc
I think that this gets to the core of your original question. If the page itself is a resource, then the URI should describe the page so something like page-10 is my way of saying "a page of 10 items" and the next portion of the page is the page number. The query portion contains the filter.
The other resource names each item that the page contains. How the items are identified should be controlled by what the resources are. I think that a key question is whether the result resources stand on their own or not. How you represent the item resources differs based on this concept.
If the item representations are only appropriate when in the context of the page, then it might be appropriate to include the representation inline. If you do this, then identify them individually and make sure that you can retrieve them using either URI fragment syntax or an additional path element. It seems that the following URLs should result in the fifth item on the third page of ten items:
http://.../myresource/page-10/3?...#5
http://.../myresource/page-10/3/5?...
The largest factor in deciding between these two is how strongly coupled the individual item is with the page. The fragment syntax is considerably more binding than the path element IMHO.
Now, if the item resources are free-standing and the page is simply the result of a query (which I think is likely the case here), then the page resource should be an ordered list of URLs for each item resource. The item resource should be independent of the page resource in this case. You might want to use a URI that is based on the identifying attribute of the item itself. So you might end up with something like:
http://.../myresource/item/42
http://.../myresource/item/307E8599-AD9B-4B32-8612-F8EAF754DFDB
The key deciding factor is whether the items are freestanding resources or not. If they are not, then they are derived from the page URI. If they are freestanding, then they should have their are defined by their own resources and should be included in the page resource as links instead.
I know that the RESTful folk tend to dislike the usage of HTTP headers, but has anyone actually looked into using the HTTP ranges to solve pagination. I wrote a ISAPI extension a few years back that included pagination information along with other non-property information in the URI and I never really like the feel of it. I was thinking about doing something like:
GET http://...?name=xxx&country=xxxx&_orderby=name&_order=asc HTTP/1.1
Range: pageditems=20-29
...
This puts the result set parameters (e.g., _orderby and _order) in the URI and the selection as a Range header. I have a feeling that most HTTP implementations would screw this up though especially since support for non-byte ranges is a MAY in RFC2616. I started thinking more seriously about this after doing a bunch of work with RTSP. The Range header in RTSP is a nice example of extending ranges to handle time as well as bytes.
I guess another way of handling this is to make a separate request for each item on the page as an individual resource in its own right. If your representation allows for this, then you might want to consider it. It is more likely that intermediate caching would work very well with this approach. So your resources would be defined as:
myresource/name=xxx;country=xxx/orderby=name;order=asc/20/
myresource/name=xxx;country=xxx/orderby=name;order=asc/21/
myresource/name=xxx;country=xxx/orderby=name;order=asc/22/
myresource/name=xxx;country=xxx/orderby=name;order=asc/23/
myresource/name=xxx;country=xxx/orderby=name;order=asc/24/
I'm not sure if anyone has tried something like this or not. This would make URIs constructible which is always a useful property IMHO. The bonus to this approach is that the individual responses could be cached and the server is free to optimize handling of collecting pages of items and what not in the most efficient way. The basic idea is to have the client specify the query in the URI and the index of them item that it wants to retrieve. No need to push the idea of a "page" into the resource or even to make it visible. The client can iteratively retrieve objects until it's page is full or it receives a 404.
There is a downside of course... the HTTP server and infrastructure has to support pipelining or the cost of creation/destruction of connections might kill the idea outright.

Resources