We want to implement a character counter in our Javascript data entry form, so the user gets immediate keystroke feedback as to how many characters he has typed and how many he has left (something like "25/100", indicating current string length is 25 and 100 is the max allowed).
To do this, I would like to write a service that returns a list of dto property names and their max allowed lengths.
{Name='SmallComment', MaxLength=128}
{Name='BigComment', MaxLength=512}
The best way I can think of to do this would be to create an instance of the validator for that dto and iterate through it to pull out the .Length(min,max) rules. I had other ideas as well, like storing the max lengths in an attribute, but this would require rewriting all the validators to set up the rules based on the attributes.
Whatever solution is best, the goal is to store the max length for each property in a single place, so that changing that length affects the validation rule and the service data passed down to the javascript client.
If you want to maintain a single source of reference for both client/server I would take a metadata approach and provide a Service that returns the max lengths to the client for all types, something like:
public class ValidationMetadataServices : Service
{
public object Any(GetFieldMaxLengths request)
{
return new GetFieldMaxLengthsResponse {
Type1 = GetFieldMaxLengths<Type1>(),
Type2 = GetFieldMaxLengths<Type2>(),
Type3 = GetFieldMaxLengths<Type3>(),
};
}
static Dictionary<string,int> GetFieldMaxLengths<T>()
{
var to = new Dictionary<string,int>();
typeof(T).GetPublicProperties()
.Where(p => p.FirstAttribute<StringLengthAttribute>() != null)
.Each(p => to[p.PropertyName] =
p.FirstAttribute<StringLengthAttribute>().MaximumLength);
return to;
}
}
But FluentValidation uses Static properties so that would require manually specifying a rule for each property that validates against the length from the property metadata attribute.
Related
Suppose I have two tables USER_GROUP and USER_GROUP_DATASOURCE. I have a classic relation where one userGroup can have multiple dataSources and one DataSource simply is a String.
Due to some reasons, I have a custom RecordMapper creating a Java UserGroup POJO. (Mainly compatibility with the other code in the codebase, always being explicit on whats happening). This mapper sometimes creates simply POJOs containing data only from the USER_GROUP table, sometimes also the left joined dataSources.
Currently, I am trying to write the Multiset query along with the custom record mapper. My query thus far looks like this:
List<UserGroup> = ctx
.select(
asterisk(),
multiset(select(USER_GROUP_DATASOURCE.DATASOURCE_ID)
.from(USER_GROUP_DATASOURCE)
.where(USER_GROUP.ID.eq(USER_GROUP_DATASOURCE.USER_GROUP_ID))
).as("datasources").convertFrom(r -> r.map(Record1::value1))
)
.from(USER_GROUP)
.where(condition)
.fetch(new UserGroupMapper()))
Now my question is: How to create the UserGroupMapper? I am stuck right here:
public class UserGroupMapper implements RecordMapper<Record, UserGroup> {
#Override
public UserGroup map(Record rec) {
UserGroup grp = new UserGroup(rec.getValue(USER_GROUP.ID),
rec.getValue(USER_GROUP.NAME),
rec.getValue(USER_GROUP.DESCRIPTION)
javaParseTags(USER_GROUP.TAGS)
);
// Convention: if we have an additional field "datasources", we assume it to be a list of dataSources to be filled in
if (rec.indexOf("datasources") >= 0) {
// How to make `rec.getValue` return my List<String>????
List<String> dataSources = ?????
grp.dataSources.addAll(dataSources);
}
}
My guess is to have something like List<String> dataSources = rec.getValue(..) where I pass in a Field<List<String>> but I have no clue how I could create such Field<List<String>> with something like DSL.field().
How to get a type safe reference to your field from your RecordMapper
There are mostly two ways to do this:
Keep a reference to your multiset() field definition somewhere, and reuse that. Keep in mind that every jOOQ query is a dynamic SQL query, so you can use this feature of jOOQ to assign arbitrary query fragments to local variables (or return them from methods), in order to improve code reuse
You can just raw type cast the value, and not care about type safety. It's always an option, evne if not the cleanest one.
How to improve your query
Unless you're re-using that RecordMapper several times for different types of queries, why not do use Java's type inference instead? The main reason why you're not getting type information in your output is because of your asterisk() usage. But what if you did this instead:
List<UserGroup> = ctx
.select(
USER_GROUP, // Instead of asterisk()
multiset(
select(USER_GROUP_DATASOURCE.DATASOURCE_ID)
.from(USER_GROUP_DATASOURCE)
.where(USER_GROUP.ID.eq(USER_GROUP_DATASOURCE.USER_GROUP_ID))
).as("datasources").convertFrom(r -> r.map(Record1::value1))
)
.from(USER_GROUP)
.where(condition)
.fetch(r -> {
UserGroupRecord ug = r.value1();
List<String> list = r.value2(); // Type information available now
// ...
})
There are other ways than the above, which is using jOOQ 3.17+'s support for Table as SelectField. E.g. in jOOQ 3.16+, you can use row(USER_GROUP.fields()).
The important part is that you avoid the asterisk() expression, which removes type safety. You could even convert the USER_GROUP to your UserGroup type using USER_GROUP.convertFrom(r -> ...) when you project it:
List<UserGroup> = ctx
.select(
USER_GROUP.convertFrom(r -> ...),
// ...
I'm new to JOOQ and currently fail to map a joined query to Map<K, List<V>>: the list always only contains one element.
Here's my code:
DSL.using(...)
.select(ORDER.fields())
.select(ORDER_ITEM_ARTICLE.fields())
.from(ORDER)
.leftOuterJoin(ORDER_ITEM_ARTICLE).on(ORDER.ID.eq(ORDER_ITEM_ARTICLE.ORDER_ID))
// to Map<InOutOrder, List<OrderItemArticle>>
.fetchGroups(
r -> r.into(ORDER).into(InOutOrder.class),
r -> r.into(ORDER_ITEM_ARTICLE).into(OrderItemArticle.class)
)
// map to InOutOrder
.entrySet().stream().map( e -> {
// e.getValue() always returns list with only 1 element?!
e.getKey().articles = e.getValue();
return e.getKey();
})
.collect(Collectors.toList())
;
Say I have 1 row in ORDER and 2 corresponding rows in ORDER_ITEM_ARTICLE. Running the SQL returned by .getSQL() (after .fetchGroups()), returns me 2 rows as expected, so I assumed the fetchGroups() call will populate my list with two entries as well?!
What am I missing?
Thanks!
Update:
As requested, the InOutOrder class:
public class InOutOrder extends Order {
public List<OrderItemArticle> articles;
public List<OrderItemOther> others;
public List<OrderItemCost> costs;
public List<OrderContact> contacts;
public List<EmailJob> emailJobs;
}
So this is just an extension of the JOOQ POJO class and is used for JSON communication with the API clients...
fetchGroups() simply puts objects in a LinkedHashMap. You have to adhere to the usual Map contract, which means implementing equals() and hashCode(). Without it, each object you're creating (or which jOOQ is creating for you) will use identity comparison, so you get every "value" only once in the result.
I have the following c# code:
private XElement BuildXmlBlob(string id, Part part, out int counter)
{
// return some unique xml particular to the parameters passed
// remember to increment the counter also before returning.
}
Which is called by:
var counter = 0;
result.AddRange(from rec in listOfRecordings
from par in rec.Parts
let id = GetId("mods", rec.CKey + par.UniqueId)
select BuildXmlBlob(id, par, counter));
Above code samples are symbolic of what I am trying to achieve.
According to the Eric Lippert, the out keyword and linq does not mix. OK fair enough but can someone help me refactor the above so it does work? A colleague at work mentioned accumulator and aggregate functions but I am novice to Linq and my google searches were bearing any real fruit so I thought I would ask here :).
To Clarify:
I am counting the number of parts I might have which could be any number of them each time the code is called. So every time the BuildXmlBlob() method is called, the resulting xml produced will have a unique element in there denoting the 'partNumber'.
So if the counter is currently on 7, that means we are processing 7th part so far!! That means XML returned from BuildXmlBlob() will have the counter value embedded in there somewhere. That's why I need it somehow to be passed and incremented every time the BuildXmlBlob() is called per run through.
If you want to keep this purely in LINQ and you need to maintain a running count for use within your queries, the cleanest way to do so would be to make use of the Select() overloads that includes the index in the query to get the current index.
In this case, it would be cleaner to do a query which collects the inputs first, then use the overload to do the projection.
var inputs =
from recording in listOfRecordings
from part in recording.Parts
select new
{
Id = GetId("mods", recording.CKey + part.UniqueId),
Part = part,
};
result.AddRange(inputs.Select((x, i) => BuildXmlBlob(x.Id, x.Part, i)));
Then you wouldn't need to use the out/ref parameter.
XElement BuildXmlBlob(string id, Part part, int counter)
{
// implementation
}
Below is what I managed to figure out on my own:.
result.AddRange(listOfRecordings.SelectMany(rec => rec.Parts, (rec, par) => new {rec, par})
.Select(#t => new
{
#t,
Id = GetStructMapItemId("mods", #t.rec.CKey + #t.par.UniqueId)
})
.Select((#t, i) => BuildPartsDmdSec(#t.Id, #t.#t.par, i)));
I used resharper to convert it into a method chain which constructed the basics for what I needed and then i simply tacked on the select statement right at the end.
As a part of a project, me and a few others are currently working on a URL classifier. What we are trying to implement is actually quite simple : we simply look at the URL and find relevant keywords occuring within it and classify the page accordingly.
Eg : If the url is : http://cnnworld/sports/abcd, we would classify it under the category "sports"
To accomplish this, we have a database with mappings of the format : Keyword -> Category
Now what we are currently doing is, for each URL, we keep reading all the data items within the database, and using String.find() method to see if the keyword occurs within the URL. Once this is found, we stop.
But this approach has a few problems, the main ones being :
(i) Our database is very big and such repeated querying runs extremely slowly
(ii) A page may belong to more than one category and our approach does not handle such cases. Of-course, one simple way to ensure this would be to continue querying the database even once a category match is found, but this would only make things even slower.
I was thinking of alternatives and was wondering if the reverse could be done - Parse the url, find words occuring within it and then query the database for those words only.
A naive algorithm for this would run in O( n^2 ) - query the database for all substrings that occur within the url.
I was wondering if there was any better approach to accomplish this. Any ideas ?? Thank you in advance :)
In our commercial classifier we have a database of 4m keywords :) and we also search the body of the HTML, there are number of ways to solve this:
Use Aho-Corasick, we have used a modified algorithm specially to work with web content, for example treat: tab, space, \r, \n as space, as only one, so two spaces would be considered as one space, and also ignore lower/upper case.
Another option is to put all your keywords inside a tree (std::map for example) so the search becomes very fast, the downside is that this takes memory, and a lot, but if it's on a server, you wouldn't feel it.
I think your suggestion of breaking apart the URL to find useful bits and then querying for just those items sounds like a decent way to go.
I tossed together some Java that might help illustrate code-wise what I think this would entail. The most valuable portions are probably the regexes, but I hope the general algorithm of it helps some as well:
import java.io.UnsupportedEncodingException;
import java.net.URLDecoder;
import java.util.List;
public class CategoryParser
{
/** The db field that keywords should be checked against */
private static final String DB_KEYWORD_FIELD_NAME = "keyword";
/** The db field that categories should be pulled from */
private static final String DB_CATEGORY_FIELD_NAME = "category";
/** The name of the table to query */
private static final String DB_TABLE_NAME = "KeywordCategoryMap";
/**
* This method takes a URL and from that text alone determines what categories that URL belongs in.
* #param url - String URL to categorize
* #return categories - A List<String&rt; of categories the URL seemingly belongs in
*/
public static List<String> getCategoriesFromUrl(String url) {
// Clean the URL to remove useless bits and encoding artifacts
String normalizedUrl = normalizeURL(url);
// Break the url apart and get the good stuff
String[] keywords = tokenizeURL(normalizedUrl);
// Construct the query we can query the database with
String query = constructKeywordCategoryQuery(keywords);
System.out.println("Generated Query: " + query);
// At this point, you'd need to fire this query off to your database,
// and the results you'd get back should each be a valid category
// for your URL. This code is not provided because it's very implementation specific,
// and you already know how to deal with databases.
// Returning null to make this compile, even though you'd obviously want to return the
// actual List of Strings
return null;
}
/**
* Removes the protocol, if it exists, from the front and
* removes any random encoding characters
* Extend this to do other url cleaning/pre-processing
* #param url - The String URL to normalize
* #return normalizedUrl - The String URL that has no junk or surprises
*/
private static String normalizeURL(String url)
{
// Decode URL to remove any %20 type stuff
String normalizedUrl = url;
try {
// I've used a URLDecoder that's part of Java here,
// but this functionality exists in most modern languages
// and is universally called url decoding
normalizedUrl = URLDecoder.decode(url, "UTF-8");
}
catch(UnsupportedEncodingException uee)
{
System.err.println("Unable to Decode URL. Decoding skipped.");
uee.printStackTrace();
}
// Remove the protocol, http:// ftp:// or similar from the front
if (normalizedUrl.contains("://"))
{
normalizedUrl = normalizedUrl.split(":\\/\\/")[1];
}
// Room here to do more pre-processing
return normalizedUrl;
}
/**
* Takes apart the url into the pieces that make at least some sense
* This doesn't guarantee that each token is a potentially valid keyword, however
* because that would require actually iterating over them again, which might be
* seen as a waste.
* #param url - Url to be tokenized
* #return tokens - A String array of all the tokens
*/
private static String[] tokenizeURL(String url)
{
// I assume that we're going to use the whole URL to find tokens in
// If you want to just look in the GET parameters, or you want to ignore the domain
// or you want to use the domain as a token itself, that would have to be
// processed above the next line, and only the remaining parts split
String[] tokens = url.split("\\b|_");
// One could alternatively use a more complex regex to remove more invalid matches
// but this is subject to your (?:in)?ability to actually write the regex you want
// These next two get rid of tokens that are too short, also.
// Destroys anything that's not alphanumeric and things that are
// alphanumeric but only 1 character long
//String[] tokens = url.split("(?:[\\W_]+\\w)*[\\W_]+");
// Destroys anything that's not alphanumeric and things that are
// alphanumeric but only 1 or 2 characters long
//String[] tokens = url.split("(?:[\\W_]+\\w{1,2})*[\\W_]+");
return tokens;
}
private static String constructKeywordCategoryQuery(String[] keywords)
{
// This will hold our WHERE body, keyword OR keyword2 OR keyword3
StringBuilder whereItems = new StringBuilder();
// Potential query, if we find anything valid
String query = null;
// Iterate over every found token
for (String keyword : keywords)
{
// Reject invalid keywords
if (isKeywordValid(keyword))
{
// If we need an OR
if (whereItems.length() > 0)
{
whereItems.append(" OR ");
}
// Simply append this item to the query
// Yields something like "keyword='thisKeyword'"
whereItems.append(DB_KEYWORD_FIELD_NAME);
whereItems.append("='");
whereItems.append(keyword);
whereItems.append("'");
}
}
// If a valid keyword actually made it into the query
if (whereItems.length() > 0)
{
query = "SELECT DISTINCT(" + DB_CATEGORY_FIELD_NAME + ") FROM " + DB_TABLE_NAME
+ " WHERE " + whereItems.toString() + ";";
}
return query;
}
private static boolean isKeywordValid(String keyword)
{
// Keywords better be at least 2 characters long
return keyword.length() > 1
// And they better be only composed of letters and numbers
&& keyword.matches("\\w+")
// And they better not be *just* numbers
// && !keyword.matches("\\d+") // If you want this
;
}
// How this would be used
public static void main(String[] args)
{
List<String> soQuestionUrlClassifications = getCategoriesFromUrl("http://stackoverflow.com/questions/10046178/pattern-matching-for-url-classification");
List<String> googleQueryURLClassifications = getCategoriesFromUrl("https://www.google.com/search?sugexp=chrome,mod=18&sourceid=chrome&ie=UTF-8&q=spring+is+a+new+service+instance+created#hl=en&sugexp=ciatsh&gs_nf=1&gs_mss=spring%20is%20a%20new%20bean%20instance%20created&tok=lnAt2g0iy8CWkY65Te75sg&pq=spring%20is%20a%20new%20bean%20instance%20created&cp=6&gs_id=1l&xhr=t&q=urlencode&pf=p&safe=off&sclient=psy-ab&oq=url+en&gs_l=&pbx=1&bav=on.2,or.r_gc.r_pw.r_cp.r_qf.,cf.osb&fp=2176d1af1be1f17d&biw=1680&bih=965");
}
}
The Generated Query for the SO link would look like:
SELECT DISTINCT(category) FROM KeywordCategoryMap WHERE keyword='stackoverflow' OR keyword='com' OR keyword='questions' OR keyword='10046178' OR keyword='pattern' OR keyword='matching' OR keyword='for' OR keyword='url' OR keyword='classification'
Plenty of room for optimization, but I imagine it to be much faster than checking the string for every possible keyword.
Aho-corasick algorithm is best for searching intermediate string with one traversal. You can form a tree (aho-corasick tree) of your keyword. At the last node contains a number mapped with a particular keyword.
Now, You just need to traverse the URL string on the tree. When you got some number (work as flag in our scenario), it means that we got some mapped category. Go on with that number on hash map and find respective category for further use.
I think this will help you.
Go to this link: good animation of aho-corasick by ivan
If you have (many) fewer categories than keywords, you could create a regex for each category, where it would match any of the keywords for that category. Then you'd run your URL against each category's regex. This would also address the issue of matching multiple categories.
I have an application that requires mappings between string values, so essentially a container that can hold key values pairs. Instead of using a dictionary or a name-value collection I used a resource file that I access programmatically in my code. I understand resource files are used in localization scenarios for multi-language implementations and the likes. However I like their strongly typed nature which ensures that if the value is changed the application does not compile.
However I would like to know if there are any important cons of using a *.resx file for simple key-value pair storage instead of using a more traditional programmatic type.
There are two cons which I can think of out of the blue:
it requires I/O operation to read key/value pair, which may result in significant performance decrease,
if you let standard .Net logic to resolve loading resources, it will always try to find the file corresponding to CultureInfo.CurrentUICulture property; this could be problematic if you decide that you actually want to have multiple resx-es (i.e. one per language); this could result in even further performance degradation.
BTW. Couldn't you just create helper class or structure containing properties, like that:
public static class GlobalConstants
{
private const int _SomeInt = 42;
private const string _SomeString = "Ultimate answer";
public static int SomeInt
{
get
{
return _SomeInt;
}
}
public static string SomeString
{
get
{
return _SomeString;
}
}
}
You can then access these properties exactly the same way, as resource files (I am assuming that you're used to this style):
textBox1.Text = GlobalConstants.SomeString;
textBox1.Top = GlobalConstants.SomeInt;
Maybe it is not the best thing to do, but I firmly believe this is still better than using resource file for that...