Pagination of the result of a JdbcPagingItemReader limited to the first page - pagination

I'm missing some details how to execute the pagination of a SQL Select (of almost 100.000 record)in Spring Batch.
My batch has no parallelism, neither partitioning, neither remote chunking.
It has only execute one query , process every record and writes the result in a CSV file.
It 'snt any custom class of ItemReader or InputStream.
In my class BatchConfig I have my input Bean that prepares the JDBCPagingItemReader
#StepScope
#Bean(name = "myinput")
public JdbcPagingItemReader<MyDTO> input(DataSource dataSource, PagingQueryProvider queryProvider, {other jobparams)){...}
inside I call a method of an object that set the JDBCPagingItemReader to return
public JdbcPagingItemReader<MyDTO> myMethod(/**various params: dataSource, size of the pagination, queryProvider **/){
JdbcPagingItemReader<MyDTO> databaseReader = new JdbcPagingItemReader<MyDTO>();
databaseReader.setDataSource(dataSource);
databaseReader.setPageSize(Integer.parseInt(size));
Map<String, Object> params = new HashMap<String, Object>();
//my jobparams are putted in the params
databaseReader.setParameterValues(params);
databaseReader.setRowMapper(new MyMapper());
databaseReader.setQueryProvider(queryProvider);
return databaseReader;
}
Another class declares the queryProvider
public SqlPagingQueryProviderFactoryBean queryProvider(DataSource dataSource) {
SqlPagingQueryProviderFactoryBean queryProvider = new SqlPagingQueryProviderFactoryBean();
queryProvider.setDataSource(dataSource);
queryProvider.setSelectClause(select().toString());
queryProvider.setFromClause(from().toString());
queryProvider.setWhereClause(where().toString());
queryProvider.setSortKeys(this.sortBy());// I declare only 1 field in descending order
return queryProvider;
}
At this point, I have 2 questions:
I verified that using the same pageSize and modifying the sorting field, the number of record in the final CSV file changes: I read that the sorting field has to be a primary key but my select is about a views, not a physical table: is the primary key in sortby() mandatory in this case?
I verified that the method databaseReader.setPageSize() limit the number of the read record by my SELECT, but I expected a pagination that read all the data. Now the batch read only the first page of result and does'nt move forward.
My idea is to use the partition but I see that is a bit over-engineerized and I'm thinking to neglet some point in my code: do you have sime suggest, please?
I read this question (Spring Batch: JdbcPagingItemReader pagination) and the solution of #Mahmoud Ben Hassine, but unfortunately I can't test in my enviroment because the lack critical mass of datain db.

Related

Cassandra Trigger Exception: InvalidQueryException: table of additional mutation does not match primary update table

i am using Cassandra Trigger on a table. I am following the example and loading trigger jar with 'nodetool reloadtriggers'. Then i am using
'CREATE TRIGGER mytrigger ON ..'
command from cqlsh to create trigger on my table.
Adding an entry into that table , my audit table is being populated.
But calling a method from within my Java application, which persists an entry into my table by using
'session.execute(BoundStatement)' i am getting this exception:
InvalidQueryException: table of additional mutation does not match primary update table
Why does the insertion into the table and the audit work when doing it directly with cqlsh and why does it fail when doing pretty much exactly the same with the Java application?
i am using this as AuditTrigger, very simplified(left out all of the other operations other than Row insertion:
public class AuditTrigger implements ITrigger {
private Properties properties = loadProperties();
public Collection<Mutation> augment(Partition update) {
String auditKeyspace = properties.getProperty("keyspace");
String auditTable = properties.getProperty("table");
CFMetaData metadata = Schema.instance.getCFMetaData(auditKeyspace,
auditTable);
PartitionUpdate.SimpleBuilder audit =
PartitionUpdate.simpleBuilder(metadata, UUIDGen.getTimeUUID());
if (row.primaryKeyLivenessInfo().timestamp() != Long.MIN_VALUE) {
// Row Insertion
JSONObject obj = new JSONObject();
obj.put("message_id", update.metadata().getKeyValidator()
.getString(update.partitionKey().getKey()));
audit.row().add("operation", "ROW INSERTION");
}
audit.row().add("keyspace_name", update.metadata().ksName)
.add("table_name", update.metadata().cfName)
.add("primary_key", update.metadata().getKeyValidator()
.getString(update.partitionKey()
.getKey()));
return Collections.singletonList(audit.buildAsMutation());
It seems like using BoundStatement, the trigger fails:
session.execute(boundStatement);
, using a regular cql queryString works though.
session.execute(query)
We are using Boundstatement everywhere within our application though and cannot change that.
Any help would be appreciated.
Thanks

how can i fill controller with data - SQL Function in Entity Framework?

i have create an sql function in my database that take to Date params and get data from 5 tables.
after that add it to project as entity framework from database and the code generated is:
[DbFunction("Dr_EmploEntities", "SelectEmployee")]
public virtual IQueryable SelectEmployee(Nullable frm_date, Nullable to_date)
{
var frm_dateParameter = frm_date.HasValue ?
new ObjectParameter("frm_date", frm_date) :
new ObjectParameter("frm_date", typeof(DateTime));
var to_dateParameter = to_date.HasValue ?
new ObjectParameter("to_date", to_date) :
new ObjectParameter("to_date", typeof(DateTime));
return ((IObjectContextAdapter)this).ObjectContext.CreateQuery("[Dr_EmploEntities].[SelectEmployee](#frm_date, #to_date)", frm_dateParameter, to_dateParameter);
}
public DbSet SelectEmployee_Result { get; set; }
as you see i have now "SelectEmployee_Result" that don't take any params, and "SelectEmployee" that take two date params.
after that i have create an controller for "SelectEmployee_Result" class.
after that i run my project Index View that working with "SelectEmployee_Result" class give me err:
"The type 'SelectEmployee_Result' is mapped as a complex type. The Set method, DbSet objects, and DbEntityEntry objects can only be used with entity types, not complex types."
and i make breakpoint and see that "SelectEmployee_Result" has no data so i change the Index Code in controller and fill "SelectEmployee" with two date params
and when run got same err msg too.
so how can i fill "SelectEmployee_Result" from the beginning with data between two dates to let me use it in all views ?
all what i need here is view data i got i edit before saving it in database Like using DataTable but i need to do that from Entity with sql function
and what is difference between "SelectEmployee" that is my function name and that is need two params and "SelectEmployee_Result"?

Document not available in query direct after store

I'm trying to store a "Role" object and then get a list of Roles, as shown here:
public class Role
{
public Guid RoleId { get; set; }
public string RoleName { get; set; }
public string RoleDescription { get; set; }
}
//Function store:
private void StoreRole(Role role)
{
using (var docSession = docStore.OpenSession())
{
docSession.Store(role);
docSession.SaveChanges();
}
}
// then it return and a function calls this
public List<Role> GetRoles()
{
using (var docSession = docStore.OpenSession())
{
var Roles = from roles in docSession.Query<Role>() select roles;
return Roles.ToList();
}
}
However, in the GetRoles I am missing the last inserted record/document. If I wait 200ms and then call this function the item is there.
So I am not in sync. ?!
How can I solve this, or alternately how could I know when the result is in the document store for querying?
I've used transactions, but cannot figure this out. Update and delete are just fine, but when inserting I need to delay my 'List' call.
You are treating RavenDB as if it is a relational database, and it isn't. Load and Store are ACID operations in RavenDB, Query is not. Indexes (necessary for queries) are updated asynchronously, and in fact, temporary indexes may have to be built from scratch when you do a session.Query<T>() without a durable index specified. So, if you are trying to query for information you JUST stored, or if you are doing the FIRST query that requires a temporary index to be created, you probably won't get the data you expect.
There are methods of customizing your query to wait for non-stale results but you shouldn't lean on these too much because they're indicative of a bad design - it is better to figure out a better way to do the same thing in a way that embraces eventual consistency, either changing your model (so you get consistency via Load/Store - perhaps you could have one document that defines ALL of the roles in a list?) or by changing the application flow so you don't need to Store and then immediately Query.
An additional way of solving this is to query the index with WaitForNonStaleResultsAsOfLastWrite() turned on inside the save function. That way when the save is completed the index will be updated to at least include the change you just made.
You can read more about this here

Spring Data JPA Pagination (Pageable) with Dynamic Queries

I have a simple query as follows "select * from USERS". I also use Pageable to enable pagination.
This query may have optional predicates based on the given parameters being null or not.
For example if "code" parameter is given and not null, then the query becomes
"select * from USERS where code = :code";
As far as I know I cannot implement this using #Query annotation. I can implement a custom repository and use EntityManager to create a dynamic query.
However, I am not sure how I can integrate "Pageable" with that to get back paginated results.
How can I achieve this?
This is very easy to do in Spring Data using QueryDSL (as alternative to the criteria API). It is supported out of the box with the following method of QueryDSLPredicateExecutor where you can just pass null as the Predicate if no restrictions are to be applied:
Page<T> findAll(com.mysema.query.types.Predicate predicate,
Pageable pageable)
Using QueryDSL may not be an option for you however if you look at the following series of tutorials you might get some ideas.
http://www.petrikainulainen.net/programming/spring-framework/spring-data-jpa-tutorial-part-nine-conclusions/
The scenario you have is actually discussed by the author in the comments to part 9 of his guide.
Getting page results for querydsl queries is somehow complicated since you need two queries: one for the total number of entries, and one for the list of entries you need in the page.
You could use the following superclass:
public class QueryDslSupport<E, Q extends EntityPathBase<E>> extends QueryDslRepositorySupport {
public QueryDslSupport(Class<E> clazz) {
super(clazz);
}
protected Page<E> readPage(JPAQuery query, Q qEntity, Pageable pageable) {
if (pageable == null) {
return readPage(query, qEntity, new QPageRequest(0, Integer.MAX_VALUE));
}
long total = query.clone(super.getEntityManager()).count(); // need to clone to have a second query, otherwise all items would be in the list
JPQLQuery pagedQuery = getQuerydsl().applyPagination(pageable, query);
List<E> content = total > pageable.getOffset() ? pagedQuery.list(qEntity) : Collections.<E> emptyList();
return new PageImpl<>(content, pageable, total);
}
}
You have to use querydsl and build your where depending on not null parameter for example
BooleanBuilder where = new BooleanBuilder();
...
if(code != null){
where.and(YOURENTITY.code.eq(code));
}
and after execute the query
JPAQuery query = new JPAQuery(entityManager).from(..)
.leftJoin( .. )
...
.where(where)
and use your own page
MaPage<YOURENTITY> page = new MaPage<YOURENTITY>();
page.number = pageNumber+1;
page.content = query.offset(pageNumber*pageSize).limit(pageSize).list(...);
page.totalResult = query.count();
I create MyPage like that
public class MaPage<T> {
public List<T> content;
public int number;
public Long totalResult;
public Long totalPages;
...
}
it works but if in your query you got a fetch then you gonna have this warning
nov. 21, 2014 6:48:54 AM org.hibernate.hql.internal.ast.QueryTranslatorImpl list
WARN: HHH000104: firstResult/maxResults specified with collection fetch; applying in memory!
and it will slow down your request So the solution is to get ride of the fetch and define a #BatchSize(size=10) and use Hibernate.initialize(....) to fetch data in collections and other object type.
Display data from related entities to avoid the lazy initialization exception with setting up #BatchSize
How to execute a JPAQuery with pagination using Spring Data and QueryDSL
The information here is obsolete. Have your Repository implement the QueryDslPredicateExecutor and paging comes for free.

Subsonic - Where do i include my busines logic or custom validation

Im using subsonic 2.2
I tried asking this question another way but didnt get the answer i was looking for.
Basically i ususally include validation at page level or in my code behind for my user controls or aspx pages. However i haev seen some small bits of info advising this can be done within partial classes generated from subsonic.
So my question is, where do i put these, are there particular events i add my validation / business logic into such as inserting, or updating. - If so, and validation isnt met, how do i stop the insert or update. And if anyone has a code example of how this looks it would be great to start me off.
Any info greatly appreciated.
First you should create a partial class for you DAL object you want to use.
In my project I have a folder Generated where the generated classes live in and I have another folder Extended.
Let's say you have a Subsonic generated class Product. Create a new file Product.cs in your Extended (or whatever) folder an create a partial class Product and ensure that the namespace matches the subsonic generated classes namespace.
namespace Your.Namespace.DAL
{
public partial class Product
{
}
}
Now you have the ability to extend the product class. The interesting part ist that subsonic offers some methods to override.
namespace Your.Namespace.DAL
{
public partial class Product
{
public override bool Validate()
{
ValidateColumnSettings();
if (string.IsNullOrEmpty(this.ProductName))
this.Errors.Add("ProductName cannot be empty");
return Errors.Count == 0;
}
// another way
protected override void BeforeValidate()
{
if (string.IsNullOrEmpty(this.ProductName))
throw new Exception("ProductName cannot be empty");
}
protected override void BeforeInsert()
{
this.ProductUUID = Guid.NewGuid().ToString();
}
protected override void BeforeUpdate()
{
this.Total = this.Net + this.Tax;
}
protected override void AfterCommit()
{
DB.Update<ProductSales>()
.Set(ProductSales.ProductName).EqualTo(this.ProductName)
.Where(ProductSales.ProductId).IsEqualTo(this.ProductId)
.Execute();
}
}
}
In response to Dan's question:
First, have a look here: http://github.com/subsonic/SubSonic-2.0/blob/master/SubSonic/ActiveRecord/ActiveRecord.cs
In this file lives the whole logic I showed in my other post.
Validate: Is called during Save(), if Validate() returns false an exception is thrown.
Get's only called if the Property ValidateWhenSaving (which is a constant so you have to recompile SubSonic to change it) is true (default)
BeforeValidate: Is called during Save() when ValidateWhenSaving is true. Does nothing by default
BeforeInsert: Is called during Save() if the record is new. Does nothing by default.
BeforeUpdate: Is called during Save() if the record is new. Does nothing by default.
AfterCommit: Is called after sucessfully inserting/updating a record. Does nothing by default.
In my Validate() example, I first let the default ValidatColumnSettings() method run, which will add errors like "Maximum String lenght exceeded for column ProductName" if product name is longer than the value defined in the database. Then I add another errorstring if ProductName is empty and return false if the overall error count is bigger than zero.
This will throw an exception during Save() so you can't store the record in the DB.
I would suggest you call Validate() yourself and if it returns false you display the elements of this.Errors at the bottom of the page (the easy way) or (more elegant) you create a Dictionary<string, string> where the key is the columnname and the value is the reason.
private Dictionary<string, string> CustomErrors = new Dictionary<string, string>
protected override bool Validate()
{
this.CustomErrors.Clear();
ValidateColumnSettings();
if (string.IsNullOrEmpty(this.ProductName))
this.CustomErrors.Add(this.Columns.ProductName, "cannot be empty");
if (this.UnitPrice < 0)
this.CustomErrors.Add(this.Columns.UnitPrice, "has to be 0 or bigger");
return this.CustomErrors.Count == 0 && Errors.Count == 0;
}
Then if Validate() returns false you can add the reason directly besides/below the right field in your webpage.
If Validate() returns true you can safely call Save() but keep in mind that Save() could throw other errors during persistance like "Dublicate Key ...";
Thanks for the response, but can you confirm this for me as im alittle confused, if your validating the column (ProductName) value within validate() or the beforevalidate() is string empty or NULL, doesnt this mean that the insert / update has already been actioned, as otherwise it wouldnt know that youve tried to insert or update a null value from the UI / aspx fields within the page to the column??
Also, within asp.net insert or updating events we use e.cancel = true to stop the insert update, if beforevalidate failes does it automatically stop the action to insert or update?
If this is the case, isnt it eaiser to add page level validation to stop the insert or update being fired in the first place.
I guess im alittle confused at the lifecyle for these methods and when they come into play

Resources