Does Accessor of datastax cassandra java driver use pagination? - cassandra

Datastax's java driver for cassandra provides Accessor. Refer here
With reference to their example as below, do they do pagination and fetch records in batches or is there a risk of the queries timing out ?
#Accessor
public interface UserAccessor {
#Query("SELECT * FROM user")
Result<User> getAll();
}
When I say pagination, do they internally do something similar to below
Statement stmt = new SimpleStatement("SELECT * FROM user");
stmt.setFetchSize(24);
ResultSet rs = session.execute(stmt);

Yes there is a fetch size used behind the scenes. The driver will auto page for you as needed.
You will probably want to set a fetch size via #QueryParameters. The default at this time is 5k, see DEFAULT_FETCH_SIZE.

Here is an example of how I am using the fetchSize in the #QueryParameters annotation within an Accessor:
#Accessor
public interface UserAccessor {
#Query("SELECT * FROM users")
#QueryParameters(fetchSize = 1000)
Result<User> getAllUsers();
}

Related

Pagination of the result of a JdbcPagingItemReader limited to the first page

I'm missing some details how to execute the pagination of a SQL Select (of almost 100.000 record)in Spring Batch.
My batch has no parallelism, neither partitioning, neither remote chunking.
It has only execute one query , process every record and writes the result in a CSV file.
It 'snt any custom class of ItemReader or InputStream.
In my class BatchConfig I have my input Bean that prepares the JDBCPagingItemReader
#StepScope
#Bean(name = "myinput")
public JdbcPagingItemReader<MyDTO> input(DataSource dataSource, PagingQueryProvider queryProvider, {other jobparams)){...}
inside I call a method of an object that set the JDBCPagingItemReader to return
public JdbcPagingItemReader<MyDTO> myMethod(/**various params: dataSource, size of the pagination, queryProvider **/){
JdbcPagingItemReader<MyDTO> databaseReader = new JdbcPagingItemReader<MyDTO>();
databaseReader.setDataSource(dataSource);
databaseReader.setPageSize(Integer.parseInt(size));
Map<String, Object> params = new HashMap<String, Object>();
//my jobparams are putted in the params
databaseReader.setParameterValues(params);
databaseReader.setRowMapper(new MyMapper());
databaseReader.setQueryProvider(queryProvider);
return databaseReader;
}
Another class declares the queryProvider
public SqlPagingQueryProviderFactoryBean queryProvider(DataSource dataSource) {
SqlPagingQueryProviderFactoryBean queryProvider = new SqlPagingQueryProviderFactoryBean();
queryProvider.setDataSource(dataSource);
queryProvider.setSelectClause(select().toString());
queryProvider.setFromClause(from().toString());
queryProvider.setWhereClause(where().toString());
queryProvider.setSortKeys(this.sortBy());// I declare only 1 field in descending order
return queryProvider;
}
At this point, I have 2 questions:
I verified that using the same pageSize and modifying the sorting field, the number of record in the final CSV file changes: I read that the sorting field has to be a primary key but my select is about a views, not a physical table: is the primary key in sortby() mandatory in this case?
I verified that the method databaseReader.setPageSize() limit the number of the read record by my SELECT, but I expected a pagination that read all the data. Now the batch read only the first page of result and does'nt move forward.
My idea is to use the partition but I see that is a bit over-engineerized and I'm thinking to neglet some point in my code: do you have sime suggest, please?
I read this question (Spring Batch: JdbcPagingItemReader pagination) and the solution of #Mahmoud Ben Hassine, but unfortunately I can't test in my enviroment because the lack critical mass of datain db.

Insert data into cassandra using datastax driver

We are trying to insert data from CSV file into Cassandra using the DataStax driver for Java. What are the available methods to do so?
We are currently using running cqlsh to load from a CSV file.
The question is quite vague. Usually, you should be able to provide code, and give an example of something that isn't working quite right for you.
That being said, I just taught a class (this week) on this subject for our developers at work. So I can give you some quick examples.
First of all, you should have a separate class built to handle your Cassandra connection objects. I usually build it with a couple of constructors so that it can be called in a couple different ways. But each essentially calls a connect method, which looks something like this:
public void connect(String[] nodes, String user, String pwd, String dc) {
QueryOptions qo = new QueryOptions();
qo.setConsistencyLevel(ConsistencyLevel.LOCAL_ONE);
cluster = Cluster.builder()
.addContactPoints(nodes)
.withCredentials(user,pwd)
.withQueryOptions(qo)
.withLoadBalancingPolicy(
new TokenAwarePolicy(
DCAwareRoundRobinPolicy.builder()
.withLocalDc(dc)
.build()
)
)
.build();
session = cluster.connect();
With that in place, I also write a few simple methods to expose some functionality of the of the session object:
public ResultSet query(String strCQL) {
return session.execute(strCQL);
}
public PreparedStatement prepare(String strCQL) {
return session.prepare(strCQL);
}
public ResultSet query(BoundStatement bStatement) {
return session.execute(bStatement);
}
With those in-place, I can then call these methods from within a service layer. A simple INSERT (with preparing a statement and binding values to it) looks like this:
String[] nodes = {"10.6.8.2","10.6.6.4"};
CassandraConnection conn = new CassandraConnection(nodes, "aploetz", "flynnLives", "West-DC");
String userID = "Aaron";
String value = "whatever";
String strINSERT = "INSERT INTO stackoverflow.timestamptest "
+ "(userid, activetime, value) "
+ "VALUES (?,dateof(now()),?)";
PreparedStatement pIStatement = conn.prepare(strINSERT);
BoundStatement bIStatement = new BoundStatement(pIStatement);
bIStatement.bind(userID, value);
conn.query(bIStatement);
In addition, the DataStax Java Driver has a folder called "examples" in their Git repo. Here's a link to the "basic" examples, which I recommend reading further.

Cassandra Trigger Exception: InvalidQueryException: table of additional mutation does not match primary update table

i am using Cassandra Trigger on a table. I am following the example and loading trigger jar with 'nodetool reloadtriggers'. Then i am using
'CREATE TRIGGER mytrigger ON ..'
command from cqlsh to create trigger on my table.
Adding an entry into that table , my audit table is being populated.
But calling a method from within my Java application, which persists an entry into my table by using
'session.execute(BoundStatement)' i am getting this exception:
InvalidQueryException: table of additional mutation does not match primary update table
Why does the insertion into the table and the audit work when doing it directly with cqlsh and why does it fail when doing pretty much exactly the same with the Java application?
i am using this as AuditTrigger, very simplified(left out all of the other operations other than Row insertion:
public class AuditTrigger implements ITrigger {
private Properties properties = loadProperties();
public Collection<Mutation> augment(Partition update) {
String auditKeyspace = properties.getProperty("keyspace");
String auditTable = properties.getProperty("table");
CFMetaData metadata = Schema.instance.getCFMetaData(auditKeyspace,
auditTable);
PartitionUpdate.SimpleBuilder audit =
PartitionUpdate.simpleBuilder(metadata, UUIDGen.getTimeUUID());
if (row.primaryKeyLivenessInfo().timestamp() != Long.MIN_VALUE) {
// Row Insertion
JSONObject obj = new JSONObject();
obj.put("message_id", update.metadata().getKeyValidator()
.getString(update.partitionKey().getKey()));
audit.row().add("operation", "ROW INSERTION");
}
audit.row().add("keyspace_name", update.metadata().ksName)
.add("table_name", update.metadata().cfName)
.add("primary_key", update.metadata().getKeyValidator()
.getString(update.partitionKey()
.getKey()));
return Collections.singletonList(audit.buildAsMutation());
It seems like using BoundStatement, the trigger fails:
session.execute(boundStatement);
, using a regular cql queryString works though.
session.execute(query)
We are using Boundstatement everywhere within our application though and cannot change that.
Any help would be appreciated.
Thanks

Spring Data JPA Pagination (Pageable) with Dynamic Queries

I have a simple query as follows "select * from USERS". I also use Pageable to enable pagination.
This query may have optional predicates based on the given parameters being null or not.
For example if "code" parameter is given and not null, then the query becomes
"select * from USERS where code = :code";
As far as I know I cannot implement this using #Query annotation. I can implement a custom repository and use EntityManager to create a dynamic query.
However, I am not sure how I can integrate "Pageable" with that to get back paginated results.
How can I achieve this?
This is very easy to do in Spring Data using QueryDSL (as alternative to the criteria API). It is supported out of the box with the following method of QueryDSLPredicateExecutor where you can just pass null as the Predicate if no restrictions are to be applied:
Page<T> findAll(com.mysema.query.types.Predicate predicate,
Pageable pageable)
Using QueryDSL may not be an option for you however if you look at the following series of tutorials you might get some ideas.
http://www.petrikainulainen.net/programming/spring-framework/spring-data-jpa-tutorial-part-nine-conclusions/
The scenario you have is actually discussed by the author in the comments to part 9 of his guide.
Getting page results for querydsl queries is somehow complicated since you need two queries: one for the total number of entries, and one for the list of entries you need in the page.
You could use the following superclass:
public class QueryDslSupport<E, Q extends EntityPathBase<E>> extends QueryDslRepositorySupport {
public QueryDslSupport(Class<E> clazz) {
super(clazz);
}
protected Page<E> readPage(JPAQuery query, Q qEntity, Pageable pageable) {
if (pageable == null) {
return readPage(query, qEntity, new QPageRequest(0, Integer.MAX_VALUE));
}
long total = query.clone(super.getEntityManager()).count(); // need to clone to have a second query, otherwise all items would be in the list
JPQLQuery pagedQuery = getQuerydsl().applyPagination(pageable, query);
List<E> content = total > pageable.getOffset() ? pagedQuery.list(qEntity) : Collections.<E> emptyList();
return new PageImpl<>(content, pageable, total);
}
}
You have to use querydsl and build your where depending on not null parameter for example
BooleanBuilder where = new BooleanBuilder();
...
if(code != null){
where.and(YOURENTITY.code.eq(code));
}
and after execute the query
JPAQuery query = new JPAQuery(entityManager).from(..)
.leftJoin( .. )
...
.where(where)
and use your own page
MaPage<YOURENTITY> page = new MaPage<YOURENTITY>();
page.number = pageNumber+1;
page.content = query.offset(pageNumber*pageSize).limit(pageSize).list(...);
page.totalResult = query.count();
I create MyPage like that
public class MaPage<T> {
public List<T> content;
public int number;
public Long totalResult;
public Long totalPages;
...
}
it works but if in your query you got a fetch then you gonna have this warning
nov. 21, 2014 6:48:54 AM org.hibernate.hql.internal.ast.QueryTranslatorImpl list
WARN: HHH000104: firstResult/maxResults specified with collection fetch; applying in memory!
and it will slow down your request So the solution is to get ride of the fetch and define a #BatchSize(size=10) and use Hibernate.initialize(....) to fetch data in collections and other object type.
Display data from related entities to avoid the lazy initialization exception with setting up #BatchSize
How to execute a JPAQuery with pagination using Spring Data and QueryDSL
The information here is obsolete. Have your Repository implement the QueryDslPredicateExecutor and paging comes for free.

Why no SQL for NHibernate 3 Query?

Why is no SQL being generated when I run my Nhibernate 3 query?
public IQueryable<Chapter> FindAllChapters()
{
using (ISession session = NHibernateHelper.OpenSession())
{
var chapters = session.QueryOver<Chapter>().List();
return chapters.AsQueryable();
}
}
If I run the query below I can see that the SQL that gets created.
public IQueryable<Chapter> FindAllChapters()
{
using (ISession session = NHibernateHelper.OpenSession())
{
var resultDTOs = session.CreateSQLQuery("SELECT Title FROM Chapter")
.AddScalar("Title", NHibernateUtil.String)
.List();
// Convert resultDTOs into IQueryable<Chapter>
}
}
Linq to NHibernate (like Linq to entities) uses delayed execution. You are returning IQueryable<Chapter> which means that you might add further filtering before using the data, so no query is executed.
If you called .ToList() or .List() (i forget which is in the API), then it would actually produce data and execute the query.
In other words, right now you have an unexecuted query.
Added: Also use Query() not QueryOver(). QueryOver is like detached criteria.
For more info, google "delayed execution linq" for articles like this

Resources