Accumulo: Equivalent of MongoDB createIndex() - accumulo

I am new to Accumulo and I am trying to retrieve all rowIDs corresponding to a column family/qualifier. In MongoDB, this could be done by creating an index on the field using createIndex(). IS there any way of doing the same in Accumulo?

Extracting RowIds from Column Family and Qualifier can be done in the above mentioned way its recommended to make groups of family for better performance in this case.
And for searching on Values , I recomend you to make reverse Indexes

RowIds in Accumulo can be retrieved in you know the family or qualifier
Code for Java
Scanner scan = conn.createScanner("tableName", new Authorizations());
//for scanning on only Column Family
scan.fetchColumnFamily(new Text("CF"));
//for Scanning on both family and qualifier
scan.fetchColumn(new Text("CF"),new Text("CQ"));
for (Entry<Key, Value> entry : scan) {
System.out.println(entry.getKey().getRow());
}
Accumulo Shell
After selecting the specific table, use cammand
scan -c CF:CQ

Related

Howto expose a native SQL function as a predicate

I have a table in my database which stores a list of string values as a jsonb field.
create table rabbits_json (
rabbit_id bigserial primary key,
name text,
info jsonb not null
);
insert into rabbits_json (name, info) values
('Henry','["lettuce","carrots"]'),
('Herald','["carrots","zucchini"]'),
('Helen','["lettuce","cheese"]');
I want to filter my rows checking if info contains a given value.
In SQL, I would use ? operator:
select * from rabbits_json where info ? 'carrots';
If my googling skills are fine today, I believe that this is not implemented yet in JOOQ:
https://github.com/jOOQ/jOOQ/issues/9997
How can I use a native predicate in my query to write an equivalent query in JOOQ?
For anything that's not supported natively in jOOQ, you should use plain SQL templating, e.g.
Condition condition = DSL.condition("{0} ? {1}", RABBITS_JSON.INFO, DSL.val("carrots"));
Unfortunately, in this specific case, you will run into this issue here. With JDBC PreparedStatement, you still cannot use ? for other usages than bind variables. As a workaround, you can:
Use Settings.statementType == STATIC_STATEMENT to prevent using a PreparedStatement in this case
Use the jsonb_exists_any function (not indexable) instead of ?, see https://stackoverflow.com/a/38370973/521799

unable to upsert using java datastax driver

I am unable to upsert a row using the datastax driver.
The data in the Cassandra table is stored like follows:
tag | partition_info
------------+--------------------------------------------------
sometag | {{year: 2018, month: 1}, {year: 2018, month: 2}}
tag is primary key and partition_info is a UDT
CREATE TYPE codingjedi.tag_partitions (
year bigint,
month bigint
);
I want that if a tag doesn't exist then it gets created. If tag exists then the new udt value gets appended to old one. I suppose I cannot use insert as it overrides previous value i.e. this will not work
QueryBuilder.insertInto(tableName).value("tag",model.tag)
.value("partition_info",setAsJavaSet(Set(partitionsInfo)))
I am trying to use update but it isn't working. Datastax driver gives error java.lang.IllegalArgumentException for following query
QueryBuilder.update(tableName).`with`(QueryBuilder.append("partition_info",setAsJavaSet(Set(partitionsInfo))))
.where(QueryBuilder.eq("tag", id.tag))
I tried using add and append for primary key but but got the error PRIMARY KEY part tag found in SET part
QueryBuilder.update(tableName).`with`(QueryBuilder.add("tag",id.tag))
.and(QueryBuilder.append("partition_info",setAsJavaSet(Set(partitionsInfo)))) .where(QueryBuilder.eq("tag", id.tag))
You're using the incorrect operation in your update statement - you're using append, but it's used to append data to columns of list types. You can use instead either add if you're adding a single value (your case, so you wont even need to wrap data into Set explicitly), or addAll if you're adding multiple values.
QueryBuilder.update(tableName)
.`with`(QueryBuilder.add("partition_info", partitionsInfo))
.where(QueryBuilder.eq("tag", id.tag))

Neo4j-php retrieve node

I have been exclusively using cypher queries of this client for Neo4j because there is no out of the box way of doing many things. One of those id to get nodes. There is no way to retrieve them without knowing their id, which is very low level. Any idea on how to run a
$client->findOne('property','value');
?
It should be straightforward but it isn't from the documentation.
Make Indexes on the properties you want to search, from a newly created $personNode
$personIndex = new \Everyman\Neo4j\NodeIndex($client, 'person');
$personIndex->add($personNode, 'name', $personNode->name);
Then later to search, the new PHP object $personIndex will reference the same, populated index as above.
$personIndex = new \Everyman\Neo4j\NodeIndex($client, 'person');
$match = $personIndex->findOne('name', 'edoceo');

How to check if a Cassandra table exists

Is there an easy way to check if table (column family) is defined in Cassandra using CQL (or API perhaps, using com.datastax.driver)?
Right now I am leaning towards executing SELECT 1 FROM table and checking for exception but maybe there is a better way?
As of 1.1 you should be able to query the system keyspace, schema_columnfamilies column family. If you know which keyspace you want to check, this CQL should list all column families in a keyspace:
SELECT columnfamily_name
FROM schema_columnfamilies WHERE keyspace_name='myKeyspaceName';
The report describing this functionality is here: https://issues.apache.org/jira/browse/CASSANDRA-2477
Although, they do note that some of the system column names have changed between 1.1 and 1.2. So you might have to mess around with it a little to get your desired results.
Edit 20160523 - Cassandra 3.x Update:
Note that for Cassandra 3.0 and up, you'll need to make a few adjustments to the above query:
SELECT table_name
FROM system_schema.tables WHERE keyspace_name='myKeyspaceName';
The Java driver (since you mentioned it in your question) also maintains a local representation of the schema.
Driver 3.x and below:
KeyspaceMetadata ks = cluster.getMetadata().getKeyspace("myKeyspace");
TableMetadata table = ks.getTable("myTable");
boolean tableExists = (table != null);
Driver 4.x and above:
Metadata metadata = session.getMetadata();
boolean tableExists =
metadata.getKeyspace("myKeyspace")
.flatMap(ks -> ks.getTable("myTable"))
.isPresent();
I just needed to manually check for the existence of a table using cqlsh.
Possibly useful general info.
describe keyspace_name.table_name
If it doesn't exist you'll get 'table_name' not found in keyspace 'keyspace'
If it does exist you'll get a description of the table.
For the .NET driver CassandraCSharpDriver version 3.17.1 the following code creates a table if it doesn't exist yet:
var ks = _cassandraSession.Cluster.Metadata.GetKeyspace(keyspaceName);
var tableNames = ks.GetTablesNames();
if(!tableNames.Contains(tableName.ToLowerInvariant()))
{
var stmt = new SimpleStatement($"CREATE TABLE {tableName} (id text PRIMARY KEY, name text, price decimal, volume int, time timestamp)");
_cassandraSession.Execute(stmt);
}
You will need to adapt the list of table columns to your needs. This can also be awaited by using await _cassandraSession.ExecuteAsync(stmt).ConfigureAwait(false) in an async method.
Also, I want to mention that I'm using Cassandra version 4.0.1.

Astyanax key range query

trying to write a query which will paginate through all rows in a column family using astyanax client and RowSliceQuery.
keyspace.prepareQuery(COLUMN_FAMILY).getKeyRange(null, null, null, null, 100);
Done this successfully using hector where 1st call is done with null start and end keys. After retrieving 1st page I use last key from the result to make query for second page and etc. This is code for 1st page using hector.
HFactory.createRangeSlicesQuery(keyspace,
LongSerializer.get(), new CompositeSerializer(),
BytesArraySerializer.get())
.setColumnFamily(COLUMN_FAMILY)
.setRange(null, null, false, 100).setRowCount(100);
Now when I am trying to do this with astyanax I am getting errors about null and non-null keys and tokens. Not sure what tokens do in this query. Also I am able to use allRows(), but would like to do this using key range query as it gives me more flexibility.
Does anybody have an example of key range query using astyanax? I cannot find an example neither in "getting started" documentation or anywhere else on the net.
Thanks!
Anton
What you are referring to is the getRowRange method:
keyspace.prepareQuery(CF_STANDARD1)
.getRowRange(startKey, endKey, startToken, endToken, count)
Note however that this works only when the ByteOrderedPartitioner is used. Since by default Cassandra uses the Murmur3Partitioner, this will usually not work. Using an index to do this instead is recommended. Astyanax also provides the reverse index search recipe which takes advantage of a second column family which stores your keys as columns to allow efficient range searches on the original data.
Check this sample code. I hope this code will help you in doing the paging.
IndexQuery<String, String> query = keyspace
.prepareQuery(CF_STANDARD1).searchWithIndex()
.setRowLimit(10).autoPaginateRows(true).addExpression()
.whereColumn("Index2").equals().value(42);
Best,

Resources