How to read all rows with datatypes from cassandra using java?

How to read all rows with datatypes from cassandra using java? - cassandra

I want to read all rows with data types for a given keyspace and column family in Cassandra.
To read data i tried using CQL like below
CqlQuery<String, String, ByteBuffer> cqlQuery = new CqlQuery<String, String, ByteBuffer>(keyspaceOperator, se, se, be);
cqlQuery.setQuery("select * from colfam1");
QueryResult<CqlRows<String, String, ByteBuffer>> result = cqlQuery.execute();
Even I have tried using Hector slice queries API also
Cluster cluster = HFactory.getOrCreateCluster("Test Cluster", "localhost:9160");
Keyspace keyspace = HFactory.createKeyspace("rajesh", cluster);
SliceQuery<String, String, String> sliceQuery = HFactory.createSliceQuery(keyspace, stringSerializer, stringSerializer, stringSerializer);
sliceQuery.setColumnFamily("colfam1").setKey("key123");
sliceQuery.setRange("", "", false, 4);
QueryResult<ColumnSlice<String, String>> result = sliceQuery.execute();
But in both ways i was able to read all rows but i am not able to read data types.
Can anyone help me to read row values with data types from cassandra using java.??

Reading rows with values is very simple. But i want to read metadata as well. Here is the solution for that
public Map<String, ArrayList<String>> getMetaData(Client _client, String _keyspace) throws SQLException, NotFoundException, InvalidRequestException, TException{
ArrayList<String> columnfamilyNames = new ArrayList<String>();
ArrayList<String> columnNames = new ArrayList<String>();
ArrayList<String> validationClasses = new ArrayList<String>();
Map<String, ArrayList<String>> metadataMapList =new HashMap<String,ArrayList<String>>();
KsDef keyspaceDefinition = _client.describe_keyspace(_keyspace);
List<CfDef> columnDefinition = keyspaceDefinition.getCf_defs();
for(int i=0;i<columnDefinition.size();i++){
List<ColumnDef> columnMetadata = columnDefinition.get(i).getColumn_metadata();
for(int j=0;j<columnMetadata.size();j++){
columnfamilyNames.add(columnDefinition.get(i).getName());
columnNames.add(new String((columnMetadata.get(j).getName())));
validationClasses.add(columnMetadata.get(j).getValidation_class());
}
}
metadataMapList.put("columnfamilyName", columnfamilyNames);
metadataMapList.put("ColumnName", columnNames);
metadataMapList.put("validationClass", validationClasses);
return metadataMapList;
}
FYI Here I used thrift client.

Related

Cannot insert into Cassandra table, getting SyntaxError

I have an assignment where I have to build a Cassandra database. I have connected Cassandra with IntelliJ, i'm writing in java and the output is shown in the command line.
My keyspace farm_db contains a couple of tables in wish i'm would like to insert data. I would like to insert the data with two columns and a list all in one row, in the table 'farmers'. This is a part of my database so far:
cqlsh:farm_db> use farm_db;
cqlsh:farm_db> Describe tables;
farmers foods_dairy_eggs foods_meat
foods_bread_cookies foods_fruit_vegetables
cqlsh:farm_db> select * from farmers;
farmer_id | delivery | the_farmer
-----------+----------+------------
This is what i'm trying to do:
[Picture of what i'm trying to do][1]
I need to insert the collection types 'list' and 'map' in 'farmers' but after a couple of failed attempts with that I tried using hashmap and arraylist instead. I think this could work but i seem to have an error in my syntax and I have no idea what the problem seem to be:
Exception in thread "main" com.datastax.driver.core.exceptions.SyntaxError: line 1:31 mismatched input 'int' expecting ')' (INSERT INTO farmers (farmer_id [int]...)
Am I missing something or am I doing something wrong?
This is my code:
public class FarmersClass {
public static String serverIP = "127.0.0.1";
public static String keyspace = "";
//Create db
public void crateDatabase(String databaseName) {
Cluster cluster = Cluster.builder()
.addContactPoints(serverIP)
.build();
keyspace = databaseName;
Session session = cluster.connect();
String create_db_query = "CREATE KEYSPACE farm_db WITH replication "
+ "= {'class':'SimpleStrategy', 'replication_factor':1};";
session.execute(create_db_query);
}
//Create table
public void createFarmersTable() {
Cluster cluster = Cluster.builder()
.addContactPoints(serverIP)
.build();
Session session = cluster.connect("farm_db");
String create_farm_table_query = "CREATE TABLE farmers(farmer_id int PRIMARY KEY, the_farmer Map <text, text>, delivery list<text>); ";
session.execute(create_farm_table_query);
}
//Insert data in table 'farmer'.
public void insertFarmers(int id, HashMap< String, String> the_farmer, ArrayList <String> delivery) {
Cluster cluster = Cluster.builder()
.addContactPoints(serverIP)
.build();
Session session = cluster.connect("farm_db");
String insert_query = "INSERT INTO farmers (farmer_id int PRIMARY KEY, the_farmer, delivery) values (" + id + "," + the_farmer + "," + delivery + ");";
System.out.println(insert_query);
session.execute(insert_query);
}
}
public static void main(String[] args) {
FarmersClass farmersClass = new FarmersClass();
// FarmersClass.crateDatabase("farm_db");
// FarmersClass.createFarmersTable();
//Collection type map
HashMap<String, String> the_farmer = new HashMap<>();
the_farmer.put("Name", "Ana Petersen ");
the_farmer.put("Farmhouse", "The great farmhouse");
the_farmer.put("Foods", "Fruits & Vegetables");
//Collection type list
ArrayList<String> delivery = new ArrayList<String>();
String delivery_1 = "Village 1";
String delivery_2 = "Village 2";
delivery.add(delivery_1);
delivery.add(delivery_2);
FarmersClass.insertFarmers(1, the_farmer, delivery);
}

The problem is the syntax of your CQL INSERT query:
String insert_query = \
"INSERT INTO farmers (farmer_id int PRIMARY KEY, the_farmer, delivery) \
values (" + id + "," + the_farmer + "," + delivery + ");";
You've incorrectly added int PRIMARY KEY in the list of columns.
The correct format is:
INSERT INTO table_name (pk, col2, col3) VALUES ( ... )
For details and examples, see CQL INSERT. Cheers!

Spark : cleaner way to build Dataset out of Spark streaming

I want to create an API which looks like this
public Dataset<Row> getDataFromKafka(SparkContext sc, String topic, StructType schema);
here
topic - is Kafka topic name from which the data is going to be consumed.
schema - is schema information for Dataset
so my function contains following code :
JavaStreamingContext jsc = new JavaStreamingContext(javaSparkContext, Durations.milliseconds(2000L));
JavaPairInputDStream<String, String> directStream = KafkaUtils.createDirectStream(
jsc, String.class, String.class,
StringDecoder.class, StringDecoder.class,
kafkaConsumerConfig(), topics
);
Dataset<Row> dataSet = sqlContext.createDataFrame(javaSparkContext.emptyRDD(), schema);
DataSetHolder holder = new DataSetHolder(dataSet);
LongAccumulator stopStreaming = sc.longAccumulator("stop");
directStream.foreachRDD(rdd -> {
RDD<Row> rows = rdd.values().map(value -> {
//get type of message from value
Row row = null;
if (END == msg) {
stopStreaming.add(1);
row = null;
} else {
row = new GenericRow(/*row data created from values*/);
}
return row;
}).filter(row -> row != null).rdd();
holder.union(sqlContext.createDataFrame(rows, schema));
holder.get().count();
});
jsc.start();
//stop stream if stopStreaming value is greater than 0 its spawned as new thread.
return holder.get();
Here DatasetHolder is a wrapper class around Dataset to combine the result of all the rdds.
class DataSetHolder {
private Dataset<Row> df = null;
public DataSetHolder(Dataset<Row> df) {
this.df = df;
}
public void union(Dataset<Row> frame) {
this.df = df.union(frame);
}
public Dataset<Row> get() {
return df;
}
}
This doesn't looks good at all but I had to do it. I am wondering what is the good way to do it. Or is there any provision for this by Spark?
Update
So after consuming all the data from stream i.e. from kafka topic, we create a dataframe out of it so that the data analyst can register it as a temp table and can fire any query to get the meaningful result.

Lucene - Sorting Date as NumericField

While trying to sort datetime (long) numeric fields I always get a FormatException.
When converting a string to DateTime, parse the string to take the
date before putting each variable into the DateTime object.
Adding the numeric field:
doc.Add(new NumericField("creationDate", Field.Store.YES, true)
.SetLongValue(DateTime.UtcNow.Ticks);
Add sorting:
// boolean query
var sortField = new SortField("creationDate", SortField.LONG, true);
var inverseSort = new Sort(sortField);
var results = searcher.Search(query, null, 100, inverseSort); // exception thrown here
Inspecting the index, I can verify that 'creationDate' field is storing "long" values. What could be causing this exception?
EDIT:
Query
var query = new BooleanQuery();
foreach (var termQuery in incomingProps.Select(p => new TermQuery(new Term(kvp.Key, kvp.Value.ToLowerInvariant()))
{
query.Add(new BooleanClause(termQuery , Occur.Must));
}
return query;
Version: Lucene.Net 3.0.3
UPDATE:
This issue is occurring again, now with INT values.
I downloaded Lucene.Net source code and debugged the issue.
So it's somewhere in the FieldCache, when trying to parse the value "`\b\0\0\0" to Integer, which seems a bit odd.
I'm adding these values as numeric fields:
doc.Add(new NumericField(VersionNum, int.MaxValue, Field.Store.YES,
true).SetIntValue(VersionValue));
I get the exception when I'm supposed to get at least 1 hit back.
After inspecting the Index I see that the field's term is as following:
And the field text is:
EDIT:
I've hardcoded an int value and added a few segments:
doc.Add(new Field(VersionNum, NumericUtils.IntToPrefixCoded(1), Field.Store.YES, Field.Index.NOT_ANALYZED_NO_NORMS));
Which resulted on storing the version field as:
And still, when I try to sort I get the parsing error:
var sortVersion = new SortField(VersionNum, SortField.INT, true);
For every exception, Lucene is trying to parse " \b\0\0\0 ".
Looking at the prefixed coded stored as string, 1 would translate to " \b\0\0\0\1 " I'm guessing?
Is Lucene probably leaving some garbage behind in the FieldCache ?

Here's a unit test that tries to capture what you're asking. The test passes. Can you explain what the difference with your code is? (posting a full failing test would help us understand what you're doing :-) )
using System;
using System.Linq;
using System.Collections.Generic;
using Microsoft.VisualStudio.TestTools.UnitTesting;
using Lucene.Net.Search;
using Lucene.Net.Index;
using Lucene.Net.Analysis.Standard;
using Lucene.Net.QueryParsers;
using Lucene.Net.Documents;
using Lucene.Net.Store;
namespace SO_answers
{
[TestClass]
public class UnitTest1
{
[TestMethod]
public void TestShopping()
{
var item = new Dictionary<string, string>
{
{"field1", "value1" },
{"field2", "value2" },
{"field3", "value3" }
};
var writer = CreateIndex();
Add(writer, item);
writer.Flush(true, true, true);
var searcher = new IndexSearcher(writer.GetReader());
var result = Search(searcher, item);
Assert.AreEqual(1, result.Count);
writer.Dispose();
}
private List<string> Search(IndexSearcher searcher, Dictionary<string, string> values)
{
var query = new BooleanQuery();
foreach (var termQuery in values.Select(kvp => new TermQuery(new Term(kvp.Key, kvp.Value.ToLowerInvariant()))))
query.Add(new BooleanClause(termQuery, Occur.MUST));
return Search(searcher, query);
}
private List<string> Search(IndexSearcher searcher, Query query)
{
var sortField = new SortField("creationDate", SortField.LONG, true);
var inverseSort = new Sort(sortField);
var results = searcher.Search(query, null, 100, inverseSort); // exception thrown here
var result = new List<string>();
var matches = results.ScoreDocs;
foreach (var item in matches)
{
var id = item.Doc;
var doc = searcher.Doc(id);
result.Add(doc.GetField("creationDate").StringValue);
}
return result;
}
IndexWriter CreateIndex()
{
var directory = new RAMDirectory();
var analyzer = new StandardAnalyzer(Lucene.Net.Util.Version.LUCENE_30);
var writer = new IndexWriter(directory, analyzer, new IndexWriter.MaxFieldLength(1000));
return writer;
}
void Add(IndexWriter writer, IDictionary<string, string> values)
{
var document = new Document();
foreach (var kvp in values)
document.Add(new Field(kvp.Key, kvp.Value.ToLowerInvariant(), Field.Store.YES, Field.Index.ANALYZED));
document.Add(new NumericField("creationDate", Field.Store.YES, true).SetLongValue(DateTime.UtcNow.Ticks));
writer.AddDocument(document);
}
}
}

Persisting data to DynamoDB using Apache Spark

I have a application where
1. I read JSON files from S3 using SqlContext.read.json into Dataframe
2. Then do some transformations on the DataFrame
3. Finally I want to persist the records to DynamoDB using one of the record value as key and rest of JSON parameters as values/columns.
I am trying something like :
JobConf jobConf = new JobConf(sc.hadoopConfiguration());
jobConf.set("dynamodb.servicename", "dynamodb");
jobConf.set("dynamodb.input.tableName", "my-dynamo-table"); // Pointing to DynamoDB table
jobConf.set("dynamodb.endpoint", "dynamodb.us-east-1.amazonaws.com");
jobConf.set("dynamodb.regionid", "us-east-1");
jobConf.set("dynamodb.throughput.read", "1");
jobConf.set("dynamodb.throughput.read.percent", "1");
jobConf.set("dynamodb.version", "2011-12-05");
jobConf.set("mapred.output.format.class", "org.apache.hadoop.dynamodb.write.DynamoDBOutputFormat");
jobConf.set("mapred.input.format.class", "org.apache.hadoop.dynamodb.read.DynamoDBInputFormat");
DataFrame df = sqlContext.read().json("s3n://mybucket/abc.json");
RDD<String> jsonRDD = df.toJSON();
JavaRDD<String> jsonJavaRDD = jsonRDD.toJavaRDD();
PairFunction<String, Text, DynamoDBItemWritable> keyData = new PairFunction<String, Text, DynamoDBItemWritable>() {
public Tuple2<Text, DynamoDBItemWritable> call(String row) {
DynamoDBItemWritable writeable = new DynamoDBItemWritable();
try {
System.out.println("JSON : " + row);
JSONObject jsonObject = new JSONObject(row);
System.out.println("JSON Object: " + jsonObject);
Map<String, AttributeValue> attributes = new HashMap<String, AttributeValue>();
AttributeValue attributeValue = new AttributeValue();
attributeValue.setS(row);
attributes.put("values", attributeValue);
AttributeValue attributeKeyValue = new AttributeValue();
attributeValue.setS(jsonObject.getString("external_id"));
attributes.put("primary_key", attributeKeyValue);
AttributeValue attributeSecValue = new AttributeValue();
attributeValue.setS(jsonObject.getString("123434335"));
attributes.put("creation_date", attributeSecValue);
writeable.setItem(attributes);
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
return new Tuple2(new Text(row), writeable);
}
};
JavaPairRDD<Text, DynamoDBItemWritable> pairs = jsonJavaRDD
.mapToPair(keyData);
Map<Text, DynamoDBItemWritable> map = pairs.collectAsMap();
System.out.println("Results : " + map);
pairs.saveAsHadoopDataset(jobConf);
However I do not see any data getting written to DynamoDB. Nor do I get any error messages.

I'm not sure, but your's seems more complex than it may need to be.
I've used the following to write an RDD to DynamoDB successfully:
val ddbInsertFormattedRDD = inputRDD.map { case (skey, svalue) =>
val ddbMap = new util.HashMap[String, AttributeValue]()
val key = new AttributeValue()
key.setS(skey.toString)
ddbMap.put("DynamoDbKey", key)
val value = new AttributeValue()
value.setS(svalue.toString)
ddbMap.put("DynamoDbKey", value)
val item = new DynamoDBItemWritable()
item.setItem(ddbMap)
(new Text(""), item)
}
val ddbConf = new JobConf(sc.hadoopConfiguration)
ddbConf.set("dynamodb.output.tableName", "my-dynamo-table")
ddbConf.set("dynamodb.throughput.write.percent", "0.5")
ddbConf.set("mapred.input.format.class", "org.apache.hadoop.dynamodb.read.DynamoDBInputFormat")
ddbConf.set("mapred.output.format.class", "org.apache.hadoop.dynamodb.write.DynamoDBOutputFormat")
ddbInsertFormattedRDD.saveAsHadoopDataset(ddbConf)
Also, have you checked that you have upped the capacity correctly?

How to bind String[] values in listview using hashmap in android?

Hi I am trying to get values using hashmap<> using .net web services in android. I have custemized adapter, I am trying to do this.
SoapObject folderResponse = (SoapObject)envelope.getResponse();
Log.i("AllFolders", folderResponse.toString());
String[] folderslist = new String[folderResponse.getPropertyCount()];
//getting values using folderslist.
ArrayList<HashMap<String, String>> hashfoldersList = new ArrayList <HashMap<String, String> >();
//But I want hashfoldersList list in my custamized adapter.
for(i=0; i<folderResponse.getPropertyCount(); i++) {
SoapObject SingleFolder = (SoapObject)folderResponse.getProperty(i);
Log.i("SingleFolder", SingleFolder.toString());
ID= SingleFolder.getProperty(0).toString();
KEY_Name = SingleFolder.getProperty(1).toString();
ParentID = SingleFolder.getProperty(2).toString();
CreatedBy= SingleFolder.getProperty(3).toString();
System.out.println(ID);
System.out.println(KEY_Name);
System.out.println(ParentID);
System.out.println(CreatedBy);
SoapPrimitive Record =(SoapPrimitive) SingleFolder.getProperty(1);
Log.i("Record", Record.toString());
{
folderslist[i] = SingleFolder.getProperty(0).toString();
}
XMLParser parser = new XMLParser();
String xml = parser.getXmlFromUrl(URL); // getting XML from URL
org.w3c.dom.Document doc = parser.getDomElement(xml); // getting DOM element
NodeList nl = (NodeList) doc.getElementsByTagName(ID);
// looping through all song nodes <song>
for (int i = 0; i < nl.getLength(); i++) {
// creating new HashMap
HashMap<String, String> map = new HashMap<String, String>();
Element e = (Element) nl.item(i);
// adding each child node to HashMap key => value
map.put(ID, parser.getValue(e, ID));
map.put(KEY_Name, parser.getValue(e, KEY_Name));
map.put(ParentID, parser.getValue(e, ParentID));
map.put(CreatedBy, parser.getValue(e, CreatedBy));
foldersList.add(map);
}
listview = (ListView)findViewById(R.id.listview);
adapter=new LazyAdapter(this, hashfoldersList);
//My custemized adapter.
listview.setAdapter(adapter);
listview.setOnItemClickListener(this);
}
}
Please suggest, how to get values in list using ArrayList> hashfolderlist, as I am using string[] folderlist. when I am inserting hashfolderlist, it is giving error. Please suggest. thanks

mate you should have a simple adapter to take the strings and put it inside a listview.
second you should have 2 textviews
SimpleAdapter adapter = new SimpleAdapter(this, list,
R.layout.your_activity, new String[] { "", "" },
new int[] { R.id.textview1, R.id.textView2 }
);
listView1.setAdapter(adapter);
hope it helps you!!

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to read all rows with datatypes from cassandra using java? - cassandra

Related

Cannot insert into Cassandra table, getting SyntaxError

Spark : cleaner way to build Dataset out of Spark streaming

Lucene - Sorting Date as NumericField

Persisting data to DynamoDB using Apache Spark

How to bind String[] values in listview using hashmap in android?

Categories

Resources