I'm using Cassandra 3.10 and I need to calculate how many data modifications I have. For resolving the issue I'm going to use cassandra trigger, but result of triggers work need to be Mutation
public interface ITrigger {
public Collection<Mutation> augment(Partition update);
}
But I have CounterMutation,
#Override
public Collection<Mutation> augment(Partition update) {
String keyspaceName = update.metadata().ksName;
//CFMetaData metadata = Schema.instance.getCFMetaData(keyspaceName, cfName);
long timestamp = System.currentTimeMillis();
String cfName = "user_product_count_cf";
CFMetaData counterCfMetadata = Schema.instance.getCFMetaData(keyspaceName, cfName);
ByteBuffer key = toByteBuffer("test-user-key");
PartitionUpdate.SimpleBuilder builder = PartitionUpdate.simpleBuilder(counterCfMetadata, key);
ByteBuffer columnName = toByteBuffer("test-counter-column");
ByteBuffer countValue = CounterContext.instance().createLocal(1);
builder.timestamp(timestamp).row(Clustering.make(columnName)).add("value", value);
Mutation mutation = builder.buildAsMutation();
//TODO this line does not work
//return Collections.singletonList(mutation);
CounterMutation cMutation = new CounterMutation(mutation, ConsistencyLevel.ONE);
//I do not understand what I have to do next.
//Now I use next line and it is work, but I don't sure that it is good practice.
return Collections.singletonList(cMutation.applyCounterMutation());
}
so how to convert CounterMutation to Mutation?
You can use QueryProcessor class
Example :
#Override
public Collection<Mutation> augment(Partition update) {
String keyspaceName = update.metadata().ksName;
String cfName = "user_product_count_cf";
QueryProcessor.process(
QueryBuilder.update(keyspaceName,cfName).with(incr("test-counter-column",1)).where(eq("test-user-key", bindMarker())).toString(),
ConsistencyLevel.LOCAL_QUORUM,
Arrays.asList(key) // Put the partition key value here
);
return Collections.EMPTY_LIST;
}
Related
I am trying to append values to a column of type set, via the JAVA API.
It seems that the connector disregards the type of CollectionBehavior I am setting,
and always overrides the previous collection.
Even when I use CollectionRemove, the value to be removed is added to the collection.
I am following the example as shown in:
https://datastax-oss.atlassian.net/browse/SPARKC-340?page=com.atlassian.jira.plugin.system.issuetabpanels%3Achangehistory-tabpanel
I am using:
spark-core_2.11 2.2.0
spark-cassandra-connector_2.11 2.0.5
Cassandra 2.1.17
Could it be that this feature is no supported on those versions?
Here is the implementation code:
// CASSANDRA TABLE
CREATE TABLE test.profile (
id text PRIMARY KEY,
dates set<bigint>,
)
// ENTITY
public class ProfileRow {
public static final Map<String, String> namesMap;
static {
namesMap = new HashMap<>();
namesMap.put("id", "id");
namesMap.put("dates", "dates");
}
private String id;
private Set<Long> dates;
public ProfileRow() {}
public String getId() {
return id;
}
public void setId(String id) {
this.id = id;
}
public Set<Long> getDates() {
return dates;
}
public void setDates(Set<Long> dates) {
this.dates = dates;
}
}
public void execute(JavaSparkContext context) {
List<ProfileRow> elements = new LinkedList<>();
ProfileRow profile = new ProfileRow();
profile.setId("fGxTObQIXM");
Set<Long> dates = new HashSet<>();
dates.add(1l);
profile.setDates(dates);
elements.add(profile);
JavaRDD<ProfileRow> rdd = context.parallelize(elements);
RDDAndDStreamCommonJavaFunctions<T>.WriterBuilder wb = javaFunctions(rdd)
.writerBuilder("test", "profile", mapToRow(ProfileRow.class, ProfileRow.namesMap));
CollectionColumnName appendColumn = new CollectionColumnName("dates", Option.empty(), CollectionAppend$.MODULE$);
scala.collection.Seq<ColumnRef> columnRefSeq = JavaApiHelper.toScalaSeq(Arrays.asList(appendColumn));
SomeColumns columnSelector = SomeColumns$.MODULE$.apply(columnRefSeq);
wb.withColumnSelector(columnSelector);
wb.saveToCassandra();
}
Thanks,
Shai
I found the answer. There are 2 things I had to change:
Add the primary key column to the column selector.
WriterBuilder.withColumnSelector() generates a new instance of WriterBuilder, so I had to store the new instance.
:
RDDAndDStreamCommonJavaFunctions<T>.WriterBuilder wb = javaFunctions(rdd)
.writerBuilder("test", "profile", mapToRow(ProfileRow.class, ProfileRow.namesMap));
ColumnName pkColumn = new ColumnName("id", Option.empty())
CollectionColumnName appendColumn = new CollectionColumnName("dates", Option.empty(), CollectionAppend$.MODULE$);
scala.collection.Seq<ColumnRef> columnRefSeq = JavaApiHelper.toScalaSeq(Arrays.asList(pkColumn, appendColumn));
SomeColumns columnSelector = SomeColumns$.MODULE$.apply(columnRefSeq);
wb = wb.withColumnSelector(columnSelector);
wb.saveToCassandra();
I am trying to match a field that is not a key with a remote hazelcast, the goal here is to create many remote instances and use it to store serialized objects.
what i noticed is that if i do both put and SQL in the same run, the return works, as follow :
my class
public class Provider implements Serializable {
private String resourceId;
private String first;
public String getResourceId() {
return resourceId;
}
public void setResourceId(String resourceId) {
this.resourceId = resourceId;
}
public String getFirst() {
return first;
}
public void setFirst(String first) {
this.first = first;
}
}
code :
/*********** map initlization ************/
Config config = new Config();
config.getNetworkConfig().setPublicAddress(host + ":" + port);
instance = Hazelcast.newHazelcastInstance(config);
map = instance.getMap("providerWithIndex");
String first = "XXXX"
/***** adding item ***************/
Provider provider = new Provider();
provider.setResourceId("11111");
provider.setFirst( first);
map.put(provider);
/********** getting items ************/
EntryObject e = new PredicateBuilder().getEntryObject();
Predicate predicate = e.get( "first" ).equal( first ) ;
Collection<Provider> providers = map.values(predicate);
once i run both put and get in different runs, the result is 0 - so to the same code i get no response.
my only thought is that it only does local fetch and not remote. but i hope i have a bug somewhere.
Your code looks good, apart from the fact that you can't do map.put(provider), you need to pass a key, so like map.put(someKey, provider);
The query looks & works fine.
I am new to Android.I need help to solve the error below.
Got stuck here.
public class ForecastFragment extends Fragment {
private ArrayAdapter<String> mForecastAdapter;
public ForecastFragment() {
}
#Override
public void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
// Add this line in order for this fragment to handle menu events.
setHasOptionsMenu(true);
}
#Override
public void onCreateOptionsMenu(Menu menu, MenuInflater inflater) {
inflater.inflate(R.menu.forecastfragment, menu);
}
#Override
public boolean onOptionsItemSelected(MenuItem item) {
// Handle action bar item clicks here. The action bar will
// automatically handle clicks on the Home/Up button, so long
// as you specify a parent activity in AndroidManifest.xml.
int id = item.getItemId();
if (id == R.id.action_refresh) {
FetchWeatherTask weatherTask = new FetchWeatherTask();
weatherTask.execute("94043");
return true;
}
return super.onOptionsItemSelected(item);
}
#Override
public View onCreateView(LayoutInflater inflater, ViewGroup container,
Bundle savedInstanceState) {
// Create some dummy data for the ListView. Here's a sample weekly forecast
String[] data = {
"Mon 6/23 - Sunny - 31/17",
"Tue 6/24 - Foggy - 21/8",
"Wed 6/25 - Cloudy - 22/17",
"Thurs 6/26 - Rainy - 18/11",
"Fri 6/27 - Foggy - 21/10",
"Sat 6/28 - TRAPPED IN WEATHERSTATION - 23/18",
"Sun 6/29 - Sunny - 20/7"
};
List<String> weekForecast = new ArrayList<String>(Arrays.asList(data));
// Now that we have some dummy forecast data, create an ArrayAdapter.
// The ArrayAdapter will take data from a source (like our dummy forecast) and
// use it to populate the ListView it's attached to.
mForecastAdapter =
new ArrayAdapter<String>(
getActivity(), // The current context (this activity)
R.layout.list_item_forecast, // The name of the layout ID.
R.id.list_item_forecast_textview, // The ID of the textview to populate.
weekForecast);
View rootView = inflater.inflate(R.layout.fragment_main, container, false);
// Get a reference to the ListView, and attach this adapter to it.
ListView listView = (ListView) rootView.findViewById(R.id.listview_forecast);
listView.setAdapter(mForecastAdapter);
return rootView;
}
public class FetchWeatherTask extends AsyncTask<String, Void, String[]> {
private final String LOG_TAG = FetchWeatherTask.class.getSimpleName();
/* The date/time conversion code is going to be moved outside the asynctask later,
* so for convenience we're breaking it out into its own method now.
*/
private String getReadableDateString(long time){
// Because the API returns a unix timestamp (measured in seconds),
// it must be converted to milliseconds in order to be converted to valid date.
SimpleDateFormat shortenedDateFormat = new SimpleDateFormat("EEE MMM dd");
return shortenedDateFormat.format(time);
}
/**
* Prepare the weather high/lows for presentation.
*/
private String formatHighLows(double high, double low) {
// For presentation, assume the user doesn't care about tenths of a degree.
long roundedHigh = Math.round(high);
long roundedLow = Math.round(low);
String highLowStr = roundedHigh + "/" + roundedLow;
return highLowStr;
}
/**
* Take the String representing the complete forecast in JSON Format and
* pull out the data we need to construct the Strings needed for the wireframes.
*
* Fortunately parsing is easy: constructor takes the JSON string and converts it
* into an Object hierarchy for us.
*/
private String[] getWeatherDataFromJson(String forecastJsonStr, int numDays)
throws JSONException {
// These are the names of the JSON objects that need to be extracted.
final String OWM_LIST = "list";
final String OWM_WEATHER = "weather";
final String OWM_TEMPERATURE = "temp";
final String OWM_MAX = "max";
final String OWM_MIN = "min";
final String OWM_DESCRIPTION = "main";
JSONObject forecastJson = new JSONObject(forecastJsonStr);
JSONArray weatherArray = forecastJson.getJSONArray(OWM_LIST);
// OWM returns daily forecasts based upon the local time of the city that is being
// asked for, which means that we need to know the GMT offset to translate this data
// properly.
// Since this data is also sent in-order and the first day is always the
// current day, we're going to take advantage of that to get a nice
// normalized UTC date for all of our weather.
Time dayTime = new Time();
dayTime.setToNow();
// we start at the day returned by local time. Otherwise this is a mess.
int julianStartDay = Time.getJulianDay(System.currentTimeMillis(), dayTime.gmtoff);
// now we work exclusively in UTC
dayTime = new Time();
String[] resultStrs = new String[numDays];
for(int i = 0; i < weatherArray.length(); i++) {
// For now, using the format "Day, description, hi/low"
String day;
String description;
String highAndLow;
// Get the JSON object representing the day
JSONObject dayForecast = weatherArray.getJSONObject(i);
// The date/time is returned as a long. We need to convert that
// into something human-readable, since most people won't read "1400356800" as
// "this saturday".
long dateTime;
// Cheating to convert this to UTC time, which is what we want anyhow
dateTime = dayTime.setJulianDay(julianStartDay+i);
day = getReadableDateString(dateTime);
// description is in a child array called "weather", which is 1 element long.
JSONObject weatherObject = dayForecast.getJSONArray(OWM_WEATHER).getJSONObject(0);
description = weatherObject.getString(OWM_DESCRIPTION);
// Temperatures are in a child object called "temp". Try not to name variables
// "temp" when working with temperature. It confuses everybody.
JSONObject temperatureObject = dayForecast.getJSONObject(OWM_TEMPERATURE);
double high = temperatureObject.getDouble(OWM_MAX);
double low = temperatureObject.getDouble(OWM_MIN);
highAndLow = formatHighLows(high, low);
resultStrs[i] = day + " - " + description + " - " + highAndLow;
}
for (String s : resultStrs) {
Log.v(LOG_TAG, "Forecast entry: " + s);
}
return resultStrs;
}
#Override
protected String[] doInBackground(String... params) {
// If there's no zip code, there's nothing to look up. Verify size of params.
if (params.length == 0) {
return null;
}
// These two need to be declared outside the try/catch
// so that they can be closed in the finally block.
HttpURLConnection urlConnection = null;
BufferedReader reader = null;
// Will contain the raw JSON response as a string.
String forecastJsonStr = null;
String format = "json";
String units = "metric";
int numDays = 7;
try {
// Construct the URL for the OpenWeatherMap query
// Possible parameters are avaiable at OWM's forecast API page, at
// http://openweathermap.org/API#forecast
final String FORECAST_BASE_URL =
"http://api.openweathermap.org/data/2.5/forecast/daily?";
final String QUERY_PARAM = "q";
final String FORMAT_PARAM = "mode";
final String UNITS_PARAM = "units";
final String DAYS_PARAM = "cnt";
final String APPID_PARAM = "02867cfd75153da1eda43a17f213ffc5";
Uri builtUri = Uri.parse(FORECAST_BASE_URL).buildUpon()
.appendQueryParameter(QUERY_PARAM, params[0])
.appendQueryParameter(FORMAT_PARAM, format)
.appendQueryParameter(UNITS_PARAM, units)
.appendQueryParameter(DAYS_PARAM, Integer.toString(numDays))
.appendQueryParameter(APPID_PARAM, BuildConfig.OPEN_WEATHER_MAP_API_KEY)
.build();
URL url = new URL(builtUri.toString());
Log.v(LOG_TAG, "Built URI " + builtUri.toString());
// Create the request to OpenWeatherMap, and open the connection
urlConnection = (HttpURLConnection) url.openConnection();
urlConnection.setRequestMethod("GET");
urlConnection.connect();
// Read the input stream into a String
InputStream inputStream = urlConnection.getInputStream();
StringBuffer buffer = new StringBuffer();
if (inputStream == null) {
// Nothing to do.
return null;
}
reader = new BufferedReader(new InputStreamReader(inputStream));
String line;
while ((line = reader.readLine()) != null) {
// Since it's JSON, adding a newline isn't necessary (it won't affect parsing)
// But it does make debugging a *lot* easier if you print out the completed
// buffer for debugging.
buffer.append(line + "\n");
}
if (buffer.length() == 0) {
// Stream was empty. No point in parsing.
return null;
}
forecastJsonStr = buffer.toString();
Log.v(LOG_TAG, "Forecast string: " + forecastJsonStr);
} catch (IOException e) {
Log.e(LOG_TAG, "Error ", e);
// If the code didn't successfully get the weather data, there's no point in attemping
// to parse it.
return null;
} finally {
if (urlConnection != null) {
urlConnection.disconnect();
}
if (reader != null) {
try {
reader.close();
} catch (final IOException e) {
Log.e(LOG_TAG, "Error closing stream", e);
}
}
}
try {
return getWeatherDataFromJson(forecastJsonStr, numDays);
} catch (JSONException e) {
Log.e(LOG_TAG, e.getMessage(), e);
e.printStackTrace();
}
// This will only happen if there was an error getting or parsing the forecast.
return null;
}
}
}
}
}
[{
public final class BuildConfig {
public static final boolean DEBUG = Boolean.parseBoolean("true");
public static final String APPLICATION_ID = "com.example.patels.sunshine";
public static final String BUILD_TYPE = "debug";
public static final String FLAVOR = "";
public static final int VERSION_CODE = 1;
public static final String VERSION_NAME = "1.0";
// Fields from build type: debug
public static final String OPEN_WEATHER_MAP_API_KEY = MyOpenWeatherMapApiKey;
}
I have written this code....I got stuck here..Unable to solve the error.
Help me out....Thank you
Under the app folder you can find build.gradle file, in this make this below changes.
Since you are using a String you have to use this syntax:
it.buildConfigField "String" , "OPEN_WEATHER_MAP_API_KEY" , "\"MyOpenWeatherMapApiKey\""
The last parameter has to be a String.
You should create an account here http://openweathermap.org/ and when you register with your email, you get an APIKey for your account. Use this api key and replace the MyOpenWeatherMapApiKey in the following line:
public static final String OPEN_WEATHER_MAP_API_KEY = MyOpenWeatherMapApiKey;
Another alternative would be to write the APIKey in grandle file as proposed already.
I want to measure the time that the execution of combineByKey function needs. I always get a result of 20-22 ms (HashPartitioner) and ~350ms (without pratitioning) with the code below, independent of the file size I use (file0: ~300 kB, file1: ~3GB, file2: ~8GB)! Can this be true? Or am I doing something wrong???
JavaPairRDD<Integer, String> pairRDD = null;
JavaPairRDD<Integer, String> partitionedRDD = null;
JavaPairRDD<Integer, Float> consumptionRDD = null;
boolean partitioning = true; //or false
int partitionCount = 100; // between 1 and 200 I cant see any difference in the duration!
SparkConf conf = new SparkConf();
JavaSparkContext sc = new JavaSparkContext(conf);
input = sc.textFile(path);
pairRDD = mapToPair(input);
partitionedRDD = partition(pairRDD, partitioning, partitionsCount);
long duration = System.currentTimeMillis();
consumptionRDD = partitionedRDD.combineByKey(createCombiner, mergeValue, mergeCombiners);
duration = System.currentTimeMillis() - duration; // Measured time always the same, independent of file size (~20ms with / ~350ms without partitioning)
// Do an action
Tuple2<Integer, Float> test = consumptionRDD.takeSample(true, 1).get(0);
sc.stop();
Some helper methods (shouldn't matter):
// merging function for a new dataset
private static Function2<Float, String, Float> mergeValue = new Function2<Float, String, Float>() {
public Float call(Float sumYet, String dataSet) throws Exception {
String[] data = dataSet.split(",");
float value = Float.valueOf(data[2]);
sumYet += value;
return sumYet;
}
};
// function to sum the consumption
private static Function2<Float, Float, Float> mergeCombiners = new Function2<Float, Float, Float>() {
public Float call(Float a, Float b) throws Exception {
a += b;
return a;
}
};
private static JavaPairRDD<Integer, String> partition(JavaPairRDD<Integer, String> pairRDD, boolean partitioning, int partitionsCount) {
if (partitioning) {
return pairRDD.partitionBy(new HashPartitioner(partitionsCount));
} else {
return pairRDD;
}
}
private static JavaPairRDD<Integer, String> mapToPair(JavaRDD<String> input) {
return input.mapToPair(new PairFunction<String, Integer, String>() {
public Tuple2<Integer, String> call(String debsDataSet) throws Exception {
String[] data = debsDataSet.split(",");
int houseId = Integer.valueOf(data[6]);
return new Tuple2<Integer, String>(houseId, debsDataSet);
}
});
}
The web ui provides you with details on jobs/stage that your application has run. It details the time for each of them, and you can now filter various details such as Scheduler Delay, Task Deserialization Time, and Result Serialization Time.
The default port for the webui is 8080. Completed application are listed there, and you can then click on the name, or craft the url like this: x.x.x.x:8080/history/app-[APPID] to access those details.
I don't believe any other "built-in" methods exist to monitor the running time of a task/stage. Otherwise, you may want to go deeper and use a JVM debugging framework.
EDIT: combineByKey is a transformation, which means that it is not applied on your RDD, as opposed to actions (read more the lazy behaviour of RDDs here, chapter 3.1). I believe the time difference you're observing comes from the time SPARK takes to create the actual data structure when partitioning or not.
If a difference there is, you'll see it at action's time (takeSample here)
I am reading kafka streaming messages using spark-streaming.
Now I want to set Cassandra as my output.
I have created a table in cassandra "test_table" with columns "key:text primary key" and "value:text"
I have mapped the data successfully into JavaDStream<Tuple2<String,String>> data like this:
JavaSparkContext sc = new JavaSparkContext("local[4]", "SparkStream",conf);
JavaStreamingContext jssc = new JavaStreamingContext(sc, new Duration(3000));
JavaPairReceiverInputDStream<String, String> messages = KafkaUtils.createStream(jssc, args[0], args[1], topicMap );
JavaDStream<Tuple2<String,String>> data = messages.map(new Function< Tuple2<String,String>, Tuple2<String,String> >()
{
public Tuple2<String,String> call(Tuple2<String, String> message)
{
return new Tuple2<String,String>( message._1(), message._2() );
}
}
);
Then I have created a List:
List<TestTable> list = new ArrayList<TestTable>();
where TestTable is my custom class having the same structure as my Cassandra table, with members "key" and "value":
class TestTable
{
String key;
String val;
public TestTable() {}
public TestTable(String k, String v)
{
key=k;
val=v;
}
public String getKey(){
return key;
}
public void setKey(String k){
key=k;
}
public String getVal(){
return val;
}
public void setVal(String v){
val=v;
}
public String toString(){
return "Key:"+key+",Val:"+val;
}
}
Please suggest a way how to I add the data from JavaDStream<Tuple2<String,String>> data into the List<TestTable> list.
I am doing this so that I can subsequently use
JavaRDD<TestTable> rdd = sc.parallelize(list);
javaFunctions(rdd, TestTable.class).saveToCassandra("testkeyspace", "test_table");
to save the RDD data into Cassandra.
I had tried coding this way:
messages.foreachRDD(new Function<Tuple2<String,String>, String>()
{
public List<TestTable> call(Tuple2<String,String> message)
{
String k = message._1();
String v = message._2();
TestTable tbl = new TestTable(k,v);
list.put(tbl);
}
}
);
but seems some type mis-match happenning.
Please help.
Assuming that the intention of this program is to save the streaming data from kafka into Cassandra, it's not necessary to dump the JavaDStream<Tuple2<String,String>> data into a List<TestTable> list.
The Spark-Cassandra connector by DataStax supports this functionality directly through the Spark Streaming extensions.
It should be sufficient to use such extensions on the JavaDStream:
javaFunctions(data).writerBuilder("testkeyspace", "test_table", mapToRow(TestTable.class)).saveToCassandra();
instead of draining data on an intermediary list.