SparkStreaming either running 2 times the same command or mail sending 2 times same mail

SparkStreaming either running 2 times the same command or mail sending 2 times same mail - apache-spark

My spark application sends two mails when I send just 1 string to my Kafka Topic. Here is the interested part of code:
JavaDStream<String> lines = kafkaStream.map ( [returns the 2nd value of the tuple];
lines.foreachRDD(new VoidFunction<JavaRDD<String>>() {
[... some stuff ...]
JavaRDD<String[]> flagAddedRDD = associatedToPersonRDD.map(new Function<String[],String[]>(){
#Override
public String[] call(String[] arg0) throws Exception {
String[] s = new String[arg0.length+1];
System.arraycopy(arg0, 0, s, 0, arg0.length);
int a = FilePrinter.getAge(arg0[CSVExampleDevice.LENGTH+People.BIRTH_DATE]);
int p = Integer.parseInt(arg0[CSVExampleDevice.PULSE]);
if(
((p<=45 || p>=185)&&(a<=12 || a>=70))
||
(p>=190 || p<=40)){
s[arg0.length]="1";
Mailer.sendMail(mailTo, arg0);
}
else
s[arg0.length]="0";
return s;
}
});`
I cannot understand if the mail sends two emails, because the transformation after this one is a save on file and this only return 1 line. The mailer.sendMail:
public static void sendMail(String whoTo, String[] whoIsDying){
Properties props = new Properties();
props.put("mail.smtp.host", "mail.***.com"); //edited
props.put("mail.smtp.port", "25");
SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");
Session session = Session.getInstance(props);
try {
Message message = new MimeMessage(session);
message.setFrom(new InternetAddress("***my-email***")); //edited
for (String string : whoTo.split(","))
message.addRecipient(Message.RecipientType.TO,
new InternetAddress(string));
message.setSubject(whoIsDying[PersonClass.TIMESTAMP]);
message.setText("trial");
System.out.println("INFO: sent mail");
Transport.send(message);
} catch (MessagingException e) {
throw new RuntimeException(e);
}
}

The reason this happens is because I called two actions at the end of the transformation:
FilePrinter.saveAssociatedAsCSV(associatedSavePath, unifiedAssociatedStringRDD.collect()); //first action
JavaRDD<String[]> enrichedWithWeatherRDD = flagAddedRDD.map(new Function<String[],String[]>(){ [some more stuff] });
JavaRDD<String> unifiedEnrichedStringRDD = enrichedWithWeatherRDD.map(unifyArrayIntoString);
FilePrinter.saveEnrichedAsCSV(enrichedSavePath, unifiedEnrichedStringRDD.collect()); //second action
and thus the whole transformation is called again and the mailer part is above both of these actions.

Related

Printing Duplicate Records from the Data View In Acumatica

I am trying to print all records of a data-view into a file using a for loop in my customization in Acumatica. Unfortunately I am ending up with printing the first record everytime resulting into duplication of records, Unable to track where I am going wrong....Please Assist
Here Goes my Code......
public class MayBankGIROProcess : PXGraph<MayBankGIROProcess>
{
public PXSelect<MayBankGIRO> Document; //This is my Data View
public PXAction<MayBankGiroFilter> createTextFile;
[PXUIField(DisplayName = "Create Text File")]
[PXButton()]
public virtual IEnumerable CreateTextFile(PXAdapter adapter)
{
List<string> myList = new List<string> { };
foreach (MayBankGIRO dacRecord in this.Document.Select()) //this is the loop which is taking the data records.
{
myList.Add(dacRecord.ReordType+ "|"+ dacRecord.CustomerReferenceNumber+ "|"+ dacRecord.ClientBatchID+ "|");
// The above line is printing only the first record of the data view everytime .
}
string filename = "DAWN" + ".txt";
Download(myList, filename);
return adapter.Get();
}
public static void Download(List<string> lines, string name) //method generating file
{
var bytes = default(byte[]);
using (MemoryStream stream = new MemoryStream())
{
StreamWriter sw = new StreamWriter(stream);
foreach (string line in lines)
{
sw.WriteLine(line);
}
stream.Position = 0;
bytes = stream.ToArray();
sw.Close();
};
PX.SM.FileInfo textDoc = new PX.SM.FileInfo(name, null, bytes);
if (textDoc != null)
{
throw new PXRedirectToFileException(textDoc, true);
}
else
{
PXTrace.WriteInformation("Could not generate file");
}
}
}
[Generated Text File with all duplicate Record][1]
[1]: https://i.stack.imgur.com/Kllmk.png
[Original Record from database][2]
[2]: https://i.stack.imgur.com/Rbr9k.png

This usually happens when the report is pulling from a SQL View DAC which doesn't have unique key defined. Add IsKey=True on DAC fields until the SQL view is pulling unique record and the error should go away.

Converting UnixTimestamp to TIMEUUID for Cassandra

I'm learning all about Apache Cassandra 3.x.x and I'm trying to develop some stuff to play around. The problem is that I want to store data into a Cassandra table which contains these columns:
id (UUID - Primary Key) | Message (TEXT) | REQ_Timestamp (TIMEUUID) | Now_Timestamp (TIMEUUID)
REQ_Timestamp has the time when the message left the client at frontend level. Now_Timestamp, on the other hand, is the time when the message is finally stored in Cassandra. I need both timestamps because I want to measure the amount of time it takes to handle the request from its origin until the data is safely stored.
Creating the Now_Timestamp is easy, I just use the now() function and it generates the TIMEUUID automatically. The problem arises with REQ_Timestamp. How can I convert that Unix Timestamp to a TIMEUUID so Cassandra can store it? Is this even possible?
The architecture of my backend is this: I get the data in a JSON from the frontend to a web service that process it and stores it in Kafka. Then, a Spark Streaming job takes that Kafka log and puts it in Cassandra.
This is my WebService that puts the data in Kafka.
#Path("/")
public class MemoIn {
#POST
#Path("/in")
#Consumes(MediaType.APPLICATION_JSON)
#Produces(MediaType.TEXT_PLAIN)
public Response goInKafka(InputStream incomingData){
StringBuilder bld = new StringBuilder();
try {
BufferedReader in = new BufferedReader(new InputStreamReader(incomingData));
String line = null;
while ((line = in.readLine()) != null) {
bld.append(line);
}
} catch (Exception e) {
System.out.println("Error Parsing: - ");
}
System.out.println("Data Received: " + bld.toString());
JSONObject obj = new JSONObject(bld.toString());
String line = obj.getString("id_memo") + "|" + obj.getString("id_writer") +
"|" + obj.getString("id_diseased")
+ "|" + obj.getString("memo") + "|" + obj.getLong("req_timestamp");
try {
KafkaLogWriter.addToLog(line);
} catch (Exception e) {
e.printStackTrace();
}
return Response.status(200).entity(line).build();
}
}
Here's my Kafka Writer
package main.java.vcemetery.webservice;
import org.apache.kafka.clients.producer.KafkaProducer;
import org.apache.kafka.clients.producer.ProducerRecord;
import java.util.Properties;
import org.apache.kafka.clients.producer.Producer;
public class KafkaLogWriter {
public static void addToLog(String memo)throws Exception {
// private static Scanner in;
String topicName = "MemosLog";
/*
First, we set the properties of the Kafka Log
*/
Properties props = new Properties();
props.put("bootstrap.servers", "localhost:9092");
props.put("acks", "all");
props.put("retries", 0);
props.put("batch.size", 16384);
props.put("linger.ms", 1);
props.put("buffer.memory", 33554432);
props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer");
props.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer");
// We create the producer
Producer<String, String> producer = new KafkaProducer<>(props);
// We send the line into the producer
producer.send(new ProducerRecord<>(topicName, memo));
// We close the producer
producer.close();
}
}
And finally here's what I have of my Spark Streaming job
public class MemoStream {
public static void main(String[] args) throws Exception {
Logger.getLogger("org").setLevel(Level.ERROR);
Logger.getLogger("akka").setLevel(Level.ERROR);
// Create the context with a 1 second batch size
SparkConf sparkConf = new SparkConf().setAppName("KafkaSparkExample").setMaster("local[2]");
JavaStreamingContext ssc = new JavaStreamingContext(sparkConf, Durations.seconds(10));
Map<String, Object> kafkaParams = new HashMap<>();
kafkaParams.put("bootstrap.servers", "localhost:9092");
kafkaParams.put("key.deserializer", StringDeserializer.class);
kafkaParams.put("value.deserializer", StringDeserializer.class);
kafkaParams.put("group.id", "group1");
kafkaParams.put("auto.offset.reset", "latest");
kafkaParams.put("enable.auto.commit", false);
/* Se crea un array con los tópicos a consultar, en este caso solamente un tópico */
Collection<String> topics = Arrays.asList("MemosLog");
final JavaInputDStream<ConsumerRecord<String, String>> kafkaStream =
KafkaUtils.createDirectStream(
ssc,
LocationStrategies.PreferConsistent(),
ConsumerStrategies.<String, String>Subscribe(topics, kafkaParams)
);
kafkaStream.mapToPair(record -> new Tuple2<>(record.key(), record.value()));
// Split each bucket of kafka data into memos a splitable stream
JavaDStream<String> stream = kafkaStream.map(record -> (record.value().toString()));
// Then, we split each stream into lines or memos
JavaDStream<String> memos = stream.flatMap(x -> Arrays.asList(x.split("\n")).iterator());
/*
To split each memo into sections of ids and messages, we have to use the code \\ plus the character
*/
JavaDStream<String> sections = memos.flatMap(y -> Arrays.asList(y.split("\\|")).iterator());
sections.print();
sections.foreachRDD(rdd -> {
rdd.foreachPartition(partitionOfRecords -> {
//We establish the connection with Cassandra
Cluster cluster = null;
try {
cluster = Cluster.builder()
.withClusterName("VCemeteryMemos") // ClusterName
.addContactPoint("127.0.0.1") // Host IP
.build();
} finally {
if (cluster != null) cluster.close();
}
while(partitionOfRecords.hasNext()){
}
});
});
ssc.start();
ssc.awaitTermination();
}
}
Thank you in advance.

Cassandra has no function to convert from UNIX timestamp. You have to do the conversion on client side.
Ref: https://docs.datastax.com/en/cql/3.3/cql/cql_reference/timeuuid_functions_r.html

Load report failed- Crystal Report in c#

I have this application in C# on VS2012, in which I need to generate Crystal Report 13.0.x. This application has been running fine for last 2 years or so . Recently did some addons and after that its giving error
Load Report Fail
However strange thing is that , in a day around 100 times this Crystal report is generated and in between it gives out that error. After the whole application has to be exited and then it works fine too. Because of this I am not abel to replicate the error at me end.
Here my code:
public partial class ChangeOrderList : Form
{
ConnectionClass connectionclass = new ConnectionClass();
NewOrderBL NObl = new NewOrderBL();
DailySalesReportBL DSRbl = new DailySalesReportBL();
public ChangeOrderList()
{
InitializeComponent();
}
private void ChangeOrderList_Load(object sender, EventArgs e)
{
/////////////////////////To count Lunch Buffet///////////////////
DataTable dtlb = DSRbl.selectBuffet(DateTime.Today.Date.ToString(), DateTime.Today.Date.ToString());
string date = dtlb.Rows[0][0].ToString();
////////////////////////////////////////////////////////////////
try
{
string sqlqry = "Select KOTNo,TableNo,WaiterName,ItemCode,ItemName,Quantity,Status,Foodtype from tblOrderChange where KOTNo=#kotno and Quantity>'0.00' and (Category!='Appetizer' and Category!='Indian Breads' and Category!='Desserts' and Category!='Beverages' and Category!='Tandoori')";
SqlCommand cmd = new SqlCommand(sqlqry, connectionclass.con);
cmd.Parameters.AddWithValue("#kotno", NewOrderBL.KOTNo);
SqlDataAdapter adapter = new SqlDataAdapter(cmd);
DataSet1 ds = new DataSet1();
adapter.Fill(ds, "tblOrderChange");
if (ds.Tables["tblOrderChange"].Rows.Count == 0)
{
MessageBox.Show("No Data Found", this.Text, MessageBoxButtons.OK, MessageBoxIcon.Information);
}
if (deliverybl.order == "Delivery")
{
//PrintDelivery printorder = new PrintDelivery();
ChangeOrderdelivery printorder = new ChangeOrderdelivery();
printorder.SetDataSource(ds);
crystalReportViewer1.ReportSource = printorder;
System.Drawing.Printing.PrintDocument printDocument = new System.Drawing.Printing.PrintDocument();
printorder.PrintOptions.PrinterName = printDocument.PrinterSettings.PrinterName;
printorder.PrintOptions.PrinterName = "EPSON TM-U220 Receipt";
printorder.PrintToPrinter(1, false, 0, 0);
}
else
{
crystalReportViewer1.RefreshReport();
ParameterFields paramFields = new ParameterFields();
ParameterField paramField = new ParameterField();
ParameterDiscreteValue paramDiscreteValue = new ParameterDiscreteValue();
paramField.Name = "LBqty";
paramDiscreteValue.Value = date;
paramField.CurrentValues.Add(paramDiscreteValue);
paramFields.Add(paramField);
PrintChangeOrderList printchangeorder = new PrintChangeOrderList();
printchangeorder.SetDataSource(ds);
printchangeorder.SetParameterValue("LBqty", date);
crystalReportViewer1.ReportSource = printchangeorder;
System.Drawing.Printing.PrintDocument printDocument = new System.Drawing.Printing.PrintDocument();
printchangeorder.PrintOptions.PrinterName = printDocument.PrinterSettings.PrinterName;
printchangeorder.PrintOptions.PrinterName = "EPSON TM-U220 Receipt";
printchangeorder.PrintToPrinter(1, false, 0, 0);
}
}
catch (Exception ex)
{
MessageBox.Show(ex.Message, this.Text, MessageBoxButtons.OK, MessageBoxIcon.Information);
}
finally { connectionclass.disconnect(); }
onlinebl.crystalreport = "";
this.DialogResult = DialogResult.OK;
}
private void btnExit_Click(object sender, EventArgs e)
{
onlinebl.crystalreport = "";
this.DialogResult = DialogResult.OK;
}
I have been banging my head for long . Every where I search it says about the path and I am not using the path anywhere so not able to understand where the fault is.
If you need any more info or the code please let me know. Thanks

Check your class:
<pre>
ChangeOrderdelivery printorder = new ChangeOrderdelivery();
printorder.SetDataSource(ds);
crystalReportViewer1.ReportSource = printorder;
this one has a report Path hidding i guess
ChangeOrderdelivery printorder = new ChangeOrderdelivery();
</pre>

I just deleted internet temp files and after that client has not yet complained about the error. So I am keeping an eye on it if it resolved the issue

how to get values from SqlDataReader to forloop execution?

Here I coding for get each and every StudyUID(as string) from database to SqlDataReader,but i need to know how the reader value call to forloop execution.
Get to read each and every StudyUID for execution.Here is the code :.
public void automaticreport()
{
//string autsdyid="";
SqlConnection con = new SqlConnection(constr);
con.Open();
string autoquery = "Select StudyUID From StudyTable Where status='2'";
SqlCommand cmd = new SqlCommand(autoquery, con);
SqlDataReader rdr = cmd.ExecuteReader();
for()
{
//how to call each StudyUId from database through for loop
if (!this.reportchk)
{
Reportnew cf = new Reportnew();
ThreadPool.QueueUserWorkItem((WaitCallback)(o => cf.ReportRetrive(this, autsdyid, true)));
}
else
{
int num = (int)System.Windows.Forms.MessageBox.Show("Reports checking in progress, Please wait sometime and try again later", "OPTICS", MessageBoxButtons.OK, MessageBoxIcon.Asterisk);
}
con.Close();
}

Like #R.T and others have mentioned you can use the Read method on the data reader. Looking at your sample code you might want to refactor it slightly to meet more of the SOLID principles and make sure you're not leaking database connections
Here's an example of code that has been refactored a bit.
public void automaticreport()
{
foreach (var autsdyid in LoadStudyIdentifiers())
{
if (!this.reportchk)
{
Reportnew cf = new Reportnew();
ThreadPool.QueueUserWorkItem((WaitCallback)(o => cf.ReportRetrive(this, autsdyid, true)));
}
else
{
int num = (int)System.Windows.Forms.MessageBox.Show("Reports checking in progress, Please wait sometime and try again later", "OPTICS", MessageBoxButtons.OK, MessageBoxIcon.Asterisk);
}
}
}
private string[] LoadStudyIdentifiers()
{
var results = new List<string>();
// adding a using statement will close the database connection if there are any errors
// avoiding consuming the database connection pool
using (var con = new SqlConnection(constr))
{
conn.Open();
var autoquery = "Select StudyUID From StudyTable Where status='2'";
using (var cmd = new SqlCommand(autoquery, con))
{
SqlDataReader rdr = cmd.ExecuteReader();
while(rdr.Read())
{
results.Add(rdr.GetString(rdr.GetOrdinal("StudyUID")));
}
}
}
return results.ToArray();
}
Note: I wrote this in notepad so there is no guarantee it will compile but should give an indication as to how you could refactor your code.

if (rdr.HasRows)
{
while (rdr.Read())
{
Console.WriteLine(rdr.getString("columnName"));
}
}

You can use something like:
while (reader.Read())
{
string value = reader.getString("columnName");
}

You may use the while loop like this:
while (rdr.Read())
{
string s = rdr.GetString(rdr.GetOrdinal("Column"));
//Apply logic to retrieve here
}

Is it possible to use IMAP Query Terms in Javamail with GMail?

I am trying to programmatically retrieve the Call Log messages that are backup up from my android phone by a little application called SMSBackup (highly recommended).
What I want to do is to be able to retrieve the call logs for a particular day. I have tried the following program, using JavaMail:
public List<CallLogEntry> getCallLog(String username, String password, Date date, TimeZone tz) {
Store store = null;
try {
store = MailUtils.getGmailImapStore(username, password);
Folder folder = store.getDefaultFolder();
if (folder == null)
throw new Exception("No default folder");
Folder inboxfolder = folder.getFolder("Call log");
if (inboxfolder == null)
throw new Exception("No INBOX");
inboxfolder.open(Folder.READ_ONLY);
Date fromMidnight = new Date(TimeUtils.fromMidnight(date.getTime(), tz));
Date toMidnight = new Date(TimeUtils.toMidnight(date.getTime(), 0, tz));
SentDateTerm fromTerm = new SentDateTerm(SentDateTerm.GT, fromMidnight);
SentDateTerm toTerm = new SentDateTerm(SentDateTerm.LT, toMidnight);
AndTerm searchTerms = new AndTerm(fromTerm, toTerm);
Message[] msgs = inboxfolder.search(searchTerms);
FetchProfile fp = new FetchProfile();
fp.add("Subject");
fp.add("Content");
fp.add("From");
fp.add("SentDate");
inboxfolder.fetch(msgs, fp);
List<CallLogEntry> callLog = new ArrayList<CallLogEntry>();
for (Message message : msgs) {
CallLogEntry entry = new CallLogEntry();
entry.subject = message.getSubject();
entry.body = (String) message.getContent();
callLog.add(entry);
}
inboxfolder.close(false);
store.close();
return callLog;
} catch (NoSuchProviderException ex) {
ex.printStackTrace();
} catch (MessagingException ex) {
ex.printStackTrace();
} catch (Exception ex) {
ex.printStackTrace();
} finally {
try {
if (store != null)
store.close();
} catch (MessagingException ex) {
ex.printStackTrace();
}
}
return null;
}
My two utility methods (fromMidnight / toMidnight):
public static final long fromMidnight(long time, TimeZone tz) {
Calendar c = Calendar.getInstance(tz);
c.setTimeInMillis(time);
c.set(Calendar.HOUR_OF_DAY, 0);
c.set(Calendar.MINUTE, 0);
c.set(Calendar.SECOND, 0);
c.set(Calendar.MILLISECOND, 1);
return c.getTimeInMillis();
}
public static final long toMidnight(long time, int nDays, TimeZone tz) {
Calendar c = Calendar.getInstance(tz);
c.setTimeInMillis(time + nDays*MILLIS_IN_DAY);
c.set(Calendar.HOUR_OF_DAY, 23);
c.set(Calendar.MINUTE, 59);
c.set(Calendar.SECOND, 59);
c.set(Calendar.MILLISECOND, 999);
return c.getTimeInMillis();
}
However, for some reason:
while eventually executing, it takes about 3 minutes to complete
I'm getting back the entire Call log, i.e. the entire content of the "Call Log" folder in my mailbox
What am I missing?

The main thing that you're missing is that the underlying IMAP SEARCH syntax supports only dates, not date-times. So your query will result in JavaMail issuing the command:
A001 SEARCH SENTBEFORE 16-JAN-2011 SENTSINCE 16-JAN-2011 ALL
(Put a breakpoint in IMAPProtocol.issueSearch() to see this.)
GMail appears to freak out on this query, which logically cannot match any messages. Try switching your logic to a single term using SentDateTerm.EQ (which maps to SENTON) and it should work:
SentDateTerm term = new SentDateTerm(SentDateTerm.EQ, date.getTime());
Message[] msgs = inboxfolder.search(term);

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

SparkStreaming either running 2 times the same command or mail sending 2 times same mail - apache-spark

Related

Printing Duplicate Records from the Data View In Acumatica

Converting UnixTimestamp to TIMEUUID for Cassandra

Load report failed- Crystal Report in c#

how to get values from SqlDataReader to forloop execution?

Is it possible to use IMAP Query Terms in Javamail with GMail?

Categories

Resources