Apache Spark: Optimization Filter - apache-spark

So this is more of a design question.
Right now, I have a list of patient ids and I need to put them into one of 3 buckets.
The bucket they go into is completely based on the following RDDs
case class Diagnostic(patientID:String, date: Date, code: String)
case class LabResult(patientID: String, date: Date, testName: String, value: Double)
case class Medication(patientID: String, date: Date, medicine: String)
Right now I'm basically going to each RDD 3-4 times per patient_id per bucket to see if it goes into a bucket. This runs extremely slow, is there anything I can do to improve this?
Example is for bucket 1, I have to check if there a diagnostic, for patient_id 1 (even though there are multiple), has a code of 1 and that patient_id 1 has a medication where medicine is foo
Right now I'm doing this as two filters (one on each RDD)....
Ugly code example
if (labResult.filter({ lab =>
val testName = lab.testName
testName.contains("glucose")
}).count == 0) {
return false
} else if (labResult.filter({ lab =>
val testName = lab.testName
val testValue = lab.value
// all the built in rules
(testName == "hba1c" && testValue >= 6.0) ||
(testName == "hemoglobin a1c" && testValue >= 6.0) ||
(testName == "fasting glucose" && testValue >= 110) ||
(testName == "fasting blood glucose" && testValue >= 110) ||
(testName == "glucose" && testValue >= 110) ||
(testName == "glucose, serum" && testValue >= 110)
}).count > 0) {
return false
} else if (diagnostic.filter({ diagnosis =>
val code = diagnosis.code
(code == "790.21") ||
(code == "790.22") ||
(code == "790.2") ||
(code == "790.29") ||
(code == "648.81") ||
(code == "648.82") ||
(code == "648.83") ||
(code == "648.84") ||
(code == "648.0") ||
(code == "648.01") ||
(code == "648.02") ||
(code == "648.03") ||
(code == "648.04") ||
(code == "791.5") ||
(code == "277.7") ||
(code == "v77.1") ||
(code == "256.4") ||
(code == "250.*")
}).count > 0) {
return false
}
true

Related

what can I do to my code donĀ“t delete a 0 in a array?

I'm trying to make a calculator in Haxe, it is almost done but have a bug. The bug is happening every time that some part of the equation result in 0.
This is how I concatenate the numbers and put i the array number, the cn is the variable used to receive the digit and transform in a number, the ci is a specific counter to make the while work well and the c is the basic counter that is increased to a background while used to read the array (input) items:
var cn = '';
var ci = c;
if (input[c] == '-') {
number.push('+');
cn = '-';
ci ++;
}
while (input[ci] == '0' || input[ci] == '1' || input[ci] == '2' || input[ci] == '3' || input[ci] == '4' || input[ci] == '5' || input[ci] == '6' || input[ci] == '7' || input[ci] == '8' || input[ci] == '9' || input[ci] == '.') {
if(ci == input.length) {
break;
}
cn += input[ci];
ci++;
}
number.push(cn);
c += cn.length;
This is the part of the code used to calculate the addition and subtraction
for (i in 0 ... number.length) { trace(number); if (number[c] == '+') { number[c-1] = ''+(Std.parseFloat(number[c-1])+Std.parseFloat(number[c+1])); number.remove(number[c+1]); number.remove(number[c]); }
else {
c++;
}
}
Example:
12+13-25+1: When my code read this input, it transform in a array ([1,2,+,1,3,-,2,5,+,1]), then the code concatenate the numbers ([12,+,13,-,25,+,1]) and for lastly it seeks for the operators(+,-,* and /) to make the operation (ex: 12+13), substituting "12" for the result of the operation (25) and removing the "+" and the "13". This part works well and then the code does 25-25=0.
The problem starts here because the equation then becomes 0+1 and when the code process that what repend is that the 0 vanish and the 1 is removed and the output is "+" when the expected is "1".
remove in this case uses indexOf and is not ideal, suggest using splice instead.
number.splice(c,1);
number.splice(c,1);
https://try.haxe.org/#D3E38

How to use WITHIN with 2 geo indexes in one collection?

I have 2 geo indexes in the collection.
I need to find all the documents on the first geo-index and another time on another geo-index.
LET cFrom = (
FOR c IN WITHIN("city", 22.5455400, 114.0683000, 3000, "geofrom")
FOR r IN rcity
FILTER r._from == c._id && r.user == "5010403" && r.type == "freight-m"
LIMIT 1
RETURN r)
LET cTo = (
FOR c IN WITHIN("city", 55.7522200, 37.6155600, 3000, "geoTo")
FOR r IN rcity
FILTER r._to == c._id && r.user == "5010403" && r.type == "freight-r"
LIMIT 100
RETURN r)

Cannot perform option-mapped operation with type: (Boolean, _57) => R

I have next filter
type DatabaseID = Long
val filter = moderators.filter(m =>
(m.created < before) &&
(m.userType inSet userTypeList) &&
(if(true) m.mcID === mcIDFilter else true)
)
where m.mcID has Rep[Option[models.DatabaseID]] type and mcIDFilter Option[models.DatabaseID].
Why i'm getting next error?
Cannot perform option-mapped operation
with type: (Boolean, _57) => R
for base type: (Boolean, Boolean) => Boolean
_57? What is it?
I have replaced condition with true for simplicity. If i remove line with condition or replace m.mcID === mcIDFilter with just true, code compiles fine.
Also if i remove if statement, it compiles without error:
val filter = moderators.filter(m =>
(m.created < before) &&
(m.userType inSet userTypeList) &&
m.mcID === mcIDFilter
)
I found that this error appears when type one of operands have not the same type.
I also tried
val filter = moderators.filter(m =>
(m.created < before) &&
(m.userType inSet userTypeList) &&
(if(true) m.mcID === mcIDFilter else true:Rep[Boolean])
)
but without success.
Ok, i found how compile this. It's ugly, but work.
val filter = moderators.filter(m =>
(m.created < before) &&
(m.userType inSet userTypeList) &&
(if(true) m.mcID === mcIDFilter else Some(true):Rep[Option[Boolean]])
)

Comparing 2 strings (Days of week) no effect

I am trying to validate sVenueDay (text entered via textbox), to make sure the value entered is a valid day. I entered "Sunday" into txtBoxVenueDay.Text. When running the program, the "Input entered not valid day" is displayed even though "Sunday" entered is a valid day. I tried using !sVenueDay.Equals("Sunday") format (and for other days as well) but nothing :/
string sVenueDay = txtBoxVenueDay.Text;
if (sVenueDay != "Monday" || sVenueDay != "Tuesday" || sVenueDay != "Wednesday" || sVenueDay != "Thursday" || sVenueDay != "Friday" || sVenueDay != "Saturday" || sVenueDay != "Sunday")
{
lblOutput.Text = "Input entered not valid day";
return;
}
else
lblOutput.Text = "Valid day";
You're checking whether it's not equal to "Monday" or it's not equal to "Tuesday". Can you propose which string is equal to both "Monday" and "Tuesday"? :)
I suspect you want:
if (sVenueDay != "Monday" && sVenueDay != "Tuesday" && ...)
Or, rather more usefully:
private static readonly HashSet<string> ValidDays = new HashSet<string>(
new[] { "Monday", "Tuesday", ... });
...
if (!ValidDays.Contains(sVenueDay))
{
...
}
When your using or if first condition is true then i will not check the next conditions.. So first check true then false ; Try like following:(use equals method)
string sVenueDay = txtBoxVenueDay.Text;if (sVenueDay == "Monday" || sVenueDay == "Tuesday" || sVenueDay == "Wednesday" || sVenueDay == "Thursday" || sVenueDay == "Friday" || sVenueDay == "Saturday" || sVenueDay == "Sunday") { lblOutput.Text = "valid day"; return; }else lblOutput.Text = "Input entered not Valid day";

Finding the number of days in a month

I am making a program to display the no. of days in the month provided by user. I am making this program at Data Flow level. As I am new to verilog, I don't know if we can use if/else conditions or case statement in data flow level. because using if/else statement will make this program piece of cake. If not how can I implement the following idea in data flow level.
if(month==4 || month==6 || month==9|| month==11)
days=30;
else
if(month==2 && leapyear==1)
days=29;
Here is my verilog incomplete code:
module LeapYear(year,month,leapOrNot,Days);
input year,month;
output leapOrNot,Days;
//if (year % 400 == 0) || ( ( year % 100 != 0) && (year % 4 == 0 ))
leapOrNot=((year&400)===0) && ((year % 100)!==0 || (year & 4)===0);
Days=((month & 4)===4 ||(month & 6)===6 ||(month & 9)===9 ||(month & 11)===11 )
You cannot use if/else in a continuous assignment, but you can use the conditional operator, which is functionally equivalent.
Try this:
assign Days = (month == 4 || month == 6 || month == 9 || month == 11) ? 30 :
(month == 2 && leapyear == 1) ? 29;
That will produce what you put in your question. But's its not the correct answer as you are missing the conditions where Days is equal to 28 or 31.
EDIT:
Here's how to combine all the conditions into a single assign statement using the conditional operator.v
assign Days = (month == 4 || month == 6 || month == 9 || month == 11) ? 30 :
(month == 2 && leapyear == 1) ? 29 :
(month == 2 && leapyear == 0) ? 28 :
31;

Resources