writing genetic algorithm with vba in microsoft project - excel

would it be possible to write vba code for genetic algorithm in MSP as like as excel ?
I want to optimize the duration of project's tasks according to minimize total cost, for this aim I need to run genetic algorithm to change duration of each task randomly, if i can write this algorithm with Vba in microsoft project, MSP will level resources based on new duration of each task and by new total duration of project, I can calculate the new cost.
I know there are vba genetic codes available for optimization in excel, is it possible to write the same code in Vba of MSP ?
if it is not, what about exporting data from MSP and importing them to excel for running the algorithm and then at the same time importing that data to MSP again?
thanks a lot,

I want to optimize the duration of project's tasks according to
minimize total cost, for this aim I need to run genetic algorithm to change duration of each task randomly
Yes, you can do this in MS Project using VBA (Task object methods and properties). And you need to do this in MS Project since you rely on Project's CPM engine and leveling capabilities.
Algorithm
Loop through all tasks and change duration to random number. Skip Summary tasks as Duration is read-only and skip tasks that are complete, and optionally skip tasks that have already started.
Level the resources.
Calculate total cost.
Store the durations for all tasks in the Duration1 field and store the total cost in the Cost1 of the first task.
Repeat steps 1-4 nine times, storing the durations in Duration2-Duration10 and total cost in Cost2-Cost10.
Determine the iteration that had the lowest cost and move those durations to the Duration1 field.
Repeat steps 5-6 until cost is optimized sufficiently.
Move the durations from the Duration1 field to the Duration field skipping tasks per Step 1.

Related

Calculate project work risk completion rate

Calculate project work risk completion rate (in excel) based on
Deadline
Work effort/work-effort weight
of tasks completed
Given
Project Unit: Milestone-1 - Milestone-4, MVP-1 (includes All milestones)
Work effort for each milestone (e.g. Milestone-1 = 3 points or small work effort, Milestone-2 = 5 or medium, Milestone-3 = 8 or large)
Work-effort weight for each Milestone (in %)
% or # of Completed tasks per a Milestone
How do I include the time in the equation? Say we have a start date and projected due date or duration (e.g. x weeks or days or months), I need to calculate the risk of completing a task (milestone and the entire MVP) on time based on the current # of tasks completed.
In other words, what is the risk (small/medium/large) that a Milestone/Milestones/MVP will be completed on time (say, Mar-31, 2023) based on the number of tasks completed (15 of 40)?
Please let me know if I need to clarify
I really appreciate any help you can provide.
The image is missing the time/deadline value so the risk is inaccurate current view

Azure cost analysis for a particular subscription using Python SDK

So I'm trying to automate fetching the current cost and cost forecast (Like it is shown under cost analysis for a particular subscription) for a particular subscription using python SDK but I haven't been able to find a single API that does this yet.
I've tried using UsageAggregate and Rate card but I haven't really figured out a way to find the cost for the current month to date. If there is an API that I'm missing or if I need to calculate monthly costs myself, I'd appreciate any code snippets or help.
If you already have the usage and the ratecard data, then you must combine them.
Take the meterId of the usage data and get the related ratecard data.
The ratecard data contains the MeterRates and the IncludedQuantity which you must take.
There are probably multiple meter rates and the included quantity because there are probably different costs per usage (e.g. first 10 calls for free, 3 GB for free, ...).
The consumption starts/is reseted at the 14th of the month. That's the reason why you have to read the data from the whole billing period (begins with 14th of each month), because that's the only way how you get the correct consumption.
So, if you are using e.g. Azure Functions and you have a usage of 100.000 units per day and you want the costs from 20th - 30th, then the calculation works as follows:
read data from 14th - 30th. These are 17 days and therefore it used 1.700.000 units. The first 400.000 are for free = IncludedQuantity (so in this sample the first 4 days).
From the 400.001 unit on, you have to take the meter rate (0,0000134928 €) and calculate the costs. 1.300.000 * 0,0000134928 = ~17,54€.
Fortunately, the azure functions have only one rate. If the rate changes e.g. after 5.000.000 units, then you also have to take this into account. If you have the whole costs, then you can filter on your date which is 20.-30. and you will get the result.
Its calculation implemented in C# and published it as a NuGet package here. It also contains a sample console which you could use to export the data.
I know I am bit late to the party, but after struggling with the same problem, I managed to create the code for getting the cost of a resource group using
azure.mgmt.costmanagement
Link to cost management API
Code sample is in my answer here

Spark optimize "DataFrame.explain" / Catalyst

I've got a complex software which performs really complex SQL queries (well not queries, Spark plans you know). <-- The plans are dynamic, they change based on user input so I can't "cache" them.
I've got a phase in which spark takes 1.5-2min building the plan. Just to make sure, I added "logXXX", then explain(true), then "logYYY" and it takes 1minute 20 seconds for the explain to execute.
I've trying breaking the lineage but this seems to cause worse performance because the actual execution time becomes longer.
I can't parallelize driver work (already did, but this task can't be overlapped with anything else).
Any ideas/guide on how to improve the plan builder in Spark? (like for example, flags to try enabling/disabling and such...)
Is there a way to cache plans in Spark? (so I can run that in parallel and then execute it)
I've tried disabling all possible optimizer rules, setting min iterations to 30... but nothing seems to affect that concrete point :S
I tried disabling wholeStageCodegen and it helped a little, but the execution is longer so :).
Thanks!,
PS: The plan does contain multiple unions (<20, but quite complex plans inside each union) which are the cause for the time, but splitting them apart also affects execution time.
Just in case it helps someone (and if no-one provides more insights).
As I couldn't manage to reduce optimizer times (and well, not sure if reducing optimizer times would be good, as I may lose execution time).
One of the latest parts of my plan was scanning two big tables and getting one column from each one of them (using windows, aggregations etc...).
So I splitted my code in two parts:
1- The big plan (cached)
2- The small plan which scans and aggregates two big tables (cached)
And added one more part:
3- Left Join/enrich the big plan with the output of "2" (this takes like 10seconds, the dataset is not so big) and finish the remainder computation.
Now I launch both actions (1,2) in parallel (using driver-level parallelism/threads), cache the resulting DataFrames and then wait+ afterwards perform 3.
With this, while Spark driver (thread 1) is calculating the big plan (~2minutes) the executors will be executing part "2" (which has a small plan, but big scans/shuffles) and then both get "mixed" in like 10-15seconds, which a good improvement in execution time over the 1:30 I save while calculating the plan.
Comparing times:
Before I would have
1:30 Spark optimizing time + 6 minutes execution time
Now I have
max
(
1:30 Spark Optimizing time + 4 minutes execution time,
0:02 Spark Optimizing time + 2 minutes execution time
)
+ 15 seconds joining both parts
Not so much, but quite a few "expensive" people will be waiting for it to finish :)

Using Timer for batch operations

I'm new to using Micrometer and am trying to see if there's a way to use a Timer that would also include a count of the number of items in a batch processing scenario. Since I'm processing the batch with Java streams, I didn't see an obvious way to record the timer for each item processed, so I was looking for a way to set a batch size attribute. One way I think that could work is to use the FunctionTimer from https://micrometer.io/docs/concepts#_function_tracking_timers, but I believe that requires the app to maintain a persistent monotonically increasing set of values for the total count and total time.
Is there a simpler way this can be done? Ultimately this data will be fed to New Relic. I've also tried setting tags for the batch size, but those seem to be reported as strings so I can't do any type of aggregation on the values.
Thanks!
A timer is intended for measuring an action and at a minimum results in two measurements: a count and a duration.
So a timer will work perfectly for your batch processing. In the the java stream, a peek operation might be a good place to put a timer.
If you were about to process 20 elements and you were just measuring the time for all 20 elements, you would need to create a new Counter for measuring the batch size. You could them divide the timer's total duration against your counter to get a per-item duration or divide it against the timer's total count to get a per-batch duration.
Feel free to add code snippets if you would like feedback for those.

Find documents in MongoDB with non-typical limit

I have a problem, but don't have idea how to resolve it.
I've got PointValues collection in MongoDB.
PointValue schema has 3 parameters:
dataPoint (ref to DataPoint schema)
value (Number)
time (Date)
There is one pointValue for every hour (24 per day).
I have API method to get PointValues for specified DataPoint and time range. Problem is I need to limit it to max 1000 points. Typical limit(1000) method isn't good way, because I need point for whole, specified time range, with time step depends on specified time range and point values count.
So... for example:
Request data for 1 year = 1 * 365 * 24 = 8760
It should return 1000 values but approx 1 value per (24 / (1000 / 365)) = ~9 hours
I don't have idea what method i should use to filter that data in MongoDB.
Thanks for help.
Sampling exactly like that on the database would be quite hard to do and likely not very performant. But an option which gives you a similar result would be to use an aggregation pipeline which $group's the $first best value by $year, $dayOfYear, and $hour (and $minute and $second if you need smaller intervals). That way you can sample values by time steps, but your choices of step lengths are limited to what you have date-operators for. So "hourly" samples is easy, but "9-hourly" samples gets complicated. When this query is performance-critical and frequent, you might want to consider to create additional collections with daily, hourly, minutely etc. DataPoints so you don't need to perform that aggregation on every request.
But your documents are quite lightweight due to the actual payload being in a different collection. So you might consider to get all the results in the requested time range and then do the skipping on the application layer. You might want to consider combining this with the above described aggregation to pre-reduce the dataset. So you could first use an aggregation-pipeline to get hourly results into the application and then skip through the result set in steps of 9 documents. Whether or not this makes sense depends on how many documents you expect.
Also remember to create a sorted index on the time-field.

Resources