Parsing percent values from string in Excel - excel

I am currently working on a financial model where each team receives different types of news such as this :
1,379,BondD,News regarding Dayaria Milk Products,Dayaria Milk
Products' sales increase by 25% in comparison to an analyst consensus
of 23%. The market is currently questionning whether it is due to an
increase in sales of organic milk given the health trends, or due to
the temporary increase in prices after the shortage in North America.
P.S. I didn't make the names.
Since this is a sales increase and it contains a percent sign increase, but only one of the % values actually matters.
I already have a parsing function that pulls for words such as 'increase' and 'decrease' but I haven't been able to figure out how I should go about differentiating % values that are useful and ignoring the ones that not.
Thanks ahead of time.

if the increase or decrease percentage always follows those words then use the instr function to first find the word and then again to find the following % sign.
longNumber1 = instr(1,textVariable,"increase")
longNumber2 = instr(1,longNumber1, textVariable, "%") - 3
stringVariable = mid(textVariable,longNumber2, 3)

Related

How can I determine the 'total cost' from a tiered pricing structure using standard formulas in Excel?

I'm trying to evaluate various tiered pricing structures (for say, electricity plans) using Excel (more-or-less) to see what costing/plan is 'optimal', given some existing usage data I have.
Consider an example 'Table of Usage & Rates' (with fictitious but easily manipulated values):-
For a daily usage value of 120, we'd have 100 (in the 1st tier) and 20 (in the 2nd tier). The amount used within a tier gets charged at a certain rate (the 'factor')... and each 'tier charge' is addded together to form a total charge for the day.
So, we can calculate:-
100 x 8 = 800 ...a part of the total
20 x 4 = 80 ...another part of the total
...and that's all, giving a total of 880.
...but how to do that in a single formula within a cell?
I've done some pretty decent explorations for a few hours today, as I can't nut out how to deal with this... and most suggestions talk about multiple =IF formulas (cumbersome and unscalable - I shouldn't need to recode cell contents if I split/add another tier)... and suggestions with =VLOOKUP just don't 'click' with me ( = I don't understand them).
I'm actually using 'PlanMaker', a component of Softmaker's 'Office 2021' product to create/maintain this spreadsheet.. and there is no VBA-like plugin available.
I'd appreciate a method of attack, if anyone can suggest something, please...
So:
=product(10,8)+product(20,4)
or if we assume Factor starts in B9 then =product(A9,B9)+product(A10,B10+product(A11,B11)
then take the sum of those results etc assuming A9 is the amount used.
You can also use:
=sumproduct(A9:A11,B9:B11)
for the same but only needs one cell. And the advantage of a lot less typing.
You can include a 3rd array in sumproduct (or as many as needed) such as a binary value to include in the calculation or not.

"IF" function for analysis of hospital lab frequency

I work for a hospital that is part of a larger network. We were recently asked by our corporate overlords to address the use of a specific laboratory test. in general, this test should only be performed daily, which should be considered to corresponded to a 24 hour period from last draw. sometimes, however, based on when people arrive to the hospital (e.g. 7pm), and in the interest of bundling labs for a single draw, they may be drawn sooner to coincide with routine testing i.e. 5am. it would never be necessary to otherwise need to repeat within a short (8 hour) window, particularly on the same day.
we have been asked to validate to see if we are adhering to this general practice, as testing any more frequent than that, say, within 12h of a previous test, has no real clinical value and thus adds unnecessary cost.
To address this issue I was given a dataset that among other items includes all instances the lab was performed including collection date and time.
please see HIPPA-safe example below (to be clear, no real data and identifiers are not real); the actual dataset has over 4,174 entries corresponding to 1,328 unique persons. everyone had at least one test performed, not everyone had >1.
I THINK what I want to do is an IF formula that reads the antecedent cell to 1) check if same person and 2) if so, perform a subtraction of the time stamp to display the relevant difference in time, which I can then filter, create histogram, etc. does this seem like a reasonable approach? is there a more preferable method to facilitate analysis? do any other forms of analysis come to mind?
=IF(B2=B1, D2-D1, "n/a")
example data set with formula:
any other forms of analysis come to mind?
By the looks of it you should consider taking the values under "Results" into account, assuming there is a band that might be considered 'normal' readings. The "one in 24 hours is sufficient" rule of thumb may well be appropriate for a series of values within the 'normal' band but not so much so if readings are close to 'danger level'.
That is, in some cases a higher than 'standard' frequency of monitoring may be in the patient's interest, even if not hospital policy, so it may be worth separating the "less than 24 hours interval" readings into those where the higher frequency provided information of little value (eg readings remaining within a 'normal' band) from any that crossed into or out of the band and/or large changes in value. This though may be more a matter of statistical analysis than programming and depend upon whether any action might be taken as a result of such "extra" readings.

Excel - Way to calculate paying salaries

I work at a restaurant chain with about 170 employees. Currently we pay most of them by cash, and our administration team counts 'what bank notes' to pay them manually.
So for example if one of our waiters gets paid 2152 Euro, then the function should calculate the following:
100 Euro * 21
50 Euro * 1
2 Euro * 1
Anything like that in excel?
This is just one way to do it. I chose to use a helper column to keep the formulas simple for you
Here is how it looks:
This formula goes in B3 and you should drag it down. This column shows the remainder due after each note. E.g. 52€ remaining after 2100€ in 100€ notes has been accounted for)
=MOD(B2,A3)
Then place this formula in C3 and drag it down
=IF(B2<>B3,(B2-B3)/A3,0)
That will give you how many of each note to dispense to the worker.
Note
This method simply assumes you want to issue the fewest notes. It would be a good idea to have a table which allows you to indicate if you are out of a particular note so that the model uses 2x€50 notes rather than 100€ notes, for example.
Perhaps you can try to use Truncate & Divide to calculate number of notes for each given note.
for example, to get number of 100 note:
NumberOf100Note = Trunc(givenAmount/100,0)
NumberOf50Note = Trunc((givenAmount - 100 * NumberOf100Note) / 50, 0)
etc. and etc.
and you can build these very easily in excel, isn't it?

Simulation in Excel using probability

I am trying to create a spreadsheet that can find the most likely probability that a student scored a specific grade on a test.
Only one student can score a grade and only one grade can have a student.
I have limited information about each student.
There are 5 students (1,2,3,4,5)
and the grades possible are only (100,90,80,70,60)
In the spreadsheet a 1 denotes that the student DIDN'T score that grade.
Does anyone know how to make a simulation that I can find the most likely probability of what student scored what grade?
Link:
https://docs.google.com/spreadsheets/d/1a8uUIRzUKsY3DolTM1A0ISqMd-42WCUCiDsxmUT5TKI/edit?usp=sharing
Based on your response in comments, each student has an equal likelihood of getting each grade. No simulation is necessary.
If you want to simulate it anyway, don't use Excel*. Create a vector of students, and pair it with a shuffled vector of the grades. Lather, rinse, repeat as many times as you want to see that the student-to-grade matching is uniformly distributed.
* - To get an idea of how bad Excel can be for random variate generation, enable the Analysis Toolpak, go to "Data -> Data Analysis" on the ribbon, and select "Random Number Generation". Fill in the tabs that you want 10 variables, number of random numbers 2000, a "Normal" distribution, leave the mean and std dev at 0 and 1, and enter a "Random Seed" value of 123. You will find that the resulting table contains 3 instances of the value "-9.35764". Values that extreme should occur about once per twenty thousand years if you generate a billion a second. Getting three of them is so extreme that it should happen once per 1030 times the current estimated age of the universe. Conclude that a) it's your lucky day, or b) Excel sucks at random numbers, and despite being informed about this as far back as 1998 Microsoft hasn't bothered to fix it.

View Collation with Couchbase

We are using couchbase as our nosql store and loving it for its capabilities.
There is however an issue that we are running in with creating associations
via view collation. This can be thought of akin to a join operation.
While our data sets are confidential I am illustrating the problem with this model.
The volume of data is considerable so cannot be processed in memory.Lets say we have data on ice-creams, zip-code and average temperature of the day.
One type of document contains a zipcode to icecream mapping
and the other one has transaction data of an ice-cream being sold in a particular zip.
The problem is to be able to determine a set of top ice-creams sold by the temperature of a given day.
We crunch this corpus with a view to emit two outputs, one is a zipcode to temperature mapping , while the other
represents an ice-cream sale in a zip code. :
Key Value
[zip1] temp1
[zip1,ice_cream1] 1
[zip2,ice_cream2] 1
The view collation here is a mechanism to create an association between the ice_cream sale, the zip and the average temperature ie a join.
We have a constraint that the temperature lookup happens only once in 24 hours when the zip is first seen and that is the valid
avg temperature to use for that day. eg lookup happened at 12:00 pm on Jan 1st, next lookup does not happen till 12:00 pm Jan 2nd.
However the avg temperature that is accepted in the 1st lookup is valid only for Jan 1st and that on the 2nd lookup only for Jan 2
including the first half of the day.
Now things get complicated when I want to do the same query with a time component involved, concretely associating the average temperature of a
day with the ice-creams that were sold on that day in that zip.eg. x vanilla icecreams were sold when the average temperature for that day is 70 F
Key Value
[y,m,d,zip1] temp1
[y,m,d,zip2,ice_cream2 ] 1
[y,m,d2,zip1,ice_cream1] 1
This has an interesting impact on the queries, say I query for the last 1 day I cannot make any associations between the ice-cream and temperature before the
first lookup happens, since that is when the two keys align. The net effect being that I lose the ice-cream counts for that day before that temperature lookup
happens. I was wondering if any of you have faced similar issues and if you are aware of a pattern or solution so as not to lose those counts.
First, welcome to StackOverflow, and thank you for the great question.
I understand the specific issue that you are having, but what I don't understand is the scale of your data - so please forgive me if I appear to be leading down the wrong path with what I am about to suggest. We can work back and forth on this answer depending on how it might suit your specific needs.
First, you have discovered that CB does not support joins in its queries. I am going to suggest that this is not really an issue if when CB is used properly. The conceptual model for how Couchbase should be used to filter out data is as follows:
Create CB view to be as precise as possible
Select records as precisely as possible from CB using the view
Fine-filter records as necessary in data-access layer (also perform any joins) before sending on to rest of application.
From your description, it sounds to me as though you are trying to be too clever with your CB view query. I would suggest one of two courses of action:
Manually look-up the value that you want when this happens with a second view query.
Look up more records than you need, then fine-filter afterward (step 3 above).

Resources