How to analyze data in Excel

How to analyze data in Excel - excel

I have an Excel document that consists of huge data in need of analysis.
The data is basically objects with corresponding error messages. Typical output is:
**REC NO07/121007163**
Valuation for 0001 IFRS16 Balance sheet valuation
Asset transactions already posted need to be reversed
The valuation could not be completed
**REC NO07/121007165**
Valuation for 0001 IFRS16 Balance sheet valuation
Asset transactions already posted need to be reversed
The valuation could not be completed
**REC NO07/121007220**
Valuation for 0001 IFRS16 Balance sheet valuation
Closing balance 5 070,00 NOK liability available
Difference 5 070,00- NOK between clearing and expense available
**REC NO07/121007221**
Valuation for 0001 IFRS16 Balance sheet valuation
Closing balance 5 070,00 NOK liability available
Difference 5 070,00- NOK between clearing and expense available
What you see in bold above, is the object. This is not in bold in Excel, but I have made it bold here to explain. Everything in-between is the error message for that object.
The length (number of lines) of the error message could vary between objects.
What I would like to do, is basically convert the above to this:
REC NO07/121007163 Valuation for 0001 IFRS16 Balance sheet valuation. Asset transactions already posted need to be reversed. The valuation could not be completed
REC NO07/121007165 Valuation for 0001 IFRS16 Balance sheet valuation. Asset transactions already posted need to be reversed. The valuation could not be completed
REC NO07/121007220 Valuation for 0001 IFRS16 Balance sheet valuation. Closing balance 5 070,00 NOK liability available. Difference 5 070,00- NOK between clearing and expense available
REC NO07/121007221 Valuation for 0001 IFRS16 Balance sheet valuation. Closing balance 5 070,00 NOK liability available. Difference 5 070,00- NOK between clearing and expense available
I am adding a tab between the object and the error message.
I am combining all lines of the error message with ". "
Is this possible in Excel and if yes, is there anyone that could help me with that?
Thank you
Best regards
Antonis
I have tried to do this with formulas in Excel but as the number of lines for each error message varies, I was not able to solve it.

Assuming all the error codes start with REC and no excel version constraints per tags listed in the question, then you can use the following formula in cell B1:
=LET(A, A1:A16, m, ROWS(A), seq, SEQUENCE(m), idx, FILTER(seq, (LEFT(A,3)="REC")),
start, idx+1, end, VSTACK(DROP(idx-1,1), m), MAP(start, end,
LAMBDA(s,e, INDEX(A,s-1)&" "&TEXTJOIN(". ",, FILTER(A, (seq>=s) * (seq<=e))))))
Here is the output:
Basically, it finds first the index position of the error codes (idx) and based on that finds the start and end rows of each error message. Then we use MAP to concatenate the result via TEXTJOIN selecting on each iteration the range via FILTER and prefixing the error code (INDEX(A,s-1)).

Related

How to calculate quartiles in Excel on negative and positive total returns for different funds?

This may have been asked so please direct me.
I have the following:
Column A Column B Column C:
Name: 2020 2019
Fund A -5% +2.5%
Fund B -2.5% -5%
Fund C -7% +9%
Fund D -12% -2%
I have tried different excel formuales:
=MATCH(F4, QUARTILE.INC($F$4:$F$15, ROW(INDIRECT("1:4"))-1),1)
I have also tried using..
=VLOOKUP(PERCENTRANK($F$4:$F$15,F4),{0,1;0.25,2;0.5,3;0.75,4},2,1)
Both works fine when the data set contain all positive total returns for each fund, but it does the opposite when applied to negative returns - i.e. it basically ranks the worst performing fund as Q1.
I am looking for an excel formulae that can work for negative and positive returns and get their quartile ranking.
Any help?

How to tell excel to copy from one sheet to another with the condition?

I have an excel sheet which has around 4000 rows.
Its about treatment details about the patients, where there are patients repeatedly came for the tests.
Now, wanted to create a new sheet which allows me to collect info from the old sheet.
conditions:
I need only one row represent a patient.
The second column onwards the info has be filled in the following way.
for example, if the data is entered in the following way:
patient_id test1 test2 test3
001 1 0 1
001 0 1 0
.
.
.
002 1 1 1
002 0 0 0
.
.
.
003 1 0 0
.
.
.
Now the in the new sheet, the first column should show the patient id and in the second column if she has alteast one 1 in her follow up(means if she has 1 in any of her visit in the test1) it should return 1 otherwise 0.
I dont know how many times a patient come for test.. its not uniform.
Similarly for the second and third column.
How to do that??
I hope patient name column also can be entered in the sameway.
If its not possible and can be done easily in other softwares..I wish to know that.
Thanks for helping!!!

Using a pivot table you can filter on patient ID and test.
From this dataset
you will get something like this
To insert a pivot table go to the Insert tab and select pivot table
and then apply the following settings
In the filter on patient ID then simply select the patient ID you want to see.
More extensive tutorials on how to use and set pivot tables can be found on the internet, like this one

Test data on next record

I would like to know if is is possible to read the next record when we are using SyncSORT (SyncTool) based on a certain condition.
Example of the input
Sort key will be account nbr + descending record type + amount
account nbr amount record type
11111111111 10 reversal not in the output
11111111111 10 deposit not in the output
33333333333 20 deposit in the output
44444444444 15 deposit in the output
55555555555 20 reversal in the output
55555555555 10 deposit in the output
66666666666 30 reversal in the output no match
When a reversal type is read, a deposit should follow with the same amount, in this case it both record the reversal and deposit should not be in the output file. It is possible the amount is not the same for the reversal and the deposit, in this case both records should be in the output file.
output
33333333333 20 deposit
44444444444 15 deposit
55555555555 20 reversal
55555555555 10 deposit
66666666666 30 reversal

Yes. As long as your SyncSORT is up-to-date enough.
You need to use JOINKEYS. Specify the same DSN for both input datasets, and indicate that they are SORTED. There is an undocumented feature which allows the use of JNFnCTNL files, like DFSORT.
In JNF1CNTL (which is a "preprocessor" for the first JOINKEYS dataset) temporarily add a sequence number to each record. The default is that the sequence starts at one. Here it is useful to be explicit...
Because, in JNF2CNTL you want to do the same thing, but start the sequence at zero (START=0).
The key for each of the JOINKEYS is the sequence number.
Use JOIN UNPAIRED,F1. Define a REFORMAT with all the data from the first file, and data for comparison from the second file.
This is what a four-record dataset would look like if you imagine the join:
- - A 0
A 1 B 1
B 2 C 2
C 3 D 3
D 3 - -
Because you specify JOIN UNPAIRED,F1 you won't actually see the mismatched A 0 (because that is on F2) but you will see the mismatched D 3.
If you look at your REFORMAT record, you now have data from the "current" record, and data from the "next" record.
Then there's a little more work to select only the records you want. But, dinner first...

COUNTING IN VBA

I have a number of clients in an excel spreadsheet (by client name), each associated with a particular item. For example
12345 1
12345 2
12345 2
23451 1
23451 3
55667 1
55667 2
89001 3
99999 1
99999 2
I need to count the number of distinctly different items for each client - in the above example, client 12345 has bought 3 items (output is 3); client 23451 has bought 2 items (output 2); client 89001 has bought one item (output 1). I'm sure it's a COUNT feature which looks to the previous column A and breaks/restarts the count if the client number changes, but I'm having a devil of a time doing it. Any help would be deeply appreciated.

Have you considered the SUM function instead? The COUNT function counts the amount of cells that are being used, the SUM adds up the integers within the cells.
Check out this link -> Here
Yes, or a better suggestion from David in the comments, the SumIf function.
Thanks David!

Create an excel formula for "buy one, get the rest 50% off"

I need to create a formula in excel that will kind of do a "buy one item, get the rest at 50% off".
I need excel to pick the most expensive item and charge it at full value, then charge the rest at 50% of their value:
Item A=$30
Item B=$21
If on day one, item A was bought 2 times, and then item B was used once, I need excel to pick out the most expensive item of the day (which would be item A) and charge it at 100% of its value ($30) and then the for the second item A, charge it at 50% of its value ($15) and item B would also be at 50% of its value ($10.5). So the total charge for the day would be $55.50.
I have set up names for each item that correlates to its price. If I put =sum(itemA) in a cell in excel, then it comes up with 30.
I have it set up so that I can put in the number of each item that was bought and excel can multiple it for me =sum(itemA*2)-->60. I just need to figure out the 50% discount for all of the items bought in one day.
Please help, and let me know if there is anymore info that I need to share!!!!
ADDITIONAL:
I have added three items using the name function under "define name". Item A is equal to 30, item B equal to 21, item C equal to 15. So this is what I have set up for example, for day one:
Item Quantity Total price
item A 2 60 =sum(itemA*2)
item B 1 21 =sum (itemB*1)
item C 0 0 =sum (itemc*0)
total daily charges: 81 =sum(C2:C4)
total daily charges with discount: 55.5 (THIS IS WHERE I NEED THE FORMULA!)
ADDITIONAL:
Ok, so after working with this formula, I have another question:
I have two set of this data, and excel will pick the most expensive of the two sets and charge 100% and then charge the rest at 50%. However, I now need a way to separate out the charges for the two sets of data and get their total. So example:
Item A=30, item B=21, item C=15
Set one: item A used 2x, item B used 1x
Set two: item B used 1x, item C used 1x
Excel picks item A (as this is the highest in both sets) and charges it at 100% (30), then charges the rest of the items at 50% (43.5). The total that is charged is 73.5
Now I need excel to separate out the charges by set.
So set one, the charge is 55.5
set two, the charge is 18.
Please let me know if additional details are needed.

Assuming a layout as A:e below, three added columns might suit, with:
in F2: =MAX(IF(A:A=G2,C:C))
in G2: =IF(A2<>A3,A2,"")
in H2: =IF(G2=0,"",0.5*SUMIF(A:A,G2,E:E))+F2/2
each copied down to suit.
The first an array formula so entered with Ctrl+Shift+Enter.
The first identifies the daily maximum unit price (before discount).
The second to identify the daily summary.
The third for the calculations (same approach as #Ron Rosenfeld).

Depends on how you set things up, and you don't show that. It might be simpler to use an algorithm that computes 50% of everything, then add back 50% of the most expensive item. So if you have three columns: Items Prices Quantity, (where Prices = Price/Item) you could use a formula like:
=0.5*MAX(Prices)+SUMPRODUCT(Prices,Quantity)*0.5
If some entries in your "Quantity" column might be zero or blank, then use this formula instead:
=SUMPRODUCT(MAX(Prices*(Quantity>0))+SUMPRODUCT(Prices,Quantity))*0.5

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string