removing some rows based on criteria - solving this by pivot tables - excel

i have an excel data . the 3rd collumn contains phone numbers . i have to delete rows whose phone numbers have less than 10 digits . as the data is very large and not even one mistake is acceptable. i want to use pivot tables or automation script. pivot tables is better because the number of digits is variable and the collumn number is variable.
where im stuck. - whenever i use pivot tables to do this the original tabular format is lost . i get some cross tabular format which i dont want. here is the sample data.
date time number count
1-Sep-09 15:29:44 9800000005 1
2-Sep-09 10:07:03 333333 1
3-Sep-09 9:53:46 9800000004 1
7-Sep-09 14:47:31 9800000005 1
10-Sep-09 10:51:39 9800000001 1
12-Sep-09 14:52:50 9800000002 1
13-Sep-09 8:28:28 333333 1
17-Sep-09 10:32:13 9800000001 1
18-Sep-09 9:01:42 9800000005 1

I don't think a cube or code is necessary.
Try adding a calculation to show the length of the phone number to cell E2 with the formula
=len(C2)
(assuming that number appears in C2) - then copy this formula down to the rest of column E.
You can then apply an auto-filter to the table, and use a custom filter on column E to show all rows where the length is greater than or equal to 10.

Related

Generate two false Booleans every ten rows in excel

I need to add a column to my spread sheet that generates two "false" at random intervals every ten frames.
So for example rows 1 though 10 could read:
true
true
true
False
true
false
true
true
true
true
and then repeat that for rows 11 through 20, but the false are randomly put in different places. etc. I want write a formula that does this for me.
With Office 365:
In first cell you want the list to be created put:
=LET(rws,1000,arr,RANDARRAY(10,rws/10),seq,SEQUENCE(rws,,0),INDEX(MAKEARRAY(10,rws/10,LAMBDA(i,j,INDEX(BYCOL(arr,LAMBDA(v,MATCH(SMALL(v,i),v,0))),1,j)<9)),MOD(seq,10)+1,INT(seq/10)+1))
Change the 1000 to the number of rows desired.
If one does not have Office 365 then put this in the second row of a column and copy it down.
=IF(COUNTIF(INDEX(A:A,MIN(ROW($ZZ1)-MOD(ROW($ZZ1)-1,10)+1,ROW()-1)):INDEX(A:A,ROW()-1),FALSE)>=2,TRUE,IF(COUNTIF(INDEX(A:A,MIN(ROW($ZZ1)-MOD(ROW($ZZ1)-1,10)+1,ROW()-1)):INDEX(A:A,ROW()-1),TRUE)>=8,FALSE,RANDBETWEEN(0,9)<8))
Be aware:
Each cell is randomly chosen and as such FALSE will appear in the last of the 10 more often than truly random. One can play with the RANDBETWEEN(0,9)<8 to maybe make that more random.
BRUTE FORCE METHOD
There are 10!/(8!*2!) = 45 ways of arranging your True/False requirements
I personally didn't have anything better to do with my time so I wrote out all possible combinations in 45 columns.
The concept with this methodology is to randomly write out one of the 45 columns every 10 rows. One of the problems here is that using random in a formula does not mean you will be able to use the same random value in the next row of the formula.
A potential random problem side step
In order to make a random result accessible by multiple formula calculations one can spit out the results in a helper column. For this solution we will be randomly selecting from 45 possible columns, so in the first column the following formula is used and copied down. The number of rows will be equal to the number of 10 groupings you will use.
Start in A1 and copy down
=RANDBETWEEN(1,45)
How to make each formula in a group of ten pick the same random number
For demonstration purposes the next column is to generate integers starting at 1 and increasing by 1 after every 10 rows. For the demonstration it would need to be copied down a number of rows equal to the number of results needed (10 * number of groups of 10). Ultimately this formula can be embedded in the final formula.
Start in B1 and copy down
=INT((ROW(A1)-1)/10)+1
For demonstration purposes the next column is to generate integers starting at 1 and increasing by 1 row but resetting to 1 after the 10th row. For the demonstration it would need to be copied down a number of rows equal to the number of results needed (10 * number of groups of 10). Ultimately this formula can be embedded in the final formula.
Start in C1 and copy down
=MOD(ROW(A1)-1,10)+1
So now there is a way of indexing the column you need and what row of that column you need.
Indexing the solution
In the next column the index function is used (twice) to find out what column and row to look in from the list of all possible combination. In this demo, the list of all possible combination is written out from F1:AX10.
First we start by indexing which random column to use. Since the random numbers are written in column A starting in row 1 I used the following formula:
=INDEX(A:A,B1)
To get the row reference I used the following formula:
=C1
I then took those two formulas and combined them to pull data from the possibility table as follows:
Start in D1
=INDEX($F$1:$AX$10,C1,INDEX(A:A,B1))
Tidying it up
We can't eliminate the random number column as we need something quasi static for the formulas to refer to. The reason I say quasi static, random is a volatile function which means it will recalculate every time the sheet recalculates. However, we can place the formulas from B and C into D. This results in the formula in D looking like:
=INDEX($F$1:$AX$10,MOD(ROW(A1)-1,10)+1,INDEX(A:A,INT((ROW(A1)-1)/10)+1))
It's not clear which version of Excel you're using so this approach will work for all versions:
the starting point is C12:L13, where the formula in row 12 is
=RANDBETWEEN(1,5)
and the formula in row 13 is
=RANDBETWEEN(6,10)
These results determine the positions of the FALSE values in the range starting with cell C1 where the formula is
=NOT(OR(ROW()=C$12,ROW()=C$13))
The array formula in A1:A10 is
=INDEX($C$1:$L$10,,1+MOD(RANDBETWEEN(1,100),10))
column B is just an indexing column containing the formula
=1+MOD(ROW()-1,10)
which, coupled with the conditional formatting in column A illustrates that the positions of the FALSE values are different in each 10-row sequence.
(you will notice that the random numbers generated in columns I and J happen to be the same so, if this is a concern, you could extend the 'helper range' beyond 10 columns in order to augment randomness)

How can you in Excel total predefined values depending on cell values

I have a spreadsheet with two tabs. The first one contains Vehicle Types and a numeric score value.
Second Tab has like a variety of these vehicle types and what should be the total score. Depending on the vehicle types present in the respective neighbour cell.
See images below for illustration.
Is there a way via formula to get the total, in Column B in sheet 2, of the corresponding numeric values of column a from sheet 1?
For example, as per the illustration B2 in sheet would total 3; whereby in sheet 1 bus has a score of 1 and car 2.
Update:
As per the answer below, I have used the formula;
=SUMPRODUCT(ISNUMBER(FIND(" "&sheet1!A$2:A$4&" "," "&SUBSTITUTE(A4,CHAR(10)," ")&" "))*sheet1!B$2:B$4)
However, I am unfortunately getting zero as the value. Changing the line breaks in column A in sheet2 I am duly able to get the total. Is there a way to do it so irrespective of how the list is presented in the column the total will work?
I think you are after something like this:
Formula in E2:
=SUMPRODUCT(VLOOKUP(FILTERXML("<t><s>"&SUBSTITUTE(D2,CHAR(10),"</s><s>")&"</s></t>","//s"),A$2:B$4,2,FALSE))
If one has O365 you could just use SUM instead since it would auto-CSE the formula.
If you don't have Excel 2013 or later, you could try the following as another option (shorter but not my favourite):
=SUMPRODUCT(ISNUMBER(FIND(" "&A$2:A$4&" "," "&SUBSTITUTE(D2,CHAR(10)," ")&" "))*B$2:B$4)

Excel Formula to count unique values based on three different criteria

I am trying to get a row by row count of unique invoices in a spreadsheet. I want excel to do this by reading either 1 for unique or zero for duplicate.
I have had success with =IF(COUNTIF($C$3:C3,C3)>1,0,1).
This has given me an accurate count based on one specific column, but I have not had any luck advancing this beyond the one column. I would like this formula to be based on three criteria, not just two.
A B C D E F G
Vendor ID Name 1 Invoice Number Inv Date Sum Amount Acctg Date Unique#
00001 A 0000001 3/16/2015 5.00 5/11/2016 1
00010 M 0000001 9/14/2015 10.00 5/24/2016 1
00010 M 0000001 9/4/2015 15.00 5/24/2016 0
00005 K 0000285 4/8/2016 20.00 4/18/2016 1
000106 O 000042 6/7/2016 30.00 6/21/2016 1
000107 H 006333 4/5/2016 6.00 4/11/2016 1
000107 H 006333 4/5/2016 6.00 4/12/2016 1​
There are duplicates in all the columns because of how I needed to pull the report. I would like a pull down formula that would give me unique values of A, C, F in a 1,0 format on each row line by comparing each of them against a total combination of each of three columns. Please note vendor M having a duplicate invoice number vs vendor H which has two distinct invoices based on the criteria.
This will be a large drain on resources because of the size of the data. I am looking at around 20-90k lines, but maybe someone can show me a better mousetrap? VBA macro? Match Index? Anyway, onwards to the failures!
Please feel free to explain why they didn't work, or how they could. Also please ignore column locations compared to my example as I was moving things around quite frequently.
=A&C&F then use If(countif('ColumnX')), but this didn't work correctly as I found data that was listed as a repeat when it was actually unique. I think the root problem with doing this was combining the date and general formats into one cell.
=SUMPRODUCT((1/COUNTIFS(E3:E1000,E3:E1000,J3:J1000,J3:J1000,G3:G1000,G3:G1000)))
Multiple versions of AND with IF(CountIF)
Multiple versions of =A&C AND CountIF (Date)
I have also looked at the following questions in SE and found them helpful, but ultimately not what I specifically needed, or I failed at implementation.
Simple Pivot Table to Count Unique Values
I tried this unsuccessfully based on unique invoices, need three criteria not just one.
Count unique values in Excel
See above.
Excel Formula: Count Unique Values in a Row Based on Corresponding Value in Another Row
This looks like it should work, but I tried and failed to correctly adapt to my problem.
Excel - Return Count of Unique Values Based on Two Columns
This also should work perfectly with addition of third column. Formula yelled at me and called me names. Mentioned something about can't fix stupid.
Please let me know if any parts of the question are unclear. I did my best to not duplicate and trim the information down. Thanks in advance!
If I am understanding your problem correctly, basically you want column G to check if the current row is a duplicate (based on columns A, C and F) of any rows above it. If it is, return a 0, else return a 1.
If that is what you are looking to achieve, you can do so using the COUNTIFS() function to know if there are any duplicates above the row and then simply check if the count = 0 or is > 0 (=0 means it's unique, >0 means it is a duplicate).
Your formula for column G would look as follows:
G2: 1 (obviously we know it is unique since there are no values above it to be a duplicate of)
G3: =IF(COUNTIFS($A$2:A2,A3,$C$2:C2,C3,$F$2:F2,F3)=0,1,0)
then, drag G3 downwards.
Hope this is what you were looking for.

Excel: If Cell in Column = text value of X, then display text (in the same row, but different column) on another sheet

This is a confusing request.
I have an excel tab with a lot of data, for now I'll focus on 3 points of that data.
Team
Quarter
Task Name
In one tab I have a long list of this data displaying all the tasks for all the teams and what Quarter they will be on.
I WANT to load another tab, and take that data (from the original tab) and insert it into a non-list format. So I would have Quarters 1,2,3,4 as columns going across the screen, and Team Groups going down. I want each "task" that is labeled as Q1 to know to list in the Q1 section of that Teams "Block"
So something like this: "If Column A=TeamA,AND Quarter=Q1, then insert Task Name ... here."
Basically, if the formula = true, I want to print a list of those items within that team section of the excel document.
I'd like to be able to add/move things around at the data level, and have things automatically shift in the Display tab. I honestly have no idea where to start.
If there is never a possibility that there could be more that 1 task for a given team and quarter, then you can use a formula solution.
Given a data setup like this (in a sheet named 'Sheet1'):
And expected results like this (in a different sheet):
The formula in cell B2 and copied over and down is:
=IFERROR(INDEX(Sheet1!$C$2:$C$7,MATCH(1,INDEX((Sheet1!$A$2:$A$7=$A2)*(Sheet1!$B$2:$B$7=B$1),),0)),"")
I came across this situation. When I have to insert the values into a table from an Excel sheet I need all information in 1 Column instead of 2 multiple rows. In Excel my Data looks like:
ProductID----OrderID
9353510---- 1212259
9650934---- 1381676
9572474---- 1381677
9632365---- 1374217
9353182---- 1212260
9353182---- 1219361
9353182---- 1212815
9353513---- 1130308
9353320---- 1130288
9360957---- 1187479
9353077---- 1104558
9353077---- 1130926
9353124---- 1300853
I wanted single row for each product in shape of
(ProductID,'OrdersIDn1,OrderIDn2,.....')
For quick solution I fix it with a third column ColumnC to number the Sale of Product
=IF(A2<>A1,1,IF(A2=A1,C1+1,1))
and fourth Column D as a placeholder to concatenate with previous row value of same product:
=IF(A2=A1,D1+","&TEXT(B2,"########"),TEXT(B2,"########"))
Then Column E is the final column I required to hide/blank out duplicate row values and keep only the correct one:
=IF(A2<>A3,"("&A2&",'"&D2&"'),","")
Final Output required is only from Column E
ProductID Order Id Sno PlaceHolder Required Column
9353510 1212259 1 1212259 (9353510,'1212259'),
9650934 1381676 1 1381676 (9650934,'1381676'),
9572474 1381677 1 1381677 (9572474,'1381677'),
9632365 1374217 1 1374217 (9632365,'1374217'),
9353182 1212260 1 1212260
9353182 1219361 2 1212260,1219361
9353182 1212815 3 1212260,1219361,1212815 (9353182,'1212260,1219361,1212815'),
9353513 1130308 1 1130308 (9353513,'1130308'),
9353320 1130288 1 1130288 (9353320,'1130288'),
9360957 1187479 1 1187479 (9360957,'1187479'),
9353077 1104558 1 1104558
9353077 1130926 2 1104558,1130926 (9353077,'1104558,1130926')
You will notice that final values are only with the Maximum Number of ProductSno which I need to avoid duplication ..
In Your case Product could be Team and Order could be Quarter and Output could be
(Team,Q1,Q2,....),
Based on my understanding of your summary above, you want to put non-numerical data into a grid of teams and quarters.
The offset worksheet function will work well for this in conjunction with the match or vlookup functions. I have often done this task by doing the following steps.
In my data table, I have to concatenate the Team and quarter columns so I have a unique lookup value at the leftmost column of your table (Note: you can eventually hide this for ease of reading).
Note: You will want to name the input range for best formula management. Ideally use an Excel Table (2007 or greater) or create a dynamically named range with the offset and CountA functions working together (http://tinyurl.com/yfhfsal)
First, VLOOKUP arguments are VLOOKUP(Lookup_Value,Table_Array,Col_Index_num,[Range Lookup]) See http://tinyurl.com/22t64x7
In the first cell of your output area you would have a VLOOKUP formula that would look like this
=Vlookup(TeamName&Quarter,Input_List,Column#_Where_Tasks_Are,False)
The Lookup value should be referencing cells where you have the team names and quarter names listed down the sides and across the top. The input list is from the sheet you have the data stored. The number three represents the column number the tasks are listed in your source data, and the False tells the function it will only use an exact match in your putput.

In Excel 2007, how can I SUMIFS indices of multiple columns from a named range?

I am analysing library statistics relating to loans made by particular user categories. The loan data forms the named range LoansToApril2013. Excel 2007 is quite happy for me to use an index range as the sum range in a SUMIF:
=SUMIF(INDEX(LoansToApril2013,0,3),10,INDEX(LoansToApril2013,0,4):INDEX(LoansToApril2013,0,6))
Here 10 indicates a specific user category, and this sums loans made to that group from three columns. By "index range" I'm referring to the
INDEX(LoansToApril2013,0,4):INDEX(LoansToApril2013,0,6)
sum_range value.
However, if I switch to using a SUMIFS to add further criteria, Excel returns a #VALUE error if an index range is used. It will only accept a single index.
=SUMIFS(INDEX(LoansToApril2013,0,4),INDEX(LoansToApril2013,0,3),1,INDEX(LoansToApril2013,0,1),"PTFBL")
works fine
=SUMIFS(INDEX(LoansToApril2013,0,4):INDEX(LoansToApril2013,0,6),INDEX(LoansToApril2013,0,3),1,INDEX(LoansToApril2013,0,1),"PTFBL")
returns #value, and I'm not sure why.
Interestingly,
=SUMIFS(INDEX(LoansToApril2013,0,4):INDEX(LoansToApril2013,0,4),INDEX(LoansToApril2013,0,3),1,INDEX(LoansToApril2013,0,1),"PTFBL")
is also accepted and returns the same as the first one with a single index.
I haven't been able to find any documentation or comments relating to this. Does anyone know if there is an alternative structure that would allow SUMIFS to conditionally sum index values from three columns? I'd rather not use three separate formulae and add them together, though it's possible.
The sumifs formula is modelled after an array formula and comparisons in the sumifs need to be the same size, the last one mimics a single column in the LoansToApril2013 array column 4:4 is column 4.
The second to bottom one is 3 columns wide and the comparison columns are 1 column wide causing the error.
sumifs can't do that, but sumproduct can
Example:
X 1 1 1
Y 2 2 2
Z 3 3 3
starting in A1
the formula =SUMPRODUCT((A1:A3="X")*B1:D3) gives the answer 3, and altering the value X in the formula to Y or Z changes the returned value to the appropriate sum of the lines.
Note that this will not work if you have text in the area - it will return #VALUE!
If you can't avoid the text, then you need an array formula. Using the same example, the formula would be =SUM(IF(A1:A3="X",B1:D3)), and to enter it as an array formula, you need to use CTRL+SHIFT+ENTER to enter the formula - you should notice that excel puts { } around the formula. It treats any text as zero, so it will successfully add up the numbers it finds even if you have text in one of the boxes (e.g. change one of the 1's in the example to be blah and the total will be 2 - the formula will add the two remaining 1s in the line)
The two answers above and a bit of searching allowed me to find a formula that worked. I'll put it here for posterity, because questions with no final outcome are a pain for future readers.
=SUMPRODUCT( (INDEX(LoansToApril2013,0,3)=C4) * (INDEX(LoansToApril2013,0,1)="PTFBL") * INDEX(LoansToApril2013,0,4):INDEX(LoansToApril2013,0,6))
This totals up values in columns 4-6 of the LoansToApril2013 range, where the value in column 3 equals the value in C4 (a.k.a. "the cell to the left of this one with the formula") AND the value in column 1 is "PTFBL".
Despite appearances, it isn't multiplying anything by anything else. I found an explanation on this page, but basically the asterisks are adding criteria to the function. Note that criteria are enclosed in their own brackets, while the range isn't.
If you want to use names ranges you need to use INDIRECT for the Index commands.
I used that formula to check for conditions in two columns, and then SUM the results in a table which has 12 columns for the months (the column is chosen by a helper cell which is 1 to 12 [L4]).
So you can do if:
Dept (1 column name range [C6]) = Sales [D6];
Region (1 column name range [C3]) = USA [D3];
SUM figures in the 12 column monthly named range table [E7] for that 1 single month [L4] for those people/products/line item
Just copy the formula across your report page which has columns 1-12 for the months and you get a monthly summary report with 2 conditions.
=SUMPRODUCT( (INDEX(INDIRECT($C$6),0,1)=$D$6) * (INDEX(INDIRECT($C$3),0,1)=$D$3) * INDEX(INDIRECT($E7),0,L$4))

Resources