working with HUGE spreadsheet - excel

i have about 300,000 records in this spreadsheet. and there are a couple hundred columns!!
one of the columns is the social security number and i need to replace it with some random identifier. i cant really do a vlookup because that is too taxing so i think i am going to write a macro
can anyone please suggest to me how do i do this?
please note that the social security numbers appear multiplle times. so i need them to map correctly to the new unique identifier

Create a hash based on the current SSN.
An example is here using SHA1 hash. Plenty of other options exist, including creating your own.

Why not simply enter a Random number in the column in question, like =RAND(), double-click the bottom corner of the cell to copy the formula to the bottom of your sheet, then copy/paste special value the column on itself to get rid of the formula?

Related

Making an auto-updating list of goals sorted by difficulty

Project Details
I am currently making a checklist of long-term goals, each goal has difficulty self-assigned to it between 1 and 9, and I want to group these goals by difficulty, either in the one-column or in different tables.
As well as this, I've used a checkbox (Form Control) which I will tick once the goal is completed, which will put a strikethrough the goal as well as italicize it. The general idea is something like this. The goal cell is formatted when the corresponding checkboxvalue is TRUE.
My Issue
My problem is that when I want to add new goals to the list, I will need to manually move and adjust the goal list so that the goals are grouped by difficulty.
I would prefer to add goals to the bottom of the list and then use the sort tool to sort the table by difficulty (lowest to highest), or potentially find a way of adding the goal details to one table and it being grouped by difficulty automatically in another table.
Attempted Solutions
1. When I sort the table by difficulty, the correct goal cell is formatted but the checkbox does not move with the sort, ( Here I sorted from (smallest to largest) to (largest to smallest) to highlight the difference ).
I believed this was because I included the checkbox reference cell inside the table.
2. Knowing this, I moved the checkbox reference cells outside of the table. After sorting the table, the checkboxes now move to the correct cell, however the incorrect goals are crossed out. This makes sense, as the cells the goals are formatted on haven't moved.
3. Changing the reference from locked to unlocked (eg. =$D$2 to =D2 ) does not seem to change the result in either scenario. Neither does using a formula to populate the goal column based on another table containing the list.
Despite knowing all this, I have not been able to figure out a way to correctly move both the checkboxes via sorting, as well as correctly sorting the reference cell, thus the correct goal being crossed off. I suspect the solution will need
VBA code (which I have no experience in)
a smarter formula for conditional formatting
using ActiveX checkboxes (Which I would prefer not to)
potentially changing the layout of my whole list into separate same-difficulty tables.
Or something really obvious that I'm missing
Hopefully what I'm after isn't impossible, and any would be greatly appreciated.
Cheers :)

Excel: Find and select multiple cells

I am an excel noob with a massive and weirdly organized spreadsheet that contains several columns with hundreds of account number values.
I have identified 70 of these accounts that are "special" based on criteria not in this spreadsheet. I wish to select EVERY cell containing one of the 70 identified account ID's but I cant figure out how to do this. The Excel search function doesn't have any OR operator.
Is there a simple way to do this? If not can someone please be really specific when describing the hard way to do it? I'm sure this can be done somehow.
Thank you so much!
And in addition to Tim’s question: do you have the 70 accounts in a table somewhere? If so, you might be able to use VLOOKUP.
Well you can just press Shift and click on every cell you want to selec withaut stopping pressing it.

Excel 2013: Count unique values or IDs in column B with a condition which needs to be filtered in column A

I am a noob in excel, hence pardon me for any mistakes made.
This question must have been answered before but I couldn't find the right string to make it work for me.
There are around 500 rows and 20 columns (Yes, it is a report)
Column A has a few values (eg: Problem, Change, Request, etc.)
Column B has ticket numbers assigned to each entry. (No, I don't work for a call center, these are Datacenter Operations tickets)
Column B has several duplicate ticket numbers, as many people worked on same ticket OR the ticket was reopened for some reason.
I wish to take a count of unique ticket numbers from Column B when the condition in Column A is Change only.
So if there are 500 ticket number 250 are duplicate for sure, and only 25% of the rest will be Change tickets.
I am not supposed to use a Pivot or filter hence asking this question.
Need a formula to retrieve the count with the condition.
I may put the formula in Sheet2 or at the extreme right column, plz don't worry about it, I will take care of those things.
Many thanks in advance.
Adding to the question,
Let me help you with some data.
Change CRQ1110001
Problem INC1110001
Change CRQ1110001
Problem INC1110001
Change CRQ1110003
Problem INC1110003
Change CRQ1110004
Problem INC1110004
Change CRQ1110004
Change CRQ1110004
Problem INC1110005
Now I wish to only consider Change here without considering duplicate values.
Maybe this helps.
Thanks again.
Based on this website count unique values and with a small change this formula should work, expand the ranges to cover your entire range.
=SUM(IF(FREQUENCY(IF(A2:A10="change",IF(LEN(B2:B10)>0,MATCH(B2:B10,B2:B10,0),""),""), IF(A2:A10="change",IF(LEN(B2:B10)>0,MATCH(B2:B10,B2:B10,0),""),""))>0,1))
Entered with ctrl+shift+enter as it's an array formula.
Note that if you do this over the entire column A:A it will take quite a bit of time to compute as it has to go through a lot of calculations in the array formula.
If your "ticket numbers" in column B are actual numbers then you can use this formula
=SUM(IF(FREQUENCY(IF(A2:A500="Change",B2:B500),B2:B500),1))
confirmed with CTRL+SHIFT+ENTER
If not numeric you need to use a version as per gtwebb's suggestion

Filter all unique items like Google Docs

Is there a quick/easy way to filter all unique items in an Excel 2013 column similar to the Google Docs "Unique" function?
This is not a pretty answer, but it works.
Paste this as an array formula into cell B2:
=LOOKUP(2, 1/((COUNTIF(B$1:B1, A:A)=0)*(A:A<>"")), A:A)
With the column that needs to be filtered in A:A
Then drag / copy it down as far as is required.
See it online in Google Spreadsheets
Caveats:
Does not retain original order (resulting order is in fact the reverse)
Does not automatically expand to cover all cells
Not fast, not pretty, not transparent
Footnotes:
It is trivial to use IFERROR() to filter out the #N/A errors, but I've not done this to keep the answer concise
In the same vein the header of the column A is currently also returned. This can be fixed by changing A:A to A$2:$25 in all 3 locations
Original question was for Excel 2013, all of this should work there, but I wrote and tested it in Excel 2016
I would love to hear suggestions on how to make the formula automatically expand down as far as required.
Use the Unique records only feature in Advanced Filter.
Under the DATA tab there is this: "Remove Duplicates". It'll do what you want.
There isn't an equivalent to =unique() in Excel, and I hate having to work without it.
Without =unique() trying to find all of the unique values in a large array of data is impossible. Take a dozen columns of a hundred+ entries and see what the unique values are across the whole mess and pop them nicely into a new columns. I can't figure out how to do it in Excel, but in Gdocs it's simple:
=unique(transpose(split(ArrayFormula(concatenate(A:M&",")),",")))
Using Filters, or PivotTables, or whatever, just doesn't cut it, and I haven't been able to find any hacked together ridiculous excel formula to do anything similar.
filter your data in spreadsheets
This might prove to be of some help to you.

rotating columns into rows with excel

I have an excel sheet that looks like this:
dont ask me how this happened, but somehow things that should be columns are in this sheet as rows...
you can see the repeating Account numbers and the words in column c, imagine everything rotated 180 degrees. And insert blanks or nulls for the field that doesnt exist for that specific Account number.
In short it should end up looking like this:
I cant think of an easy way to do this inside of excel. But perhaps with some VBA code?
what would be the easiest solution? and how do I implement it?
I know this question isnt very clear, but if you leave some directing comments, I would be happy to edit this question till it makes sense. Thanks!
You don't need VBA for this :)
I am taking a selective example to show how you can proceed.
Let's say your data looks like this
Now, create a Pivot of your data. This will give you unique Account numbers. See the screenshot below.
Then create a table with the respective headers as shown in your screenshot and copy and paste the unique Account Number from your pivot there.
Now you are ready to pull up your data. So as per the above screenshots enter this formula in cell B13. Note this is an Array formula. You have to press CTL + SHIFT + ENTER
=INDEX($D$2:$D$5,MATCH(1,($C$2:$C$5=B12)*($A$2:$A$5=A13),0))
Simply do that for the rest :)

Resources