VBscript to get total count of unique values in spreadsheet - excel

UPDATED with questions still:
So I've used count before and haven't had issue. However I am trying to get a total for each unique value in a spreadsheet. I'm using vb script because I need this to be outside of excel for several reasons.
So if my data is like :
Bob
Bob
Ted
Ann
Ann
I'm looking for the results to be
Bob =2
Ted = 1
Ann = 2
etc...
This is what I have so far, which gets me my unique count but not a total for the unique items...
Dim objDict,item,arr,cRow, result,count
Set objDict = CreateObject("Scripting.Dictionary")
arr= .Sheets(1).Range("A2:A" & iLastRow )
For Each key In arr
If key <>"" Then
If Not objDict.Exists(key) Then objDict.Add key, Nothing
else
objDict.key("name").Item = objDict.Item() + 1
end if
End If
Next
I've updated with Marks suggestion, yet I continue to get hung up on key.
I'm not understanding how to use in this situation. From the help file Mark supplied, the example is adding a key and an item. I'm just adding keys, so what am I missing here?
Thanks for any help in this. Just pointers in the right direction are most welcome.

Related

Excel non-uniform data extraction

I've had a really hard time tracking down a solution for this--though I'm sure it's out there. Just not sure of the exact wording to get what I'm looking for.
I have a huge data set where some of the data is missing information so it is not uniform. I want to extract just the name into one column and the e-mail in to the next column.
The best way I can narrow this down is there is a space between each unique entry with the name always being in the first box.
Example:
John Doe
John Doe's Company
(555) 555-5555
John.doe#johndoe.com
John Doe
(555) 555-5555
John Doe
Jane Doe's Company
John.doe#johndoe.com
With the results wanted being (in two excel columns):
John Doe | john.doe#johndoe.com
John Doe |
John Doe | john.doe#johndoe.com
Any suggestions on the best way to do this would be appreciated it. To make it complicated if there was no e-mail I would want to ignore that set completely, but I could just manually check.
VBA coding:
1. Indicate in Row1 the initial row where the data begins.
2. Place a flag in this case the word "end" to indicate the end of the information.
3. Create a second sheet
Sub ToList()
Row1 = 1 'Row initial from data
Row2 = 1 'Row initial to put list
Do
Name = False
Do
field = Trim(Sheets(1).Cells(Row1, 1))
If field <> "" And LCase(field) <> "end" And Not Name Then
Sheets(2).Cells(Row2, 1) = field
Name = True
End If
Row1 = Row1 + 1
Loop Until (IIf(field = "" Or LCase(field) = "end", True, False))
fieldprev = Sheets(1).Cells(Row1 - 2, 1)
If InStr(fieldprev, "#") > 0 Then
Sheets(2).Cells(Row2, 2) = fieldprev
End If
Row2 = Row2 + 1
Loop Until (IIf(LCase(field) = "end", True, False))
End Sub
Extracting the e-mail address shouldn't be too difficult as you just need to is search for a string containing the # character. A series of search() and mid() functions can be used to separate out the individual words. Search for each instance of a space and use that value in a mid() function. Then search for # in the results and you should find the e-mail address. Extracting the name will be more difficult if the original data is very messy.
However I second the comment above about using an external script, especially for a large dataset. Excel isn't really designed for the sort of thing you are describing here.

Need formula for excel, to subtract the number "9" to each number individually and

I want you to have some fun. I need something specific.
First i must explain what i do. I use a simple codification for product prices at retail store, because i dont want people know the real price for themselves. So i change the original numbers to another subtracting the number 9 for each number.
Normally I manually write down all the prices with this codification for every product.
So.. for example number 10 would be 89. (9-1 = 8) and (9-0 = 9)
Other examples:
$128 = 871
$75 = 24
$236 = 763
$9 = 0
Finally i put 2 number nines (9) at the beginning of the codified price also, to confuse people who might think that number could be the price.
So the examples i used before are like this:
99871 (means $128)
9924 (means $75)
99763 (means $236)
990 (means $9)
Remember that i need 2 (two) nines before the real price. The real prices never start with 0 so, the nines at the beginning exist only to confuse people.
Ok. So, now that you understand, here comes the 2nd part.
I have an excel whith hundreds of my products added, with prices, description, etc. And i decided it is time to use a printer and start to print this information from excel. I have a software to do that, but first i need to have the codified prices in the excel also.
The fun part begins when i want to convert the real prices that are already written in my excel document into a new column AUTOMATICALLY. So that way i don´t have to type again all the prices in codified form for the old and new items i add in the future.
Can someone help me with this? Is it even possible?
I tried with =A1-9999 but, it works well with 2 character number only. Because if the real price is 5, i will get 3 nines: 9994(code). And if the price is 234 i will get only 1 nine 9765(code). And it is a condition i need to have the TWO nines at first.
Thank you very much in advanced!
Though you have requested for formula , I am suggesting VBA program which seems to me very convenient.
You have to open VBE and insert a module and copy the program. Change the code lines wherever indicated to suit your requirements for sheets etc.
Sub NumberCode()
Dim c As Range
Dim LR As Integer
Dim numProbs As Long
Dim sht As Worksheet
Dim s As Integer
Dim v As Long
Dim v1 As Long
Set sht = Worksheets("Sheet1") ' change as per yr requirement
numProbs = 0
LR = sht.Cells(Rows.Count, "A").End(xlUp).Row
For Each c In sht.Range("A1:A" & LR).Cells
s = Len(c)
v = c.Value
v1 = 99
For s = 1 To Len(c)
v1 = v1 & (9 - Mid(c, s, 1))
Next
c.Offset(0, 1).Value = v1
v1 = 99
numProbs = numProbs + 1
Next
MsgBox "Number coding finished"
End Sub
Sample sheet of results is appended below.
I will be using helper cells but you could dump it all into one cell if you want since you are only dealing with 4 characters.
For the purpose of this example, I am assuming your original price list starts in B11.
=IFERROR(9-MID($B11,COLUMN(A1),1),"")
Place that in D11 and copy to the right three more times so you have it from D11 to G11. That formula strips off 1 character from your price and subtracts that character from 9. When you go the next column it repeats itself. If you do not have that many characters, it will return "".
In C11 you will build your number based on the adjacent 4 columns using this formula:
="99"&D11&E11&F11&G11
It places 99 in front then adds the numbers from the adjacent 4 columns.
Select cells C11 to G11 and copy and paste downward beside your data column as far as you need to go.
An alternate more concise method would be:
=REPT(9,LEN(B11)+2)-B11
Perhaps I'm missing something, though simply:
=REPT(9,2+LEN(A1))-A1
seems good to me.
Regards

Is there a way to randomly select a cell based on weighted probability in Excel?

Say there's a list of names:
Peter
Andrew
James
John
Philip
Thomas
Matthew
I want to select one name randomly, and the formulas I'm currently using to do this are =RANDBETWEEN(1,7) and =VLOOKUP(A3,$A$6:$B$12,2).
However, is there a way to give each name a weight so that there is a higher chance for one particular name to be selected, because the only way I can think of doing that would be to add duplicate names to the list:
Peter
Peter
Peter
Peter
Peter
Andrew
Andrew
Andrew
James
James
John
Philip
Thomas
Thomas
Thomas
Thomas
Matthew
Mathhew
This way Peter would have the greatest probability of being randomly selected since the name appears the most, but I'd prefer not to do it this way if there's a more efficient way of doing this.
Any response is appreciated.
You can add an additional left column near the names and add number which will imply your desired probability.
1 Peter
10 Andrew
15 James
25 John
75 Philip
95 Thomas
100 Matthew
And now.. Use the randbetween 1 to 100 and instead of using the vlookup as false use it as true on match… this will give you the closet number to the round.
=VLOOKUP(A3,$A$6:$B$12,2,1)
This method requires no additional columns or helper cells.
Instead of RANDBETWEEN(1,7) use the following formula instead:
=CHOOSE(VLOOKUP(RANDBETWEEN(0,99),{0,1;28,2;45,3;55,4;60,5;65,6;87,7},2,1),1,2,3,4,5,6,7)
That will give you a weighting approximately equal to your longer list.
Then go ahead and use your =VLOOKUP(A3,$A$6:$B$12,2) to return the name.
One way to achieve this would be to use a combination of RANDBETWEEN with INDEX and MATCH.
Using MATCH's last parameter which specifies the type of comparison to apply, you can effectively use ranges of values to match on (e.g. if < 50 do X, if 51 - 75 do Y if 76-100 do Z). This means that if you add an extra column next to your names and assign caps ceiling values to each, you will effectively be able to create weighted probabilities.
Try adding ceilings between 1-100 to all names and try the following:
=INDEX(A:A, MATCH(RANDBETWEEN(1,100),B:B,1))
This is assuming your names are in A and your ceilings are in B.
I stumbled upon the same need and solved it using VBA.
In the following VBA code, simply state the names and the weights on line 2:
Private Sub CommandButton1_Click()
RandomName = WeightedRnd(Array("Peter", "Andrew", "James"), Array(34, 33, 33))
MsgBox(RandomName)
End Sub
Function WeightedRnd(items As Variant, weights As Variant) As Variant
Dim myItems(1 To 100) As Variant
Dim weight As Variant
Dim item As Variant
Dim myNumber As Variant
i = 1
n = 0
For Each weight In weights
For p = 1 To weight
myItems(i) = items(n)
i = i + 1
Next
n = n + 1
Next
n = UBound(myItems) - LBound(myItems) + 1
pick = getRandom(1, n)
WeightedRnd = myItems(pick)
End Function
Function getRandom(lowerbound, upperbound)
Randomize
getRandom = Int((upperbound - lowerbound + 1) * Rnd + lowerbound)
End Function
NB: Make sure the sum of your weights is equal to 100 or to the upperbound of myItems array: Dim myItems(1 To 100) As Variant

Loop through a combination of numbers

I am trying to think of a way to loop through a number of combinations making sure that I go through each available combination without repeat. Let me explain. I have a set of numbers, for example
20,000
25,000
27,000
29,000
and I would like to alter this set of numbers via a loop and copy the new numbers into a different sheet so that my formulas on that sheet can calculate whatever I need them to calculate. For example, the first couple of iterations might look something like this:
1st
20,000 x 1.001
25,000 x 1
27,000 x 1
29,000 x 1
2nd
20,002 x 1.001
25,000 x 1.001
27,000 x 1
29,000 x 1
The first row of numbers should never exceed the second. So 20,000 should only go as high as 25,000.
I was able to set up a system whereby I set up a matrix and then loop through a random set of combinations using =rand() however this does not ensure I hit every combination and also repeats combinations.
Can anyone explain the math behind this and also how I would use a loop to accomplish my goal?
Thank you!
Try starting with smaller numbers.
See if this works for you.
Sub looper()
'First Array
Dim myArray(9) As Double
For i = 1 To 10
myArray(i - 1) = i
Next i
'Second Array
Dim myOtherArray(9) As Double
For i = 1 To 10
myOtherArray(i - 1) = i
Next i
'Loop through each one
For Each slot In myArray
For Each otherSlot In myOtherArray
Debug.Print (slot & " * " & otherSlot & " = " & slot * otherSlot)
Next otherSlot
Next slot
End Sub
GD user1813558,
Your question contains too little detail and is too broadly scoped to be able to provide a accurate answer.
Are your numbers arbitrary (i.e. the ones you provided are 'just'
samples) or will they be fixed as per your indicated numbers ?
Will there always be only 4 numbers ?
Is the distribution of your startnumbers (i.e. their difference
value) always as per your indication 0, +5000, +2000, +2000
Will the results of all 'loops' (or iterations) need to be copied to
a different sheet ? (i.e looping from 20.000 to 25.000 by increments
of 1.001 would require about 223 iterations, and subsequently sheets,
before the result starts exceeding 25.000 ?)
Does a new sheet need to be created for each iteration result or are they
existent or will the result be copied to the same sheet for every iteration ?
In short, please provide a more accurate question.

Excel/VBA: How to Keep the first row containing first occurrence of var and remove the rest and repeat?

Problem:
I have about 50,000 rows in Excel. Each row contains a the word domain=[a-Z0-9]
where [a-Z0-9] is a placeholder for a bunch of numbers and text like a GUID. This domain ID let's call abc123 it is unique. However in the 50,000 rows it is not a unique key for the table so I need to make it unique by removing all the other rows where domain ID = abc123. But I have to do this for all domains so I can't be specific. I need a script to figure this out. The domain ID is always in the same column and there many different domain ID's that repeat themselves.
Sample
column 2
abunchofstuff3123123khafadkfh23k4h23kh*DomainID=abc123*
Pseudo Code
//Whenever there is a value for domain in row i col 2
//does it already exist in ListOfUniqueDomains?
//if so then remove this row
//else add to the ListOfUniqueDomains
How would one do this with Excel/VBA?
UPDATED ANSWER
So I really liked the idea of using Pivot Tables but I still had to extract the domain ID so I thought I'd post the solution to that portion here. I actually stole the function from some other website while googling but I lost the original post to give proper credit. So forgive me if that person is you but give yourself a pat on the back and I'll buy you lunch if you're in my neighborhood (easy everyone).
So in my case I had 2 delimeters (=, &) for the string domain=abc123& which is embedded in a longer string. So to extract the domain ID I did the following.
Public Function extract_value(str As String) As String
Dim openPos As Integer
Dim closePos As Integer
Dim midBit As String
On Error Resume Next
openPos = InStr(str, "=") 'get the position of the equal sign
On Error Resume Next
closePos = InStr(str, "&") ' get the position of the &
On Error Resume Next
midBit = Mid(str, openPos + 1, closePos - 1)
'get the string that is between equal sign and before '&' however this seems
'greedy and so it 'picked up the last '&'.I used split to get the first occurrence
'of '&' because that was how my string was designed.
Dim s As String
s = Split(midBit, "&")(0)
extract_value = s
End Function
Is VBA even a good idea for something like this?
Thanks
I've done this for some fairly large file (50k rows) where I needed to extract only unique elements. What I've done is quite simple: use a pivot table. This way you don't even need VBA, but if you want to process it further it's still very simple to update the table and extract data.
One of the reasons I really love this method is that it is extremely easy and powerful at the same time. You have no looping or algorithm to write, it's all right there in the Excel features.

Resources