Is there a faster way to replace accented characters? - excel

I have this code that replaces all accented characters except in row 6. However, this macro takes a long time because it goes through every cell/letter, is there any way to make this any faster by making it ignore cells that don't have any accents in them?
Const sFm As String = "ŠŽšžŸÀÁÂÃÄÅÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖÙÚÛÜÝàáâãäåçèéêëìíîïðñòóôõöùúûüýÿ"
Const sTo As String = "SZszYAAAAAACEEEEIIIIDNOOOOOUUUUYaaaaaaceeeeiiiidnooooouuuuyy"
Dim i As Long, employeews As Worksheet
Dim rowsix() As Variant
Set employeews = DestWb.Sheets(1)
'Don't replace row 6
rowsix = employeews.Rows(6).Value
For i = 1 To Len(sFm)
employeews.Cells.Replace Mid(sFm, i, 1), Mid(sTo, i, 1), LookAt:=xlPart, MatchCase:=True
Next i
employeews.Rows(6).Value = rowsix

Putting comment as an answer so the code is more readable:
I would think to choose a range I want to replace values within, then loop through the special characters to replace, as a whole, within the range. The only real caveat to remember is that this will affect formulas.
dim accentArr as variant, noAccentArr as variant
'accent and noaccent need to have same upper bound for this approach!
accentArr = Array("Š","Ž","š") 'quick mockup
noAccentArr = Array("S","Z","s")
dim i as long
For i = lbound(accentArr) to ubound(accentArr)
ws.range("a1:z5").replace(accentArr(i),noAccentArr(i))
Next i
Rather than going character by character in the cell, you at least do a mass replace for specific characters... this also allows your Range() to start at row 7, as to not include row 6.
Postscript, see: Split string into array of characters? if you want to utilize the existing string without having to manually split out the string of characters into an array.

In line with what everyone else is saying, and not really knowing what you are considering as bad performance, you could try someting like so. It uses a dictionary which is populated with your from and to strings, split into characters and their replacements where the from is the key and the to is the item The keys() and items() of the dictionary are array's so using them rather than slicing the string each time and the dictionary will be available again.
Private d As Scripting.Dictionary
Const sFrom As String = "ŠŽšžŸÀÁÂÃÄÅÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖÙÚÛÜÝàáâãäåçèéêëìíîïðñòóôõöùúûüýÿ"
Const sTo As String = "SZszYAAAAAACEEEEIIIIDNOOOOOUUUUYaaaaaaceeeeiiiidnooooouuuuyy"
Sub PopulateReplacements()
Dim s As String
Dim l As Long
Set d = New Scripting.Dictionary
For l = 1 To Len(sFrom)
If Not d.Exists(Mid(sFrom, l, 1)) Then _
d.Add Mid(sFrom, l, 1), Mid(sTo, l, 1)
Next l
End Sub
Sub TestReplacing()
Dim s As String
Dim l As Long
s = "ÁÂÃÄÅÇÈÉÊËÌÍÎÏÐÑÒÓÔ"
s = "_Ÿ_À_Á_Â_Ã_Ä_Å_Ç_È_É_Ê_Ë_Ì_Í_Î_Ï_Ð_Ñ_"
s = sFrom
If d Is Nothing Then
PopulateReplacements
End If
For l = 0 To UBound(d.Keys())
s = Replace(s, d.Keys()(l), d.Items()(l))
Next l
Debug.Print s
End Sub

Related

Table (ListObject) Assigning values using UDF

Above is a simplified example but what I want to achieve with my UDF is to accept the string in the Collated column and a delimiter which will be used to break the sting apart into substrings and be assigned sequentially to the columns Q1, Q2, Q3, Q4. It is possible for there to be less than 4 substrings generated but there will never be more than 4.
Function DECONS(Subject As String, Delim As String) As String
' takes an input string "Subject" and seperates it using "Delim" as the deliminator
' If the desired element exceeds the number of unique substrings the function returns a blank result
' Hardcoded for max 4 substrings
' initializes temporary variables
Dim i As Long
Dim r() As String
' uses built in VBA function to split the passed string using the deliminating character
r = Split(Subject, Delim)
' increases the size of r() to 4 elements and fills extra elements with blancks
For i = UBound(r) + 1 To 3
ReDim Preserve r(i)
r(i) = ""
Next i
' my sad attempt at what I want to acheive
Dim loT As ListObject
Set loT = ThisWorksheet.ListObjects("TT")
For i = 1 To 4
loT.ListColumn(i + 1).Range.Value = r(i - 1)
Next i
End Function
I've done the text manipulation using the VBA split function which yields an array of strings but I'm hitting a wall as to how to assign the substrings to the relevant columns. My attempt can be seen above. I've done a fair bit of reading but I'm still not comfortable enough in VBA/Excel to figure this out on my own yet. Is this more complex when in a ListObject than outside a table object?
Unfortunately I can't return the string array from the function and then assign it to multiple cells as the Table Object doesn't allow array operations. I had a work around where I would return a specified element, ie the 3rd, and I would call the function in each column and output the one corresponding value. However, the method is not elegant and does a lot of unnecessary repeated computation.
Try
Sub DECONS(Delim As String)
Dim objList As ListObject
Dim vDB As Variant, vSplit
Dim vR(), n As Integer, r As Long
Dim i As Long
Set objList = ActiveSheet.ListObjects("TT")
vDB = objList.DataBodyRange.Columns(1)
r = UBound(vDB, 1)
ReDim vR(1 To r, 1 To 4)
For i = 1 To r
vSplit = Split(vDB(i, 1), "\")
n = 0
For Each v In vSplit
n = n + 1
vR(i, n) = v
Next v
Next i
'Range("b2").Resize(r, 4) = vR
objList.DataBodyRange.Columns(2).Range("a1").Resize(r, 4) = vR
End Sub

Generating Random Strings of Characters including Special Characters

I have a code that generates random strings of characters (passwords) that contain exactly 8 characters. My issue is that it populates only numbers and characters.
1) I would like to include special characters to the mix and to make sure at least one special character is always included in each string.
2) I need to add a condition that each string always contains at least one upper case character, at least one lower case, at least one number.
This is the code I have but I can't figure out how to include these conditions to make it work as I need. I've tried a lot of googling to make it work specifically for this code but can't figure it out. Can you please advise?
Sub MakeRandom()
Dim J As Integer
Dim K As Integer
Dim L As Double
Dim iTemp As Integer
Dim sNumber As String
Dim bOK As Boolean
Range("G5:G148").Activate
Randomize
L = InputBox("Amount of Passwords:")
For J = 1 To L
sNumber = ""
For K = 1 To 8
Do
iTemp = Int((122 - 48 + 1) * Rnd + 48)
Select Case iTemp
Case 48 To 57, 97 To 122
bOK = True
Case Else
bOK = False
End Select
Loop Until bOK
bOK = False
sNumber = sNumber & Chr(iTemp)
Next K
ActiveCell.Value = sNumber
ActiveCell.Offset(1, 0).Select
Next J
End Sub
I'd appreciate any useful advice.
Why not introduce Long string, containing all legal characters and then having your algo pick chars from random positions of this string? That way you have easy control of what can be in your Passwords.
To ensure the conditions are true, I would place the Password Generation in an infinite Loop, that only end when all conditions are satisfied. As Long as this this is actually possible (that is, your conditions can all be true at the same time), this Loop will eventually end (due to law of large numbers).
I have something that might help you. I use a different method but the result should be similar. I've adapted it to your constraints, but may have missed something.
Mine works like this: I have a sheet called ChrSrc with all characters that I want to be able to include in the string. The characters have been divided into four columns. From column A to D it's lower case letters, upper case letters, numbers, special characters.
An array is made to store the characters and create the random string. The array contains 3 'columns'. the first gives a number between 1 and 4 to determine from which column it should get it's character. The first four are always 1,2,3,4 to makes sure every type of character is used at least once. The other 4 (or more if the random string is longer) are randomly filled.
The second 'column' is then filled with the actual character to add to the random string. And lastly the third 'column' is filled with zeros. These will be used to track which characters have been used in the random string.
Once the array has been filled, the do while loop is used to select the order of the characters randomly. After a character has been added, the zero in the array is changed to a one, to make sure every character gets used once.
By the end your random string is in the variable RandomString and you can write it to a cell or do whatever with it.
If you want to create multiple strings in one go, I would suggest writing a small caller sub, that calls this one x amount of times. Or add a loop and inputbox into this one.
Hope that helps.
Sub CreateString()
Dim StringArray() As Variant
Dim PositionCount As Long
Dim Lr As Long
Dim RandomString As String
Dim arrIndex As Long
Dim Loopcount As Long
Dim StringLength As Long
StringLength = 8
ReDim StringArray(1 To StringLength, 1 To 3)
For PositionCount = 1 To StringLength
If PositionCount > 4 Then
StringArray(PositionCount, 1) = Random(4)
Else
StringArray(PositionCount, 1) = PositionCount
End If
'lastrow for each character category, adjust as needed
Select Case StringArray(PositionCount, 1)
Case Is <= 2
Lr = 26
Case Is = 3
Lr = 10
Case Is = 4
Lr = 17
End Select
StringArray(PositionCount, 2) = ThisWorkbook.Sheets("ChrSrc").Cells(Random(Lr), StringArray(PositionCount, 1))
StringArray(PositionCount, 3) = 0
Next
Do While Len(RandomString) < StringLength
arrIndex = Random(StringLength)
If StringArray(arrIndex, 3) = 0 Then
RandomString = RandomString & StringArray(arrIndex, 2)
StringArray(arrIndex, 3) = 1
End If
Loopcount = Loopcount + 1
Loop
End Sub
Function Random(Max As Long)
Random = Int(Max * Rnd) + 1
End Function
Where possible, it's advisable to avoid using Activate and Select. In your case, you could create an array of random strings and then write the array to the sheet. The length of the array could be controlled by the value returned by InputBox.
Your code may benefit from: L = InputBox("Amount of Passwords:", Type:=1) which, if I'm reading the documentation correctly, validates that the input is numeric.
My understanding is that you should use type Long instead of Integer under normal circumstances (as Integers are now converted to Longs under the hood). Moreover, in this context where you're receiving arbitrary user input, Integer type overflows at 32768. If you enter 32768 (for example) or any larger number into the InputBox you should see an unhandled overflow error.
With reference to the approach described in this answer (https://stackoverflow.com/a/57903244/8811778):
Option Explicit
Private Function CreateRandomString(Optional ByVal lengthOfOutput As Long = 8, Optional ByVal minimumCountOfNumbers As Long = 1, Optional ByVal minimumCountOfLetters As Long = 1, Optional ByVal minimumCountOfSymbols As Long = 1) As String
Dim countRemaining As Long
countRemaining = lengthOfOutput - (minimumCountOfLetters + minimumCountOfNumbers + minimumCountOfSymbols)
Debug.Assert countRemaining >= 0
Const LETTERS_ALLOWED As String = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
Const NUMBERS_ALLOWED As String = "0123456789"
Const SYMBOLS_ALLOWED As String = "!""£$%^&*()-_+[]{};:'##" ' Change as necessary, I do not know what symbols you want included.
Dim toJoin() As String
ReDim toJoin(1 To 4)
toJoin(1) = GetRandomCharactersFromText(LETTERS_ALLOWED, minimumCountOfLetters, duplicatesAllowed:=False)
toJoin(2) = GetRandomCharactersFromText(NUMBERS_ALLOWED, minimumCountOfNumbers, duplicatesAllowed:=False)
toJoin(3) = GetRandomCharactersFromText(SYMBOLS_ALLOWED, minimumCountOfSymbols, duplicatesAllowed:=False)
' I arbitrarily pad the rest of the string with random letters, but you can change this logic.
toJoin(4) = GetRandomCharactersFromText(LETTERS_ALLOWED, countRemaining, duplicatesAllowed:=False)
Dim outputString As String
outputString = Join(toJoin, vbNullString)
' This step is meant to scramble the characters in the string.
' Otherwise, the returned string's structure would reflect the code above:
' • w letters, followed by x numbers, followed by y symbols, followed by z characters
' which stops it being pseudo-random.
outputString = GetRandomCharactersFromText(outputString, Len(outputString), False)
CreateRandomString = outputString
End Function
Private Function RandomBetween(ByVal lowerLimit As Long, ByVal upperLimit As Long) As Long
' Could use Application.RandBetween instead (maybe). But maybe there is some performance difference.
' https://learn.microsoft.com/en-us/office/vba/language/reference/user-interface-help/rnd-function
RandomBetween = Int((upperLimit - lowerLimit + 1) * Rnd + lowerLimit)
End Function
Private Function GetRandomCharactersFromText(ByVal someText As String, ByVal numberOfCharactersToGet As Long, Optional ByVal duplicatesAllowed As Boolean = True) As String
' Returns n characters from a given string. Characters are chosen pseudo-randomly.
' "duplicatesAllowed" controls whether a given index can be chosen more than once.
Dim chosenIndexes() As Long
ReDim chosenIndexes(1 To numberOfCharactersToGet)
Dim characterIndex As Long
For characterIndex = 1 To numberOfCharactersToGet
Do While True
Dim randomCharacterIndex As Long
randomCharacterIndex = RandomBetween(1, Len(someText))
If duplicatesAllowed Then Exit Do
If IsError(Application.Match(randomCharacterIndex, chosenIndexes, 0)) Then Exit Do
Loop
chosenIndexes(characterIndex) = randomCharacterIndex
Next characterIndex
Dim chosenCharacters() As String
ReDim chosenCharacters(1 To numberOfCharactersToGet)
For characterIndex = 1 To numberOfCharactersToGet
randomCharacterIndex = chosenIndexes(characterIndex)
chosenCharacters(characterIndex) = Mid(someText, randomCharacterIndex, 1)
Next characterIndex
GetRandomCharactersFromText = Join(chosenCharacters, vbNullString)
End Function
This is just my interpretation of the approach posted by the user. That user may have implemented their approach differently.
Majority of the work is done by the GetRandomCharactersFromText function.
You can probably get rid of the nested For K = 1 to 8 loop and replace with something like ActiveCell.Value = CreateRandomString(lengthOfOutput:=8, minimumCountOfNumbers:=1, minimumCountOfSymbols:=1) (although you should avoid using ActiveCell and, in general, relying on objects to be active).
Lastly, based on certain parts of your code, it seems this code is meant to generate passwords. I'm no security expert so I will refrain from providing security-related suggestions/advice. You may or may not benefit from reading https://xkcd.com/936/ and the related discussion https://security.stackexchange.com/a/6096/71460.

How to ignore special characters when referencing a string in excel Vba

I am trying to use vba to read client feedback and reference it to a set of keywords mapped to categories. However the problem I am having is that on occasion, clients use special characters such as “ -^<* ‘ in their comments and this is breaking my code as soon as it hit such a string.
How can I make my code ignore these special characters and keep moving down the rows to search for criteria? Thanks in advance
First place the data to be "cleaned-up" in column A, then run:
Sub Kleanup()
Dim A As Range, aa As Range, L As Long, i As Long
Dim CH As String, temp As String
Set A = Range("A:A")
For Each aa In Intersect(A, ActiveSheet.UsedRange)
If aa <> "" Then
L = Len(aa)
temp = ""
For i = 1 To L
CH = Mid(aa, i, 1)
If CH Like "[A-Za-z0-9]" Then
temp = temp & CH
End If
Next i
aa.Value = temp
End If
Next aa
End Sub
It will remove all characters except 0 through 9 and upper case letters and lower case letters.

How to highlight and list missing values in a letter-number sequence

I'm using Excel 2010. I have a column of data that contains four-digit sequences. These sequences consist of and iterate through a number of the alphabet followed by a number from 100-999. So A100, A101, A102... A999, then B100, B101... all the way up until Z999. There are 24,000 of these sequences (i.e. rows).
I first tried the following VBA script:
Function MissingNumbers(Rng As Range) As String
Dim X As Long, MaxNum As Long
MaxNum = WorksheetFunction.Max(Rng)
ReDim Nums(1 To MaxNum)
For X = 1 To MaxNum
If Rng.Find(X, LookAt:=xlWhole) Is Nothing Then
MissingNumbers = MissingNumbers & ", " & X
End If
Next
MissingNumbers = Mid(MissingNumbers, 3)
End Function
...but I ran into an error when calling it with =MissingNumbers(A1:A23400), which I'm guessing is because the function can't parse the alphabetic letters at the beginning of the sequences.
My question: what formula can I use to both highlight and separately list missing values in my first column?
For extra clarification, missing values in the range
C996
C998
C999
D101
...would be...
C996
*C997*
C998
C999
*D100*
D101
Thanks in advance.
There's perhaps more speedy way to do this ... but this is tested and produced results on a small sample.
Function MissingSequence(Rng As Range) As String
Dim iCnt As Integer
For iCnt = 65 To 90 'ASCI characters for alphabet A-Z
Dim iNum As Integer
For iNum = 100 To 999
Dim sCheck As String
sCheck = Chr(iCnt) & iNum
If Rng.Find(sCheck, lookat:=xlWhole) Is Nothing Then
Dim sMissingNumbers As String
sMissingNumbers = sMissingNumbers & "," & sCheck
End If
Next
Next
MissingSequence = Mid(sMissingNumbers, 2)
End Function

Logically parse a string in excel to trim near duplicates

Assume the string:
item1, item1N, item1Z, item1fhg, item1_any_letters, item2, item3, item3N, item3H
my goal output is simply
item1, item2, item3
this is about a 100,000 line Excel file currently, but can be migrated to another program etc if needed temporarily.
Essentially I need to determine duplicates (any initial phrase ending in a number) with no regard to letters after the number. Some phrases might have for example "Brand item2, Brand item34" as well, the only determining factor of a duplicate is any and all terminology AFTER the number.
any ideas on where to begin with this? Each string usually has between 2 and 500 values in it, seperated by comma and a space. No comma follows the final value.
Sub Tester()
Dim re As Object, match As Object
Dim dict As Object
Dim arr, arrItems, x As Long, y As Long
Dim val, matches, valMatch
Set dict = CreateObject("scripting.dictionary")
Set re = CreateObject("VBScript.RegExp")
re.Pattern = "([\w ]+\d+)"
re.ignorecase = True
re.Global = True
arr = ActiveSheet.Range("A1:A100").Value
For x = LBound(arr, 1) To UBound(arr, 1)
arrItems = Split(arr(x, 1), ",")
dict.RemoveAll
For y = LBound(arrItems) To UBound(arrItems)
val = Trim(arrItems(y))
If re.Test(val) Then
Set matches = re.Execute(val)
valMatch = matches(0).Value
If Not dict.exists(valMatch) Then dict.Add valMatch, 1
End If
Next y
Debug.Print arr(x, 1)
Debug.Print Join(dict.keys, ",") 'where do you want this?
Next x
End Sub
A VBA approach that is somehwat similar to Tim's for the first pathway
Use a RegExp to remove the invalid charcaters (characters after a number and before a comma)
Eliminate the duplicates with
a) Use a Dictionary
b) Excel's inbuilt remove duplicates functionality (writes to a sheet)
Const strDelim = ", "
Sub TestMe()
Dim strTest As String
Dim x
strTest = "item1, item1N, item1Z, item1fhg, item1_any_letters, item2, item3, item3N, item3H"
x = Split(DeDupe(strTest), strDelim)
'fix last element
x(UBound(x)) = Left$(x(UBound(x)), Len(x(UBound(x))) - 1)
Call Method2(x)
End Sub
Sub Method2(ByVal x)
Dim objDic As Object
Dim y As Variant
Set objDic = CreateObject("Scripting.Dictionary")
Dim lngRow As Long
For lngRow = LBound(x) To UBound(x)
objDic(x(lngRow)) = 1
Next lngRow
MsgBox Join(objDic.keys, strDelim)
End Sub
Function DeDupe(strIn As String) As String
Dim objRegex As Object
Set objRegex = CreateObject("vbscript.regexp")
With objRegex
.Global = True
.Pattern = "(.+?\d+)[^\d]+(,|$)"
DeDupe = .Replace(strIn, "$1,")
End With
End Function
Option B
'another potential option. Not applied in this code
Sub Method1(ByVal x)
Dim y As Variant
Dim rng1 As Range
With ActiveSheet
.[a1].Resize(UBound(x) + 1, 1) = Application.Transpose(x)
.Columns("A").RemoveDuplicates Columns:=1, Header:=xlNo
y = Application.Transpose(Range([a1], Cells(Rows.Count, "A").End(xlUp)))
End With
MsgBox Join(y, strDelim)
End Sub
This is probably imperfect, since it's a quick hack which only removes the rightmost non-digit strings. You will need some regexp knowledge to tune it to your needs.
Anyway, follow the "installation" steps given here, save the module, and you will be able to write in your sheet a formula such as
=S(A1;"[^0-9]*$";"")
in, say, the B1 cell. If A1 cell contains "Item 1234 blah blah", then B1 will now contain "Item 1234". Drag the formula in all cells of column B, and save values to another Excel file for sorting (or you can try sorting and sub-totaling in-place).
Unfortunately, I do not believe that doing this in 100,000+ cells is practical (I even advise against subtotaling in-place).
You would be much better served by installing textools (sed, grep, uniq...) for Windows, and running your file through a filter. Assuming that each row represents one item as above, a filter such as
sed -e 's/^\([^0-9][^0-9]*[0-9][0-9]*\).*/\1/g' | sort | uniq -c | sort -rn
would get your 100,000 line file and return something like
79283 Item 1
1234 Item 2
993 Item 3
..........
(on some platforms you could have written (\D+\d+) instead of ([^0-9]..., but I'm unsure of the Windows behaviour).
An even better choice of tools would be (Strawberry)Perl, which has CSV support too, or Python language.

Resources