'Creating Bucket in VBA with string values
I am trying to create a bucket of corresponding regions from string values of country codes. I have written a code using for loop and nested if conditions as below: Data:
Column A. Column B
Country code. Region.
Code:
Sub bucket()
Dim i As Integer
For i = 2 To 4321
If Range("A" & i).Value = "IN" Or Range("A" & i).Value = "CN" Then
Range("B" & i).Value = "ASIA"
ElseIf Range("A" & i).Value = "UK" Or Range("A" & i).Value = "GB" Then
Range("B" & 1).Value = "EMEA"
ElseIf Range("A" & i).Value = "US" Or Range("A" & i).Value = "CAN" Then
Range("B" & i).Value = "USAI"
Else
Range("B" & i).Value = "other"
End If
Next i
End Sub
I just wanted to check is there any alternative solution for string buckets in vba.
Solution 1:[1]
A cleaner alternative would be Select Case
sub bucket()
Dim i As Integer
For i = 2 To 4321
Select Case Range("A" & i)
Case "IN","CN"
Range("B" & i).Value = "ASIA"
Case "UK", "GB"
Range("B" & i).Value = "EMEA"
Case "US", "CA"
Range("B" & i).Value = "USAI"
Case Else
Range("B" & i).Value = "other"
End Select
Next i
End sub
Or to get even fancier:
sub bucket()
Dim i As Integer
For i = 2 To 4321
With Range("B" & i)
Select Case UCase$(Range("A" & i).Value)
Case "IN","CN"
.Value = "ASIA"
Case "UK", "GB"
.Value = "EMEA"
Case "US", "CA"
.Value = "USAI"
Case Else
.Value = "other"
End Select
End With
Next
End sub
These differences make it easier to read, easier to add more options later, and easier to modify if the columns were to change later.
Solution 2:[2]
An alternative with Switch() function:
Sub bucket()
Dim i As Long
Dim str As String
Dim var As Variant
For i = 2 To 4321
str = UCase$(Cells(i, 1).Value)
var = Switch(str = "IN", "ASIA", str = "UK", "EMEA", str = "US", "USAI", _
str = "CN", "ASIA", str = "GB", "EMEA", str = "CA", "USAI")
If IsNull(var) Then var = "other"
Cells(i, 2).Value = var
Next
End Sub
Or another one with no loops:
Sub bucket()
With Range("B2:B4321")
.FormulaR1C1 = "=IF(OR(RC[-1]=""IN"",RC[-1]=""CN""),""ASIA"", IF(OR(RC[-1]=""UK"",RC[-1]=""GB""),""EMEA"", IF(OR(RC[-1]=""US"",RC[-1]=""CA""),""USAI"",""other"")))"
.Value = .Value
End With
End Sub
which can also be written as follows:
Sub bucket()
With Range("B2:B4321")
.FormulaR1C1 = "=IF(OR(RC[-1]=""IN"",RC[-1]=""CN""),""ASIA""," & _
"IF(OR(RC[-1]=""UK"",RC[-1]=""GB""),""EMEA""," & _
"IF(OR(RC[-1]=""US"",RC[-1]=""CA""),""USAI""," & _
"""other"")))"
.Value = .Value
End With
End Sub
Solution 3:[3]
Here's another option, using Scripting.Dictionary. It's probably as close to a "bucket" as you'll get in VBA. It stores key-value pairs and returns them quickly.
I pulled out the population of the dictionary into a separate function -- while not strictly necessary, it makes it easier to update the items in your dictionary in future.
Sub MapContinents()
Dim d As Object, inputRange As Range, r As Range
Set inputRange = Sheet1.Range("A2:A4321")
Set d = GetContinentMapping
For Each r In inputRange
If d.Exists(r.Value) Then
r.Offset(, 1).Value = d(r.Value)
Else
r.Offset(, 1).Value = "Other"
End If
Next
End Sub
Private Function GetContinentMapping()
Dim d As Object
Set d = CreateObject("Scripting.Dictionary")
d.Add "IN", "ASIA"
d.Add "CN", "ASIA"
d.Add "UK", "EMEA"
d.Add "GB", "EMEA"
d.Add "USA", "USAI"
d.Add "CAN", "USAI"
Set GetContinentMapping = d
End Function
Edit: If you wish to use the approach above, but also enjoy speed improvements (see T.M's great suggestions and speed comparisons), you can adapt the code above to use VBA arrays. Replacing the first sub above with the code below will speed up run time considerably. The trade off (imo) is that you lose a bit of readability, but it might be well worth it!
Sub MapContinents()
Dim d As Object
Dim inputRange As Range
Dim inputData As Variant, v As Variant, outputData As Variant, i As Long
Set inputRange = Sheet1.Range("A2:A4321")
inputData = inputRange.Value
ReDim outputData(1 To UBound(inputData), 1 To 1)
Set d = GetContinentMapping
For i = 1 To UBound(inputData)
If d.Exists(inputData(i, 1)) Then
outputData(i, 1) = d(inputData(i, 1))
Else
outputData(i, 1) = "Other"
End If
Next
inputRange.Offset(, 1).Value = outputData
End Sub
Solution 4:[4]
Some speed considerations
- A. Looping through a cell range by means of VBA can be time-consuming,
- B. it is faster to process an array instead.
- C. Addressing, however an entire table range dynamically with an Excel formula can increase the processing speed considerably.
C. If you dispose of the newer dynamic array functions, it would be sufficient to
insert a single formula (e.g. in cell B2) resulting in a so called spill range.
I want to demonstrate especially the Let() formula which allows to avoid some redundancies in traditional formulae.
It groups formula variables with their code or value assignments in form of
argument pairs before defining the result logic as last argument eventually.
Note also the use of Match applied on two data fields, where Index(table,,2), for example defines the needed
2nd column of the lookup table.
=LET(data,$A$2:$A$4321,table,Sheet2!A$2:A$8,result,INDEX(INDEX(table,,2),MATCH(data,INDEX(table,,1),0)),IF(ISERROR(result),"Other",result))"
Of course you might also do this (once) using VBA
(assuming a lookup table e.g. in Sheet2!A2:A8):
Sub FormulaApproach365() ' 0.02 secs needed
Dim t As Double: t = Timer
Sheet1.Range("B2").Formula2 = _
"=LET(data,$A$2:$A$4321,table,Sheet2!A$2:A$8,result,INDEX(INDEX(table,,2),MATCH(data,INDEX(table,,1),0)),IF(ISERROR(result),""Other"",result))"
Debug.Print "Formula approach 365", Format(Timer - t, "0.00 secs needed")
End Sub
Note that you can obtain similar results with =XLOOKUP(A2:A12,Sheet2!A2:A8,Sheet2!B2:B8,"Other"), though not as fast
- D. Alternative in dynamic array versions:
Sub FormulaApproachDynamic() ' 0.03 secs needed
Dim t As Double: t = Timer
Sheet1.Range("B2").Formula2 = _
"=IF(--ISNUMBER(MATCH(A2:A4321,Sheet2!A2:A8,0)),INDEX(Sheet2!B2:B8,MATCH(A2:A4321,Sheet2!A2:A8,0)),""Other"")"
Debug.Print "Formula approach DA", Format(Timer - t, "0.00 secs needed")
End Sub
- E. More backwards compatible approach
Instead of this dynamic spill approach in D. above you can also apply a .FormulaArray = ... assignment
upon the corresponding entire column B range (CSE, i.e. CtrlShiftEnter).
Time needed
Note: This doesn't intend by no means to qualify the mentioned answers, as speed was no requirement of the original post!
| ad) | Approach | Time needed | Post |
|---|---|---|---|
| A.) | Range Loop | 1.23 seconds | braX |
| B.) | I. Range Loop/dictionary | 0.85 seconds | CallumDA |
| " | II. Array Loop/dictionary | 0.03 seconds | CallumDA - Edit/2022-02-02 |
| C.) | LET Formula365 | 0.02 seconds | see this post |
| D.) | Dynamic Array | 0.03 seconds | " - " |
| E.) | CSE | 0.03 seconds | " - " |
Doesn't include DisplayName 's valid Switch approach repeating each condition within the formula, as the formula length may be a serious limit.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | |
| Solution 3 | |
| Solution 4 |
