I have an excel file with the following links:
These links are connected to files with the following data:
I want the yellow part of the notepad file to be read into the .xlsx file yellow parts (the notepad is an opened version of the .tbl file). The dotted parts differ for each Version number. (This code is used as a check that the right discount curve is used). However, the discount_curve.tbl format is the only format the next programme used is able to handle. Therefore, it has the same name just in a different folder.
Is there a way excel/vba can read in every third line whilst the file read in depends on the folder link? I strongly prefer to have the whole process automated since there are many many version numbers. Furthermore, I do not want to change the file formatting, since I want the process to be as clean as possible.
Could someone help me out?
Kind regards.
Please, try the next function, if the necessary data to be extracted exists in a single file, at every three rows.. It will return a 2D array able to be dropped at once in the range you need:
Function extractThirdLine(filePath As String) As Variant
Dim arrTxt, i As Long, arrFin, k As Long
'read the file content in an array:
arrTxt = Split(CreateObject("Scripting.FileSystemObject").OpenTextFile(filePath, 1).ReadAll, vbCrLf)
ReDim arrFin(1 To Int(UBound(arrTxt) / 3) + 1, 1 To 1)
For i = 2 To UBound(arrTxt) Step 3 'start from 2, because arrTxt is 1D array
k = k + 1
arrFin(k, 1) = arrTxt(i) 'build the filal array containing the necessary rows
Next i
extractThirdLine = arrFin
End Function
Your picture does not show the rows and columns headers. So, supposing that the range you show us exists in columns "A:C" and you need to place the extracted data in column "D:D", please use the next way:
Sub testExtractThirdLine()
Dim filePath As String, arrVal, el
filePath = "your text file full name" 'please write here the correct file name
arrVal = extractThirdLine(filePath)
Range("D1").Resize(UBound(arrVal), 1).value = arrVal
End Sub
If the range you show is not the one I supposed, you cam easily adapt Range("D1") to the immediately after the columns range and its row to be the first row of the range in discussion.
If something not clear enough, please do not hesitate to ask for clarifications.
Edited:
But if each third line can be found in a file, for each row, and the path to the respective file is obtained by concatenation of the three columns, the next function will do the job:
Function extractLine(filePath As String) As String
extractLine = Split(CreateObject("Scripting.FileSystemObject").OpenTextFile(filePath, 1).ReadAll, vbCrLf)(2)
End Function
It can be called as:
Sub extractStrings()
Dim i As Long, arr, arrFin, lastRow As Long
lastRow = Range("A" & rows.count).End(xlUp).Row 'supposing that 'C:\' exists in A:A column
arr = Range("A2:C" & lastRow).value
ReDim arrFin(1 To UBound(arr), 1 To 1)
For i = 1 To UBound(arr)
arrFin(i, 1) = extractLine(arr(i, 1) & arr(i, 2) & arr(i, 3))
Next i
'drop the processed array content at once:
Range("D2").Resize(UBound(arrFin), 1).value = arrFin
End Sub
Seems like you're looking for common I/O opearations i.e. reading file line by line.
Pretty good example was shown [here][1]
To reach your goal we need to add some if-conditions to extract every third line of your text files.
Modulo division will be a good helper.
For example we have 'i' as row number
then we just need to make an if condition looks smth like that:
If (i mod 3) = 0 Then ...
It means that we're looking for every 'i' which divided by 3 gives us a remainder of 0
This way our code will look something like this
Sub ReadFileLineByLine()
Dim my_file As Integer
Dim text_line As String
Dim file_name As String
Dim i As Integer
file_name = "C:\text_file.txt"
my_file = FreeFile()
Open file_name For Input As my_file
i = 1
While Not EOF(my_file)
Line Input #my_file, text_line
If (i mod 3) = 0 Then
Cells(i, "A").Value = text_line
End If
i = i + 1
Wend
End Sub
[1]: https://excel.officetuts.net/vba/read-a-text-file/#:~:text=Reading%20a%20file%20line%20by%20line,-Let's%20read%20text&text=Open%20VBA%20Edit%20(Alt%20%2B%20F11,and%20insert%20the%20following%20code.&text=First%2C%20a%20new%20file%20is,places%20it%20inside%20a%20worksheet.
You can create a User function that will read the lines from the given file and return the third one.
Here is such a function (Disclaimer: there is no error management in this code it can probably be improved a lot)
Function Get3rdLine(filename As String)
Dim f As Long
f = FreeFile
Open filename For Input As f
Line Input #f, Get3rdLine ' just ignore this line
Line Input #f, Get3rdLine ' and this one too
Line Input #f, Get3rdLine ' and return this one
Close #f
End Function
You can call it with the path of the file you want to read from:
=Get3rdLine(CONCATENATE(A1,B1,C1)) for example if your path is defined by cells A1, B1 and C1.
Related
Each configuration is set-up as a different Part number that makes up one complete device. We want to find out how many of each full configuration we have sold to tell the highest selling combinations. The configuration numbers relate to an option of the product for instance 12345-W means wireless.
We need to find out how many of each string of configurations we have to count them. This has a few columns that will help to find the configuration. You can look at the line number and the sales number to ensure that they are in the same grouping on the sales order. For instance one whole config that makes up a finished product will have line number 1 for all parts associated with the finished product, going down the sales order numerically. We can use this in combination with the sales order to come up with the Configuration String. Then we can look at the config column and part column to differentiate the base config from the options. The "C" in the Config column tells us it's the base model, the "X" tells us it's an option.
With this information we need to create the configuration string shown as a manual example in blueish/purple column, photo linked below. Once we have the string it's just counting the "C" config option only to avoid double counting then we can make a Pivot Table to tell how many duplicates of the same option there are and filter for C only. These are multiple different products, and multiple different configurations of different products.
Here is the Set-up:
Data example: the Blue column is an example manually of what is needed
Some thoughts I had were an If, then statement, Concatenation IF, or a Macro. But nothing has worked out so far. Any advice would be greatly appreciated!
To be able to do that efficient you need to ensure that the data is sorted in the following way:
by Sales
by linenum
by Config
by Part
It needs to be sorted by all 4 columns at once and in this order or you might end up with messed up data!
It is also assumed that the option numbers are always prefixed with the corresponding base number plus a dash.
Then we read the columns we need into arrays (for faster processing). We loop through that data line by line. When we find a C in config we rember that row number (for comparison of the X options and for writing the output). We move on with the next row and check if it is an option and if it belongs to the rembered base config. If so we append the option part no to the base config.
In the end we write the array data back to the cells.
Option Explicit
Public Sub GetFullConfigFromOptions()
Dim ws As Worksheet ' define your sheet here
Set ws = ThisWorkbook.Worksheets("Sheet1")
Dim LastRow As Long
LastRow = ws.Cells(ws.Rows.Count, "A").End(xlUp).Row
' read data columns into array
Dim ArrSalesNo() As Variant 'define Sales column
ArrSalesNo = ws.Range("A1").Resize(RowSize:=LastRow).Value
Dim ArrConfig() As Variant 'define Config column
ArrConfig = ws.Range("B1").Resize(RowSize:=LastRow).Value
Dim ArrPartNo() As Variant 'define Part column
ArrPartNo = ws.Range("C1").Resize(RowSize:=LastRow).Value
Dim ArrLineNo() As Variant 'define Linenum column
ArrLineNo = ws.Range("E1").Resize(RowSize:=LastRow).Value
' create output array
Dim ArrOut() As Variant 'define String column
ArrOut = ws.Range("D1").Resize(RowSize:=LastRow).Value
Dim CurrentConfigRow As Long
Dim iRow As Long
For iRow = 2 To LastRow
If ArrConfig(iRow, 1) = "C" Then
' is config line …
CurrentConfigRow = iRow ' remember line number for output
If iRow = LastRow Then
' base without options (in last row)
ArrOut(CurrentConfigRow, 1) = ArrPartNo(CurrentConfigRow, 1) ' write base part number to output (without dash)
ElseIf Not ArrSalesNo(CurrentConfigRow + 1, 1) = ArrSalesNo(CurrentConfigRow, 1) Or _
Not ArrLineNo(CurrentConfigRow + 1, 1) = ArrLineNo(CurrentConfigRow, 1) Or _
Not ArrConfig(CurrentConfigRow + 1, 1) = "X" Then
' base without options
ArrOut(CurrentConfigRow, 1) = ArrPartNo(CurrentConfigRow, 1) ' write base part number to output (without dash)
Else
' base with options
ArrOut(CurrentConfigRow, 1) = ArrPartNo(CurrentConfigRow, 1) & "-" ' write base part number to output (including dash)
End If
Else
' check if it is an option line of the remebered config line
If ArrSalesNo(iRow, 1) = ArrSalesNo(CurrentConfigRow, 1) And _
ArrLineNo(iRow, 1) = ArrLineNo(CurrentConfigRow, 1) And _
ArrConfig(iRow, 1) = "X" Then
' is option line (so append to output)
ArrOut(CurrentConfigRow, 1) = ArrOut(CurrentConfigRow, 1) & Mid$(ArrPartNo(iRow, 1), Len(ArrPartNo(CurrentConfigRow, 1)) + 2)
End If
End If
Next iRow
' write ouput to cells
ws.Range("D1").Resize(RowSize:=LastRow).Value = ArrOut
End Sub
The output will look like
I created a code to get the all the file names in the folder into a worksheet.I use this to check the accuracy of the file names(Please see the diagram below).
When I click the macro file names of the destination folder appear under the system reports.Then I use some formulas to match the file names with "Actual Names" column and indicates it to the user.
There is an issue with my code that the order of the file names displaying in the the worksheet is changing day by day though the file names and the order of the files are same in the destination folder.
How do I solve this problem?
Sub GetFiles_Name()
Dim x As String, y As Variant
x = "D:\Reports\*"
y = GetFileList(x)
Select Case IsArray(y)
Case True
MsgBox UBound(y)
Sheets("Cost").Range("H6:H11").Select
Selection.ClearContents
For i = LBound(y) To UBound(y)
Sheets("Cost").Cells(i, 8).Rows("6").Value = y
Next i
Case False
MsgBox "No Matching Files Found!"
End Select
End Sub
Function GetFileList(FileSpec As String) As Variant
Dim FileArray() As Variant
Dim FileCount As Integer
Dim FileName As String
On Error GoTo NoFilesFound
FileCount = 0
FileName = Dir(FileSpec)
If FileName = "" Then GoTo NoFilesFound
Do While FileName <> ""
FileCount = FileCount + 1
ReDim Preserve FileArray(1 To FileCount)
FileArray(FileCount) = FileName
FileName = Dir()
Loop
GetFileList = FileArray
Exit Function
NoFilesFound:
GetFileList = False
End Function
I can see several problems with your Sub GetFiles_Name():
The variable y is an array, but you're using it as if it was a variable on this line:
Sheets("Cost").Cells(i, 8).Rows("6").Value = y
When you do that, VBA will take the first element of the y array and use it in each column. Has your code actually ever worked as you show in your picture?
By writing Sheets("Cost").Range("H6:H11").ClearContents you assume that your files will always be 6 (from 6 to 11). Is that really the case? I would rather use something more flexible (here I assume that H5 corresponds to your Actual Names header column):
Dim lastRow As Integer: lastRow = Sheets("Cost").Range("G5").End(xlDown).Row
Sheets("Cost").Range("H6:H" & lastRow).ClearContents
Also notice that you don't need to .Select first and then clear the Selection. You can directly .ClearContents on the range without selecting.
Finally, in order not to be dependent on the order of the System Reports column files, you should look for each file and, if matched, just write it close to it. It would look like this:
For i = LBound(y) To UBound(y)
Set matched = Range("G6:G" & lastRow).Find(y(i), LookAt:=xlWhole) '<-- I assume "G" is the column with the file names moving in order
If Not matched Is Nothing Then '<-- if I found the file in the list
matched.Offset(0, 1) = y(i) '<-- put the file name in the adjacent column H
End If
Next i
Please be aware that I am working with a series of ~1000 line medical information databases. Due to the size of the databases, manual manipulation of the data is too time consuming. As such, I have attempted to learn VBA and code an Excel 2010 macro using VBA to help me accomplish parsing certain data. The desired output is to split certain characters from a provided string on each line of the database as follows:
99204 - OFFICE/OUTPATIENT VISIT, NEW
will need to be split into
Active Row Active Column = 99204 ActiveRow Active Column+3 = OFFICE/OUTPATIENT VISIT, NEW
I have researched this topic using Walkenbach's "Excel 2013: Power Programming with VBA" and a fair amount of web resources, including this awesome site, but have been unable to develop a fully-workable solution using VBA in Excel. The code for my current macro is:
Sub EasySplit()
Dim text As String
Dim a As Integer
Dim name As Variant
text = ActiveCell.Value
name = Split(text, "-", 2)
For a = 0 To 1
Cells(1, a + 3).Value = Trim(name(a))
Next a
End Sub
The code uses the "-" character as a delimiter to split the input string into two substrings (I have limited the output strings to 2, as there exists in some input strings multiple "-" characters). I have trimmed the second string output to remove leading spaces.
The trouble that I am having is that the output is being presented at the top of the activesheet, instead of on the activerow.
Thank you in advance for any help. I have been working on this for 2 days and although I have made some progress, I feel that I have reached an impasse. I think that the issue is somewhere in the
Cells(1, a + 3).Value = Trim(name(a))
code, specifically with "Cells()".
Thank you Conrad Frix!
Yah.. funny enough. Just after I post I have a brainstorm.. and modify the code to read:
Sub EasySplit()
Dim text As String
Dim a As Integer
Dim name As Variant
text = ActiveCell.Value
name = Split(text, "-", 2)
For a = 0 To 1
ActiveCell.Offset(0, 3 + a).Value = Trim(name(a))
Next a
End Sub
Not quite the colkumn1,column4 output that I want (it outputs to column3,column4), but it will work for my purpose.
Now I need to incorporate a loop so that the code runs on each successive cell in the column (downwards, step 1) skipping all bolded cells, until it hits an empty cell.
Modified answer to modified request.
This will start on row 1 and continue until a blank cell is found in column A. If you would like to start on a different row, perhaps row 2 if you have headers, change the
i = 1
line to
i = 2
I added a check on the upper bound of our variant before doing the output writes, in case the macro is run again on already formatted cells. (Does nothing instead of erroring out)
Sub EasySplit()
Dim initialText As String
Dim i As Double
Dim name As Variant
i = 1
Do While Trim(Cells(i, 1)) <> ""
If Not Cells(i, 1).Font.Bold Then
initialText = Cells(i, 1).text
name = Split(initialText, "-", 2)
If Not UBound(name) < 1 Then
Cells(i, 1) = Trim(name(0))
Cells(i, 4) = Trim(name(1))
End If
End If
i = i + 1
Loop
End Sub
just add a variable to keep track of the active row and then use that in place of the constant 1.
e.g.
Dim iRow as Integer = ActiveCell.Row
For a = 0 To 1
Cells(iRow , a + 3).Value = Trim(name(a))
Next a
Alternate method utilizing TextToColumns. This code also avoids using a loop, making it more efficient and much faster. Comments have been added to assist with understanding the code.
EDIT: I have expanded the code to make it more versatile by using a temp worksheet. You can then output the two columns to wherever you'd like. As stated in your original question, the output is now to columns 1 and 4.
Sub tgr()
Const DataCol As String = "A" 'Change to the correct column letter
Const HeaderRow As Long = 1 'Change to be the correct header row
Dim rngOriginal As Range 'Use this variable to capture your original data
'Capture the original data, starting in Data column and the header row + 1
Set rngOriginal = Range(DataCol & HeaderRow + 1, Cells(Rows.Count, DataCol).End(xlUp))
If rngOriginal.Row < HeaderRow + 1 Then Exit Sub 'No data
'We will be using a temp worksheet, and to avoid a prompt when we delete the temp worksheet we turn off alerts
'We also turn off screenupdating to prevent "screen flickering"
Application.DisplayAlerts = False
Application.ScreenUpdating = False
'Move the original data to a temp worksheet to perform the split
'To avoid having leading/trailing spaces, replace all instances of " - " with simply "-"
'Lastly, move the split data to desired locations and remove the temp worksheet
With Sheets.Add.Range("A1").Resize(rngOriginal.Rows.Count)
.Value = rngOriginal.Value
.Replace " - ", "-"
.TextToColumns .Cells, xlDelimited, Other:=True, OtherChar:="-"
rngOriginal.Value = .Value
rngOriginal.Offset(, 3).Value = .Offset(, 1).Value
.Worksheet.Delete
End With
'Now that all operations have completed, turn alerts and screenupdating back on
Application.DisplayAlerts = True
Application.ScreenUpdating = True
End Sub
You can do this in a single shot without looping using the VBA equivalent of entering this formula, then taking values only
as a formula
=IF(NOT(ISERROR(FIND("-",A1))),RIGHT(A1,LEN(A1)-FIND("-",A1)-1 ),A1)
code
Sub Quicker()
Dim rng1 As Range
Set rng1 = Range([a1], Cells(Rows.Count, "A").End(xlUp))
With rng1.Offset(0, 3)
.FormulaR1C1 = "=IF(NOT(ISERROR(FIND(""-"",RC[-3]))),RIGHT(RC[-3],LEN(RC[-3])-FIND(""-"",RC[-3])-1 ),RC[-3])"
.Value = .Value
End With
End Sub
I need a macro to get some data from an Excel spreadsheet prior to importing it into MySql linking table.
There is a column of charity names and a column with a list of id's separated by commas (these represent charity types)
To for example
Column A
CharityName1
CharityName2
CharityName3
CharityName4
Column B
100, 101,104
(empty)
104
100,105
I would like this to write a new csv file as follows
1,100
1,101
1,104
3,104
4,100
4,105
Thanks in advance for any help
This code will quickly create a csv file c:\temp\dump.csv with this format
[Updated to handle your format
I note that you may have lost data as Excel has applied scientific notation to your fields. For now I have added an ugly workaround to pad out the 0's. Should B2 be a 30 digit field?]
Sub GetEm()
Dim x()
Dim lngCnt As Long
Dim lngCnt2 As Long
Dim lngE As Long
Dim objFSO As Object
Dim objTF As Object
Dim vArr
Dim vArrElem
Set objFSO = CreateObject("scripting.filesystemobject")
Set objTF = objFSO.createtextfile("c:\temp\dump.csv", 2)
x = Application.Transpose(Range("B1", Cells(Rows.Count, "B").End(xlUp)))
For lngCnt = 1 To UBound(x)
lngE = InStr(x(lngCnt), "E")
If lngE > 0 Then
x(lngCnt) = CStr(Replace(Replace(x(lngCnt), ".", vbNullString), "E+", vbNullString) & Application.Rept("0", Right$(x(lngCnt), 2) - lngE + 1))
End If
If Len(x(lngCnt)) > 0 Then
If Len(x(lngCnt)) Mod 3 = 0 Then
For lngCnt2 = 1 To Len(x(lngCnt)) Step 3
objTF.writeline lngCnt & ",'" & Mid$(x(lngCnt), lngCnt2, 3)
Next
End If
End If
Next
objTF.Close
End Sub
I would iterate through the second column and take the values from each cell into an array, lets call it mainArray. (This iterates rows and cols, be warned: How to iterate through a variable-column-length range in Excel using VBA)
Then I would parse until the delimiting ',' and store them in a an array called cellArray with the first value as the numbered cell they were taken from. Then, replace the original cell value in mainArray with the new cellArray. ( String-Manipulation: Split this String delimited by a - character? )
So cell B1 would become cellArray = { 1, 100, 101, 104 } which would be the first value in mainArray. Do this for each cell in column B for the used range.
Then I would create a new csv ( How to create a separate CSV file from VBA? ) and then input the data into it.
To input the data I would loop through each of my saved arrays and store as CellValue = array[0] + ", " + array[i]
Lastly, I would save my new CSV file.
I have a simple problem that I'm hoping to resolve without using VBA but if that's the only way it can be solved, so be it.
I have a file with multiple rows (all one column). Each row has data that looks something like this:
1 7.82E-13 >gi|297848936|ref|XP_00| 4-hydroxide gi|297338191|gb|23343|randomrandom
2 5.09E-09 >gi|168010496|ref|xp_00| 2-pyruvate
etc...
What I want is some way to extract the string of numbers that begin with "gi|" and end with a "|". For some rows this might mean as many as 5 gi numbers, for others it'll just be one.
What I would hope the output would look like would be something like:
297848936,297338191
168010496
etc...
Here is a very flexible VBA answer using the regex object. What the function does is extract every single sub-group match it finds (stuff inside the parenthesis), separated by whatever string you want (default is ", "). You can find info on regular expressions here: http://www.regular-expressions.info/
You would call it like this, assuming that first string is in A1:
=RegexExtract(A1,"gi[|](\d+)[|]")
Since this looks for all occurance of "gi|" followed by a series of numbers and then another "|", for the first line in your question, this would give you this result:
297848936, 297338191
Just run this down the column and you're all done!
Function RegexExtract(ByVal text As String, _
ByVal extract_what As String, _
Optional separator As String = ", ") As String
Dim allMatches As Object
Dim RE As Object
Set RE = CreateObject("vbscript.regexp")
Dim i As Long, j As Long
Dim result As String
RE.pattern = extract_what
RE.Global = True
Set allMatches = RE.Execute(text)
For i = 0 To allMatches.count - 1
For j = 0 To allMatches.Item(i).submatches.count - 1
result = result & (separator & allMatches.Item(i).submatches.Item(j))
Next
Next
If Len(result) <> 0 Then
result = Right$(result, Len(result) - Len(separator))
End If
RegexExtract = result
End Function
Here it is (assuming data is in column A)
=VALUE(LEFT(RIGHT(A1,LEN(A1) - FIND("gi|",A1) - 2),
FIND("|",RIGHT(A1,LEN(A1) - FIND("gi|",A1) - 2)) -1 ))
Not the nicest formula, but it will work to extract the number.
I just noticed since you have two values per row with output separated by commas. You will need to check if there is a second match, third match etc. to make it work for multiple numbers per cell.
In reference to your exact sample (assuming 2 values maximum per cell) the following code will work:
=IF(ISNUMBER(FIND("gi|",$A1,FIND("gi|", $A1)+1)),CONCATENATE(LEFT(RIGHT($A1,LEN($A1)
- FIND("gi|",$A1) - 2),FIND("|",RIGHT($A1,LEN($A1) - FIND("gi|",$A1) - 2)) -1 ),
", ",LEFT(RIGHT($A1,LEN($A1) - FIND("gi|",$A1,FIND("gi|", $A1)+1)
- 2),FIND("|",RIGHT($A1,LEN($A1) - FIND("gi|",$A1,FIND("gi|", $A1)+1) - 2))
-1 )),LEFT(RIGHT($A1,LEN($A1) - FIND("gi|",$A1) - 2),
FIND("|",RIGHT($A1,LEN($A1) - FIND("gi|",$A1) - 2)) -1 ))
How's that for ugly? A VBA solution may be better for you, but I'll leave this here for you.
To go up to 5 numbers, well, study the pattern and recurse manually in the formula. IT will get long!
I'd probably split the data first on the | delimiter using the convert text to columns wizard.
In Excel 2007 that is on the Data tab, Data Tools group and then choose Text to Columns. Specify Other: and | as the delimiter.
From the sample data you posted it looks like after you do this the numbers will all be in the same columns so you could then just delete the columns you don't want.
As the other guys presented the solution without VBA... I'll present the one that does use. Now, is your call to use it or no.
Just saw that #Issun presented the solution with regex, very nice! Either way, will present a 'modest' solution for the question, using only 'plain' VBA.
Option Explicit
Option Base 0
Sub findGi()
Dim oCell As Excel.Range
Set oCell = Sheets(1).Range("A1")
'Loops through every row until empty cell
While Not oCell.Value = ""
oCell.Offset(0, 1).Value2 = GetGi(oCell.Value)
Set oCell = oCell.Offset(1, 0)
Wend
End Sub
Private Function GetGi(ByVal sValue As String) As String
Dim sResult As String
Dim vArray As Variant
Dim vItem As Variant
Dim iCount As Integer
vArray = Split(sValue, "|")
iCount = 0
'Loops through the array...
For Each vItem In vArray
'Searches for the 'Gi' factor...
If vItem Like "*gi" And UBound(vArray) > iCount + 1 Then
'Concatenates the results...
sResult = sResult & vArray(iCount + 1) & ","
End If
iCount = iCount + 1
Next vItem
'And removes trail comma
If Len(sResult) > 0 Then
sResult = Left(sResult, Len(sResult) - 1)
End If
GetGi = sResult
End Function
open your excel in Google Sheets and use the regular expression with REGEXEXTRACT
Sample Usage
=REGEXEXTRACT("My favorite number is 241, but my friend's is 17", "\d+")
Tip: REGEXEXTRACT will return 241 in this example because it returns the first matching case.
In your case
=REGEXEXTRACT(A1,"gi[|](\d+)[|]")