I should start by saying I'm new at this, but I've been asked to get something done at work. I'm using Excel 2008 on a Mac.
I've created a data set that is roughly 3000 rows x 95 columns. the first column is a concatenation of product descriptions, manufacturers, etc. The rows are a list of keywords that I've used the following formula to identify and display in each of the 3000 rows:
=IF(ISNUMBER(SEARCH(C$3,$A3)),C$2,"") .
This has left me with data scattered throughout the sheet. I now need to combine the data from each row into one cell, with each discovered keyword separated by |, but I need to ignore blank cells so that I have a result like the following:
Keyword1|Keyword3|Keyword4|
and not like this:
Keyword1||Keyword3|Keyword4||||||||||||
Anyone have any ideas?
Thanks
You could use & like this:
="keyword1"&"|"&"keyword2"&... which does not skip blank cells
or if keyword1 is in cell A1 and keyword 2 in B1 then
=IF(OR(A1="",B1=""),"",A1&"|"&B1)
which will leave the result blank if either A1 or B1 are blank
I ran into this years ago when I was trying to create lists of potential racing picks from a list of horses in a race. I never found something I liked, but used several different methods as I changed my preferences over time. Here is one somewhat generic solution:
=SUBSTITUTE(SUBSTITUTE(TRIM(SUBSTITUTE(SUBSTITUTE(A7&"|"&B7&"|"&...&DM7," ","~"),"|"," "))," ","|"),"~"," ")
where the series starting with A7 and ending with DM7 would be replaced with the concatenation you have with multiple "|" symbols.
The formula looks for existing spaces and replaces them with an unused character (in this case, I used a "~"), then it replaces "|" with a " " and uses the TRIM command to eliminate leading and ending spaces as well as excessive spaces in between. Then it replaces the remaining spaces with "|" and puts back the initial spaces by substituting a space for any "~".
Obviously, this is a bit simpler if your keywords have no spaces, but that was not stated.
Addendum....
Thinking about this last night, I think getting to one concatenated string based on 3000 rows of data with up to 95 columns may be impossible. This is likely to easily run up against the string limit size of 32,767 since if even 5% of those cells have just a single character, you will have 14,250 characters plus a nearly equal amount of separators. I do not see how you would do this using the technique I posted. You could use this technique on a row by row basis and then cut and paste these to a plain text file, but I am beginning to think a native Excel solution would be nearly impossible, especially without VBA which we could have used to write the text out.
While my earlier suggestion might work well for relatively small ranges, it will not work for data approaching the size of the original request. Excel 2008 on a Mac will not likely be able to handle this, but if you have access to a version of Excel that can use VBA, this suggestion may help you since it uses the extended capabilities of VBA and will put the concatenated data into a text file rather than trying to leave it in the spreadsheet.
I am still trying to figure out a way to do this without using a string or VBA.
I am including some step-by-step information in case it is useful to you or others.
Open the spreadsheet with the 95 rows and 3000 columns of data.
Open the Visual Basic Editor and Choose Insert / Module from the menu.
Paste the following code into the editor window:
Option Explicit
Function ConcatNonblankWithSeparator(rngRange As Range, strSeparator As String) As String
Dim rngCell As Range
Dim strReturn As String
For Each rngCell In rngRange
' MsgBox rngCell.Address
If Len(rngCell.Value) > 0 Then
' MsgBox "before: " & strReturn
strReturn = strReturn & rngCell.Value & strSeparator
' MsgBox "after: " & strReturn
End If
Next rngCell
' MsgBox strReturn
' MsgBox Len(strReturn)
' MsgBox Len(strSeparator)
ConcatNonblankWithSeparator = Left(strReturn, Len(strReturn) - Len(strSeparator))
End Function
Sub ConcatRange()
Dim rngCell As Range
Dim strSeparator As String
Dim strFileName As String
strFileName = Application.DefaultFilePath & "\TestDelimOutput.txt"
Set rngCell = Sheets("Data").Range("A1:DX3000")
strSeparator = "|"
Open strFileName For Output As #1
Write #1, ConcatNonblankWithSeparator(rngCell, strSeparator)
Close #1
End Sub
The first line protects you (or more likely me) from not defining variables before using them.
The Function creates a fairly generic concatenation routine that takes a range and a string and creates a returned string with nonblank values within the range concatenated and separated by the string. The msgbox commands can be uncommented (by removing the ' at the start of the line) if you want to see what is going on. The "MsgBox strReturn" will unlikely display the full amount of the concatenated data due to size limitations of that function, but may during testing.
The Sub actually makes the specific identification of the range to be processed and identifies the separator to be used. It writes the result to a test file. If you want a different separator (perhaps a " | " rather than a "|" you would specify it here. If you want a different sheet name, you would replace "Data" with the name of your sheet (in quotes). If you have a different range, you would replace the "A1:DX3000" with your range (also in quotes). If you want to use a different file name, you would specify that in the strFileName variable (along with a path to the location you want to use).
I am a bit of a hacker when it comes to VBA, so someone may suggest some stylistic or technical improvements, but this should get you started. I did create a 700,000 byte file with this, so it will handle a lot of data.
I hope this points you in a better direction than my earlier post.
What I tried next were some manual steps. I tried using a Macintosh CSV file but could not work with it because I was having trouble interpreting the end-of-line characters, so I settled for allowing Excel to create an MS-DOS text file in this process. Your actual process may need to be adapted to your environment. I used MS Word instead of Notepad or Notepad++ in the hopes that Microsoft has already adapted their search/replace functionality to the MAC.
With the data in a sheet called "Data" in columns A1..DX3000, I created a second sheet with the following formula in A1:
=SUBSTITUTE(TRIM(SUBSTITUTE(SUBSTITUTE(Data!A1&","&Data!B1&","&Data!C1&","&Data!D1&","&Data!E1&","&Data!F1&","&Data!G1&","&Data!H1&","&Data!I1&","&Data!J1&","&Data!K1&","&Data!L1&","&Data!M1&","&Data!N1&","&Data!O1&","&Data!P1&","&Data!Q1&","&Data!R1&","&Data!S1&","&Data!T1&","&Data!U1&","&Data!V1&","&Data!W1&","&Data!X1&","&Data!Y1&","&Data!Z1&","&Data!AA1&","&Data!AB1&","&Data!AC1&","&Data!AD1&","&Data!AE1&","&Data!AF1&","&Data!AG1&","&Data!AH1&","&Data!AI1&","&Data!AJ1&","&Data!AK1&","&Data!AL1&","&Data!AM1&","&Data!AN1&","&Data!AO1&","&Data!AP1&","&Data!AQ1&","&Data!AR1&","&Data!AS1&","&Data!AT1&","&Data!AU1&","&Data!AV1&","&Data!AW1&","&Data!AX1&","&Data!AY1&","&Data!AZ1&","&Data!BA1&","&Data!BB1&","&Data!BC1&","&Data!BD1&","&Data!BE1&","&Data!BF1&","&Data!BG1&","&Data!BH1&","&Data!BI1&","&Data!BJ1&","&Data!BK1&","&Data!BL1&","&Data!BM1&","&Data!BN1&","&Data!BO1&","&Data!BP1&","&Data!BQ1&","&Data!BR1&","&Data!BS1&","&Data!BT1&","&Data!BU1&","&Data!BV1&","&Data!BW1&","&Data!BX1&","&Data!BY1&","&Data!BZ1&","&Data!CA1&","&Data!CB1&","&Data!CC1&","&Data!CD1&","&Data!CE1&","&Data!CF1&","&Data!CG1&","&Data!CH1&","&Data!CI1&","&Data!CJ1&","&Data!CK1&","&Data!CL1&","&Data!CM1&","&Data!CN1&","&Data!CO1&","&Data!CP1&","&Data!CQ1&","&Data!CR1&","&Data!CS1&","&Data!CT1&","&Data!CU1&","&Data!CV1&","&Data!CW1&","&Data!CX1&","&Data!CY1&","&Data!CZ1&","&Data!DA1&","&Data!DB1&","&Data!DC1&","&Data!DD1&","&Data!DE1&","&Data!DF1&","&Data!DG1&","&Data!DH1&","&Data!DI1&","&Data!DJ1&","&Data!DK1&","&Data!DL1&","&Data!DM1&","&Data!DN1&","&Data!DO1&","&Data!DP1&","&Data!DQ1&","&Data!DR1&","&Data!DS1&","&Data!DT1&","&Data!DU1&","&Data!DV1&","&Data!DW1&","&Data!DX1&","," ","~"),","," "))," ",",")
I copied this all the way down to A3000. This gave me a set of values on each line which were separated by a "|". Obviously, you would need to adapt this to the actual range you are processing. By reducing the size to a single line, I did not run into the 32,767 text limit. Hopefully, you won't either.
I saved this as an MS-DOS text file.
I opened it in MS Word and used its Find/Replace command to replace the paragraph break character (^p) with a "|" and re-saved this as a text file. The resulting file differed from what I wanted to create only by having an extra "|" at the end.
Hopefully, this will work for you or perhaps you need to save it as a Macintosh text file for the Mac version of MS Word to be able to replace the characters at the end of each line.
Not sure if this helps you achieve what you need. It certainly could work if you only need to do this once or occasionally, but it may be a bit of a pain if this is a frequent task.
In Excel 2010, I am writing VBA to take the SUM of a range of filtered values, and store that result into a variable. The code looks like this:
With Sheets("Output")
.Range("$A:$ZZ").AutoFilter field:=ColIndex(AB), Criteria1:="x"
y = Application.WorksheetFunction.Sum( _
Range(Cells(2, "AC"), Cells(10, "AC")).SpecialCells(xlCellTypeVisible) _
)
This does work, but only when I have manually toyed with the data. When I try to use this formula on my data set unedited, I get a blank result. The problem seems to lie with the data when unedited.
Each number gets the Number Stored as Text (NSaT) error. Changing the type from Text to General causes nothing to happen. I have to open the cell for editing, and then remove focus from the cell for the type to kick in. After that, I can change it back and forth from General to Text, and Excel immediately recognizes this and updates the cell. The Sum function will, at this point, recognize both General and Text field types as a number.
Is there a VBA solution for dealing with these NSaT errors? I have attempted to use 'NumberFormat' on the column, but it does not help. I have also tried manually copying and pasting the data again, even using the special As Value option, but it still has the NSaT error until manually toyed with.
Using VBA, I import a csv file and put a bunch of data into several columns.
One of these columns has a date and time. As I need to be able to use just the 'time' part of these cells, I try to convert the entire column to Time by using (and just about every other variation)
Cells(x.y).EntireColumn.NumberFormat = "hh:mm:ss"
or
Range("C1").NumberFormat = "hh:mm:ss"
Range("C1").EntireColumn.NumberFormat = "hh:mm:ss"
However, this does not convert the entire column. I've tried every possible other way of selecting the entire column and changing it (through VB) however still only a portion remains converted.
If I doubleclick on these unconverted cells and press enter they change to the correct format. I realise this is a common problem relating to Calculations but my workbook is set to Automatic Calculations and I've tried setting this in VB too. This doesn't change anything.
The only pattern I can find is that the cells stop being converted when the Day reaches double digits. For example:
Column C
01/05/2013 7:28:56
03/05/2013 13:24:53
07/05/2013 20:13:24
09/05/2013 8:29:22
12/05/2013 9:28:56
15/05/2013 21:14:25
17/05/2013 7:28:56
Becomes:
Column C
7:28:56
13:24:53
20:13:24
8:29:22
12/05/2013 9:28:56
15/05/2013 21:14:25
17/05/2013 7:28:56
In the formula bar up the top for each cell it still shows the whole Date and Time for all cells, not sure if this is related, but doesn't seem to matter in terms of the calculations i have to perform using the Time.
Essentially I have to take the time for a cell in column C and the time from another Cell (also in Date/Time format) and check the difference. After some research I decided the best way was to convert all the cells to a time format and then do my calculations.
Alternatively I could try converting the column to text and using a Split function (using space as a delimiter) and pulling the time out, but I'm having trouble doing this too, as once again trying to convert the entire column to text stops at the double digits for date.
Thanks for reading through all that, any thoughts and help would be appreciated.
Edit: Realised some of my syntax was incorrect in my post, this was however correct inside my macro
another edit: I think this definitely has something to do with the date format... I just realised that before i format them, the dates are m/dd/yyyy and then when it gets to actual double digit days it changes to dd/mm/yyyy, and thats when the problem occurs...
In order to avoid confusion, and as the date always seems to occupy the same width, I recommend to
1) import this column as a text
2) then go over the whole column
For Each C In Range("A:A").Cells
If C <> "" Then
' ....
End If
Next C
3) cut away the leading 11 positions, e.g. C = Mid(C, 11, 99)
4) convert the remaining string to a time, e.g. C = CDate(C) (... yes it works with a time as well, because a time is a fractional part of a date)
Alternatively you may want to capture the date part and bring it into shape, too. See here for some ideas using worksheet formulas.