I get an excel data feed file from an outside source on vendor information. I copy and paste-values into my own XLSM and run some VBA scripts on the data. In short, the script aggregates data removing duplicate rows leaving one list of unique vendor numbers in column A with the aggregated data for each vendor. Then the script uploads the data into an SQL database for other departments.
Yesterday while uploading, my database spit an error claiming an INSERT attempt with a duplicate value where dupes are not allowed. Strange... I went back to the excel data and sorted on vendor number to discover that there are tons of duplicate rows. What? This never happened before.
I highlighted the sorted column A and copied it off to another sheet, then used the Remove Duplicates tool. Excel reported that there are no duplicates when I'm literally staring at easily 10 cases of duplicate vendor numbers. Unfortunately, this is a TEXT column because other vendor numbers have letters in them like AX6058
The top two cells A1 and A2 are showing vendor number 7003 and 7003. As a test, when I say =A1=A2 the test returns FALSE.
I made sure both cells were TEXT... =A1=A2 still returns FALSE
I changed both cells to GENERAL... =A1=A2 still returns FALSE
I TRIM()'ed both cells for spaces... =A1=A2 still returns FALSE
I tested the LEN() of both cells... they are both 4
I forced both cells to be NUMBER with no decimals and =A1=A2 is still FALSE!!
If I add 1 to 7003 = 7004, then subtract 1 to get back to 7003 for both cells, then =A1=A2 finally returns TRUE after the math manipulation.
So now my VBA script has to segregate on textual vendor numbers that are truly numeric and conduct this goofy math work to ensure that the duplicate vendor number are TRUE-ly equal so they can be aggregated.
Has anyone seen this before? Or have some funky trick or setting so I don't have to do the goofy +1 / -1 math in my script?
As I said, I am xlPastingValues:
'copy vendor ID (in column 6)
ActiveSheet.Columns(6).Copy
Sheets("Tempcopy").Columns(1).PasteSpecial xlPasteValues
Thanks,
John
EDIT: My Excel is 2010, but I don't know the version used to provide the source data.
Using #BrakNicku comments, I added an interim sheet that forces the correct data types, then copies into the worksheet for processing.
Related
The entire workbook is a few sheets long, however essentially I'm working with a base sheet that has around 8000 or so lines data with about 10 columns or so. The end goal of this project is to be able to input a start date, end date and a keyword and then be filtered one last time with another keyword. So far, I've been able to filter down the original data within the date range and within the first keyword. The problem arises now when the keyword is within a block of text that varies and is never quite the same. For example, one row contains
12T Q1FY23 Unscheduled/Emergency Maintenance
While another row contains
12T Q4FY23 ERT Spill Stations
There are hundreds of variations of this, but there, including ones that don't start with "12T". The starting data is subject to change so I can't quite use tables in excel and filter it that way, as once you apply a filter then the table won't update if new data is input as the source data, unless there is a way to do this and I just don't know how. So ultimately, I need the same filter that can be used on a table that says "contains" and/or "does not contain" as formulas. Formulas seem to work well with this dynamically/subject-to-change source data, so I'd like to keep it with formulas, as I have done with the filtering previously with the date range and then with the other keyword. The difference between what I want now and what I did for that other keyword is that it was a static keyword that isn't embedded within a string like the "12T". Please let me know if this is too vague or if there's any more material needed to help answer this question. Attached is a sample image of a what I'm working with on the original sheet. I'd like to be able to extract the rows containing only "12T", and not the one's "12T-M", for example, using only a formula. Assume that the data starts at A3 and ends at C8. I should also mention, just to be completely clear, I'm trying to copy these rows dynamically into another sheet so that it can be nicely viewed with only the relevant information and data.
To be extra clear, I first filter it the original data with the following formula:
=INDEX(Sheet1!$A$6:$N$6796, SMALL(IF(COUNTIF('12T'!$H$11,Sheet1!$G$6:$G$6796), MATCH(ROW(Sheet1!$A$6:$N$6796),ROW(Sheet1!$A$6:$N$6796)), ""), ROWS(B3:$B$3)), COLUMNS(Sheet1!$A$6:A6))
The "Sheet1" referral contains the original data and "12T" refers to the sheet that contains the filtering keywords (the dates and the number keyword). This formula extracts all of the rows of the original dataset in Sheet1 that contain a specific keyword, in this case its "5351 - Facilities: Maintenance: Building". These extracted rows of data are deposited as an array (Entered with ctrl+shift+enter) in a new sheet labeled "Xtract".
In this same sheet, I then filter out this array with the date range in mind. With the starting and ending date, I first calculate the number of instances that a date falls within the date range with the following formula.
=SUMPRODUCT(($A$2:$A$671>=Q2)*($A$2:$A$671<=Q3))
I use this result in the following formula in conjunction with the filtered data (filtered with the previous keyword) to filter it further so that I only get the rows of data that have their date in the date range.
=FILTER(A2:O671,(A2:A671>=Q2)*(A2:A671<=Q3),"No data")
This is also entered as an array, and is also in the "Xtract" sheet. With this filtered data set, I want to filter it one last time, so that only the rows of data that contain, for example, "12T" or "728M" in one of the cells (in which the respective cell can be written as "12T Q1FY23 UEM") can get extracted and placed into a final array. All of this is automatically updated simply by entering the values in this section I have shown below.
I can't use a table to filter the data, at least not that I know of, because if I filter a table by this logic ("contains '12T'" and "does not contain '-M'" to get only rows that contain 12T but not 12T-M or anything that's not 12T) then once I change the date range or the other keyword, the table won't update properly. If there's anything else I can add to help clarify, please let me know.
Add a column to the left containing formulae: "=find("12T ",B1) and copy down.
Note the space after T.
Rows matching that will have 1; rows not matching will have #VALUE! so you can sort on them.
P.S. if #VALUE! is ugly, you can use =NOT(ISERROR(FIND("12T ",B2)))
After a lot of searching and referencing my old work/internet, I found the formula to answer my problem. I understand this might not be the most clear since I can't quite provide the excel workbook I'm working with, but the goal of this was to automate all of the filtering so that no matter if data is added or not, when you change the filters, it will stay updated correctly. From the filtered data that I had already worked with, all I had to do to put it into another sheet was use the following formula:
=FILTER(XtractFilters!T2:AF900,ISNUMBER(SEARCH("12T *",XtractFilters!T2:T900)))
This finds all of the data containing a specific substring, which in this case specifically was "12T ", denoting the space as well. So all of the filtered results are then filtered once again so that only the rows where "12T " was found get returned. The range is just the entire range of data and then the column is the one containing the text where "12T " could be found.
I have been working on a solution to this problem for a few hours now and I am basically no where except knowing that I don't know how to do it...So here goes.
I want to take the original data that I have in Excel that have 'code#s' for each 'category#'. With those 'code#s', I can look up the 'category#' name.
This has been so challenging because there are a varying number of categories for every 'title#'.
I have tried printing the 'category#' name next to 'title#', but it is seemingly impossible because Excel goes through every row in the original data and gives a True, False or #N/A instead of selecting and printing only the true statements without copying down a thousand rows. I want it to go through all the possibilities and only select the categories based on the criteria that they have the same 'title#' and their lookup code matches somewhere in the lookup table.
Thanks if you can offer any sort of help.
Here are some of the formulas I have tried:
IF(AND($M$5=TOP_TREND_CONTRIBUTORS!$W$2:$W$253,MATCH(TOP_TREND_CONTRIBUTORS!$A$2:$A$253,'Category Lookup'!$D$3:$D$30,0)<>"#N/A"),TOP_TREND_CONTRIBUTORS!$A$2:$A$253,FALSE)
....where M5, W:W is the 'title#', A:A is the code for the lookup-in that part I am trying to say that they are valid if the code registers in the lookup table and the 'title#s' are equal. The last part I am trying to get it to print the 'code#s' that are valid. But that only works when I drag the formula down all the rows.
Maybe I'm missing something, but I just tried to get from your original data and lookup table to the final result. I used VLOOKUP to put categories next to titles and then used pivot table to present the data in the way you wanted (after changing some settings of pivot table and fields). Is that what you need? (some words are in Polish, it doesn't matter).
So, I have a fairly involved workbook.
Sheet 1: A database where the user enters a list of instruments as well as some data about the instruments in a vertical column.
Sheet 2: A sheet that contains the exact same information as sheet 1 but displays it in a different format. Automatically populates based on entries from Sheet 1. (Not useful in this question)
There exists a macro on Sheet 1 that is executed by clicking a button. This macro takes every column from Sheet 1 and creates a new Sheet for each column. Each new sheet, Sheet 3, is renamed to the first value in the column of Sheet 1 that it represents.
i.e., There are 4 columns in Sheet 1 with the first value in each column being: LS-ALPHA, LS-BRAVO, LS-CHARLIE, LS-DELTA. My macro will create 4 new sheets called LS-ALPHA, LS-BRAVO, LS-CHARLIE, LS-DELTA.
The first cell (technically H2) on each of the new sheets contains a formula to reference the sheet name.
=MID(CELL("filename",A1),FIND("]",CELL("filename",A1))+1,255)
i.e., H2 on the LS-ALPHA sheet will actually say "LS-ALPHA", H2 on the LS-BRAVO sheet will say LS-BRAVO, etc.
Every other data cell on the new sheet will automatically look up that value on the main sheet (Sheet 1) to determine what column it is from. Then, it will go below that value and get the contents from some cell x rows below.
=LOOKUP(H2,'Database (Cols)'!D2:AN2,'Database (Cols)'!D3:AN3)
This works absolutely perfectly. It does everything well.
Except, not always.
If I rename the columns to "LS-A, LS-B, LS-C, LS-D", it works. If I rename the columns to "LS-AA, LS-AB, LS-AC, LS-AD", it works. If I rename the columns to "LS-AAA, LS-AAB, LS-AAC, LS-AAD", it works.
However, if I rename the columns to something like "LS-TTF, LS-TTD,LS-TSD, LS-TSF" they are all broken somehow.... None of the links on the sheets work any more. Some of them point to the incorrect column if they even do show something. This issue I'm having is incredibly peculiar. I don't know why these names break it in particular, nor do I know what other names would also break it.
What happens when it 'breaks': All of the references seem to find the last available column in the LOOKUP. Three of the four sheets all use values from the fourth column when they aren't supposed to. Then, one sheet just gives me errors (#N/A). When I step through the calculation, it is looking for the correct value in the LOOKUP function, it's just not returning the right thing....
I can't really give much more information without showing you what's happening so I've included a working spreadsheet and a broken spreadsheet. The sheets have been generated from the macro so you don't have to mess with it. The working and broken files are below:
Working: https://drive.google.com/file/d/0B9zbU-BeMQNfSmRrWVhKVW9RN3M/view?usp=drivesdk
Broken: https://drive.google.com/file/d/0B9zbU-BeMQNfd1FUemwxQjQwMEE/view?usp=drivesdk
Note, the echo column is for debugging purposes. I was trying to see if they would all show echo instead of delta. Apparently, they don't.
From the help for the LOOKUP function:
IMPORTANT: The values in lookup_vector must be placed in ascending
order: ..., -2, -1, 0, 1, 2, ..., A-Z, FALSE, TRUE; otherwise, LOOKUP
might not return the correct value. Uppercase and lowercase text are
equivalent.
The set of values which work correctly - "LS-A, LS-B, LS-C, LS-D" - are in alphabetical order. The set of values which don't work correctly - "LS-TTF, LS-TTD, LS-TSD, LS-TSF" - are not in alphabetical order. Also, LOOKUP doesn't necessarily find an exact match - as specified in the help:
If the LOOKUP function can't find the lookup_value, the function
matches the largest value in lookup_vector that is less than or equal
to lookup_value.
To fix, either:
reorder the non-working set of values to be in alphabetical order (although you still won't guarantee an exact match), or
switch to using the HLOOKUP function instead. Ensure that the Range_lookup parameter is false to require an exact match. Sample usage: =HLOOKUP(H2,'Database (Cols)'!D2:AN3,2,FALSE)
I had a similar problem because I was wrongly using lookup. To find a value in a vector, I had to use
=MATCH("KEY";F5:F48;0)
instead of
=LOOKUP("KEY";F5:F48)
LOOKUP just didn't work for my objectives.
I have two excel sheets in a document used to make charts from number estimates. One sheet is a database query via plugin that imports data into excel. When pulling the data from my database, you can see the plugin correctly does not populate blanks with a 0 in columns C and D.
RallyQuery
The 2nd sheet is used to perform calculations in order to make the charts. If you look closely, columns G and H should exactly match C and D from the previous picture, however, excel has added zeros in cells that were blank in the database. The formula used to pull the data from the first sheet is:
=IFERROR(AgileCentralQueryResultList[#PlanEstimate],"")
I need excel to stop replacing blanks with zeros without removing any true zeros that were correct in the database. All of the answers I've found also remove the true zeros, which will not work.
Totals
You need to adjust your formula a little bit, A blank cell apparently is not going to produce an error. so you may want to drop the iferror part of your formula and go with an if. If you have source cells that contain actual error results then you will need to do a second if.
Option 1
=IF(AgileCentralQueryResultList[#PlanEstimate]="","",AgileCentralQueryResultList[#PlanEstimate])
Option 2
=IF(OR(AgileCentralQueryResultList[#PlanEstimate]="",ISERROR(AgileCentralQueryResultList[#PlanEstimate])),"",AgileCentralQueryResultList[#PlanEstimate])
Option 3
=IFERROR(IF(AgileCentralQueryResultList[#PlanEstimate]="","",AgileCentralQueryResultList[#PlanEstimate]),"")
I would try option 1 first and see if you need to take that extra step for error checking using option 2.
Using all the great changes from "Forward Ed" I was able to add my IFERROR check back to his formula to get it all to be happy again. Thanks a ton Ed, credit goes to you for guiding me.
=IFERROR(IF(OR(AgileCentralQueryResultList[#PlanEstimate]="",ISERROR(AgileCentralQueryResultList[#PlanEstimate])),"",AgileCentralQueryResultList[#PlanEstimate]),"")
I use Google Analytics API to export some data into Excel. Because the API have limits with the number of dimensions and metrics that can be exported at the same time, I have to make this export usind different queries, which are placed into different sheets.
I want to consolidate all this information in just one sheet, so I create (in every sheet) an unique ID with some shared values of each sheet (using concatenate) and do a VLOOKUP to consolidate the data of every sheet into the first sheet. It works like a charm for 99,5% of the data. But there are some IDs which are returning a #N/A, althought I have manually checked if they are exact matching using =B1='Sheet2'!B191, which returns TRUE.
I generate those Unique IDs with =CONCATENATE(TRIM(B1),TRIM(C1),TRIM(D1)...), so I do not believe there are blank spaces preventing the match. I have even pasted those IDs as "value" and I still get those #N/A.
I am not able to find a cause for this rare behaviour!
Agustín
Non-printing characters may give rise to a number of issues, in this case these apparently were resolved by CLEAN:
Removes the first 32 nonprinting characters in the 7-bit ASCII code (values 0 through 31) from text.