need to compare data in two excel files - excel

I need to compare two excel files. One is extracted from database and saved as CSV. Other file is cumulative report containing all records for that day. I need to check if all the data in the cumulative report is in my other csv file that's extracted from database. I know VLOOKUP, but I am not sure if VLOOKUP can compare entire file records. Many files have 4 to 5 thousand records with 50 columns in it. Is there any other option? Any free ETL tools ?

I decided to use beyond compare tool to compare two files.
Sort both files on an ID field
And compare them using Beyond Compare tool
It's a nice tool. You can get it HERE.

You can indeed use VLOOKUP.
A simple way to solve this is to
1. Create a database connection for the CSV saved data in Sheet1.
2. Copy and Link the second file (cumulative report) to this file in Sheet2.
Then use VLOOKUP or simple IF statements to compare the data.

Related

Delimited File with Varying Number of Rows Azure Data Factory

I have a delimited file separated by hashes that looks somewhat like this,
value#value#value#value#value#value##value
value#value#value#value##value#####value#####value
value#value#value#value###value#value####value##value
As you can see, when separated by hashes, there are more columns in the 2nd and 3rd rows than there is in the first. I want to be able to ingest this into a database using a ADF Data Flow after some transformations. However, whenever I try to do any kind of mapping, I always only see 7 columns (the number of columns in the first row).
Is there any way to get all of the values? As many columns as there are in the row with most number of items? I do not mind the nulls.
Note: I do not have a header row for this.
Azure Data Factory directly will not be able to Import schema -row with the maximum number of column. Hence, it is important to make sure you have same number of columns in your file.
You can use Azure functions to validate your file and update it to get equal number of columns in all rows.
You could give it a try to have a local file with row with the maximum number of column and import the schema from the file, else you have to go for Azure Functions where you have to convert the file and then trigger the pipeline.

Excel sheets with scores by same ID of person (Kahoot) - How to extract and summarize scores from several quizzes?

I've used Kahoot in the classroom and have several excel files with scores from quizzes.
Students attended quizzes by using unique IDs. In each file, scores are visible for each ID (but ordered by success on each quiz). There are also some students missing or stating wrong IDs (I'll ignore it).
Now I would like to accumulate all scores for all student IDs in one sheet and summarize them by Student ID.
How can I do that most efficiently?
Any pointer or advice is appreciated.
Thanks,
B.
Here's a high level guide to getting what you want along with a sample in this file.
Step 1 - Combine Files to Sheet with Unified Columns
Objective
The goal here is to:
Combine all of your data from other files to single sheet
Merge the data to be in a single column for each field (i.e. Column A has ID, Column B has score).
No breaks in rows.
No formulas.
To illustrate, I made this fake list based loosely on your
description.
Method
You probably can do this manually, but a macro could also be used. If you expect to do this year over year, you might look into vba to open close files in a folder. However, since that wasn't part of question, you can do copy-paste (better yet make a kid do it!). Just make sure there's only one header for each column, and all of the data records align. Probably should do copy paste value if you have any formulas.
Step 2 - Show Summation
There's a couple ways this could be done. A pivot table is probably the most sensible because you could include each quiz as a column to see the total. You could also use a pivot table to do averages by student etc.
TO make a pivot table, I would recommend going on YouTube and they will do a better job of explaining than me.
On that same file I made as an example, I included some tabs to illustrate the power of pivot tables and a couple graphs.
Hope that helps. If you have specific technical questions on this, you might consider asking separately.

Power Query to fetch values based on 2 criterias

I am trying to create a query based on below requirement. So far I am able to do my query based on single criteria only. Your help would be much appreciated.
1) I have opened a target workbook where I want to fetch matching values for it's two columns (temperature & density) from multiple workbooks saved in a particular folder. Here I am referring to New Query > From file > From Folder option.
2) So In my target work book I have Observed Density and Observed Temperature and now I want to extract volume and weight correction factors from a pool of mutilple workbooks picked in step 1 as mentioned above (all the workbooks in the desired folder not only have Observed Desnity and Observed Temperature but also columns contaning corresponding weight and volume correction factors in them)
That's it. Just want to know if this can be achieved using Power Query or VBA is a must do do get results? If so, any hints would be much appreciated.
I think you are on the right track. I would finish building the Query using the From Folder option to turn the data from all the workbooks into a single dataset. This query does not need to be loaded into an Excel table.
Then I would start a new query based on the 2 columns in your target workbook. They will need to be in a table or named range. In that query you can add a Merge step to match to rows in the From Folder query. Then Expand the results to add the columns you need from the matching rows.

Import 2 or more columns from Excel into 1 column Access

I have an Excel report that is the output of an opinion tool. In this Excel I have all the responses that the people submit for my quizz, in the questions that are multiple choise answer the tool output those questions like one question per option and only the selected option is the column with data in the Excel. For example, if my quizz is like this:
Q1 Your name:
R1 =
Q2 Options
opt 1
opt 2
opt 3
The Excel report will appear like this
Excel Report
So I want that when I import the Excel to Access it can automatically merge those columns to have only to headers in the Access table: "Q1 Your name:" & "Q2 Options"
Also, for context of the job, I will make some other editions to that imported table and then copy to another Access table (table 2) so even if there is a way to merge those Access columns before copy to the another one I will accept it like, I don't know, insert from this column and if empty insert from that column, I'm not good at doing queries sorry. Only the table 2 will have information, the first table would be like a temporary one so I will daily delete information from that one and preserve the important data en the table 2
Thanks for the support
Simplest way I can see to achieve your goal is to concatenate the three columns; since by the sound of it you will only ever have a value in one column per question per record. You could do this in Excel prior to the import, you could use a calculated field on the table or you could build a query that concatenates all your questions. My suggestion would be Excel since using the =CONCATENATE() function is probably going to be easiest option for you.
If you do import your raw data into Access you will need to assign unique column names, ie Q2_Op1, Q2_Op2, Q2_Op3.
The query syntax to concatenate these fields one would be something like:
SELECT Q1_Name, [Q2_Op1] & [Q2_Op2] & [Q3_Op3] AS Q2_Options
FROM Table1;
Where Q1_Name, Q2_Op1, Q2_Op2, Q3_Op3 are the column names on the imported data table.

Excel vb project-best practice

I'm not a vb developer neither so familiar with excel. Anyway i have a project to be done using MS Excel (cannot use access).
System is to provide a ratio analysis(ans some other analysis) of companies where data from an annual report need to entered to the system. Then based on several reports data I can derive graphs and all other information.
My question
Now I can store data in a single sheet like using is as as a database. it'll be like
CompanyName Year Data1 Data2 Data3...
Here the CompanyName can be duplicated as many Years data can be entered. If I use this method Each time I derive company data, I have to search for the relevant rows in the worksheet and keep lots of data in an array as I read through those rows and produce the final result.
Or I can use separate worksheet for each company. Then I only have to search for the relevant sheet name and perform operations in that worksheet it self easily.
So what is the best way to do this?
Thanks
Whatever way works. IMO you could create a defined range (or many) and issue SQL against it just like it was Access table(s). I'm for keeping all like data on the same worksheet even for different companies; but that's just my 2 cents. You can create a pivot to separate out the information and slice/dice it however needed
Since someone liked the comment as an answer:...
It might be simpler to do some of this just using formulas and Excel functions. The basic approach would be to keep the data on one sheet and sort it by year within company so that all the years for a company are grouped together. Then use Filter to create a list of unique companies. These steps get repeated each time you add new data.
Then create 2 formulas for each company: the first uses MATCH to find the first row containing the company name and the second uses COUNTIF to find how many rows there are for the company. Then you can use OFFSET(firstrow,ColumnIndex,NumberOfRows,1) (or similar) to get the required range of data for Charts and ratio analysis etc.

Resources