How to "protect" certain fields (to avoid row spacing) in huge excel data set - excel

I am having huge data set, and every-time I am adding new row into it (middle of the table), the tables on left and right side are disrupted. Is there any way to "lock" these tables during my edit or ? On protection sheet tab i cant find solution.
That red part is actually non-disturbed table (should be), but always getting one row of spacing when adding it, I have manually to select exactly which row and how many columns in the middle part, it is time consuming.

Related

Two tables with no unique values that I need to match and return value based on date range and zipcode fields

I have two sets of data; the first (Wind Claims) contains a StartDate, EndDate, and Zip Code field. The second (PLRB Wind) contains a Date, Zip Code, and Wind Speed field.
My goal is to get the Wind Speed from the PLRB Wind tab to the Wind Claims tab if the Date from the PLRB Wind tab is between the StartDate and EndDate on the wind Claims tab AND the Zip Code from the PLRB Wind tab matches the Zip Code on the Wind Claims tab. The point is to identify the wind speed where damage was reported.
I have tried a couple formulas; this one I actually got results but only 1227 out of 16822. I wouldnt expect to have a 100% match but definitely much more than what I am getting. I think the reason is because this formula is looking for the specific date and not looking at the date range:
=XLOOKUP(Z2&N2,'PLRB Wind'!$I$2:$I$78525&'PLRB Wind'!$D$2:$D$78525,'PLRB Wind'!$M$2:$M$78525,"")
I also tried an Index Match (this is just the Match piece of the formula)
=MATCH(1,IF('PLRB Wind'!D2>=$B$2:$B$16823,IF('PLRB Wind'!D2<='Wind Claims'!$C$2:$C$16823,IF('PLRB Wind'!I2='Wind Claims'!$Z$2:$Z$16823,1))),0)
Thank you in advance for looking at this. I appreciate any help you might be able to provide!
I'd use power query for this. Do you know what power queries are?  I was upset when I found out because of all the useful ways I could have been using it before.
You might feel differently, though. Create a new copy of your workbook for this just in case you hate it. :-)
In the "Data" ribbon of Excel, in the Get & Transform section, there's a "From Table" button. Highlight your PLRB table (including the column titles) and click that "From Table" button to create a new query from it. It will create the table and the query.
A power query editor window will pop up, presenting your query as two steps, listed in the middle of the right sidebar. The first step is to get the information from your worksheet. The second step changes the data types. Click the icon to the left of each date column's title to change the type from datetime to date because why not. On the right sidebar, change the query name to PLRB.
Now click "Close & Load" on the home ribbon. It will create a new tab with the results of your table. Leave it for now. You can delete that tab later and it won't delete the query.
So, back to your worksheet, highlight the column-title row and data rows for first three columns of the wind claims table. Create another query from table. Call it WindClaimsInput. Again, correct the datetime columns to date columns
OKAY, so now you have two queries. They both read from your workbook but they could have been from another file or text file, etc. If you like this solution then your final form might be a worksheet that doesn't actually have any source data in it, just queries that gets the raw data from elsewhere and a tab that presents the third query we're about to make.
Now for the fun part.
While still in the power query editor editing your WindClaimsInput query, near the left edge of the "Home" ribbon there's a button named "manage...". Click it, then click "Reference" to create a third query that starts with the old one. Remember, queries are only instructions. We aren't copying data until we run the queries.
Now, find the button to add a column. It should open a dialog box asking the column name and formula. Name it "PLRB" and use this formula: Table.SelectRows(PLRB, (r) => (r[Date] >= [CATFromDt] and r[Date] <= [CATThruDt] and r[ZipCode] = [ClaimZip])) Table.SelectRows is a power query function that takes two arguments:
The table (or query that returns a table), and,
A function to run on each record (aka row) of the table and return true/false. In this case, we created a function that takes one argument (r) and returns true or false.
So the above formula says "Give me a table of all rows in PLRB for the given ClaimZip zip code that also has a Date between CATFromDt and CATThruDt." Since it's a column formula, it runs once per row in. Wind Claims.
Now you have a table where the last column is another table! Specifically, the rows from PLRB that are relevant for that Wind claims row. You can single-click on any of those cells in that last column to see the subtable.
To right of the last column's title will be a little "expand" icon. Click it, choose to aggregate by max wind speed. (The right edge of the "wind speed" choice will let you change it to maximum, or average, or whatever you like.) Unclick "Use original column name as prefix". Click okay. Don't worry, you can delete this new step and try again if I didn't describe it well.
Hit "Close and Load" to see it in your workbook. If it looks right, great! Otherwise, feel free to go back and edit some more.
And now you're done! Unlike formulas it doesn't automatically refresh but when you want to refresh your output based on your input tables you can refresh that query or, in the "Data" ribbon you can click "refresh all".
In the data ribbon of Excel, in the "Get & Transform" section, there's a "Show Queries" button that toggles a sidebar that displays your queries you've made. You probably only want to keep loading your third query, so you can change the "Load to..." of the other two queries to "Connection Only".
Sorry I can't do screenshots right now.

Is there a way to match values in Excel when an individual cell has multiple values?

Above is a picture of my Excel sheet. I have 2 columns of data that have multiple data points in them (separated by commas). This is how my data is spit out after running an online psychology experiment. I'm hesitant to split text to columns because some lines only have 3 values and other lines have 20+. Essentially, I need to match values in one column to values in the second column. For example, the first value in column G needs to match with the first value in column H. The second value needs to match with the second value, etc. I don't need to match up every value in both columns, however. I only need a (defined) subset of values.
I'm not sure if this is possible to do in Excel (or any Excel add-on) without separating the values into separate columns, but any help is appreciated!
I've seen this before in survey data - the output uses "packed data" where each cell contains many values. You will need Excel 2010+ for Windows (or Excel 365) for this solution. Otherwise, there a solution that is also Mac compatible that does not involve VBA, but it takes time to construct. This approach should take you 10 mins to do - a lot of steps, but it is just clicking.
Let's say that these are your data in two columns in a table.
Click anywhere inside the table. Open the Data tab and click on From Table/Range:
This will convert your data into an Excel Table and ask you if your table has headers - yes it does. Click OK.
This will open the Power Query (PQ) editor (congratulations, you are now a step closer to data scientist, so take a selfy with this screen in the back and share on social media).
You will see in the Applied Steps on the right hand side that PQ has helpfully detected the data type in a step called Changed Type. You need to undo that because it will likely think that your comma separated numbers are just one giant number. So click the X on the left side of that step.
On the right side, you can expand out Queries as shown above. Right click on your table and select Duplicate.
NB: This is not the most efficient way to do this, but I think this is something you just want do one time and you probably don't want to go hacking through the Advanced Editor.
So now you have two tables:
Rename Table1 (2) to Output in the box on the right hand side just to create some clarity.
Right Click on the Response RT column in Output and Remove it. Click on Table1 and do the same thing to the Response column. So now you have Table1 with only the Response RT and Output with only the Responses. Now we will parse these into rows of cleaned data.
Parse Table1
First, in Table1, click on the Response RT column and in the Home tab you will see Split Column. 1) Click on that and choose By Delimiter.
2) It will default to Comma, but you need to click on Advanced options and choose the Rows radio button.
Click OK and it should turn your data into rows of separated numbers and change to the type (this time helpfully) to decimal.
Now you need to add an index. 3) Go to the Add Column tab and click on Add Index, starting from 1.
Parse Ouput Table
Now go back to Output and repeat steps 1), 2) and 3) for it as well. Then you will have to take an extra step to clean up your text column. Right-Click on the Response column and choose Transform > Trim on the data.
That will get rid of those spurious spaces.
Merge Them Back Together
While you still have the Output table selected, go to the Home tab and choose Merge Queries.
It will bring up this window:
Choose Table1 from the bottom dropdown. Click on Index on both tables and click OK. You will get something like this:
Click on the button on the top right of the Table1 column and then unselect Index and Use original column name as prefix.
Click OK. Right click the Index column and Remove it. You now have your answer, but you still need to bring it back to Excel.
Putting it back in Excel
Click on Close and Load to on the left hand of the Home tab. To keep things simple, just click OK.
It will put both Output and Table1 as worksheets into your workbook, (this is where I said it is not the most efficient approach - you can always delete the Table1 worksheet. Excel will complain when you do, but you can ignore it.) Output is your answer.
Congratulations, you just did an ETL (extract transform and load) operation in data analytics. Do another selfy with the answer and share on social media.

Set specific height for all existing/new rows of a table

I have used Excel for quite some time now but only using traditional formulas. Now I am trying to create a more elaborated document for my business using VBA coding to expand my possibilities.
I have done some research and started practicing but found something that I can’t understand how to do yet.
I have a worksheet with a table on it. Since this is a fresh document it has nothing but the header and a blank row below it. What I always did when writing the information was type text in any of the columns that belong to the row immediately below the table (without actually inserting a new row myself). This creates a new row for the table where most of its format is kept, such as text size and formulas. However, I noticed that the row height doesn’t automatically change to that of the previous rows of the table, and changing it manually for each new row is very time consuming.
I would really appreciate it if anyone could share me the necessary Excel VBA code to fix this issue. I have thought of two possible approaches. The first one is a code that automatically sets all the rows of the table to a height of 20 (I want all rows to be the same size so no problem with that), supposing this would also affect new rows as I add them the way I mentioned. The second one would be a code that automatically sets every new row of the table to a size of 20. It doesn’t matter which one you choose, the easiest will be just fine. Thanks in advance.

Hiding certain columns on an Excel table

I've been trying to hide table columns on my Excel spreadsheet. While I can hide entire columns if my data was not in table form, this is something I cannot do because of the information that is underneath the table. For the purposes of this spreadsheet, that information needs to be below. So I can't really convert the table and I can't hide the information that is irrelevant.
Does anyone have a solution for this (this seems like a basic problem but I'm relatively new to Excel)?
You don't mention if that table above moves in number of rows or not but another option is to Data ---> GROUP the rows of the table and then collapse them. Select ALL rows relevant to the table and then click GROUP. To left of row numbers you'll have a line to click (with a + or -) to expand or collapse the data. This will visually look like only the data below is present and you can set print ranges to only look at the data below.
Hope that helps
You can only hide full columns. If hiding the data in the table is important, then the data below needs to be moved to a different sheet. Or, if it only needs to be hidden when printed, then you can change the font color to match the background color.

Is there a way to shrink an Excel table to fit the data?

I have an Excel table that has several other places in the spreadsheet using for various reasons, and then I realized that the table had bad data. I collected new data, and there were fewer rows in the new data set than in the previous table.
Is there a way I can simply shrink the table to reflect the new count of data?
Not at all sure I understand your requirement but I'm guessing you want to reduce the size of a Tables/Table without deleting entire rows in your spreadsheet (because of content present elsewhere in your spreadsheet in the same rows as your Table data).
If so, merely select the area of your table to be deleted and press Delete. If you want to remove the formatting that remains select the angle icon shown at the extreme bottom right of your Table and drag it up to suit.
I am assuming the (Table) rows to be deleted are a contiguous block at the bottom of the Table.

Resources