I have some data (say hits) for different cache size say 1M, 2M, 4M in three columns. I want to see Hits(1M) < Hits(2M) < Hits(4M). One way is to write two comparison operations, but I have many columns. Is there a way to check for something like 'ascending' order relationship between columns.
How many rows are you dealing with?
One solution (if you're not dealing with an overwhelmingly large number of rows) would be to take each row, and add the headers of each column above it, and then sort each row along with its own set of headers (which would be a two-row sort across all your columns). The resulting order of each header row would give you your desired answer.
Related
Looking to use VBA to compare two tables, with three columns each against each other. Beginner here and very lost.
They may have a different amount of entries each, and there may be some in table A that aren't in table B, and vice versa
Some of the individual Columns may match but trying to work out how to make sure all three columns are compared as one against all three columns in the other table
For example
xyz123 55.50 12/07/21 if compared with XYZ123 54.55 12/07/21 will show up as not a match, because the middle column is a different number.
Have attached a picture below. For the most part, and unlike the photo, each table will be in a completely random order, and its unlikely that there will be the same entry in table 1, row 1, as table 2 row 1
Ideally, I'm trying to create two new table to the right of the original tables, the first one being the entries table 1 has, that table 2 does not have. The second one being the entries table 2 has, that table 1 does not have.
Have attached an example below of the end result I'm looking for out of this. The four rows on the left are entries that the first table has but the second table doesn't, and the rows to the right are all entries that the second table has, but the first table does not.
I've tried to search on this but haven't found something that matches what I've got, and I'm struggling to adapt someone else's code to my specific problem
Any help on this would be greatly appreciated
Maybe not a direct answer to your problem but is this data also in a database somewhere or are you familiar with Ms Access? As you could open the tables in Access, and it is pretty easy to do this kind of thing with data bases.
If not, then yes, it is do able with VBA. Numerous ways of doing it.
The simplest is to scroll through one table a line at a time and compare it with every row in the other table and match or not. This will work with small tables and be easy and quick but for large data tables it would be wasteful and may take a long time to complete.
Data:
I have two datasets, design-wise set up in Excel as a matrix with first ID and lots of rows, and with the rest of the columns in the data set have 1-1 headers id numbers, so like 500 rows and around 45 columns.
Like ID, ColumnB, ColumnC
The other matrix has the same headers, but different order. It does not seem to matter.
Challenge:
So I need to find the differences between the two. I made an anti-left join on ID and then I get the ID that are in the one data set and not the other, right? So I make one for each way, so I get the ID that are missing in the respective datasets(/matrix).
I need to do the same trick, even if both IDs are there and then I get only the data with a difference across all columns, so if there for a rowID is a "X" in ColumnB in dataset1, but NO "X" in ColumnB dataset2, then I want to include it in my new table. So if there are, for the two rows compared in the two datasets, a difference in just one of the columns, I need to know and want it in my new data, only the data with a difference.
Tried:
I tried to mark not only ID columns, but all the columns in the anti-left join setup, but it does not seem to work at all.
I have two tables with the same data but in different rows, I want to sort them in front of each other. each duplicate row in front of its duplicate.
attached photo
In a new worksheet, copy the code data from one table and append to that a copy of the code data from the other. Apply Remove Duplicates to that column and sort ascending.
Now use that sheet to look up (VLOOKUP Description, Uom and Unit Price from one of your tables into three separate columns (say 2,3,4) and lookup up same fields from the other of your tables into a further three columns (say 5,6,7).
Wrap both formulae in IFERROR(....,"") to reduce noise.
I take it any numbering will be applied independently in a new sheet (ie No. is not required to be copied to there).
Incidentally you have a lot of unconventional hyphens (eg L-80 is never normally written other than as L80), m for OCTG as a unit of measure leads to many problems and with competent staff a structured catalogue could be advisable for a high value of stock and long-term storage.
I have imported a bunch of data using PowerQuery into a single table and am building dashboard reporting. I have been using Pivot Tables to build my reports, which has worked fine so far.
However, I've come to a point though where I want to simply show the count of multiple columns (calculated fields). So I have column A,B,C,D, and want to show the count each of each. But, I don't want them to be subsets (or children) of one another, and I don't want to build a bunch of Pivot Tables (file is already getting pretty big, and I want them row by row for easy viewing). Any suggestions?
Also, I am using the "Columns" field already to show the counts by certain weeks (week one, week two, etc.).
Thanks,
-A
Thanks for the follow-up. Within PowerPivot, I have four calculated fields/columns that are True/False for each column. I want to know how many times each of those columns were marked "True" (I can rename the "True" field to distinguish between which field it's referencing). But I don't want four pivot tables. Right now I can only think of making four pivot tables, filtering out the false for each one, then hiding the rows so the "True" values stack on top of one another. If I put all the four fields together in the same Pivot, the three below the first become subsets. I don't want subsets, just occurrence counts.
Does this help provide clarification?
If I understand you correctly, here's an example that shows what you're trying to achieve:
The table on the left has the TRUE/FALSE entries and the PivotTable on the right just shows the number of true items in each of those columns.
The format of the DAX measure to produce these count totals is:
[Count of A]=CALCULATE(COUNTROWS(PetFacts),PetFacts[A]=TRUE)
(Apologies to any parrot owners who may get upset that I have inadvertently re-classified their pets as cold-blooded!)
I have a wide column family used as a 'timeline' index, where column names are timestamps. In order to prevent hotspots, I shard the CF by month so that each month has its own row in the CF.
I query the CF for a slice range between two dates and limit the number of columns returned based on the page's records per page, say to 10.
The problem is that if my date range spans several months, I get 10 columns returned from each row, even if there is 10 matching columns in the first row - thus satisfying my paging requirement.
I can see the logic in this, but it strikes me as a real inefficiency if I have to retrieve redundant records from potentially multiple nodes when I only need the first 10 matching columns regardless of how many rows they span.
So my question is, am I better off to do a single Get operation on the first row and then do another Get operation on the second row if my first call doesnt return 10 records and continue until I have the required no. of records (or hit the row limit), or just accept the redundancy and dump the unneeded records?
I would sample your queries and record how many rows you needed to fetch for each one in order to get your 10 results and build a histogram of those numbers. Then, based on the histogram, figure out how many rows you would need to fetch at once in order to complete, say, 90% of your lookups with only a single query to Cassandra. That's a good start, at least.
If you almost always need to fetch more than one row, consider splitting your timeline by larger chunks than a month. Or, if you want to take a more flexible approach, use different bucket sizes based on the traffic for each individual timeline: http://www.datastax.com/dev/blog/advanced-time-series-with-cassandra (see the "Variable Time Bucket Sizes" section).