SAS proc transpose and output to excel - excel

Another SAS question from me (I noticed these don't come up here that often...):
I have a data set containing something like this:
Name | Category | Level | Score
John | cat1 | 1 | 80
John | cat1 | 2 | 70
John | cat1 | 3 | 10
John | cat2 | 1 | 60
John | cat2 | 2 | 95
John | cat2 | 3 | 43
John | cat2 | 4 | 28
And the output (excel format) should look like:
| cat1 | cat2 |
name | 1 | 2 | 3 | 1 | 2 | 3 | 4 |
John | 80 | 70 |10 |60 |95 |43 |28 |
What I do now, is using proc transpose to get the data in the right order and then proc exportto go to .xls.
This works fine, except for one thing. I cant get the second layer of subdivision to work. So right now, before my proc transpose I actually concat my category and level in my dataset (eg making it '1_cat1') and then transpose on this value, giving me the following output:
name | 1_cat1 | 2_cat1 | 3_cat1 | 1_cat2 | 2_cat2 | 3_cat2 | 4_cat2 |
John | 80 | 70 | 10 | 60 | 95 | 43 | 28 |
Is there any way to get the first, desired output ?

Assuming that you're familiar with your ODS options to get it into Excel (I'm just lazily using html and saving it as .xls, but you could use tagsets etc. instead), here is a PROC REPORT solution to display the data in the format you're looking for. Check out the use of across variables in proc report. There's probably a way to suppress the column that isn't used under cat1 in the output, but I can't recall it right now.
data testData;
infile datalines dsd delimiter='|';
input name $ category $ level score;
datalines;
John | cat1 | 1 | 80
John | cat1 | 2 | 70
John | cat1 | 3 | 10
John | cat2 | 1 | 60
John | cat2 | 2 | 95
John | cat2 | 3 | 43
John | cat2 | 4 | 28
;
run;
ods html file="C:\SomePath\MyFile.xls";
proc report
data=testData;
columns name category,level,score;
define name / group;
define category / across '';
define level / across '';
define score / sum '';
run;
ods html close;

I don't think you will be able to go directly to your desired output using proc transpose since you are looking to get each category to span multiple levels. You might want to research two other procedures, REPORT and TABULATE. I believe you can do this directly from either one, but it has been years since I used these. A third option is to create an XML file with ODS in which you can control pretty much exactly how you want to output to appear, though it takes a little more effort to learn how to do this.

Related

Adding fields to Pivot Table from another datasource using VBA

I've created Pivot Tables before using VBA but my professor recently gave us a bonus that although is not necessary, is driving me nuts.
Use a VBA Macro to write Region, District, and Store Name to your first report to create a new report
1) My first report looks like this:
Location | Sum of ActNetSales | Sum of PlanNetSales
----------|--------------------|---------------------
1 | $76,170 | $65,172
100 | $163,691 | $140,057
101 | $34,724 | $29,710
104 | $70,501 | $60,322
106 | $113,826 | $97,391
2) Below is the data source for the above report.
Division | Year | Week | Location | SchedDept | PlanNetSales | ActNetSales | AreaCategory
----------|------|------|----------|-----------|--------------|-------------|--------------
5 | 2018 | 10 | 520 | 541 | 1943.2 | 2271.115 | Non-Comm
5 | 2018 | 10 | 520 | 608 | 4378.4 | 5117.255 | Non-Comm
5 | 2018 | 10 | 520 | 1059 | 1044.8 | 1221.11 | Comm
5 | 2018 | 10 | 520 | 1126 | 6308 | 7372.475 | Non-Comm
3) My professor wants me to add the following information to the above table: Region, District and Store Name. However, these 3 fields are from a different data source then the above report. Below is the data source for the 3 fields I've listed.
Division | Location | LocationName | Region | RegionName | District | DistrictName
----------|----------|--------------|--------|------------|----------|--------------
5 | 1 | Location 1 | 3 | Region 3 | 18 | District 18
5 | 4 | Location 4 | 5 | Region 5 | 32 | District 32
5 | 5 | Location 5 | 3 | Region 3 | 19 | District 19
5 | 6 | Location 6 | 5 | Region 5 | 28 | District 28
I've created what he's asking above by joining the 2 tables (created a key by concatenating the foreign keys - location and division: to make a unique key and using a basic index/match ) and just creating a Pivot Table from that but I want to try my best to solve the bonus! Unfortunately, I don't have Power Query so I had to do it this way. I've tried searching up the above and I can't find any good resources. Is there anything you can suggest or just point me in the right direction? Thank you!
Is it cheating to modify your table under (2) to add the columns region, district, and storename using VLOOKUP on the third table? The second table would then have raw data, and extra columns of constructed data, effectively joining it to the third table using the Excel VLOOKUP trick rather than an actual SQL table join.
Then you can just use the expanded, joined table as your one Pivot Table source.
Cheating is legal in love, war, and IT solutions.

SUMIFS on filtered data?

I am looking for a way to do a SUMIFS that uses a filtered list. I would like to:
Grab all the sales from Sheet "Sales" where Group = "Flowers", AND
Store # on Sheet "Sales" matches the Filtered Store # list on sheet" Report
The following code will work only when there is no filter on the Store #'s:
=SUMIFS(Sales!C:C,Sales!B:B,"=FLOWERS",Sales!A:A,Report!A:A)
Sheet 1 Name = Report
Row (filtered) Store # (A)
====================|==============|
| 21 | 13 |
| 36 | 28 |
| 81 | 75 |
| 84 | 78 |
Sheet 2 Name = Sales
Store # (A) Group (B) Sales (C)
===========|==============|=============|
| 21 | Flowers | $100 |
| 36 | Flowers | $200 |
| 81 | Bread | $500 |
| 1 | Flowers | $600 |
| 3 | Flowers | $100 |
| 36 | Bread | $200 |
| 8 | Bread | $100 |
| 84 | Flowers | $300 |
Is there any way for me to accomplish this? So if when the filtered list changes, the total figure changes, similar to that of a subtotal.
Seems much the easiest way is with a PivotTable: Group (B) for FILTERS, Store # (A) for ROWS and Sum of Sales (C) for VALUES, then filter Group (B) to select 'Flowers` and filter rows to suit.

Adding Columns to Excel As List From Other Sheet Grows

Background
I'm creating a grade book in Excel for my wife. I have sheets for the overall grade, classwork, exams, and participation.
The three sections of work (classwork, exams, and participation) each have a variable number of items, and each item has a different number of points possible. Each section has a weight in the overall grade.
I have this up and running with a fixed number of items per section, but I'd like to create a template that can be updated from class to class and year to year.
Here's the problem:
On the classwork sheet, I'd like to be able to enter new assignments and their point value and have that automatically update the master grade sheet on my first sheet tab. Is there any way to add columns in a section of one worksheet (the master grade sheet) when new rows are added to another worksheet (the list of assignments)?
It is possible to achieve this without using VBA. The reason you will have difficulty acheiving this, however, is that you've violated normal form in the table you've already built. It appears the pertinent data you're looking for is each student's score on each assignment. If this if correct, the level of granularity you will want is on the Assignment, not on the Student.
There are some fairly quick ways to modify your existing work to account for this. I've written out some sample data below. Take a look and see if it helps.
Sample Original Table
+---------+------+------------+------------+
| Student | Quiz | Thumbnails | Watercolor |
+---------+------+------------+------------+
| Paul | 3 | 10 | 90 |
| Frank | 4 | 10 | 95 |
| Mary | 5 | 10 | 70 |
| Ellen | | 10 | 85 |
| Sue | 6 | 10 | 92 |
| Anton | 5 | 10 | 87 |
+---------+------+------------+------------+
Image of the data is below ( note I have highlighted the blank value ).
Sample Normal Table
+---------+-------------+-----------+-------+
| Student | Assignment | New_Score | Score |
+---------+-------------+-----------+-------+
| Paul | Quiz | | 3 |
| Frank | Quiz | | 4 |
| Mary | Quiz | | 5 |
| Ellen | Quiz | | 0 |
| Sue | Quiz | | 6 |
| Anton | Quiz | | 5 |
| Paul | Thumbnails | | 10 |
| Frank | Thumbnails | | 10 |
| Mary | Thumbnails | | 10 |
| Ellen | Thumbnails | | 10 |
| Sue | Thumbnails | | 10 |
| Anton | Thumbnails | | 10 |
| Paul | Watercolor | | 90 |
| Frank | Watercolor | | 95 |
| Mary | Watercolor | | 70 |
| Ellen | Watercolor | | 85 |
| Sue | Watercolor | | 92 |
| Anton | Watercolor | | 87 |
| Mary | ExtraCredit | 10 | 10 |
| Ellen | ExtraCredit | 8 | 8 |
| Sue | ExtraCredit | 9 | 9 |
| Anton | ExtraCredit | 10 | 10 |
+---------+-------------+-----------+-------+
Image of the data is below. The score column reaches back to your old table and grabs the score you've already entered for the students, so you won't have to do this all manually. The formula for this is =INDEX(non_normal,MATCH([#Student],non_normal[Student],0),MATCH([#Assignment],non_normal[#Headers],0)).
This assumes you've formatted the old data into an Excel DataTable ( ctrl+t ) and named it non_normal ( alt+j+t+i ). Note the unsubmitted assignment for Ellen comes through with a score of zero using this method. I've added a column named New_Score so that you are able to add new student-assignment submission combinations to the table without having to modify your old non_normal table ( which was the trouble in the OP ). With this column added, the formula in the Score column can be changed to =IF(NOT(ISBLANK([#[New_Score]])),[#[New_Score]],INDEX(non_normal,MATCH([#Student],non_normal[Student],0),MATCH([#Assignment],non_normal[#Headers],0))) which will take the New_Score value if available and the original score if not.
The orange cells are new student-assignment submission combinations. Note you do not need to add a row for every student, just add a row whenever a student submits an assignment.
Sample Assignments Table
+-------------+-----------------+
| Assignment | Points_Possible |
+-------------+-----------------+
| Quiz | 6 |
| Thumbnails | 10 |
| Wartercolor | 100 |
| ExtraCredit | |
+-------------+-----------------+
I've added the ExtraCredit assignment with a possible max score of zero/blank ( since not completing extra credit shouldn't count against a student )
Payoff - Back to the Original Table
+--------------+---------------+------------+------------+-------------+-------------+--------+
| Sum of Score | Column Labels | | | | | |
+--------------+---------------+------------+------------+-------------+-------------+--------+
| Row Labels | Quiz | Thumbnails | Watercolor | ExtraCredit | Grand Total | |
+--------------+---------------+------------+------------+-------------+-------------+--------+
| Anton | 5 | 10 | 87 | 10 | 112 | 96.6% |
| Ellen | 0 | 10 | 85 | 8 | 103 | 88.8% |
| Frank | 4 | 10 | 95 | | 109 | 94.0% |
| Mary | 5 | 10 | 70 | 10 | 95 | 81.9% |
| Paul | 3 | 10 | 90 | | 103 | 88.8% |
| Sue | 6 | 10 | 92 | 9 | 117 | 100.9% |
+--------------+---------------+------------+------------+-------------+-------------+--------+
Using the image below, you pivot your newly normalized data into a Pivot Table. ( alt+n+v ). Now, simply adding a new assignment to the normal_assignment DataTable will cause that assignment to appear in a new column when you refresh the Pivot Table ( alt+a+r+a ).
The % score on the right of the Pivot Table is calculated using the following formula ( with the sample Pivot Table starting in cell $M$2 ): =GETPIVOTDATA("Score",$M$2,"Student",M4)/SUM(assignment[Points_Possible])
I've uploaded the raw sample file for this to my public repo if you'd like to pull it and take a peek at the source. Credit to sensefulsolutions for text-to-table conversion.
Hope this is what you need!

Convert Rows into Columns with same values

Let's say I've a table
ID | Item | Purchased
17 | Chocolate | 1304
17 | Biscuit | 1209
17 | Jelly | 657
17 | Milk | 2234
18 | Chocolate | 1000
19 | Jelly |2387
I want to convert the rows into columns for each Item through Pivot tables in Excel
ID | Chocolate_Purchased | Biscuit_Purchased | Jelly_Purchased | Milk_Purchased
17 | 1304 | 1209 | 657 | 2234
18 | 1000 | | |
19 | | | 2387 |
How do I do that in Excel?
One simple way is with a Pivot table, although you may need to do some massaging to get exactly the output format you want.
With ID-->Rows; Itme-->Columns and Purchased--> Values, you can easily produce a Pivot looking like:
And you can do all kinds of different things with the formatting.

Excel Combining Multiple Rows

I feel like I am missing something simple I would like to do with Excel but I am asking the question incorrectly on Google...here it goes.
I'm taking a look at some Excel sheets for a friend who runs a race timing company. At the end of a race he has an excel sheet with the following format for a series of races
Name | Gender | Age | Race 1 | Race 2 | Race 3
Bob | M | 20 | 1 | |
Al | M | 24 | 2 | |
Bob | M | 20 | | 2 |
Al | M | 24 | | 1 |
::Assume we don't care about time right now, just place::
I would like to do "something" (again I'm not sure what the proper term is, merge in Excel actually merges two adjecent cells together), where I can get the final output such that
Name | Gender | Age | Race 1 | Race 2 | Race 3
Bob | M | 20 | 1 | 2 |
Al | M | 24 | 2 | 1 |
I'm not sure how to collapse the data for the like rows together.
I'm not opposed to writing a little VBA, but I am thinking this is a built in Excel function but I'm not sure what it is called or how to make it "dance".
Thanks!
PivotTable.
The data format is making life a bit more difficult than it needs to be. Rather than having individual columns for race #1, race #2, race #3 etc, it would make life easier to have a column called "Race Number" and arrange the data like this:
Name | Gender | Age | Race Number | Place
Bob | M | 20 | 1 | 1
Al | M | 24 | 1 | 2
Bob | M | 20 | 2 | 2
Al | M | 24 | 2 | 1
This would make things like PivotTable (as suggested by Jason) a lot easier to work with

Resources