I'm trying to pick the top 3 correlated variables with a certain variable. I ran a correlation analysis between this variable and a set of other variables using the following code:
ods trace on;
proc corr data=all_base_corr outp=corr1;
var DEPVAR;
with VAR1 VAR2 VAR3 VAR4 VAR5 VAR6 VAR7 VAR8 VAR9;
ods select PearsonCorr ;
run;
ods trace off;
I only selected PearsonCorr table in the output because that's all I'm interested in, However it gives me a bunch of other statistics other than correlation values such as Mean, Std and N and I'm note sure how to get rid of those in the output.
Basically I want to have a table with the list of variables in one column and the correlation value in a second column, so that I can sort and pick the top 3 correlated variable.
I appreciate any feedback and solutions.
Thank you,
SE
With the proc corr statement, you are able to specify the type of stats you want to see. You specify the _type_ keyword to select. The details are here:
http://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm#statug_appssds_sect006.htm
Please let me know if that's not what you're looking for.
Best of luck!
Related
beginner SAS user here. Trying to rename column variables from NHANES data, but the code that I am using is registering wrong. The column names are long and drawn out so its been nearly impossible for me to try to recode them into a simpler format. Example and code down below, any assistance is greatly appreciated! For example, I'm trying to get Respondent sequence number to be renamed as ID, but SAS is having issues with the spaces between the original name if that makes sense.
data NHANES.Combined;
set NHANES.Combined;
rename Respondent sequence number = ID; run;
Image of Data Table
You have to use the names in the RENAME statement, not the labels you are looking at in the VIEWTABLE window. If you actually have a name with spaces in it (which you will not with NHANES data) then use a name literal in the code so SAS can parse out what parts of the command line represent the variable names.
rename 'non standard name'n = standard_name ;
Run PROC CONTENTS on your dataset to see the variable names and their attributes (TYPE, LENGTH, FORMAT, LABEL).
I have a table that shows me a chemical concentration value based on temperature, pH and
ammonia. The way the I measure these variables, the ammonia level are always one of these six values (on top of the table), so it works as a categorical variable.
I need a way to interpolate on this table, based on these 3 variables. I tried using a combination of INDEX and MATCH, but I was not able to achieve what I wanted. Then I thought of "dividing" the table in intervals to "reduce" one variable and use an IF function to select which interval to interpolate based on the third variable (I was thinking pH or Ammonia), but I can't figure out a way to change intervals dynamically like this.
Can anyone think of an alternative to accomplish what I'm trying to do? If possible I would like to avoid using VBA, but if there is no other way I have no problem using it.
Thank you for the help!
I'm attaching an example of the table below.
Assuming that PH is in Column A:
=INDEX(A:H;MATCH(6,8;A:A;0)+MATCH(25;B:B;0)-2;MATCH(2;2:2,0))
Where the -2 needs to be changed to the number of rows BEFORE the first 22 in Temp.
This also assumes that the pattern of 22;25;28 in Temp is the same for every pH
Column A has numbers from 1 - 5 and in column B i want to concatenate the number of Column A with the relevant nth term as indicated in the image below. Any help will be greatly appreciate!
Without using VBA, your best option would be the "CHOOSE()" function.
Try something like this for any number > 0:
=IF(AND(MOD(ABS(A1),100)>10,MOD(ABS(A1),100)<14),"th",CHOOSE(MOD(ABS(A1),10)+1,"th","st","nd","rd","th","th","th","th","th","th"))
You can set up a named "key" separately, much like the table you are showing, and then reference the key to replace any number with the desired output.
You can then indexmatch/vlookup the number, referencing the table, to find the output.
For ex:
=vlookup($A1,key,2,FALSE)
you could use nested IF functions and RIGHT like this
=IF(OR(RIGHT(H2,2)="11",RIGHT(H2,2)="12",RIGHT(H2,2)="13"),CONCAT(H2,"th"),IF(RIGHT(H2,1)="1",CONCAT(H2,"st"),IF(RIGHT(H2,1)="2",CONCAT(H2,"nd"),IF(RIGHT(H2,1)="3",CONCAT(H2,"rd"),CONCAT(H2,"th")))))
Probably not the fastest performance wise
I would like to solve this either in Excel or in SPSS:
I have categorical data (each number representing a medical diagnosis) that are combined into single cells. In other words, a row (patient) has multiple diagnoses. However, I would like to know the frequencies of each diagnosis. What is the best way to go about this? (See picture for reference)
For SPSS:
First just creating some sample data to demonstrate on:
data list free/e_cerv_dis_state (a20).
begin data
"{1/2/3/6}" "{1/2/4}" "{2/4/5}" "{1/5/6}" "{4}" "{4/5/6}" "{1/2/3/4/5/6}"
end data.
Now the following code will create a separate variable for each possible diagnosis, and will put a 1 in it if the diagnosis exists in the original variable.
do repeat vr=diag1 to diag9/vl=1 to 9.
compute vr=char.index(e_cerv_dis_state, string(vl, f1) ) > 0.
end repeat.
freq diag1 to diag6.
Note this will only work for up to 9 diagnoses. If you have more than that the solution will have to be adapted to multiple digits.
Assuming that the number of columns is fairly regular, I would suggest using text to columns, and then using COUNTIF on the cells if they are the value wanted. However there is a more robust and reproducible solution that would involve using SQL. If you download the free version of SQL Express here: https://www.microsoft.com/en-gb/sql-server/sql-server-downloads
Then you can import your table of data, here's how to do that: How to import an Excel file into SQL Server?
Then you could use the more friendly SQL database to get the answers you want. For example you can use a select statement that would say:
SELECT count(e_cerv_dis_state)
WHERE e_cerv_dis_state = '6'
It would also be possible to use a CASE WHEN statement to add-in the names of the diagnoses.
I've tried to build a very simple model in order to learn using CombiTable in Open Modelica. I want to output the value of the table ( in particular y3 ).Can you help me please? I show you pictures with errors. Thank you very much. I'm using last version 1.12.0-64bit.
In line 3 you are setting columns=2:size(table,2). But variable table is nowhere defined in your script. You're probably thinking of table as an array, if that's the case, you have to define it.
If it's not the case and you're just testing modelica you can use a constant columns =2:N with N the number of columns in your table on the file.